51
|
Chen L, Liu G, Zhang T. Integrating machine learning and genome editing for crop improvement. ABIOTECH 2024; 5:262-277. [PMID: 38974863 PMCID: PMC11224061 DOI: 10.1007/s42994-023-00133-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 12/18/2023] [Indexed: 07/09/2024]
Abstract
Genome editing is a promising technique that has been broadly utilized for basic gene function studies and trait improvements. Simultaneously, the exponential growth of computational power and big data now promote the application of machine learning for biological research. In this regard, machine learning shows great potential in the refinement of genome editing systems and crop improvement. Here, we review the advances of machine learning to genome editing optimization, with emphasis placed on editing efficiency and specificity enhancement. Additionally, we demonstrate how machine learning bridges genome editing and crop breeding, by accurate key site detection and guide RNA design. Finally, we discuss the current challenges and prospects of these two techniques in crop improvement. By integrating advanced genome editing techniques with machine learning, progress in crop breeding will be further accelerated in the future.
Collapse
Affiliation(s)
- Long Chen
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agricultural College of Yangzhou University, Yangzhou, 225009 China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009 China
| | - Guanqing Liu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agricultural College of Yangzhou University, Yangzhou, 225009 China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009 China
| | - Tao Zhang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Zhongshan Biological Breeding Laboratory/Key Laboratory of Plant Functional Genomics of the Ministry of Education, Agricultural College of Yangzhou University, Yangzhou, 225009 China
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Jiangsu Key Laboratory of Crop Genetics and Physiology, Yangzhou University, Yangzhou, 225009 China
| |
Collapse
|
52
|
Bergman S, Tuller T. Strong association between genomic 3D structure and CRISPR cleavage efficiency. PLoS Comput Biol 2024; 20:e1012214. [PMID: 38848440 PMCID: PMC11189236 DOI: 10.1371/journal.pcbi.1012214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 06/20/2024] [Accepted: 05/30/2024] [Indexed: 06/09/2024] Open
Abstract
CRISPR is a gene editing technology which enables precise in-vivo genome editing; but its potential is hampered by its relatively low specificity and sensitivity. Improving CRISPR's on-target and off-target effects requires a better understanding of its mechanism and determinants. Here we demonstrate, for the first time, the chromosomal 3D spatial structure's association with CRISPR's cleavage efficiency, and its predictive capabilities. We used high-resolution Hi-C data to estimate the 3D distance between different regions in the human genome and utilized these spatial properties to generate 3D-based features, characterizing each region's density. We evaluated these features based on empirical, in-vivo CRISPR efficiency data and compared them to 425 features used in state-of-the-art models. The 3D features ranked in the top 13% of the features, and significantly improved the predictive power of LASSO and xgboost models trained with these features. The features indicated that sites with lower spatial density demonstrated higher efficiency. Understanding how CRISPR is affected by the 3D DNA structure provides insight into CRISPR's mechanism in general and improves our ability to correctly predict CRISPR's cleavage as well as design sgRNAs for therapeutic and scientific use.
Collapse
Affiliation(s)
- Shaked Bergman
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
- The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel
| |
Collapse
|
53
|
Yee BJ, Ali NA, Mohd-Naim NFB, Ahmed MU. Exploiting the Specificity of CRISPR/Cas System for Nucleic Acids Amplification-Free Disease Diagnostics in the Point-of-Care. CHEM & BIO ENGINEERING 2024; 1:330-339. [PMID: 39974464 PMCID: PMC11835143 DOI: 10.1021/cbe.3c00112] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 12/27/2023] [Accepted: 12/27/2023] [Indexed: 02/21/2025]
Abstract
Rapid and reliable molecular diagnostics employing target nucleic acids and small biomarkers are crucial strategies required for the precise detection of numerous diseases. Although diagnoses based on nucleic acid recognition are some of the most efficient and precise procedures, these tests often require expensive equipment and skilled professionals. Recent advancements in diagnostic innovations, particularly those based on clustered regularly interspaced short palindromic repeats (CRISPR), aim to provide thorough screening at homes, in clinics, and in the field. In comparison to traditional molecular techniques like PCR, CRISPR/Cas-based detection, using the single-stranded nucleic acid trans-cleavage abilities of Cas12 or Cas13, shows significant potential as a molecular diagnostic tool. It offers benefits such as attomolar-level sensitivity, single-base precision, and rapid turnover rates. Both Cas enzymes demonstrate exceptional specificity and sensitivity, holding substantial promise in disease diagnostics and beyond. Consequently, various amplification-free CRISPR/Cas-based detection methods have emerged, aiming to maintain sensitivity despite the absence of pre-amplification. This allows for the detection of non-nucleic acid targets and facilitates integration into point-of-care settings. This Review highlights current advances in amplification-free CRISPR/Cas detection systems in disease diagnostics and investigates their utility in point-of-care settings. Furthermore, the mechanisms of alternative CRISPR-based amplification-free detection of other small molecules, aside from nucleic acids, for disease diagnosis will also be briefly discussed.
Collapse
Affiliation(s)
- Bong Jing Yee
- Biosensors
and Nanobiotechnology Laboratory, Integrated Science Building, Faculty
of Science, Universiti Brunei Darussalam, Gadong 1410, Brunei Darussalam
| | - Nurul Ajeerah Ali
- Biosensors
and Nanobiotechnology Laboratory, Integrated Science Building, Faculty
of Science, Universiti Brunei Darussalam, Gadong 1410, Brunei Darussalam
| | - Noor Faizah binti Mohd-Naim
- Biosensors
and Nanobiotechnology Laboratory, Integrated Science Building, Faculty
of Science, Universiti Brunei Darussalam, Gadong 1410, Brunei Darussalam
- PAPRSB
Institute of Health Science, Universiti
Brunei Darussalam, Gadong 1410, Brunei Darussalam
| | - Minhaz Uddin Ahmed
- Biosensors
and Nanobiotechnology Laboratory, Integrated Science Building, Faculty
of Science, Universiti Brunei Darussalam, Gadong 1410, Brunei Darussalam
| |
Collapse
|
54
|
Lemmens M, Dorsheimer L, Zeller A, Dietz-Baum Y. Non-clinical safety assessment of novel drug modalities: Genome safety perspectives on viral-, nuclease- and nucleotide-based gene therapies. MUTATION RESEARCH. GENETIC TOXICOLOGY AND ENVIRONMENTAL MUTAGENESIS 2024; 896:503767. [PMID: 38821669 DOI: 10.1016/j.mrgentox.2024.503767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 04/08/2024] [Accepted: 05/13/2024] [Indexed: 06/02/2024]
Abstract
Gene therapies have emerged as promising treatments for various conditions including inherited diseases as well as cancer. Ensuring their safe clinical application requires the development of appropriate safety testing strategies. Several guidelines have been provided by health authorities to address these concerns. These guidelines state that non-clinical testing should be carried out on a case-by-case basis depending on the modality. This review focuses on the genome safety assessment of frequently used gene therapy modalities, namely Adeno Associated Viruses (AAVs), Lentiviruses, designer nucleases and mRNAs. Important safety considerations for these modalities, amongst others, are vector integrations into the patient genome (insertional mutagenesis) and off-target editing. Taking into account the constraints of in vivo studies, health authorities endorse the development of novel approach methodologies (NAMs), which are innovative in vitro strategies for genotoxicity testing. This review provides an overview of NAMs applied to viral and CRISPR/Cas9 safety, including next generation sequencing-based methods for integration site analysis and off-target editing. Additionally, NAMs to evaluate the oncogenicity risk arising from unwanted genomic modifications are discussed. Thus, a range of promising techniques are available to support the safe development of gene therapies. Thorough validation, comparisons and correlations with clinical outcomes are essential to identify the most reliable safety testing strategies. By providing a comprehensive overview of these NAMs, this review aims to contribute to a better understanding of the genome safety perspectives of gene therapies.
Collapse
Affiliation(s)
| | - Lena Dorsheimer
- Research and Development, Preclinical Safety, Sanofi, Industriepark Hoechst, Frankfurt am Main 65926, Germany.
| | - Andreas Zeller
- Pharmaceutical Sciences, pRED Innovation Center Basel, Hoffmann-La Roche Ltd, Basel 4070, Switzerland
| | - Yasmin Dietz-Baum
- Research and Development, Preclinical Safety, Sanofi, Industriepark Hoechst, Frankfurt am Main 65926, Germany
| |
Collapse
|
55
|
Bose S, Banerjee S, Kumar S, Saha A, Nandy D, Hazra S. Review of applications of artificial intelligence (AI) methods in crop research. J Appl Genet 2024; 65:225-240. [PMID: 38216788 DOI: 10.1007/s13353-023-00826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/23/2023] [Accepted: 12/26/2023] [Indexed: 01/14/2024]
Abstract
Sophisticated and modern crop improvement techniques can bridge the gap for feeding the ever-increasing population. Artificial intelligence (AI) refers to the simulation of human intelligence in machines, which refers to the application of computational algorithms, machine learning (ML) and deep learning (DL) techniques. This is aimed to generalise patterns and relationships from historical data, employing various mathematical optimisation techniques thus making prediction models for facilitating selection of superior genotypes. These techniques are less resource intensive and can solve the problem based on the analysis of large-scale phenotypic datasets. ML for genomic selection (GS) uses high-throughput genotyping technologies to gather genetic information on a large number of markers across the genome. The prediction of GS models is based on the mathematical relation between genotypic and phenotypic data from the training population. ML techniques have emerged as powerful tools for genome editing through analysing large-scale genomic data and facilitating the development of accurate prediction models. Precise phenotyping is a prerequisite to advance crop breeding for solving agricultural production-related issues. ML algorithms can solve this problem through generating predictive models, based on the analysis of large-scale phenotypic datasets. DL models also have the potential reliability of precise phenotyping. This review provides a comprehensive overview on various ML and DL models, their applications, potential to enhance the efficiency, specificity and safety towards advanced crop improvement protocols such as genomic selection, genome editing, along with phenotypic prediction to promote accelerated breeding.
Collapse
Affiliation(s)
- Suvojit Bose
- Department of Vegetables and Spice Crops, Uttar Banga Krishi Viswavidyalaya, Pundibari, Cooch Behar, 736165, West Bengal, India
| | | | - Soumya Kumar
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Akash Saha
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Debalina Nandy
- School of Agricultural Sciences, JIS University, Kolkata, 700109, West Bengal, India
| | - Soham Hazra
- Department of Agriculture, Brainware University, Barasat, 700125, West Bengal, India.
| |
Collapse
|
56
|
Mu W, Luo T, Barrera A, Bounds LR, Klann TS, Ter Weele M, Bryois J, Crawford GE, Sullivan PF, Gersbach CA, Love MI, Li Y. Machine learning methods for predicting guide RNA effects in CRISPR epigenome editing experiments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.18.590188. [PMID: 38659894 PMCID: PMC11042384 DOI: 10.1101/2024.04.18.590188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
CRISPR epigenomic editing technologies enable functional interrogation of non-coding elements. However, current computational methods for guide RNA (gRNA) design do not effectively predict the power potential, molecular and cellular impact to optimize for efficient gRNAs, which are crucial for successful applications of these technologies. We present "launch-dCas9" (machine LeArning based UNified CompreHensive framework for CRISPR-dCas9) to predict gRNA impact from multiple perspectives, including cell fitness, wildtype abundance (gauging power potential), and gene expression in single cells. Our launchdCas9, built and evaluated using experiments involving >1 million gRNAs targeted across the human genome, demonstrates relatively high prediction accuracy (AUC up to 0.81) and generalizes across cell lines. Method-prioritized top gRNA(s) are 4.6-fold more likely to exert effects, compared to other gRNAs in the same cis-regulatory region. Furthermore, launchdCas9 identifies the most critical sequence-related features and functional annotations from >40 features considered. Our results establish launch-dCas9 as a promising approach to design gRNAs for CRISPR epigenomic experiments.
Collapse
Affiliation(s)
- Wancen Mu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Tianyou Luo
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Alejandro Barrera
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Lexi R Bounds
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Tyler S Klann
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Maria Ter Weele
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Julien Bryois
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Gregory E Crawford
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Pediatrics, Division of Medical Genetics, Duke University Medical Center, Durham, NC, USA
| | - Patrick F Sullivan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Charles A Gersbach
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
57
|
Zhu W, Xie H, Chen Y, Zhang G. CrnnCrispr: An Interpretable Deep Learning Method for CRISPR/Cas9 sgRNA On-Target Activity Prediction. Int J Mol Sci 2024; 25:4429. [PMID: 38674012 PMCID: PMC11050447 DOI: 10.3390/ijms25084429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 04/11/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target activity. However, there is still room for improvement. Here, we proposed a hybrid neural network named CrnnCrispr, which integrates a convolutional neural network and a recurrent neural network for on-target activity prediction. We performed unbiased experiments with four mainstream methods on nine public datasets with varying sample sizes. Additionally, we incorporated a transfer learning strategy to boost the prediction power on small-scale datasets. Our results showed that CrnnCrispr outperformed existing methods in terms of accuracy and generalizability. Finally, we applied a visualization approach to investigate the generalizable nucleotide-position-dependent patterns of sgRNAs for on-target activity, which shows potential in terms of model interpretability and further helps in understanding the principles of sgRNA design.
Collapse
Affiliation(s)
| | | | | | - Guishan Zhang
- College of Engineering, Shantou University, Shantou 515063, China; (W.Z.); (H.X.); (Y.C.)
| |
Collapse
|
58
|
Walton RT, Qin Y, Blainey PC. CROPseq-multi: a versatile solution for multiplexed perturbation and decoding in pooled CRISPR screens. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.17.585235. [PMID: 38558968 PMCID: PMC10979941 DOI: 10.1101/2024.03.17.585235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Forward genetic screens seek to dissect complex biological systems by systematically perturbing genetic elements and observing the resulting phenotypes. While standard screening methodologies introduce individual perturbations, multiplexing perturbations improves the performance of single-target screens and enables combinatorial screens for the study of genetic interactions. Current tools for multiplexing perturbations are incompatible with pooled screening methodologies that require mRNA-embedded barcodes, including some microscopy and single cell sequencing approaches. Here, we report the development of CROPseq-multi, a CROPseq1-inspired lentiviral system to multiplex Streptococcus pyogenes (Sp) Cas9-based perturbations with mRNA-embedded barcodes. CROPseq-multi has equivalent per-guide activity to CROPseq and low lentiviral recombination frequencies. CROPseq-multi is compatible with enrichment screening methodologies and optical pooled screens, and is extensible to screens with single-cell sequencing readouts. For optical pooled screens, an optimized and multiplexed in situ detection protocol improves barcode detection efficiency 10-fold, enables detection of recombination events, and increases decoding efficiency 3-fold relative to CROPseq. CROPseq-multi is a widely applicable multiplexing solution for diverse SpCas9-based genetic screening approaches.
Collapse
Affiliation(s)
- Russell T. Walton
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biological Engineering, MIT, Cambridge, MA, USA
| | - Yue Qin
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Paul C. Blainey
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biological Engineering, MIT, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, MIT, Cambridge, MA, USA
| |
Collapse
|
59
|
Sun J, Guo J, Liu J. CRISPR-M: Predicting sgRNA off-target effect using a multi-view deep learning network. PLoS Comput Biol 2024; 20:e1011972. [PMID: 38483980 DOI: 10.1371/journal.pcbi.1011972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 03/26/2024] [Accepted: 03/05/2024] [Indexed: 03/27/2024] Open
Abstract
Using the CRISPR-Cas9 system to perform base substitutions at the target site is a typical technique for genome editing with the potential for applications in gene therapy and agricultural productivity. When the CRISPR-Cas9 system uses guide RNA to direct the Cas9 endonuclease to the target site, it may misdirect it to a potential off-target site, resulting in an unintended genome editing. Although several computational methods have been proposed to predict off-target effects, there is still room for improvement in the off-target effect prediction capability. In this paper, we present an effective approach called CRISPR-M with a new encoding scheme and a novel multi-view deep learning model to predict the sgRNA off-target effects for target sites containing indels and mismatches. CRISPR-M takes advantage of convolutional neural networks and bidirectional long short-term memory recurrent neural networks to construct a three-branch network towards multi-views. Compared with existing methods, CRISPR-M demonstrates significant performance advantages running on real-world datasets. Furthermore, experimental analysis of CRISPR-M under multiple metrics reveals its capability to extract features and validates its superiority on sgRNA off-target effect predictions.
Collapse
Affiliation(s)
- Jialiang Sun
- College of Computer Science, Nankai University, Tianjin, China
| | - Jun Guo
- College of Software, Northeastern University, Shenyang, China
| | - Jian Liu
- College of Computer Science, Nankai University, Tianjin, China
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin, China
| |
Collapse
|
60
|
Li J, Wu P, Cao Z, Huang G, Lu Z, Yan J, Zhang H, Zhou Y, Liu R, Chen H, Ma L, Luo M. Machine learning-based prediction models to guide the selection of Cas9 variants for efficient gene editing. Cell Rep 2024; 43:113765. [PMID: 38358884 DOI: 10.1016/j.celrep.2024.113765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 11/17/2023] [Accepted: 01/25/2024] [Indexed: 02/17/2024] Open
Abstract
The increasing emergence of Cas9 variants has attracted broad interest, as these variants were designed to expand CRISPR applications. New Cas9 variants typically feature higher editing efficiency, improved editing specificity, or alternative PAM sequences. To select Cas9 variants and gRNAs for high-fidelity and efficient genome editing, it is crucial to systematically quantify the editing performances of gRNAs and develop prediction models based on high-quality datasets. Using synthetic gRNA-target paired libraries and next-generation sequencing, we compared the activity and specificity of gRNAs of four SpCas9 variants. The nucleotide composition in the PAM-distal region had more influence on the editing efficiency of HiFi Cas9 and LZ3 Cas9. We further developed machine learning models to predict the gRNA efficiency and specificity for the four Cas9 variants. To aid users from broad research areas, the machine learning models for the predictions of gRNA editing efficiency within human genome sites are available on our website.
Collapse
Affiliation(s)
- Jianbo Li
- Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China; AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China; Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
| | - Panfeng Wu
- Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China; AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China; Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
| | - Zhoutao Cao
- AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
| | - Guanlan Huang
- AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
| | - Zhike Lu
- Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
| | - Jianfeng Yan
- AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
| | - Heng Zhang
- AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China; Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
| | - Yangfan Zhou
- Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
| | - Rong Liu
- Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China
| | - Hui Chen
- AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
| | - Lijia Ma
- Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China.
| | - Mengcheng Luo
- Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China.
| |
Collapse
|
61
|
Xu S, Wei J, Sun S, Zhang J, Chan TF, Li Y. SSBlazer: a genome-wide nucleotide-resolution model for predicting single-strand break sites. Genome Biol 2024; 25:46. [PMID: 38347618 PMCID: PMC10863285 DOI: 10.1186/s13059-024-03179-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 01/24/2024] [Indexed: 02/15/2024] Open
Abstract
Single-strand breaks are the major DNA damage in the genome and serve a crucial role in various biological processes. To reveal the significance of single-strand breaks, multiple sequencing-based single-strand break detection methods have been developed, which are costly and unfeasible for large-scale analysis. Hence, we propose SSBlazer, an explainable and scalable deep learning framework for single-strand break site prediction at the nucleotide level. SSBlazer is a lightweight model with robust generalization capabilities across various species and is capable of numerous unexplored SSB-related applications.
Collapse
Affiliation(s)
- Sheng Xu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China
- Research Institute of Intelligent Complex Systems, Fudan University, 220 Handan Rd, Shanghai, 200437, China
- Shanghai AI Lab, 422 Jingan Rd, 200041, Shanghai, China
| | - Junkang Wei
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, USA.
| | - Siqi Sun
- Research Institute of Intelligent Complex Systems, Fudan University, 220 Handan Rd, Shanghai, 200437, China
- Shanghai AI Lab, 422 Jingan Rd, 200041, Shanghai, China
| | - Jizhou Zhang
- School of Life Sciences, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China
- State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China
| | - Ting-Fung Chan
- School of Life Sciences, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China
- State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China
| | - Yu Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, 100871, Hong Kong SAR, China.
- The CUHK Shenzhen Research Institute, Hi-Tech Park, Nanshan, 518057, Shenzhen, China.
| |
Collapse
|
62
|
Hatzakis N, Kaestel-Hansen J, de Sautu M, Saminathan A, Scanavachi G, Correia R, Nielsen AJ, Bleshoey S, Boomsma W, Kirchhausen T. Deep learning assisted single particle tracking for automated correlation between diffusion and function. RESEARCH SQUARE 2024:rs.3.rs-3716053. [PMID: 38352328 PMCID: PMC10862944 DOI: 10.21203/rs.3.rs-3716053/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Sub-cellular diffusion in living systems reflects cellular processes and interactions. Recent advances in optical microscopy allow the tracking of this nanoscale diffusion of individual objects with an unprecedented level of precision. However, the agnostic and automated extraction of functional information from the diffusion of molecules and organelles within the sub-cellular environment, is labor-intensive and poses a significant challenge. Here we introduce DeepSPT, a deep learning framework to interpret the diffusional 2D or 3D temporal behavior of objects in a rapid and efficient manner, agnostically. Demonstrating its versatility, we have applied DeepSPT to automated mapping of the early events of viral infections, identifying distinct types of endosomal organelles, and clathrin-coated pits and vesicles with up to 95% accuracy and within seconds instead of weeks. The fact that DeepSPT effectively extracts biological information from diffusion alone illustrates that besides structure, motion encodes function at the molecular and subcellular level.
Collapse
|
63
|
Luo Y, Chen Y, Xie H, Zhu W, Zhang G. Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT. Comput Biol Med 2024; 169:107932. [PMID: 38199209 DOI: 10.1016/j.compbiomed.2024.107932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 12/25/2023] [Accepted: 01/01/2024] [Indexed: 01/12/2024]
Abstract
Off-target effects of CRISPR/Cas9 can lead to suboptimal genome editing outcomes. Numerous deep learning-based approaches have achieved excellent performance for off-target prediction; however, few can predict the off-target activities with both mismatches and indels between single guide RNA (sgRNA) and target DNA sequence pair. In addition, data imbalance is a common pitfall for off-target prediction. Moreover, due to the complexity of genomic contexts, generating an interpretable model also remains challenged. To address these issues, firstly we developed a BERT-based model called CRISPR-BERT for enhancing the prediction of off-target activities with both mismatches and indels. Secondly, we proposed an adaptive batch-wise class balancing strategy to combat the noise exists in imbalanced off-target data. Finally, we applied a visualization approach for investigating the generalizable nucleotide position-dependent patterns of sgRNA-DNA pair for off-target activity. In our comprehensive comparison to existing methods on five mismatches-only datasets and two mismatches-and-indels datasets, CRISPR-BERT achieved the best performance in terms of AUROC and PRAUC. Besides, the visualization analysis demonstrated how implicit knowledge learned by CRISPR-BERT facilitates off-target prediction, which shows potential in model interpretability. Collectively, CRISPR-BERT provides an accurate and interpretable framework for off-target prediction, further contributes to sgRNA optimization in practical use for improved target specificity in CRISPR/Cas9 genome editing. The source code is available at https://github.com/BrokenStringx/CRISPR-BERT.
Collapse
Affiliation(s)
- Ye Luo
- College of Engineering, Shantou University, Shantou, 515063, China
| | - Yaowen Chen
- College of Engineering, Shantou University, Shantou, 515063, China
| | - HuanZeng Xie
- College of Engineering, Shantou University, Shantou, 515063, China
| | - Wentao Zhu
- College of Engineering, Shantou University, Shantou, 515063, China
| | - Guishan Zhang
- College of Engineering, Shantou University, Shantou, 515063, China.
| |
Collapse
|
64
|
Guo Y, Xue Z, Gong M, Jin S, Wu X, Liu W. CRISPR-TE: a web-based tool to generate single guide RNAs targeting transposable elements. Mob DNA 2024; 15:3. [PMID: 38303094 PMCID: PMC10832116 DOI: 10.1186/s13100-024-00313-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/13/2024] [Indexed: 02/03/2024] Open
Abstract
BACKGROUND The CRISPR/Cas systems have emerged as powerful tools in genome engineering. Recent studies highlighting the crucial role of transposable elements (TEs) have stimulated research interest in manipulating these elements to understand their functions. However, designing single guide RNAs (sgRNAs) that are specific and efficient for TE manipulation is a significant challenge, given their sequence repetitiveness and high copy numbers. While various sgRNA design tools have been developed for gene editing, an optimized sgRNA designer for TE manipulation has yet to be established. RESULTS We present CRISPR-TE, a web-based application featuring an accessible graphical user interface, available at https://www.crisprte.cn/ , and currently tailored to the human and mouse genomes. CRISPR-TE identifies all potential sgRNAs for TEs and provides a comprehensive solution for efficient TE targeting at both the single copy and subfamily levels. Our analysis shows that sgRNAs targeting TEs can more effectively target evolutionarily young TEs with conserved sequences at the subfamily level. CONCLUSIONS CRISPR-TE offers a versatile framework for designing sgRNAs for TE targeting. CRISPR-TE is publicly accessible at https://www.crisprte.cn/ as an online web service and the source code of CRISPR-TE is available at https://github.com/WanluLiuLab/CRISPRTE/ .
Collapse
Affiliation(s)
- Yixin Guo
- Department of Orthopedic Surgery of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Zhejiang, Hangzhou, 310003, China
| | - Ziwei Xue
- Department of Orthopedic Surgery of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Zhejiang, Hangzhou, 310003, China
- Future Health Laboratory, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314100, China
| | - Meiting Gong
- Department of Orthopedic Surgery of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Zhejiang, Hangzhou, 310003, China
| | - Siqian Jin
- Department of Orthopedic Surgery of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Zhejiang, Hangzhou, 310003, China
| | - Xindi Wu
- Department of Orthopedic Surgery of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Zhejiang, Hangzhou, 310003, China
| | - Wanlu Liu
- Department of Orthopedic Surgery of the Second Affiliated Hospital, and Centre of Biomedical Systems and Informatics of Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, Zhejiang, Hangzhou, 310003, China.
- Future Health Laboratory, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314100, China.
| |
Collapse
|
65
|
Toufikuzzaman M, Hassan Samee MA, Sohel Rahman M. CRISPR-DIPOFF: an interpretable deep learning approach for CRISPR Cas-9 off-target prediction. Brief Bioinform 2024; 25:bbad530. [PMID: 38388680 PMCID: PMC10883906 DOI: 10.1093/bib/bbad530] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 12/14/2023] [Accepted: 12/19/2023] [Indexed: 02/24/2024] Open
Abstract
CRISPR Cas-9 is a groundbreaking genome-editing tool that harnesses bacterial defense systems to alter DNA sequences accurately. This innovative technology holds vast promise in multiple domains like biotechnology, agriculture and medicine. However, such power does not come without its own peril, and one such issue is the potential for unintended modifications (Off-Target), which highlights the need for accurate prediction and mitigation strategies. Though previous studies have demonstrated improvement in Off-Target prediction capability with the application of deep learning, they often struggle with the precision-recall trade-off, limiting their effectiveness and do not provide proper interpretation of the complex decision-making process of their models. To address these limitations, we have thoroughly explored deep learning networks, particularly the recurrent neural network based models, leveraging their established success in handling sequence data. Furthermore, we have employed genetic algorithm for hyperparameter tuning to optimize these models' performance. The results from our experiments demonstrate significant performance improvement compared with the current state-of-the-art in Off-Target prediction, highlighting the efficacy of our approach. Furthermore, leveraging the power of the integrated gradient method, we make an effort to interpret our models resulting in a detailed analysis and understanding of the underlying factors that contribute to Off-Target predictions, in particular the presence of two sub-regions in the seed region of single guide RNA which extends the established biological hypothesis of Off-Target effects. To the best of our knowledge, our model can be considered as the first model combining high efficacy, interpretability and a desirable balance between precision and recall.
Collapse
Affiliation(s)
- Md Toufikuzzaman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1205, Bangladesh
| | - Md Abul Hassan Samee
- Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA
| | - M Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1205, Bangladesh
| |
Collapse
|
66
|
Lopes R, Prasad MK. Beyond the promise: evaluating and mitigating off-target effects in CRISPR gene editing for safer therapeutics. Front Bioeng Biotechnol 2024; 11:1339189. [PMID: 38390600 PMCID: PMC10883050 DOI: 10.3389/fbioe.2023.1339189] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 12/29/2023] [Indexed: 02/24/2024] Open
Abstract
Over the last decade, CRISPR has revolutionized drug development due to its potential to cure genetic diseases that currently do not have any treatment. CRISPR was adapted from bacteria for gene editing in human cells in 2012 and, remarkably, only 11 years later has seen it's very first approval as a medicine for the treatment of sickle cell disease and transfusion-dependent beta-thalassemia. However, the application of CRISPR systems is associated with unintended off-target and on-target alterations (including small indels, and structural variations such as translocations, inversions and large deletions), which are a source of risk for patients and a vital concern for the development of safe therapies. In recent years, a wide range of methods has been developed to detect unwanted effects of CRISPR-Cas nuclease activity. In this review, we summarize the different methods for off-target assessment, discuss their strengths and limitations, and highlight strategies to improve the safety of CRISPR systems. Finally, we discuss their relevance and application for the pre-clinical risk assessment of CRISPR therapeutics within the current regulatory context.
Collapse
Affiliation(s)
- Rui Lopes
- *Correspondence: Rui Lopes, ; Megana K. Prasad,
| | | |
Collapse
|
67
|
Yu Y, Gawlitt S, de Andrade E Sousa LB, Merdivan E, Piraud M, Beisel CL, Barquist L. Improved prediction of bacterial CRISPRi guide efficiency from depletion screens through mixed-effect machine learning and data integration. Genome Biol 2024; 25:13. [PMID: 38200565 PMCID: PMC10782694 DOI: 10.1186/s13059-023-03153-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Accepted: 12/20/2023] [Indexed: 01/12/2024] Open
Abstract
CRISPR interference (CRISPRi) is the leading technique to silence gene expression in bacteria; however, design rules remain poorly defined. We develop a best-in-class prediction algorithm for guide silencing efficiency by systematically investigating factors influencing guide depletion in genome-wide essentiality screens, with the surprising discovery that gene-specific features substantially impact prediction. We develop a mixed-effect random forest regression model that provides better estimates of guide efficiency. We further apply methods from explainable AI to extract interpretable design rules from the model. This study provides a blueprint for predictive models for CRISPR technologies where only indirect measurements of guide activity are available.
Collapse
Affiliation(s)
- Yanying Yu
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany
| | - Sandra Gawlitt
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany
| | | | - Erinc Merdivan
- Helmholtz AI, Helmholtz Zentrum München, Neuherberg, 85764, Germany
| | - Marie Piraud
- Helmholtz AI, Helmholtz Zentrum München, Neuherberg, 85764, Germany
| | - Chase L Beisel
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany
- Medical Faculty, University of Würzburg, Würzburg, 97080, Germany
| | - Lars Barquist
- Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany.
- Medical Faculty, University of Würzburg, Würzburg, 97080, Germany.
| |
Collapse
|
68
|
Ito Y, Inoue S, Nakashima T, Zhang H, Li Y, Kasuya H, Matsukawa T, Wu Z, Yoshikawa T, Kataoka M, Ishikawa T, Kagoya Y. Epigenetic profiles guide improved CRISPR/Cas9-mediated gene knockout in human T cells. Nucleic Acids Res 2024; 52:141-153. [PMID: 37985205 PMCID: PMC10783505 DOI: 10.1093/nar/gkad1076] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 10/18/2023] [Accepted: 10/26/2023] [Indexed: 11/22/2023] Open
Abstract
Genetic modification of specific genes is emerging as a useful tool to enhance the functions of antitumor T cells in adoptive immunotherapy. Current advances in CRISPR/Cas9 technology enable gene knockout during in vitro preparation of infused T-cell products through transient transfection of a Cas9-guide RNA (gRNA) ribonucleoprotein complex. However, selecting optimal gRNAs remains a major challenge for efficient gene ablation. Although multiple in silico tools to predict the targeting efficiency have been developed, their performance has not been validated in cultured human T cells. Here, we explored a strategy to select optimal gRNAs using our pooled data on CRISPR/Cas9-mediated gene knockout in human T cells. The currently available prediction tools alone were insufficient to accurately predict the indel percentage in T cells. We used data on the epigenetic profiles of cultured T cells obtained from transposase-accessible chromatin with high-throughput sequencing (ATAC-seq). Combining the epigenetic information with sequence-based prediction tools significantly improved the gene-editing efficiency. We further demonstrate that epigenetically closed regions can be targeted by designing two gRNAs in adjacent regions. Finally, we demonstrate that the gene-editing efficiency of unstimulated T cells can be enhanced through pretreatment with IL-7. These findings enable more efficient gene editing in human T cells.
Collapse
Affiliation(s)
- Yusuke Ito
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Satoshi Inoue
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Takahiro Nakashima
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
- Department of Hematology and Oncology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Haosong Zhang
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
- Division of Cellular Oncology, Department of Cancer Diagnostics and Therapeutics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Yang Li
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
- Division of Cellular Oncology, Department of Cancer Diagnostics and Therapeutics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Hitomi Kasuya
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Tetsuya Matsukawa
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
- Department of Obstetrics and Gynecology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Zhiwen Wu
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Toshiaki Yoshikawa
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Mirei Kataoka
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| | - Tetsuo Ishikawa
- Department of Extended Intelligence for Medicine, The Ishii-Ishibashi Laboratory, Keio University School of Medicine, Tokyo, Japan
- Advanced Data Science Project, RIKEN Information R&D and Strategy Headquarters, RIKEN, Yokohama, Japan
- Collective Intelligence Research Laboratory, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
| | - Yuki Kagoya
- Division of Tumor Immunology, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Division of Immune Response, Aichi Cancer Center Research Institute, Nagoya, Japan
- Division of Cellular Oncology, Department of Cancer Diagnostics and Therapeutics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
69
|
Dixit S, Kumar A, Srinivasan K, Vincent PMDR, Ramu Krishnan N. Advancing genome editing with artificial intelligence: opportunities, challenges, and future directions. Front Bioeng Biotechnol 2024; 11:1335901. [PMID: 38260726 PMCID: PMC10800897 DOI: 10.3389/fbioe.2023.1335901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
Clustered regularly interspaced short palindromic repeat (CRISPR)-based genome editing (GED) technologies have unlocked exciting possibilities for understanding genes and improving medical treatments. On the other hand, Artificial intelligence (AI) helps genome editing achieve more precision, efficiency, and affordability in tackling various diseases, like Sickle cell anemia or Thalassemia. AI models have been in use for designing guide RNAs (gRNAs) for CRISPR-Cas systems. Tools like DeepCRISPR, CRISTA, and DeepHF have the capability to predict optimal guide RNAs (gRNAs) for a specified target sequence. These predictions take into account multiple factors, including genomic context, Cas protein type, desired mutation type, on-target/off-target scores, potential off-target sites, and the potential impacts of genome editing on gene function and cell phenotype. These models aid in optimizing different genome editing technologies, such as base, prime, and epigenome editing, which are advanced techniques to introduce precise and programmable changes to DNA sequences without relying on the homology-directed repair pathway or donor DNA templates. Furthermore, AI, in collaboration with genome editing and precision medicine, enables personalized treatments based on genetic profiles. AI analyzes patients' genomic data to identify mutations, variations, and biomarkers associated with different diseases like Cancer, Diabetes, Alzheimer's, etc. However, several challenges persist, including high costs, off-target editing, suitable delivery methods for CRISPR cargoes, improving editing efficiency, and ensuring safety in clinical applications. This review explores AI's contribution to improving CRISPR-based genome editing technologies and addresses existing challenges. It also discusses potential areas for future research in AI-driven CRISPR-based genome editing technologies. The integration of AI and genome editing opens up new possibilities for genetics, biomedicine, and healthcare, with significant implications for human health.
Collapse
Affiliation(s)
- Shriniket Dixit
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Anant Kumar
- School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India
| | - Kathiravan Srinivasan
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - P. M. Durai Raj Vincent
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India
| | - Nadesh Ramu Krishnan
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
70
|
Wei J, Lotfy P, Faizi K, Baungaard S, Gibson E, Wang E, Slabodkin H, Kinnaman E, Chandrasekaran S, Kitano H, Durrant MG, Duffy CV, Pawluk A, Hsu PD, Konermann S. Deep learning and CRISPR-Cas13d ortholog discovery for optimized RNA targeting. Cell Syst 2023; 14:1087-1102.e13. [PMID: 38091991 DOI: 10.1016/j.cels.2023.11.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 05/03/2023] [Accepted: 11/20/2023] [Indexed: 12/23/2023]
Abstract
Effective and precise mammalian transcriptome engineering technologies are needed to accelerate biological discovery and RNA therapeutics. Despite the promise of programmable CRISPR-Cas13 ribonucleases, their utility has been hampered by an incomplete understanding of guide RNA design rules and cellular toxicity resulting from off-target or collateral RNA cleavage. Here, we quantified the performance of over 127,000 RfxCas13d (CasRx) guide RNAs and systematically evaluated seven machine learning models to build a guide efficiency prediction algorithm orthogonally validated across multiple human cell types. Deep learning model interpretation revealed preferred sequence motifs and secondary features for highly efficient guides. We next identified and screened 46 novel Cas13d orthologs, finding that DjCas13d achieves low cellular toxicity and high specificity-even when targeting abundant transcripts in sensitive cell types, including stem cells and neurons. Our Cas13d guide efficiency model was successfully generalized to DjCas13d, illustrating the power of combining machine learning with ortholog discovery to advance RNA targeting in human cells.
Collapse
Affiliation(s)
- Jingyi Wei
- Department of Bioengineering, Stanford University, Stanford, CA, USA; Department of Biochemistry, Stanford University, Stanford, CA, USA; Arc Institute, Palo Alto, CA, USA
| | - Peter Lotfy
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Kian Faizi
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | | | - Eleanor Wang
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA; Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA; Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Hannah Slabodkin
- Department of Biochemistry, Stanford University, Stanford, CA, USA; Arc Institute, Palo Alto, CA, USA
| | - Emily Kinnaman
- Department of Biochemistry, Stanford University, Stanford, CA, USA; Arc Institute, Palo Alto, CA, USA
| | - Sita Chandrasekaran
- Arc Institute, Palo Alto, CA, USA; Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA; Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Hugo Kitano
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Matthew G Durrant
- Arc Institute, Palo Alto, CA, USA; Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA; Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Connor V Duffy
- Arc Institute, Palo Alto, CA, USA; Department of Genetics, Stanford University, Stanford, CA, USA
| | | | - Patrick D Hsu
- Arc Institute, Palo Alto, CA, USA; Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA; Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - Silvana Konermann
- Department of Biochemistry, Stanford University, Stanford, CA, USA; Arc Institute, Palo Alto, CA, USA.
| |
Collapse
|
71
|
Aslam I, Shah S, Jabeen S, ELAffendi M, A Abdel Latif A, Ul Haq N, Ali G. A CNN based m5c RNA methylation predictor. Sci Rep 2023; 13:21885. [PMID: 38081880 PMCID: PMC10713599 DOI: 10.1038/s41598-023-48751-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open
Abstract
Post-transcriptional modifications of RNA play a key role in performing a variety of biological processes, such as stability and immune tolerance, RNA splicing, protein translation and RNA degradation. One of these RNA modifications is m5c which participates in various cellular functions like RNA structural stability and translation efficiency, got popularity among biologists. By applying biological experiments to detect RNA m5c methylation sites would require much more efforts, time and money. Most of the researchers are using pre-processed RNA sequences of 41 nucleotides where the methylated cytosine is in the center. Therefore, it is possible that some of the information around these motif may have lost. The conventional methods are unable to process the RNA sequence directly due to high dimensionality and thus need optimized techniques for better features extraction. To handle the above challenges the goal of this study is to employ an end-to-end, 1D CNN based model to classify and interpret m5c methylated data sites. Moreover, our aim is to analyze the sequence in its full length where the methylated cytosine may not be in the center. The evaluation of the proposed architecture showed a promising results by outperforming state-of-the-art techniques in terms of sensitivity and accuracy. Our model achieve 96.70% sensitivity and 96.21% accuracy for 41 nucleotides sequences while 96.10% accuracy for full length sequences.
Collapse
Affiliation(s)
- Irum Aslam
- Department of Computer Science, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, 22060, KPK, Pakistan
| | - Sajid Shah
- EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Rafha, Riyadh, 12435, Saudi Arabia
| | - Saima Jabeen
- College of Engineering, AI Research Center, Alfaisal University, Riyadh, 50927, Saudi Arabia.
| | - Mohammed ELAffendi
- EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Rafha, Riyadh, 12435, Saudi Arabia
| | - Asmaa A Abdel Latif
- Public Health and Community Medicine Department (Industrial medicine and occupational health specialty, Faculty of Medicine, Menoufia University, Shibîn el Kôm, Egypt
| | - Nuhman Ul Haq
- Department of Computer Science, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, 22060, KPK, Pakistan
| | - Gauhar Ali
- EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Rafha, Riyadh, 12435, Saudi Arabia
| |
Collapse
|
72
|
Zhong Z, Li Z, Yang J, Wang Q. Unified Model to Predict gRNA Efficiency across Diverse Cell Lines and CRISPR-Cas9 Systems. J Chem Inf Model 2023; 63:7320-7329. [PMID: 37983481 DOI: 10.1021/acs.jcim.3c01339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Computationally predicting the efficiency of a guide RNA (gRNA) from its sequence is crucial to designing the CRISPR-Cas9 system. Currently, machine learning (ML)-based models are widely used for such predictions. However, these ML models often show performance imbalance when applied to multiple data sets from diverse sources, hindering the practical utilization of these tools. To address this issue, we propose a Michaelis-Menten theoretical framework that integrates information from multiple data sets. We demonstrate that the binding free energy can serve as a useful invariant that bridges the data from different experimental setups. Building upon this framework, we develop a new ML model called Uni-deepSG. This model exhibits broad applicability on 27 data sets with different cell types, Cas9 variants, and gRNA designs. Our work confirms the existence of a generalized model for predicting gRNA efficiency and lays the theoretical groundwork necessary to finalize such a model.
Collapse
Affiliation(s)
- Zhicheng Zhong
- Department of Physics, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Zeying Li
- Department of Physics, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Jie Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Qian Wang
- Department of Physics, University of Science and Technology of China, Hefei 230026, Anhui, China
| |
Collapse
|
73
|
Störtz F, Mak JK, Minary P. piCRISPR: Physically informed deep learning models for CRISPR/Cas9 off-target cleavage prediction. ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES 2023; 3:None. [PMID: 38047242 PMCID: PMC10316064 DOI: 10.1016/j.ailsci.2023.100075] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 04/02/2023] [Accepted: 04/30/2023] [Indexed: 12/05/2023]
Abstract
CRISPR/Cas programmable nuclease systems have become ubiquitous in the field of gene editing. With progressing development, applications in in vivo therapeutic gene editing are increasingly within reach, yet limited by possible adverse side effects from unwanted edits. Recent years have thus seen continuous development of off-target prediction algorithms trained on in vitro cleavage assay data gained from immortalised cell lines. It has been shown that in contrast to experimental epigenetic features, computed physically informed features are so far underutilised despite bearing considerably larger correlation with cleavage activity. Here, we implement state-of-the-art deep learning algorithms and feature encodings for off-target prediction with emphasis on physically informed features that capture the biological environment of the cleavage site, hence terming our approach piCRISPR. Features were gained from the large, diverse crisprSQL off-target cleavage dataset. We find that our best-performing models highlight the importance of sequence context and chromatin accessibility for cleavage prediction and compare favourably with literature standard prediction performance. We further show that our novel, environmentally sensitive features are crucial to accurate prediction on sequence-identical locus pairs, making them highly relevant for clinical guide design. The source code and trained models can be found ready to use at github.com/florianst/picrispr.
Collapse
Affiliation(s)
- Florian Störtz
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| | - Jeffrey K. Mak
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| | - Peter Minary
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| |
Collapse
|
74
|
Han T, Nazarbekov A, Zou X, Lee SY. Recent advances in systems metabolic engineering. Curr Opin Biotechnol 2023; 84:103004. [PMID: 37778304 DOI: 10.1016/j.copbio.2023.103004] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/03/2023] [Accepted: 09/05/2023] [Indexed: 10/03/2023]
Abstract
Systems metabolic engineering, which integrates metabolic engineering with systems biology, synthetic biology, and evolutionary engineering, has revolutionized the sustainable production of fuels and materials through the creation of efficient microbial cell factories. Recent advancements in systems metabolic engineering targeting different biological components of the host cell have enabled the creation of highly productive microbial cell factories. This article provides a review of the recent tools and strategies used for enzyme-, genetic module-, pathway-, flux-, genome-, and cell-level engineering, supported by illustrative examples. Furthermore, we highlight recent trends in systems metabolic engineering, which involve the application of multiple tools discussed in this review. Finally, the paper addresses the challenges and perspectives of transitioning academic-level metabolic engineering studies to commercial-scale production.
Collapse
Affiliation(s)
- Taehee Han
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, the Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, the Republic of Korea; BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, 34141 Daejeon, the Republic of Korea
| | - Alisher Nazarbekov
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, the Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, the Republic of Korea
| | - Xuan Zou
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, the Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, the Republic of Korea; BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, 34141 Daejeon, the Republic of Korea
| | - Sang Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, the Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, the Republic of Korea; BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, 34141 Daejeon, the Republic of Korea; Graduate School of Engineering Biology, KAIST, Daejeon 34141, the Republic of Korea.
| |
Collapse
|
75
|
Santorsola M, Lescai F. The promise of explainable deep learning for omics data analysis: Adding new discovery tools to AI. N Biotechnol 2023; 77:1-11. [PMID: 37329982 DOI: 10.1016/j.nbt.2023.06.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/01/2023] [Accepted: 06/14/2023] [Indexed: 06/19/2023]
Abstract
Deep learning has already revolutionised the way a wide range of data is processed in many areas of daily life. The ability to learn abstractions and relationships from heterogeneous data has provided impressively accurate prediction and classification tools to handle increasingly big datasets. This has a significant impact on the growing wealth of omics datasets, with the unprecedented opportunity for a better understanding of the complexity of living organisms. While this revolution is transforming the way these data are analyzed, explainable deep learning is emerging as an additional tool with the potential to change the way biological data is interpreted. Explainability addresses critical issues such as transparency, so important when computational tools are introduced especially in clinical environments. Moreover, it empowers artificial intelligence with the capability to provide new insights into the input data, thus adding an element of discovery to these already powerful resources. In this review, we provide an overview of the transformative effects explainable deep learning is having on multiple sectors, ranging from genome engineering and genomics, from radiomics to drug design and clinical trials. We offer a perspective to life scientists, to better understand the potential of these tools, and a motivation to implement them in their research, by suggesting learning resources they can use to move their first steps in this field.
Collapse
Affiliation(s)
| | - Francesco Lescai
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy.
| |
Collapse
|
76
|
Motoche-Monar C, Ordoñez JE, Chang O, Gonzales-Zubiate FA. gRNA Design: How Its Evolution Impacted on CRISPR/Cas9 Systems Refinement. Biomolecules 2023; 13:1698. [PMID: 38136570 PMCID: PMC10741458 DOI: 10.3390/biom13121698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/05/2023] [Accepted: 06/12/2023] [Indexed: 12/24/2023] Open
Abstract
Over the past decade, genetic engineering has witnessed a revolution with the emergence of a relatively new genetic editing tool based on RNA-guided nucleases: the CRISPR/Cas9 system. Since the first report in 1987 and characterization in 2007 as a bacterial defense mechanism, this system has garnered immense interest and research attention. CRISPR systems provide immunity to bacteria against invading genetic material; however, with specific modifications in sequence and structure, it becomes a precise editing system capable of modifying the genomes of a wide range of organisms. The refinement of these modifications encompasses diverse approaches, including the development of more accurate nucleases, understanding of the cellular context and epigenetic conditions, and the re-designing guide RNAs (gRNAs). Considering the critical importance of the correct performance of CRISPR/Cas9 systems, our scope will emphasize the latter approach. Hence, we present an overview of the past and the most recent guide RNA web-based design tools, highlighting the evolution of their computational architecture and gRNA characteristics over the years. Our study explains computational approaches that use machine learning techniques, neural networks, and gRNA/target interactions data to enable predictions and classifications. This review could open the door to a dynamic community that uses up-to-date algorithms to optimize and create promising gRNAs, suitable for modern CRISPR/Cas9 engineering.
Collapse
Affiliation(s)
- Cristofer Motoche-Monar
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
| | - Julián E. Ordoñez
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
| | - Oscar Chang
- Departamento de Electrónica, Universidad Simon Bolivar, Caracas 1080, Venezuela
- MIND Research Group, Model Intelligent Networks Development, Urcuquí 100119, Ecuador
| | - Fernando A. Gonzales-Zubiate
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
- MIND Research Group, Model Intelligent Networks Development, Urcuquí 100119, Ecuador
| |
Collapse
|
77
|
Chen Q, Chuai G, Zhang H, Tang J, Duan L, Guan H, Li W, Li W, Wen J, Zuo E, Zhang Q, Liu Q. Genome-wide CRISPR off-target prediction and optimization using RNA-DNA interaction fingerprints. Nat Commun 2023; 14:7521. [PMID: 37980345 PMCID: PMC10657421 DOI: 10.1038/s41467-023-42695-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 10/19/2023] [Indexed: 11/20/2023] Open
Abstract
The powerful CRISPR genome editing system is hindered by its off-target effects, and existing computational tools achieved limited performance in genome-wide off-target prediction due to the lack of deep understanding of the CRISPR molecular mechanism. In this study, we propose to incorporate molecular dynamics (MD) simulations in the computational analysis of CRISPR system, and present CRISOT, an integrated tool suite containing four related modules, i.e., CRISOT-FP, CRISOT-Score, CRISOT-Spec, CRISORT-Opti for RNA-DNA molecular interaction fingerprint generation, genome-wide CRISPR off-target prediction, sgRNA specificity evaluation and sgRNA optimization of Cas9 system respectively. Our comprehensive computational and experimental tests reveal that CRISOT outperforms existing tools with extensive in silico validations and proof-of-concept experimental validations. In addition, CRISOT shows potential in accurately predicting off-target effects of the base editors and prime editors, indicating that the derived RNA-DNA molecular interaction fingerprint captures the underlying mechanisms of RNA-DNA interaction among distinct CRISPR systems. Collectively, CRISOT provides an efficient and generalizable framework for genome-wide CRISPR off-target prediction, evaluation and sgRNA optimization for improved targeting specificity in CRISPR genome editing.
Collapse
Affiliation(s)
- Qinchang Chen
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Guohui Chuai
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Haihang Zhang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Gene Editing Technologies (Hainan), Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jin Tang
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Liwen Duan
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Huan Guan
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Wenhui Li
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Wannian Li
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jiaying Wen
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Erwei Zuo
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Gene Editing Technologies (Hainan), Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| | - Qing Zhang
- Roche R&D Center (China) Ltd., China Innovation Center of Roche, Shanghai, 201203, China.
- Ailomics Therapeutics, Shanghai, 201203, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| |
Collapse
|
78
|
Veluchamy A, Teles K, Fischle W. CRISPR-broad: combined design of multi-targeting gRNAs and broad, multiplex target finding. Sci Rep 2023; 13:19717. [PMID: 37953351 PMCID: PMC10641073 DOI: 10.1038/s41598-023-46212-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 10/29/2023] [Indexed: 11/14/2023] Open
Abstract
In CRISPR-Cas and related nuclease-mediated genome editing, target recognition is based on guide RNAs (gRNAs) that are complementary to selected DNA regions. While single site targeting is fundamental for localized genome editing, targeting to expanded and multiple chromosome elements is desirable for various biological applications such as genome mapping and epigenome editing that make use of different fusion proteins with enzymatically dead Cas9. The current gRNA design tools are not suitable for this task, as these are optimized for defining single gRNAs for unique loci. Here, we introduce CRISPR-broad, a standalone, open-source application that defines gRNAs with multiple but specific targets in large continuous or spread regions of the genome, as defined by the user. This ability to identify multi-targeting gRNAs and corresponding multiple targetable regions in genomes is based on a novel aggregate gRNA scoring derived from on-target windows and off-target sites. Applying the new tool to the genomes of two model species, C. elegans and H. sapiens, we verified its efficiency in determining multi-targeting gRNAs and ranking potential target regions optimized for broad targeting. Further, we demonstrated the general usability of CRISPR-broad by cellular mapping of a large human genome element using dCas9 fused to green fluorescent protein.
Collapse
Affiliation(s)
- Alaguraj Veluchamy
- Bioscience Program, Division of Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Kingdom of Saudi Arabia.
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
| | - Kaian Teles
- Bioscience Program, Division of Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Kingdom of Saudi Arabia
| | - Wolfgang Fischle
- Bioscience Program, Division of Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
79
|
Capponi S, Daniels KG. Harnessing the power of artificial intelligence to advance cell therapy. Immunol Rev 2023; 320:147-165. [PMID: 37415280 DOI: 10.1111/imr.13236] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023]
Abstract
Cell therapies are powerful technologies in which human cells are reprogrammed for therapeutic applications such as killing cancer cells or replacing defective cells. The technologies underlying cell therapies are increasing in effectiveness and complexity, making rational engineering of cell therapies more difficult. Creating the next generation of cell therapies will require improved experimental approaches and predictive models. Artificial intelligence (AI) and machine learning (ML) methods have revolutionized several fields in biology including genome annotation, protein structure prediction, and enzyme design. In this review, we discuss the potential of combining experimental library screens and AI to build predictive models for the development of modular cell therapy technologies. Advances in DNA synthesis and high-throughput screening techniques enable the construction and screening of libraries of modular cell therapy constructs. AI and ML models trained on this screening data can accelerate the development of cell therapies by generating predictive models, design rules, and improved designs.
Collapse
Affiliation(s)
- Sara Capponi
- Department of Functional Genomics and Cellular Engineering, IBM Almaden Research Center, San Jose, California, USA
- Center for Cellular Construction, San Francisco, California, USA
| | - Kyle G Daniels
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
80
|
Zhao R, Luo W, Wu Y, Zhang L, Liu X, Li J, Yang Y, Wang L, Wang L, Han X, Wang Z, Zhang J, Lv K, Chen T, Xie G. Unmodificated stepless regulation of CRISPR/Cas12a multi-performance. Nucleic Acids Res 2023; 51:10795-10807. [PMID: 37757856 PMCID: PMC10602922 DOI: 10.1093/nar/gkad748] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 08/26/2023] [Accepted: 09/01/2023] [Indexed: 09/29/2023] Open
Abstract
As CRISPR technology is promoted to more fine-divided molecular biology applications, its inherent performance finds it increasingly difficult to cope with diverse needs in these different fields, and how to more accurately control the performance has become a key issue to develop CRISPR technology to a new stage. Herein, we propose a CRISPR/Cas12a regulation strategy based on the powerful programmability of nucleic acid nanotechnology. Unlike previous difficult and rigid regulation of core components Cas nuclease and crRNA, only a simple switch of different external RNA accessories is required to change the reaction kinetics or thermodynamics, thereby finely and almost steplessly regulating multi-performance of CRISPR/Cas12a including activity, speed, specificity, compatibility, programmability and sensitivity. In particular, the significantly improved specificity is expected to mark advance the accuracy of molecular detection and the safety of gene editing. In addition, this strategy was applied to regulate the delayed activation of Cas12a, overcoming the compatibility problem of the one-pot assay without any physical separation or external stimulation, and demonstrating great potential for fine-grained control of CRISPR. This simple but powerful CRISPR regulation strategy without any component modification has pioneering flexibility and versatility, and will unlock the potential for deeper applications of CRISPR technology in many finely divided fields.
Collapse
Affiliation(s)
- Rong Zhao
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Wang Luo
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - You Wu
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Li Zhang
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Xin Liu
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Junjie Li
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Yujun Yang
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Li Wang
- The Center for Clinical Molecular Medical Detection, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, PR China
| | - Luojia Wang
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Xiaole Han
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Zhongzhong Wang
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Jianhong Zhang
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Ke Lv
- Department of Neurosurgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, PR China
| | - Tingmei Chen
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| | - Guoming Xie
- Key Laboratory of Clinical Laboratory Diagnostics (Chinese Ministry of Education), College of Laboratory Medicine, Chongqing Medical Laboratory Microfluidics and SPRi Engineering Research Center, Chongqing Medical University, Chongqing 400016, PR China
| |
Collapse
|
81
|
Noshay J, Walker T, Alexander W, Klingeman D, Romero J, Walker A, Prates E, Eckert C, Irle S, Kainer D, Jacobson D. Quantum biological insights into CRISPR-Cas9 sgRNA efficiency from explainable-AI driven feature engineering. Nucleic Acids Res 2023; 51:10147-10161. [PMID: 37738140 PMCID: PMC10602897 DOI: 10.1093/nar/gkad736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 08/07/2023] [Accepted: 08/29/2023] [Indexed: 09/24/2023] Open
Abstract
CRISPR-Cas9 tools have transformed genetic manipulation capabilities in the laboratory. Empirical rules-of-thumb have been developed for only a narrow range of model organisms, and mechanistic underpinnings for sgRNA efficiency remain poorly understood. This work establishes a novel feature set and new public resource, produced with quantum chemical tensors, for interpreting and predicting sgRNA efficiency. Feature engineering for sgRNA efficiency is performed using an explainable-artificial intelligence model: iterative Random Forest (iRF). By encoding quantitative attributes of position-specific sequences for Escherichia coli sgRNAs, we identify important traits for sgRNA design in bacterial species. Additionally, we show that expanding positional encoding to quantum descriptors of base-pair, dimer, trimer, and tetramer sequences captures intricate interactions in local and neighboring nucleotides of the target DNA. These features highlight variation in CRISPR-Cas9 sgRNA dynamics between E. coli and H. sapiens genomes. These novel encodings of sgRNAs enhance our understanding of the elaborate quantum biological processes involved in CRISPR-Cas9 machinery.
Collapse
Affiliation(s)
- Jaclyn M Noshay
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Tyler Walker
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - William G Alexander
- Synthetic Biology, Biosciences,Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Dawn M Klingeman
- Synthetic Biology, Biosciences,Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jonathon Romero
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Angelica M Walker
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Erica Prates
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Carrie Eckert
- Synthetic Biology, Biosciences,Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Stephan Irle
- Computational Sciences and Engineering, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - David Kainer
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Daniel A Jacobson
- Computational and Predictive Biology, Biosciences, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| |
Collapse
|
82
|
Yang Y, Li J, Zou Q, Ruan Y, Feng H. Prediction of CRISPR-Cas9 off-target activities with mismatches and indels based on hybrid neural network. Comput Struct Biotechnol J 2023; 21:5039-5048. [PMID: 37867973 PMCID: PMC10589368 DOI: 10.1016/j.csbj.2023.10.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/24/2023] Open
Abstract
The CRISPR/Cas9 system has significantly advanced the field of gene editing, yet its clinical application is constrained by the considerable challenge of off-target effects. Although numerous deep learning models for off-target prediction have been proposed, most struggle to effectively extract the nuanced features of guide RNA (gRNA) and DNA sequence pairs and to mitigate information loss during data transmission within the model. To address these limitations, we introduce a novel Hybrid Neural Network (HNN) model that employs a parallelized network structure to fully extract pertinent features from different positions and types of bases in the sequence to minimize information loss. Notably, this study marks the first application of word embedding techniques to extract information from sequence pairs that contain insertions and deletions (Indels). Comprehensive evaluation across diverse datasets indicates that our proposed model outperforms existing state-of-the-art prediction methods in off-target prediction. The datasets and source codes supporting this study can be found at https://github.com/Yang-k955/CRISPR-HW.
Collapse
Affiliation(s)
- Yanpeng Yang
- School of Mathematics and Computer science, Zhejiang A&F University, Hangzhou 311300, China
| | - Jian Li
- School of Mathematics and Computer science, Zhejiang A&F University, Hangzhou 311300, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yaoping Ruan
- School of Mathematics and Computer science, Zhejiang A&F University, Hangzhou 311300, China
| | - Hailin Feng
- School of Mathematics and Computer science, Zhejiang A&F University, Hangzhou 311300, China
| |
Collapse
|
83
|
Fischer K, Schnieke A. How genome editing changed the world of large animal research. Front Genome Ed 2023; 5:1272687. [PMID: 37886655 PMCID: PMC10598601 DOI: 10.3389/fgeed.2023.1272687] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 09/22/2023] [Indexed: 10/28/2023] Open
Abstract
The first genetically modified large animals were developed in 1985 by microinjection to increase the growth of agricultural livestock such as pigs. Since then, it has been a difficult trail due to the lack of genetic tools. Although methods and technologies were developed quickly for the main experimental mammal, the mouse, e.g., efficient pronuclear microinjection, gene targeting in embryonic stem cells, and omics data, most of it was-and in part still is-lacking when it comes to livestock. Over the next few decades, progress in genetic engineering of large animals was driven less by research for agriculture but more for biomedical applications, such as the production of pharmaceutical proteins in the milk of sheep, goats, or cows, xeno-organ transplantation, and modeling human diseases. Available technologies determined if a desired animal model could be realized, and efficiencies were generally low. Presented here is a short review of how genome editing tools, specifically CRISPR/Cas, have impacted the large animal field in recent years. Although there will be a focus on genome engineering of pigs for biomedical applications, the general principles and experimental approaches also apply to other livestock species or applications.
Collapse
Affiliation(s)
| | - Angelika Schnieke
- Chair of Livestock Biotechnology, School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| |
Collapse
|
84
|
Li L, Vasan L, Kartono B, Clifford K, Attarpour A, Sharma R, Mandrozos M, Kim A, Zhao W, Belotserkovsky A, Verkuyl C, Schmitt-Ulms G. Advances in Recombinant Adeno-Associated Virus Vectors for Neurodegenerative Diseases. Biomedicines 2023; 11:2725. [PMID: 37893099 PMCID: PMC10603849 DOI: 10.3390/biomedicines11102725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 09/29/2023] [Accepted: 10/03/2023] [Indexed: 10/29/2023] Open
Abstract
Recombinant adeno-associated virus (rAAV) vectors are gene therapy delivery tools that offer a promising platform for the treatment of neurodegenerative diseases. Keeping up with developments in this fast-moving area of research is a challenge. This review was thus written with the intention to introduce this field of study to those who are new to it and direct others who are struggling to stay abreast of the literature towards notable recent studies. In ten sections, we briefly highlight early milestones within this field and its first clinical success stories. We showcase current clinical trials, which focus on gene replacement, gene augmentation, or gene suppression strategies. Next, we discuss ongoing efforts to improve the tropism of rAAV vectors for brain applications and introduce pre-clinical research directed toward harnessing rAAV vectors for gene editing applications. Subsequently, we present common genetic elements coded by the single-stranded DNA of rAAV vectors, their so-called payloads. Our focus is on recent advances that are bound to increase treatment efficacies. As needed, we included studies outside the neurodegenerative disease field that showcased improved pre-clinical designs of all-in-one rAAV vectors for gene editing applications. Finally, we discuss risks associated with off-target effects and inadvertent immunogenicity that these technologies harbor as well as the mitigation strategies available to date to make their application safer.
Collapse
Affiliation(s)
- Leyao Li
- Department of Biochemistry, University of Toronto, Medical Sciences Building, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
| | - Lakshmy Vasan
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Bryan Kartono
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Kevan Clifford
- Institute of Medical Science, University of Toronto, Medical Sciences Building, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
- Centre for Addiction and Mental Health (CAMH), 250 College St., Toronto, ON M5T 1R8, Canada
| | - Ahmadreza Attarpour
- Department of Medical Biophysics, University of Toronto, 101 College St., Toronto, ON M5G 1L7, Canada
| | - Raghav Sharma
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Matthew Mandrozos
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Ain Kim
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Wenda Zhao
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Ari Belotserkovsky
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Claire Verkuyl
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| | - Gerold Schmitt-Ulms
- Tanz Centre for Research in Neurodegenerative Diseases, University of Toronto, Krembil Discovery Centre, 6th Floor, 60 Leonard Avenue, Toronto, ON M5T 0S8, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Medical Sciences Building, 6th Floor, 1 King’s College Circle, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
85
|
Liu Y, Fan R, Yi J, Cui Q, Cui C. A fusion framework of deep learning and machine learning for predicting sgRNA cleavage efficiency. Comput Biol Med 2023; 165:107476. [PMID: 37696181 DOI: 10.1016/j.compbiomed.2023.107476] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/22/2023] [Accepted: 09/04/2023] [Indexed: 09/13/2023]
Abstract
CRISPR/Cas9 system is a powerful tool for genome editing. Numerous studies have shown that sgRNAs can strongly affect the efficiency of editing. However, it is still not clear what rules should be followed for designing sgRNA with high cleavage efficiency. At present, several machine learning or deep learning methods have been developed to predict the cleavage efficiency of sgRNAs, however, the prediction accuracy of these tools is still not satisfactory. Here we propose a fusion framework of deep learning and machine learning, which first deals with the primary sequence and secondary structure features of the sgRNAs using both convolutional neural network (CNN) and recurrent neural network (RNN), and then uses the features extracted by the deep neural network to train a conventional machine learning model with LGBM. As a result, the new approach overwhelmed previous methods. The Spearman's correlation coefficient between predicted and measured sgRNA cleavage efficiency of our model (0.917) is improved by over 5% compared with the most advanced method (0.865), and the mean square error reduces from 7.89 × 10-3 to 4.75 × 10-3. Finally, we developed an online tool, CRISep (http://www.cuilab.cn/CRISep), to evaluate the availability of sgRNAs based on our models.
Collapse
Affiliation(s)
- Yu Liu
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Rui Fan
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Jingkun Yi
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Qinghua Cui
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China.
| | - Chunmei Cui
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China.
| |
Collapse
|
86
|
Zhang G, Luo Y, Dai X, Dai Z. Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities. Brief Bioinform 2023; 24:bbad333. [PMID: 37775147 DOI: 10.1093/bib/bbad333] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 08/31/2023] [Accepted: 09/04/2023] [Indexed: 10/01/2023] Open
Abstract
In silico design of single guide RNA (sgRNA) plays a critical role in clustered regularly interspaced, short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9) system. Continuous efforts are aimed at improving sgRNA design with efficient on-target activity and reduced off-target mutations. In the last 5 years, an increasing number of deep learning-based methods have achieved breakthrough performance in predicting sgRNA on- and off-target activities. Nevertheless, it is worthwhile to systematically evaluate these methods for their predictive abilities. In this review, we conducted a systematic survey on the progress in prediction of on- and off-target editing. We investigated the performances of 10 mainstream deep learning-based on-target predictors using nine public datasets with different sample sizes. We found that in most scenarios, these methods showed superior predictive power on large- and medium-scale datasets than on small-scale datasets. In addition, we performed unbiased experiments to provide in-depth comparison of eight representative approaches for off-target prediction on 12 publicly available datasets with various imbalanced ratios of positive/negative samples. Most methods showed excellent performance on balanced datasets but have much room for improvement on moderate- and severe-imbalanced datasets. This study provides comprehensive perspectives on CRISPR/Cas9 sgRNA on- and off-target activity prediction and improvement for method development.
Collapse
Affiliation(s)
- Guishan Zhang
- College of Engineering, Shantou University, Shantou 515063, China
| | - Ye Luo
- College of Engineering, Shantou University, Shantou 515063, China
| | - Xianhua Dai
- School of Cyber Science and Technology, Sun Yat-sen University, Shenzhen 518107, China
- Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai 519000, China
| | - Zhiming Dai
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- Guangdong Province Key Laboratory of Big Data Analysis and Processing, Sun Yat-sen University, Guangzhou 510006, China
| |
Collapse
|
87
|
Mantena S, Pillai PP, Petros BA, Welch NL, Myhrvold C, Sabeti PC, Metsky HC. Model-directed generation of CRISPR-Cas13a guide RNAs designs artificial sequences that improve nucleic acid detection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.557569. [PMID: 37786711 PMCID: PMC10541601 DOI: 10.1101/2023.09.20.557569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
Generating maximally-fit biological sequences has the potential to transform CRISPR guide RNA design as it has other areas of biomedicine. Here, we introduce model-directed exploration algorithms (MEAs) for designing maximally-fit, artificial CRISPR-Cas13a guides-with multiple mismatches to any natural sequence-that are tailored for desired properties around nucleic acid diagnostics. We find that MEA-designed guides offer more sensitive detection of diverse pathogens and discrimination of pathogen variants compared to guides derived directly from natural sequences, and illuminate interpretable design principles that broaden Cas13a targeting.
Collapse
Affiliation(s)
- Sreekar Mantena
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | | | - Brittany A. Petros
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Health Sciences and Technology, Harvard Medical School and Massachusetts Institute of Technology, Cambridge, MA, USA
- Harvard/Massachusetts Institute of Technology, MD-PhD Program, Boston, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Cameron Myhrvold
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Pardis C. Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | |
Collapse
|
88
|
Ham DT, Browne TS, Banglorewala PN, Wilson TL, Michael RK, Gloor GB, Edgell DR. A generalizable Cas9/sgRNA prediction model using machine transfer learning with small high-quality datasets. Nat Commun 2023; 14:5514. [PMID: 37679324 PMCID: PMC10485023 DOI: 10.1038/s41467-023-41143-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023] Open
Abstract
The CRISPR/Cas9 nuclease from Streptococcus pyogenes (SpCas9) can be used with single guide RNAs (sgRNAs) as a sequence-specific antimicrobial agent and as a genome-engineering tool. However, current bacterial sgRNA activity models struggle with accurate predictions and do not generalize well, possibly because the underlying datasets used to train the models do not accurately measure SpCas9/sgRNA activity and cannot distinguish on-target cleavage from toxicity. Here, we solve this problem by using a two-plasmid positive selection system to generate high-quality data that more accurately reports on SpCas9/sgRNA cleavage and that separates activity from toxicity. We develop a machine learning architecture (crisprHAL) that can be trained on existing datasets, that shows marked improvements in sgRNA activity prediction accuracy when transfer learning is used with small amounts of high-quality data, and that can generalize predictions to different bacteria. The crisprHAL model recapitulates known SpCas9/sgRNA-target DNA interactions and provides a pathway to a generalizable sgRNA bacterial activity prediction tool that will enable accurate antimicrobial and genome engineering applications.
Collapse
Affiliation(s)
- Dalton T Ham
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Tyler S Browne
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Pooja N Banglorewala
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | | | | | - Gregory B Gloor
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| | - David R Edgell
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| |
Collapse
|
89
|
Qian Y, Zhou D, Li M, Zhao Y, Liu H, Yang L, Ying Z, Huang G. Application of CRISPR-Cas system in the diagnosis and therapy of ESKAPE infections. Front Cell Infect Microbiol 2023; 13:1223696. [PMID: 37662004 PMCID: PMC10470840 DOI: 10.3389/fcimb.2023.1223696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/24/2023] [Indexed: 09/05/2023] Open
Abstract
Antimicrobial-resistant ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) pathogens represent a global threat to human health. ESKAPE pathogens are the most common opportunistic pathogens in nosocomial infections, and a considerable number of their clinical isolates are not susceptible to conventional antimicrobial therapy. Therefore, innovative therapeutic strategies that can effectively deal with ESKAPE pathogens will bring huge social and economic benefits and ease the suffering of tens of thousands of patients. Among these strategies, CRISPR (clustered regularly interspaced short palindromic repeats) system has received extra attention due to its high specificity. Regrettably, there is currently no direct CRISPR-system-based anti-infective treatment. This paper reviews the applications of CRISPR-Cas system in the study of ESKAPE pathogens, aiming to provide directions for the research of ideal new drugs and provide a reference for solving a series of problems caused by multidrug-resistant bacteria (MDR) in the post-antibiotic era. However, most research is still far from clinical application.
Collapse
Affiliation(s)
- Yizheng Qian
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
| | - Dapeng Zhou
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
- Department of Burn Plastic and Wound Repair Surgery, The Affiliated Hospital of Youjiang Medical University for Nationalities, Baise, China
| | - Min Li
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
| | - Yongxiang Zhao
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
| | - Huanhuan Liu
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
| | - Li Yang
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
| | - Zhiqin Ying
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
| | - Guangtao Huang
- Department of Burns and Plastic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
- The Collaborative Innovation Center of Tissue Damage Repair and Regeneration Medicine of Zunyi Medical University, Zunyi, China
- Department of Burn and Plastic Surgery, Department of Wound Repair, Shenzhen Institute of Translational Medicine, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People’s Hospital, Shenzhen, China
| |
Collapse
|
90
|
Durán-Vinet B, Araya-Castro K, Zaiko A, Pochon X, Wood SA, Stanton JAL, Jeunen GJ, Scriver M, Kardailsky A, Chao TC, Ban DK, Moarefian M, Aran K, Gemmell NJ. CRISPR-Cas-Based Biomonitoring for Marine Environments: Toward CRISPR RNA Design Optimization Via Deep Learning. CRISPR J 2023; 6:316-324. [PMID: 37439822 PMCID: PMC10494903 DOI: 10.1089/crispr.2023.0019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 05/30/2023] [Indexed: 07/14/2023] Open
Abstract
Almost all of Earth's oceans are now impacted by multiple anthropogenic stressors, including the spread of nonindigenous species, harmful algal blooms, and pathogens. Early detection is critical to manage these stressors effectively and to protect marine systems and the ecosystem services they provide. Molecular tools have emerged as a promising solution for marine biomonitoring. One of the latest advancements involves utilizing CRISPR-Cas technology to build programmable, rapid, ultrasensitive, and specific diagnostics. CRISPR-based diagnostics (CRISPR-Dx) has the potential to allow robust, reliable, and cost-effective biomonitoring in near real time. However, several challenges must be overcome before CRISPR-Dx can be established as a mainstream tool for marine biomonitoring. A critical unmet challenge is the need to design, optimize, and experimentally validate CRISPR-Dx assays. Artificial intelligence has recently been presented as a potential approach to tackle this challenge. This perspective synthesizes recent advances in CRISPR-Dx and machine learning modeling approaches, showcasing CRISPR-Dx potential to progress as a rising molecular tool candidate for marine biomonitoring applications.
Collapse
Affiliation(s)
- Benjamín Durán-Vinet
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
- Scientific and Technological Bioresource Nucleus (BIOREN-UFRO), Universidad de La Frontera, Temuco, Chile; Berkeley, Berkeley, California, USA
| | - Karla Araya-Castro
- Scientific and Technological Bioresource Nucleus (BIOREN-UFRO), Universidad de La Frontera, Temuco, Chile; Berkeley, Berkeley, California, USA
| | - Anastasija Zaiko
- Cawthron Institute, Nelson, New Zealand; Berkeley, Berkeley, California, USA
- Institute of Marine Science, University of Auckland, Auckland, New Zealand; Berkeley, Berkeley, California, USA
- Sequench Ltd, Nelson, New Zealand; Berkeley, Berkeley, California, USA
| | - Xavier Pochon
- Cawthron Institute, Nelson, New Zealand; Berkeley, Berkeley, California, USA
- Institute of Marine Science, University of Auckland, Auckland, New Zealand; Berkeley, Berkeley, California, USA
| | - Susanna A. Wood
- Cawthron Institute, Nelson, New Zealand; Berkeley, Berkeley, California, USA
| | - Jo-Ann L. Stanton
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
| | - Gert-Jan Jeunen
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
- Department of Marine Sciences, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
| | - Michelle Scriver
- Cawthron Institute, Nelson, New Zealand; Berkeley, Berkeley, California, USA
- Institute of Marine Science, University of Auckland, Auckland, New Zealand; Berkeley, Berkeley, California, USA
| | - Anya Kardailsky
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
- Department of Zoology, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
| | - Tzu-Chiao Chao
- Institute of Environmental Change and Society, Department of Biology, University of Regina, Regina, Canada; Berkeley, Berkeley, California, USA
| | - Deependra K. Ban
- Keck Graduate Institute, The Claremont Colleges, Claremont, California, USA; Berkeley, Berkeley, California, USA
| | - Maryam Moarefian
- Keck Graduate Institute, The Claremont Colleges, Claremont, California, USA; Berkeley, Berkeley, California, USA
| | - Kiana Aran
- Keck Graduate Institute, The Claremont Colleges, Claremont, California, USA; Berkeley, Berkeley, California, USA
- Cardea Bio Inc., San Diego, California, USA; and Berkeley, Berkeley, California, USA
- University of California, Berkeley, Berkeley, California, USA
| | - Neil J. Gemmell
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand; Berkeley, Berkeley, California, USA
| |
Collapse
|
91
|
Hussen BM, Rasul MF, Abdullah SR, Hidayat HJ, Faraj GSH, Ali FA, Salihi A, Baniahmad A, Ghafouri-Fard S, Rahman M, Glassy MC, Branicki W, Taheri M. Targeting miRNA by CRISPR/Cas in cancer: advantages and challenges. Mil Med Res 2023; 10:32. [PMID: 37460924 PMCID: PMC10351202 DOI: 10.1186/s40779-023-00468-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 07/03/2023] [Indexed: 07/20/2023] Open
Abstract
Clustered regulatory interspaced short palindromic repeats (CRISPR) has changed biomedical research and provided entirely new models to analyze every aspect of biomedical sciences during the last decade. In the study of cancer, the CRISPR/CRISPR-associated protein (Cas) system opens new avenues into issues that were once unknown in our knowledge of the noncoding genome, tumor heterogeneity, and precision medicines. CRISPR/Cas-based gene-editing technology now allows for the precise and permanent targeting of mutations and provides an opportunity to target small non-coding RNAs such as microRNAs (miRNAs). However, the development of effective and safe cancer gene editing therapy is highly dependent on proper design to be innocuous to normal cells and prevent introducing other abnormalities. This study aims to highlight the cutting-edge approaches in cancer-gene editing therapy based on the CRISPR/Cas technology to target miRNAs in cancer therapy. Furthermore, we highlight the potential challenges in CRISPR/Cas-mediated miRNA gene editing and offer advanced strategies to overcome them.
Collapse
Affiliation(s)
- Bashdar Mahmud Hussen
- Department of Biomedical Sciences, Cihan University-Erbil, Erbil, Kurdistan Region 44001 Iraq
- Department of Clinical Analysis, College of Pharmacy, Hawler Medical University, Erbil, Kurdistan Region 44001 Iraq
| | - Mohammed Fatih Rasul
- Department of Pharmaceutical Basic Science, Faculty of Pharmacy, Tishk International University, Erbil, Kurdistan Region 44001 Iraq
| | - Snur Rasool Abdullah
- Medical Laboratory Science, Lebanese French University, Erbil, Kurdistan Region 44001 Iraq
| | - Hazha Jamal Hidayat
- Department of Biology, College of Education, Salahaddin University-Erbil, Erbil, Kurdistan Region 44001 Iraq
| | - Goran Sedeeq Hama Faraj
- Department of Medical Laboratory Science, Komar University of Science and Technology, Sulaymaniyah, 46001 Iraq
| | - Fattma Abodi Ali
- Department of Medical Microbiology, College of Health Sciences, Hawler Medical University, Erbil, Kurdistan Region 44001 Iraq
| | - Abbas Salihi
- Department of Biology, College of Science, Salahaddin University-Erbil, Erbil, Kurdistan Region 44001 Iraq
- Center of Research and Strategic Studies, Lebanese French University, Erbil, 44001 Iraq
| | - Aria Baniahmad
- Institute of Human Genetics, Jena University Hospital, 07747 Jena, Germany
| | - Soudeh Ghafouri-Fard
- Department of Medical Genetics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, 374-37515 Iran
| | - Milladur Rahman
- Department of Clinical Sciences, Malmö, Section for Surgery, Lund University, 22100 Malmö, Sweden
| | - Mark C. Glassy
- Translational Neuro-Oncology Laboratory, San Diego (UCSD) Moores Cancer Center, University of California, San Diego, CA 94720 USA
| | - Wojciech Branicki
- Faculty of Biology, Institute of Zoology and Biomedical Research, Jagiellonian University, 31-007 Kraków, Poland
| | - Mohammad Taheri
- Institute of Human Genetics, Jena University Hospital, 07747 Jena, Germany
- Urology and Nephrology Research Center, Shahid Beheshti University of Medical Sciences, Tehran, 374-37515 Iran
| |
Collapse
|
92
|
Wong F, de la Fuente-Nunez C, Collins JJ. Leveraging artificial intelligence in the fight against infectious diseases. Science 2023; 381:164-170. [PMID: 37440620 PMCID: PMC10663167 DOI: 10.1126/science.adh1114] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 06/05/2023] [Indexed: 07/15/2023]
Abstract
Despite advances in molecular biology, genetics, computation, and medicinal chemistry, infectious disease remains an ominous threat to public health. Addressing the challenges posed by pathogen outbreaks, pandemics, and antimicrobial resistance will require concerted interdisciplinary efforts. In conjunction with systems and synthetic biology, artificial intelligence (AI) is now leading to rapid progress, expanding anti-infective drug discovery, enhancing our understanding of infection biology, and accelerating the development of diagnostics. In this Review, we discuss approaches for detecting, treating, and understanding infectious diseases, underscoring the progress supported by AI in each case. We suggest future applications of AI and how it might be harnessed to help control infectious disease outbreaks and pandemics.
Collapse
Affiliation(s)
- Felix Wong
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - James J. Collins
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
93
|
Ghamsari R, Rosenbluh J, Menon AV, Lovell NH, Alinejad-Rokny H. Technological Convergence: Highlighting the Power of CRISPR Single-Cell Perturbation Toolkit for Functional Interrogation of Enhancers. Cancers (Basel) 2023; 15:3566. [PMID: 37509229 PMCID: PMC10377346 DOI: 10.3390/cancers15143566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/30/2023] [Accepted: 07/03/2023] [Indexed: 07/30/2023] Open
Abstract
Higher eukaryotic enhancers, as a major class of regulatory elements, play a crucial role in the regulation of gene expression. Over the last decade, the development of sequencing technologies has flooded researchers with transcriptome-phenotype data alongside emerging candidate regulatory elements. Since most methods can only provide hints about enhancer function, there have been attempts to develop experimental and computational approaches that can bridge the gap in the causal relationship between regulatory regions and phenotypes. The coupling of two state-of-the-art technologies, also referred to as crisprQTL, has emerged as a promising high-throughput toolkit for addressing this question. This review provides an overview of the importance of studying enhancers, the core molecular foundation of crisprQTL, and recent studies utilizing crisprQTL to interrogate enhancer-phenotype correlations. Additionally, we discuss computational methods currently employed for crisprQTL data analysis. We conclude by pointing out common challenges, making recommendations, and looking at future prospects, with the aim of providing researchers with an overview of crisprQTL as an important toolkit for studying enhancers.
Collapse
Affiliation(s)
- Reza Ghamsari
- BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, Sydney, NSW 2052, Australia
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
| | - Joseph Rosenbluh
- Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia;
| | - A Vipin Menon
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia
| | - Nigel H. Lovell
- The Graduate School of Biomedical Engineering, UNSW Sydney, Sydney, NSW 2052, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, Sydney, NSW 2052, Australia
| | - Hamid Alinejad-Rokny
- BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, Sydney, NSW 2052, Australia
- UNSW Data Science Hub, UNSW Sydney, Sydney, NSW 2052, Australia
| |
Collapse
|
94
|
Lee M. Deep learning in CRISPR-Cas systems: a review of recent studies. Front Bioeng Biotechnol 2023; 11:1226182. [PMID: 37469443 PMCID: PMC10352112 DOI: 10.3389/fbioe.2023.1226182] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 06/22/2023] [Indexed: 07/21/2023] Open
Abstract
In genetic engineering, the revolutionary CRISPR-Cas system has proven to be a vital tool for precise genome editing. Simultaneously, the emergence and rapid evolution of deep learning methodologies has provided an impetus to the scientific exploration of genomic data. These concurrent advancements mandate regular investigation of the state-of-the-art, particularly given the pace of recent developments. This review focuses on the significant progress achieved during 2019-2023 in the utilization of deep learning for predicting guide RNA (gRNA) activity in the CRISPR-Cas system, a key element determining the effectiveness and specificity of genome editing procedures. In this paper, an analytical overview of contemporary research is provided, with emphasis placed on the amalgamation of artificial intelligence and genetic engineering. The importance of our review is underscored by the necessity to comprehend the rapidly evolving deep learning methodologies and their potential impact on the effectiveness of the CRISPR-Cas system. By analyzing recent literature, this review highlights the achievements and emerging trends in the integration of deep learning with the CRISPR-Cas systems, thus contributing to the future direction of this essential interdisciplinary research area.
Collapse
|
95
|
Sherkatghanad Z, Abdar M, Charlier J, Makarenkov V. Using traditional machine learning and deep learning methods for on- and off-target prediction in CRISPR/Cas9: a review. Brief Bioinform 2023; 24:bbad131. [PMID: 37080758 PMCID: PMC10199778 DOI: 10.1093/bib/bbad131] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 03/07/2023] [Accepted: 03/13/2023] [Indexed: 04/22/2023] Open
Abstract
CRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9) is a popular and effective two-component technology used for targeted genetic manipulation. It is currently the most versatile and accurate method of gene and genome editing, which benefits from a large variety of practical applications. For example, in biomedicine, it has been used in research related to cancer, virus infections, pathogen detection, and genetic diseases. Current CRISPR/Cas9 research is based on data-driven models for on- and off-target prediction as a cleavage may occur at non-target sequence locations. Nowadays, conventional machine learning and deep learning methods are applied on a regular basis to accurately predict on-target knockout efficacy and off-target profile of given single-guide RNAs (sgRNAs). In this paper, we present an overview and a comparative analysis of traditional machine learning and deep learning models used in CRISPR/Cas9. We highlight the key research challenges and directions associated with target activity prediction. We discuss recent advances in the sgRNA-DNA sequence encoding used in state-of-the-art on- and off-target prediction models. Furthermore, we present the most popular deep learning neural network architectures used in CRISPR/Cas9 prediction models. Finally, we summarize the existing challenges and discuss possible future investigations in the field of on- and off-target prediction. Our paper provides valuable support for academic and industrial researchers interested in the application of machine learning methods in the field of CRISPR/Cas9 genome editing.
Collapse
Affiliation(s)
- Zeinab Sherkatghanad
- Departement d’Informatique, Universite du Quebec a Montreal, H2X 3Y7, Montreal, QC, Canada
| | - Moloud Abdar
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, 3216, Geelong, VIC, Australia
| | - Jeremy Charlier
- Departement d’Informatique, Universite du Quebec a Montreal, H2X 3Y7, Montreal, QC, Canada
| | - Vladimir Makarenkov
- Departement d’Informatique, Universite du Quebec a Montreal, H2X 3Y7, Montreal, QC, Canada
| |
Collapse
|
96
|
Zhang H, Yan J, Lu Z, Zhou Y, Zhang Q, Cui T, Li Y, Chen H, Ma L. Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities. Cell Discov 2023; 9:48. [PMID: 37193681 DOI: 10.1038/s41421-023-00549-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 03/21/2023] [Indexed: 05/18/2023] Open
Abstract
Life science studies involving clustered regularly interspaced short palindromic repeat (CRISPR) editing generally apply the best-performing guide RNA (gRNA) for a gene of interest. Computational models are combined with massive experimental quantification on synthetic gRNA-target libraries to accurately predict gRNA activity and mutational patterns. However, the measurements are inconsistent between studies due to differences in the designs of the gRNA-target pair constructs, and there has not yet been an integrated investigation that concurrently focuses on multiple facets of gRNA capacity. In this study, we analyzed the DNA double-strand break (DSB)-induced repair outcomes and measured SpCas9/gRNA activities at both matched and mismatched locations using 926,476 gRNAs covering 19,111 protein-coding genes and 20,268 non-coding genes. We developed machine learning models to forecast the on-target cleavage efficiency (AIdit_ON), off-target cleavage specificity (AIdit_OFF), and mutational profiles (AIdit_DSB) of SpCas9/gRNA from a uniformly collected and processed dataset by deep sampling and massively quantifying gRNA capabilities in K562 cells. Each of these models exhibited superlative performance in predicting SpCas9/gRNA activities on independent datasets when benchmarked with previous models. A previous unknown parameter was also empirically determined regarding the "sweet spot" in the size of datasets used to establish an effective model to predict gRNA capabilities at a manageable experimental scale. In addition, we observed cell type-specific mutational profiles and were able to link nucleotidylexotransferase as the key factor driving these outcomes. These massive datasets and deep learning algorithms have been implemented into the user-friendly web service http://crispr-aidit.com to evaluate and rank gRNAs for life science studies.
Collapse
Affiliation(s)
- Heng Zhang
- Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
- AIdit Therapeutics, Hangzhou, Zhejiang, China
| | - Jianfeng Yan
- Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
- AIdit Therapeutics, Hangzhou, Zhejiang, China
| | - Zhike Lu
- Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Yangfan Zhou
- Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | | | | | - Yini Li
- Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Hui Chen
- AIdit Therapeutics, Hangzhou, Zhejiang, China
| | - Lijia Ma
- Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China.
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China.
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China.
| |
Collapse
|
97
|
Vora DS, Yadav S, Sundar D. Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System. Biomolecules 2023; 13:biom13040641. [PMID: 37189388 DOI: 10.3390/biom13040641] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 03/27/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
CRISPR/Cas9 technology is capable of precisely editing genomes and is at the heart of various scientific and medical advances in recent times. The advances in biomedical research are hindered because of the inadvertent burden on the genome when genome editors are employed—the off-target effects. Although experimental screens to detect off-targets have allowed understanding the activity of Cas9, that knowledge remains incomplete as the rules do not extrapolate well to new target sequences. Off-target prediction tools developed recently have increasingly relied on machine learning and deep learning techniques to reliably understand the complete threat of likely off-targets because the rules that drive Cas9 activity are not fully understood. In this study, we present a count-based as well as deep-learning-based approach to derive sequence features that are important in deciding on Cas9 activity at a sequence. There are two major challenges in off-target determination—the identification of a likely site of Cas9 activity and the prediction of the extent of Cas9 activity at that site. The hybrid multitask CNN–biLSTM model developed, named CRISP–RCNN, simultaneously predicts off-targets and the extent of activity on off-targets. Employing methods of integrated gradients and weighting kernels for feature importance approximation, analysis of nucleotide and position preference, and mismatch tolerance have been performed.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| | - Shashank Yadav
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
- Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| |
Collapse
|
98
|
Naeem M, Alkhnbashi OS. Current Bioinformatics Tools to Optimize CRISPR/Cas9 Experiments to Reduce Off-Target Effects. Int J Mol Sci 2023; 24:ijms24076261. [PMID: 37047235 PMCID: PMC10094584 DOI: 10.3390/ijms24076261] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 03/07/2023] [Accepted: 03/13/2023] [Indexed: 03/29/2023] Open
Abstract
The CRISPR-Cas system has evolved into a cutting-edge technology that has transformed the field of biological sciences through precise genetic manipulation. CRISPR/Cas9 nuclease is evolving into a revolutionizing method to edit any gene of any species with desirable outcomes. The swift advancement of CRISPR-Cas technology is reflected in an ever-expanding ecosystem of bioinformatics tools designed to make CRISPR/Cas9 experiments easier. To assist researchers with efficient guide RNA designs with fewer off-target effects, nuclease target site selection, and experimental validation, bioinformaticians have built and developed a comprehensive set of tools. In this article, we will review the various computational tools available for the assessment of off-target effects, as well as the quantification of nuclease activity and specificity, including web-based search tools and experimental methods, and we will describe how these tools can be optimized for gene knock-out (KO) and gene knock-in (KI) for model organisms. We also discuss future directions in precision genome editing and its applications, as well as challenges in target selection, particularly in predicting off-target effects.
Collapse
|
99
|
Guo C, Ma X, Gao F, Guo Y. Off-target effects in CRISPR/Cas9 gene editing. Front Bioeng Biotechnol 2023; 11:1143157. [PMID: 36970624 PMCID: PMC10034092 DOI: 10.3389/fbioe.2023.1143157] [Citation(s) in RCA: 171] [Impact Index Per Article: 85.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 02/28/2023] [Indexed: 03/11/2023] Open
Abstract
Gene editing stands for the methods to precisely make changes to a specific nucleic acid sequence. With the recent development of the clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system, gene editing has become efficient, convenient and programmable, leading to promising translational studies and clinical trials for both genetic and non-genetic diseases. A major concern in the applications of the CRISPR/Cas9 system is about its off-target effects, namely the deposition of unexpected, unwanted, or even adverse alterations to the genome. To date, many methods have been developed to nominate or detect the off-target sites of CRISPR/Cas9, which laid the basis for the successful upgrades of CRISPR/Cas9 derivatives with enhanced precision. In this review, we summarize these technological advancements and discuss about the current challenges in the management of off-target effects for future gene therapy.
Collapse
Affiliation(s)
- Congting Guo
- School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
- Peking University Institute of Cardiovascular Sciences, Beijing, China
| | - Xiaoteng Ma
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Fei Gao
- Department of Cardiology, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
- *Correspondence: Fei Gao, ; Yuxuan Guo,
| | - Yuxuan Guo
- School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
- Peking University Institute of Cardiovascular Sciences, Beijing, China
- Ministry of Education Key Laboratory of Molecular Cardiovascular Science, Beijing, China
- Beijing Key Laboratory of Cardiovascular Receptors Research, Beijing, China
- *Correspondence: Fei Gao, ; Yuxuan Guo,
| |
Collapse
|
100
|
Du Z, Huang T, Uversky VN, Li J. Predicting TF Proteins by Incorporating Evolution Information Through PSSM. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1319-1326. [PMID: 35981062 DOI: 10.1109/tcbb.2022.3199758] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Transcription factors (TFs) are DNA binding proteins involved in the regulation of gene expression. They exist in all organisms and activate or repress transcription by binding to specific DNA sequences. Traditionally, TFs have been identified by experimental methods that are time-consuming and costly. In recent years, various computational methods have been developed to identify TF to overcome these limitations. However, there is a room for further improvement in the predictive performance of these tools in terms of accuracy. We report here a novel computational tool, TFnet, that provides accurate and comprehensive TF predictions from protein sequences. The accuracy of these predictions is substantially better than the results of the existing TF predictors and methods. Especially, it outperforms comparable methods significantly when sequence similarity to other known sequences in the database drops below 40%. Ablation tests reveal that the high predictive performance stems from innovative ways used in TFnet to derive sequence Position-Specific Scoring Matrix (PSSM) and encode inputs.
Collapse
|