1
|
Chu HY, Peng J, Mou Y, Wong ASL. Quantifying Protein-Nucleic Acid Interactions for Engineering Useful CRISPR-Cas9 Genome-Editing Variants. Methods Mol Biol 2025; 2870:227-243. [PMID: 39543038 DOI: 10.1007/978-1-0716-4213-9_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Numerous high-specificity Cas9 variants have been engineered for precision genome editing. These variants typically harbor multiple mutations designed to alter the Cas9-single guide RNA (sgRNA)-DNA complex interactions for reduced off-target cleavage. By dissecting the contributions of individual mutations, we attempt to derive principles for designing high-specificity Cas9 variants. Here, we computationally modeled the specificity harnessing mutations of the widely used Cas9 isolated from Streptococcus pyogenes (SpCas9) and investigated their individual mutational effects. We quantified the mutational effects in terms of energy and contact changes by comparing the wild-type and mutant structures. We found that these mutations disrupt the protein-protein or protein-DNA contacts within the Cas9-sgRNA-DNA complex. We also identified additional impacted amino acid sites via energy changes that constitute the structural microenvironment encompassing the focal mutation, giving insights into how the mutations contribute to the high-specificity phenotype of SpCas9. Our method outlines a strategy to evaluate mutational effects that can facilitate rational design for Cas9 optimization.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Jiaxing Peng
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanbiao Mou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
2
|
Wang X, Cheng X, Li Z, Ma S, Zhang H, Chen Z, Yao Y, Li Z, Zhong C, Li Y, Zhang Y, Menon V, Chao L, Li W, Fei T. A comprehensive benchmark for multiple highly efficient base editors with broad targeting scope. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.17.628899. [PMID: 39763781 PMCID: PMC11702641 DOI: 10.1101/2024.12.17.628899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2025]
Abstract
As the toolbox of base editors (BEs) expands, selecting appropriate BE and guide RNA (gRNA) to achieve optimal editing efficiency and outcome for a given target becomes challenging. Here, we construct a set of 10 adenine and cytosine BEs with high activity and broad targeting scope, and comprehensively evaluate their editing profiles and properties head-to-head with 34,040 BE-gRNA-target combinations using genomically integrated long targets and tiling gRNA strategies. Interestingly, we observe widespread non-canonical protospacer adjacent motifs (PAMs) for these BEs. Using this large-scale benchmark data, we build a deep learning model, named BEEP (Base Editing Efficiency Predictor), for predicting the editing efficiency and outcome of these BEs. Guided by BEEP, we experimentally test and validate the installment of 3,558 disease-associated single nucleotide variants (SNVs) via BEs, including 20.1% of target sites that would be generally considered as "uneditable", due to the lack of canonical PAMs. We further predict candidate BE-gRNA-target combinations for modeling 1,752,651 ClinVar SNVs. We also identify several cancer-associated SNVs that drive the resistance to BRAF inhibitors in melanoma. These efforts benchmark the performance and illuminate the capabilities of multiple highly useful BEs for interrogating functional SNVs. A practical webserver (http://beep.weililab.org/) is freely accessible to guide the selection of optimal BEs and gRNAs for a given target.
Collapse
Affiliation(s)
- Xiaofeng Wang
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Xiaolong Cheng
- Center for Genetic Medicine Research, Children’s National Hospital, 111 Michigan Ave NW, Washington, DC, 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Ave NW, Washington, DC, 20010, USA
| | - Zexu Li
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Shixin Ma
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Han Zhang
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Zhisong Chen
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Yingjia Yao
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Zihan Li
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Chunge Zhong
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - You Li
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Yunhan Zhang
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| | - Vipin Menon
- Center for Genetic Medicine Research, Children’s National Hospital, 111 Michigan Ave NW, Washington, DC, 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Ave NW, Washington, DC, 20010, USA
| | - Lumen Chao
- Center for Genetic Medicine Research, Children’s National Hospital, 111 Michigan Ave NW, Washington, DC, 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Ave NW, Washington, DC, 20010, USA
| | - Wei Li
- Center for Genetic Medicine Research, Children’s National Hospital, 111 Michigan Ave NW, Washington, DC, 20010, USA
- Department of Genomics and Precision Medicine, George Washington University, 111 Michigan Ave NW, Washington, DC, 20010, USA
| | - Teng Fei
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China
- National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Northeastern University, Shenyang, 110819, China
- Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, 110819, China
- Foshan Graduate School of Innovation, Northeastern University, Foshan 528311, China
| |
Collapse
|
3
|
Chu HY, Fong JHC, Thean DGL, Zhou P, Fung FKC, Huang Y, Wong ASL. Accurate top protein variant discovery via low-N pick-and-validate machine learning. Cell Syst 2024; 15:193-203.e6. [PMID: 38340729 DOI: 10.1016/j.cels.2024.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 10/11/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - John H C Fong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Dawn G L Thean
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Peng Zhou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Frederic K C Fung
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanhua Huang
- School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|