1
|
Winter E, Emiliani F, Cook A, Abderrahim A, McKenna AH. BASELINE: A CRISPR Base Editing Platform for Mammalian-Scale Single-Cell Lineage Tracing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.19.644238. [PMID: 40166145 PMCID: PMC11957144 DOI: 10.1101/2025.03.19.644238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
A cells fate is shaped by its inherited state, or lineage, and the ever-shifting context of its environment. CRISPR-based recording technologies are a promising solution to map the lineage of a developing system, yet challenges remain regarding single-cell recovery, engineering complexity, and scale. Here, we introduce BASELINE, which uses base editing to generate high-resolution lineage trees in conjunction with single-cell profiling. BASELINE uses the Cas12a adenine base editor to irreversibly edit nucleotides within 50 synthetic target sites, which are integrated multiple times into a cells genome. We show that BASELINE accumulates lineage-specific marks over a wide range of biologically relevant intervals, recording more than 4300 bits of information in a model of pancreatic cancer, a 50X increase over existing technologies. Single-cell sequencing reveals high-fidelity capture of these recorders, recovering lineage reconstructions up to 40 cell divisions deep, within the estimated range of mammalian development. We expect BASELINE to apply to a wide range of lineage-tracing projects in development and disease, especially in which cellular engineering makes small, more distributed systems challenging.
Collapse
|
2
|
Saxe R, Stuart H, Marshall A, Abdullahi F, Chen Z, Emiliani F, McKenna A. Hierarchical Lineage Tracing Reveals Diverse Pathways of AML Treatment Resistance. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.27.640600. [PMID: 40093111 PMCID: PMC11908168 DOI: 10.1101/2025.02.27.640600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/19/2025]
Abstract
Cancer cells adapt to treatment, leading to the emergence of clones that are more aggressive and resistant to anti-cancer therapies. We have a limited understanding of the development of treatment resistance as we lack technologies to map the evolution of cancer under the selective pressure of treatment. To address this, we developed a hierarchical, dynamic lineage tracing method called FLARE (Following Lineage Adaptation and Resistance Evolution). We use this technique to track the progression of acute myeloid leukemia (AML) cell lines through exposure to Cytarabine (AraC), a front-line treatment in AML, in vitro and in vivo. We map distinct cellular lineages in murine and human AML cell lines predisposed to AraC persistence and/or resistance via the upregulation of cell adhesion and motility pathways. Additionally, we highlight the heritable expression of immunoproteasome 11S regulatory cap subunits as a potential mechanism aiding AML cell survival, proliferation, and immune escape in vivo. Finally, we validate the clinical relevance of these signatures in the TARGET-AML cohort, with a bisected response in blood and bone marrow. Our findings reveal a broad spectrum of resistance signatures attributed to significant cell transcriptional changes. To our knowledge, this is the first application of dynamic lineage tracing to unravel treatment response and resistance in cancer, and we expect FLARE to be a valuable tool in dissecting the evolution of resistance in a wide range of tumor types.
Collapse
Affiliation(s)
- Rachel Saxe
- Molecular and Systems Biology, Dartmouth College, Hanover, NH
- Molecular and Cellular Biology Program, Dartmouth College, Hanover, NH
| | - Hannah Stuart
- Molecular and Systems Biology, Dartmouth College, Hanover, NH
- Quantitative Biomedical Science Program, Dartmouth College, Lebanon, NH
| | - Abigail Marshall
- Molecular and Systems Biology, Dartmouth College, Hanover, NH
- Molecular and Cellular Biology Program, Dartmouth College, Hanover, NH
| | - Fahiima Abdullahi
- The Dartmouth MD-PhD Undergraduate Summer Fellowship Program, Lebanon, NH
| | - Zoë Chen
- Dartmouth Cancer Center, Dartmouth College, Lebanon, NH
| | - Francesco Emiliani
- Molecular and Systems Biology, Dartmouth College, Hanover, NH
- Molecular and Cellular Biology Program, Dartmouth College, Hanover, NH
| | - Aaron McKenna
- Molecular and Systems Biology, Dartmouth College, Hanover, NH
- Dartmouth Cancer Center, Dartmouth College, Lebanon, NH
| |
Collapse
|
3
|
Ramírez Rojas AA, Brinkmann CK, Schindler D. Validation of Golden Gate Assemblies Using Highly Multiplexed Nanopore Amplicon Sequencing. Methods Mol Biol 2025; 2850:171-196. [PMID: 39363072 DOI: 10.1007/978-1-0716-4220-7_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Golden Gate cloning has revolutionized synthetic biology. Its concept of modular, highly characterized libraries of parts that can be combined into higher order assemblies allows engineering principles to be applied to biological systems. The basic parts, typically stored in Level 0 plasmids, are sequence validated by the method of choice and can be combined into higher order assemblies on demand. Higher order assemblies are typically transcriptional units, and multiple transcriptional units can be assembled into multi-gene constructs. Higher order Golden Gate assembly based on defined and validated parts usually does not introduce sequence changes. Therefore, simple validation of the assemblies, e.g., by colony polymerase chain reaction (PCR) or restriction digest pattern analysis is sufficient. However, in many experimental setups, researchers do not use defined parts, but rather part libraries, resulting in assemblies of high combinatorial complexity where sequencing again becomes mandatory. Here, we present a detailed protocol for the use of a highly multiplexed dual barcode amplicon sequencing using the Nanopore sequencing platform for in-house sequence validation. The workflow, called DuBA.flow, is a start-to-finish procedure that provides all necessary steps from a single colony to the final easy-to-interpret sequencing report.
Collapse
Affiliation(s)
| | | | - Daniel Schindler
- Max Planck Institute for Terrestrial Microbiology, Marburg, Germany.
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-University Marburg, Marburg, Germany.
| |
Collapse
|
4
|
Li W, Miller D, Liu X, Tosi L, Chkaiban L, Mei H, Hung PH, Parekkadan B, Sherlock G, Levy S. Arrayed in vivo barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. Nucleic Acids Res 2024; 52:e47. [PMID: 38709890 PMCID: PMC11162764 DOI: 10.1093/nar/gkae332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/23/2024] [Accepted: 04/16/2024] [Indexed: 05/08/2024] Open
Abstract
Sequence verification of plasmid DNA is critical for many cloning and molecular biology workflows. To leverage high-throughput sequencing, several methods have been developed that add a unique DNA barcode to individual samples prior to pooling and sequencing. However, these methods require an individual plasmid extraction and/or in vitro barcoding reaction for each sample processed, limiting throughput and adding cost. Here, we develop an arrayed in vivo plasmid barcoding platform that enables pooled plasmid extraction and library preparation for Oxford Nanopore sequencing. This method has a high accuracy and recovery rate, and greatly increases throughput and reduces cost relative to other plasmid barcoding methods or Sanger sequencing. We use in vivo barcoding to sequence verify >45 000 plasmids and show that the method can be used to transform error-containing dispersed plasmid pools into sequence-perfect arrays or well-balanced pools. In vivo barcoding does not require any specialized equipment beyond a low-overhead Oxford Nanopore sequencer, enabling most labs to flexibly process hundreds to thousands of plasmids in parallel.
Collapse
Affiliation(s)
- Weiyi Li
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Xianan Liu
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Lamia Chkaiban
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Han Mei
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Po-Hsiang Hung
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| |
Collapse
|
5
|
Nanda AS, Wu K, Irkliyenko I, Woo B, Ostrowski MS, Clugston AS, Sayles LC, Xu L, Satpathy AT, Nguyen HG, Alejandro Sweet-Cordero E, Goodarzi H, Kasinathan S, Ramani V. Direct transposition of native DNA for sensitive multimodal single-molecule sequencing. Nat Genet 2024; 56:1300-1309. [PMID: 38724748 PMCID: PMC11176058 DOI: 10.1038/s41588-024-01748-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 04/08/2024] [Indexed: 05/23/2024]
Abstract
Concurrent readout of sequence and base modifications from long unamplified DNA templates by Pacific Biosciences of California (PacBio) single-molecule sequencing requires large amounts of input material. Here we adapt Tn5 transposition to introduce hairpin oligonucleotides and fragment (tagment) limiting quantities of DNA for generating PacBio-compatible circular molecules. We developed two methods that implement tagmentation and use 90-99% less input than current protocols: (1) single-molecule real-time sequencing by tagmentation (SMRT-Tag), which allows detection of genetic variation and CpG methylation; and (2) single-molecule adenine-methylated oligonucleosome sequencing assay by tagmentation (SAMOSA-Tag), which uses exogenous adenine methylation to add a third channel for probing chromatin accessibility. SMRT-Tag of 40 ng or more human DNA (approximately 7,000 cell equivalents) yielded data comparable to gold standard whole-genome and bisulfite sequencing. SAMOSA-Tag of 30,000-50,000 nuclei resolved single-fiber chromatin structure, CTCF binding and DNA methylation in patient-derived prostate cancer xenografts and uncovered metastasis-associated global epigenome disorganization. Tagmentation thus promises to enable sensitive, scalable and multimodal single-molecule genomics for diverse basic and clinical applications.
Collapse
Affiliation(s)
- Arjun S Nanda
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Ke Wu
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Iryna Irkliyenko
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Brian Woo
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Helen-Diller Cancer Center, San Francisco, CA, USA
| | - Megan S Ostrowski
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Andrew S Clugston
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Leanne C Sayles
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Lingru Xu
- Helen-Diller Cancer Center, San Francisco, CA, USA
| | - Ansuman T Satpathy
- Department of Pathology, Stanford University, Stanford, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
- Gladstone-University of California, San Francisco Institute for Genomic Immunology, Gladstone Institutes, San Francisco, CA, USA
| | - Hao G Nguyen
- Helen-Diller Cancer Center, San Francisco, CA, USA
| | - E Alejandro Sweet-Cordero
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Hani Goodarzi
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Helen-Diller Cancer Center, San Francisco, CA, USA
- Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, San Francisco, CA, USA
| | - Sivakanthan Kasinathan
- Gladstone-University of California, San Francisco Institute for Genomic Immunology, Gladstone Institutes, San Francisco, CA, USA.
- Division of Rheumatology, Department of Pediatrics, Stanford University, Stanford, CA, USA.
| | - Vijay Ramani
- Gladstone Institute for Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA.
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA.
- Helen-Diller Cancer Center, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, San Francisco, CA, USA.
| |
Collapse
|
6
|
McGuffie MJ, Barrick JE. Identifying widespread and recurrent variants of genetic parts to improve annotation of engineered DNA sequences. PLoS One 2024; 19:e0304164. [PMID: 38805426 PMCID: PMC11132462 DOI: 10.1371/journal.pone.0304164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 05/07/2024] [Indexed: 05/30/2024] Open
Abstract
Engineered plasmids have been workhorses of recombinant DNA technology for nearly half a century. Plasmids are used to clone DNA sequences encoding new genetic parts and to reprogram cells by combining these parts in new ways. Historically, many genetic parts on plasmids were copied and reused without routinely checking their DNA sequences. With the widespread use of high-throughput DNA sequencing technologies, we now know that plasmids often contain variants of common genetic parts that differ slightly from their canonical sequences. Because the exact provenance of a genetic part on a particular plasmid is usually unknown, it is difficult to determine whether these differences arose due to mutations during plasmid construction and propagation or due to intentional editing by researchers. In either case, it is important to understand how the sequence changes alter the properties of the genetic part. We analyzed the sequences of over 50,000 engineered plasmids using depositor metadata and a metric inspired by the natural language processing field. We detected 217 uncatalogued genetic part variants that were especially widespread or were likely the result of convergent evolution or engineering. Several of these uncatalogued variants are known mutants of plasmid origins of replication or antibiotic resistance genes that are missing from current annotation databases. However, most are uncharacterized, and 3/5 of the plasmids we analyzed contained at least one of the uncatalogued variants. Our results include a list of genetic parts to prioritize for refining engineered plasmid annotation pipelines, highlight widespread variants of parts that warrant further investigation to see whether they have altered characteristics, and suggest cases where unintentional evolution of plasmid parts may be affecting the reliability and reproducibility of science.
Collapse
Affiliation(s)
- Matthew J. McGuffie
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas, United States of America
| | - Jeffrey E. Barrick
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas, United States of America
| |
Collapse
|
7
|
Vegh P, Donovan S, Rosser S, Stracquadanio G, Fragkoudis R. Biofoundry-Scale DNA Assembly Validation Using Cost-Effective High-Throughput Long-Read Sequencing. ACS Synth Biol 2024; 13:683-686. [PMID: 38329009 PMCID: PMC10877595 DOI: 10.1021/acssynbio.3c00589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 02/09/2024]
Abstract
Biofoundries are automated high-throughput facilities specializing in the design, construction, and testing of engineered/synthetic DNA constructs (plasmids), often from genetic parts. A critical step of this process is assessing the fidelity of the assembled DNA construct to the desired design. Current methods utilized for this purpose are restriction digest or PCR followed by fragment analysis and sequencing. The Edinburgh Genome Foundry (EGF) has recently established a single-molecule sequencing quality control step using the Oxford Nanopore sequencing technology, along with a companion Nextflow pipeline and a Python package, to perform in-depth analysis and generate a detailed report. Our software enables researchers working with plasmids, including biofoundry scientists, to rapidly analyze and interpret sequencing data. In conclusion, we have created a laboratory and software protocol that validates assembled, cloned, or edited plasmids, using Nanopore long-reads, which can serve as a useful resource for the genetics, synthetic biology, and sequencing communities.
Collapse
Affiliation(s)
- Peter Vegh
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Sophie Donovan
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Susan Rosser
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Giovanni Stracquadanio
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Rennos Fragkoudis
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
- Department
of Biochemistry and Biotechnology, University
of Thessaly, 41500 Larissa, Greece
| |
Collapse
|
8
|
Ramírez Rojas A, Brinkmann CK, Köbel TS, Schindler D. DuBA.flow─A Low-Cost, Long-Read Amplicon Sequencing Workflow for the Validation of Synthetic DNA Constructs. ACS Synth Biol 2024; 13:457-465. [PMID: 38295293 PMCID: PMC10877597 DOI: 10.1021/acssynbio.3c00522] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 10/27/2023] [Accepted: 11/13/2023] [Indexed: 02/02/2024]
Abstract
Modern biological science, especially synthetic biology, relies heavily on the construction of DNA elements, often in the form of plasmids. Plasmids are used for a variety of applications, including the expression of proteins for subsequent purification, the expression of heterologous pathways for the production of valuable compounds, and the study of biological functions and mechanisms. For all applications, a critical step after the construction of a plasmid is its sequence validation. The traditional method for sequence determination is Sanger sequencing, which is limited to approximately 1000 bp per reaction. Here, we present a highly scalable in-house method for rapid validation of amplified DNA sequences using long-read Nanopore sequencing. We developed two-step amplicon and transposase strategies to provide maximum flexibility for dual barcode sequencing. We also provide an automated analysis pipeline to quickly and reliably analyze sequencing results and provide easy-to-interpret results for each sample. The user-friendly DuBA.flow start-to-finish pipeline is widely applicable. Furthermore, we show that construct validation using DuBA.flow can be performed by barcoded colony PCR amplicon sequencing, thus accelerating research.
Collapse
Affiliation(s)
- Adán
A. Ramírez Rojas
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
| | - Cedric K. Brinkmann
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
| | - Tania S. Köbel
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
| | - Daniel Schindler
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
- Center
for Synthetic Microbiology, Philipps-University
Marburg, Karl-von-Frisch-Str.
14, 35032 Marburg, Germany
| |
Collapse
|
9
|
Li W, Miller D, Liu X, Tosi L, Chkaiban L, Mei H, Hung PH, Parekkadan B, Sherlock G, Levy SF. Arrayed in vivo barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.13.562064. [PMID: 37873145 PMCID: PMC10592806 DOI: 10.1101/2023.10.13.562064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Sequence verification of plasmid DNA is critical for many cloning and molecular biology workflows. To leverage high-throughput sequencing, several methods have been developed that add a unique DNA barcode to individual samples prior to pooling and sequencing. However, these methods require an individual plasmid extraction and/or in vitro barcoding reaction for each sample processed, limiting throughput and adding cost. Here, we develop an arrayed in vivo plasmid barcoding platform that enables pooled plasmid extraction and library preparation for Oxford Nanopore sequencing. This method has a high accuracy and recovery rate, and greatly increases throughput and reduces cost relative to other plasmid barcoding methods or Sanger sequencing. We use in vivo barcoding to sequence verify >45,000 plasmids and show that the method can be used to transform error-containing dispersed plasmid pools into sequence-perfect arrays or well-balanced pools. In vivo barcoding does not require any specialized equipment beyond a low-overhead Oxford Nanopore sequencer, enabling most labs to flexibly process hundreds to thousands of plasmids in parallel.
Collapse
Affiliation(s)
- Weiyi Li
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Xianan Liu
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Lamia Chkaiban
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Han Mei
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Po-Hsiang Hung
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
- Present Address: BacStitch DNA, Los Altos, CA, USA
| |
Collapse
|
10
|
Wei X, Penkauskas T, Reiner JE, Kennard C, Uline MJ, Wang Q, Li S, Aksimentiev A, Robertson JW, Liu C. Engineering Biological Nanopore Approaches toward Protein Sequencing. ACS NANO 2023; 17:16369-16395. [PMID: 37490313 PMCID: PMC10676712 DOI: 10.1021/acsnano.3c05628] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
Biotechnological innovations have vastly improved the capacity to perform large-scale protein studies, while the methods we have for identifying and quantifying individual proteins are still inadequate to perform protein sequencing at the single-molecule level. Nanopore-inspired systems devoted to understanding how single molecules behave have been extensively developed for applications in genome sequencing. These nanopore systems are emerging as prominent tools for protein identification, detection, and analysis, suggesting realistic prospects for novel protein sequencing. This review summarizes recent advances in biological nanopore sensors toward protein sequencing, from the identification of individual amino acids to the controlled translocation of peptides and proteins, with attention focused on device and algorithm development and the delineation of molecular mechanisms with the aid of simulations. Specifically, the review aims to offer recommendations for the advancement of nanopore-based protein sequencing from an engineering perspective, highlighting the need for collaborative efforts across multiple disciplines. These efforts should include chemical conjugation, protein engineering, molecular simulation, machine-learning-assisted identification, and electronic device fabrication to enable practical implementation in real-world scenarios.
Collapse
Affiliation(s)
- Xiaojun Wei
- Biomedical Engineering Program, University of South Carolina, Columbia, SC 29208, United States
- Department of Chemical Engineering, University of South Carolina, Columbia, SC 29208, United States
| | - Tadas Penkauskas
- Biophysics and Biomedical Measurement Group, Microsystems and Nanotechnology Division, National Institute of Standards and Technology, Gaithersburg, MD 20899, United States
- School of Engineering, Brown University, Providence, RI 02912, United States
| | - Joseph E. Reiner
- Department of Physics, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Celeste Kennard
- Biomedical Engineering Program, University of South Carolina, Columbia, SC 29208, United States
| | - Mark J. Uline
- Biomedical Engineering Program, University of South Carolina, Columbia, SC 29208, United States
- Department of Chemical Engineering, University of South Carolina, Columbia, SC 29208, United States
| | - Qian Wang
- Department of Chemistry and Biochemistry, University of South Carolina, Columbia, SC 29208, United States
| | - Sheng Li
- School of Data Science, University of Virginia, Charlottesville, VA 22903, United States
| | - Aleksei Aksimentiev
- Department of Physics and Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States
| | - Joseph W.F. Robertson
- Biophysics and Biomedical Measurement Group, Microsystems and Nanotechnology Division, National Institute of Standards and Technology, Gaithersburg, MD 20899, United States
| | - Chang Liu
- Biomedical Engineering Program, University of South Carolina, Columbia, SC 29208, United States
- Department of Chemical Engineering, University of South Carolina, Columbia, SC 29208, United States
| |
Collapse
|
11
|
Mumm C, Drexel ML, McDonald TL, Diehl AG, Switzenberg JA, Boyle AP. Multiplexed long-read plasmid validation and analysis using OnRamp. Genome Res 2023; 33:741-749. [PMID: 37156622 PMCID: PMC10317119 DOI: 10.1101/gr.277369.122] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 05/03/2023] [Indexed: 05/10/2023]
Abstract
Recombinant plasmid vectors are versatile tools that have facilitated discoveries in molecular biology, genetics, proteomics, and many other fields. As the enzymatic and bacterial processes used to create recombinant DNA can introduce errors, sequence validation is an essential step in plasmid assembly. Sanger sequencing is the current standard for plasmid validation; however, this method is limited by an inability to sequence through complex secondary structure and lacks scalability when applied to full-plasmid sequencing of multiple plasmids owing to read-length limits. Although high-throughput sequencing does provide full-plasmid sequencing at scale, it is impractical and costly when used outside of library-scale validation. Here, we present Oxford nanopore-based rapid analysis of multiplexed plasmids (OnRamp), an alternative method for routine plasmid validation that combines the advantages of high-throughput sequencing's full-plasmid coverage and scalability with Sanger's affordability and accessibility by leveraging nanopore's long-read sequencing technology. We include customized wet-laboratory protocols for plasmid preparation along with a pipeline designed for analysis of read data obtained using these protocols. This analysis pipeline is deployed on the OnRamp web app, which generates alignments between actual and predicted plasmid sequences, quality scores, and read-level views. OnRamp is designed to be broadly accessible regardless of programming experience to facilitate more widespread adoption of long-read sequencing for routine plasmid validation. Here we describe the OnRamp protocols and pipeline and show our ability to obtain full sequences from pooled plasmids while detecting sequence variation even in regions of high secondary structure at less than half the cost of equivalent Sanger sequencing.
Collapse
Affiliation(s)
- Camille Mumm
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Melissa L Drexel
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Torrin L McDonald
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Adam G Diehl
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Jessica A Switzenberg
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Alan P Boyle
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA;
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
12
|
McGuffie MJ, Barrick JE. Identifying widespread and recurrent variants of genetic parts to improve annotation of engineered DNA sequences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.10.536277. [PMID: 37090600 PMCID: PMC10120640 DOI: 10.1101/2023.04.10.536277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Engineered plasmids have been workhorses of recombinant DNA technology for nearly half a century. Plasmids are used to clone DNA sequences encoding new genetic parts and to reprogram cells by combining these parts in new ways. Historically, many genetic parts on plasmids were copied and reused without routinely checking their DNA sequences. With the widespread use of high-throughput DNA sequencing technologies, we now know that plasmids often contain variants of common genetic parts that differ slightly from their canonical sequences. Because the exact provenance of a genetic part on a particular plasmid is usually unknown, it is difficult to determine whether these differences arose due to mutations during plasmid construction and propagation or due to intentional editing by researchers. In either case, it is important to understand how the sequence changes alter the properties of the genetic part. We analyzed the sequences of over 50,000 engineered plasmids using depositor metadata and a metric inspired by the natural language processing field. We detected 217 uncatalogued genetic part variants that were especially widespread or were likely the result of convergent evolution or engineering. Several of these uncatalogued variants are known mutants of plasmid origins of replication or antibiotic resistance genes that are missing from current annotation databases. However, most are uncharacterized, and 3/5 of the plasmids we analyzed contained at least one of the uncatalogued variants. Our results include a list of genetic parts to prioritize for refining engineered plasmid annotation pipelines, highlight widespread variants of parts that warrant further investigation to see whether they have altered characteristics, and suggest cases where unintentional evolution of plasmid parts may be affecting the reliability and reproducibility of science.
Collapse
Affiliation(s)
- Matthew J. McGuffie
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas, United States
| | - Jeffrey E. Barrick
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas, United States
| |
Collapse
|
13
|
Brown SD, Dreolini L, Wilson JF, Balasundaram M, Holt RA. Complete sequence verification of plasmid DNA using the Oxford Nanopore Technologies' MinION device. BMC Bioinformatics 2023; 24:116. [PMID: 36964503 PMCID: PMC10039527 DOI: 10.1186/s12859-023-05226-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 03/11/2023] [Indexed: 03/26/2023] Open
Abstract
BACKGROUND Sequence verification is essential for plasmids used as critical reagents or therapeutic products. Typically, high-quality plasmid sequence is achieved through capillary-based Sanger sequencing, requiring customized sets of primers for each plasmid. This process can become expensive, particularly for applications where the validated sequence needs to be produced within a regulated and quality-controlled environment for downstream clinical research applications. RESULTS Here, we describe a cost-effective and accurate plasmid sequencing and consensus generation procedure using the Oxford Nanopore Technologies' MinION device as an alternative to capillary-based plasmid sequencing options. This procedure can verify the identity of a pure population of plasmid, either confirming it matches the known and expected sequence, or identifying mutations present in the plasmid if any exist. We use a full MinION flow cell per plasmid, maximizing available data and allowing for stringent quality filters. Pseudopairing reads for consensus base calling reduces read error rates from 5.3 to 0.53%, and our pileup consensus approach provides per-base counts and confidence scores, allowing for interpretation of the certainty of the resulting consensus sequences. For pure plasmid samples, we demonstrate 100% accuracy in the resulting consensus sequence, and the sensitivity to detect small mutations such as insertions, deletions, and single nucleotide variants. In test cases where the sequenced pool of plasmids contains subclonal templates, detection sensitivity is similar to that of traditional capillary sequencing. CONCLUSIONS Our pipeline can provide significant cost savings compared to outsourcing clinical-grade sequencing of plasmids, making generation of high-quality plasmid sequence for clinical sequence verification more accessible. While other long-read-based methods offer higher-throughput and less cost, our pipeline produces complete and accurate sequence verification for cases where absolute sequence accuracy is required.
Collapse
Affiliation(s)
- Scott D Brown
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Lisa Dreolini
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Jessica F Wilson
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Miruna Balasundaram
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Robert A Holt
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada.
- Department of Molecular Biology and Biochemistry, Simon Fraser University, SSB8166 - 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.
- Department of Medical Genetics, University of British Columbia, C201 - 4500 Oak Street, 675 W 10th Ave, Vancouver, BC, V6H 3N1, Canada.
| |
Collapse
|
14
|
Pasin F. Assembly of plant virus agroinfectious clones using biological material or DNA synthesis. STAR Protoc 2022; 3:101716. [PMID: 36149792 PMCID: PMC9519601 DOI: 10.1016/j.xpro.2022.101716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 07/29/2022] [Accepted: 08/26/2022] [Indexed: 01/26/2023] Open
Abstract
Infectious clone technology is universally applied for biological characterization and engineering of viruses. This protocol describes procedures that implement synthetic biology advances for streamlined assembly of virus infectious clones. Here, I detail homology-based cloning using biological material, as well as SynViP assembly using type IIS restriction enzymes and chemically synthesized DNA fragments. The assembled virus clones are based on compact T-DNA binary vectors of the pLX series and are delivered to host plants by Agrobacterium-mediated inoculation. For complete details on the use and execution of this protocol, please refer to Pasin et al. (2017, 2018) and Pasin (2021).
Collapse
Affiliation(s)
- Fabio Pasin
- Instituto de Biología Molecular y Celular de Plantas (IBMCP), Consejo Superior de Investigaciones Científicas - Universitat Politècnica de València (CSIC-UPV), 46011 Valencia, Spain.
| |
Collapse
|