1
|
Fan Q, Zhao X, Li J, Liu R, Liu M, Feng Q, Long Y, Fu Y, Zhai J, Pan Q, Li Y. De novo non-canonical nanopore basecalling enables private communication using heavily-modified DNA data at single-molecule level. Nat Commun 2025; 16:4099. [PMID: 40316536 PMCID: PMC12048662 DOI: 10.1038/s41467-025-59357-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 04/16/2025] [Indexed: 05/04/2025] Open
Abstract
Hidden messages in DNA molecules by employing chemical modifications has been suggested for private data storage and transmission at high information density. However, rapidly decoding these "molecular keys" with corresponding basecallers remains challenging. We present DeepSME, a nanopore sequencing and deep-learning based framework towards single-molecule encryption, demonstrated by using 5-hydroxymethylcytosine (5hmC) substitution for individual nucleotide recognition rather than sequential interactions. This non-natural, motif-insensitive methylation disrupts ion current, resulting in a readout failure of 67.2%-100%, concealing the privacy within the DNAs. We further develop an alignment-free DeepSME basecaller as a key to reconstitute the digital information. Our three-stage training pipeline, expands k-mer size from 46 to 49, achieving over 92% precision and recall from scratch. DeepSME deciphers fully 5hmC concealed text and image within 16× coverage depth with an F1-score of 86.4%, surpassing all the state-of-the-art basecallers. Demonstrated on edge computing devices, DeepSME holds supreme potential for DNA-based private communications and broader bioengineering and medical applications.
Collapse
Affiliation(s)
- Qingyuan Fan
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Xuyang Zhao
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Junyao Li
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Ronghui Liu
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China
| | - Ming Liu
- School of Medicine, Southern University of Science and Technology, Shenzhen, China
| | - Qishun Feng
- National Clinical Research Center for Infectious Diseases, Shenzhen Third People's Hospital, The Second Affiliated Hospital of Southern University of Science and Technology, Shenzhen, China
| | - Yanping Long
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Yang Fu
- School of Medicine, Southern University of Science and Technology, Shenzhen, China
| | - Jixian Zhai
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Qing Pan
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Yi Li
- School of Microelectronics, MOE Engineering Research Center of Integrated Circuits for Next Generation Communications, Southern University of Science and Technology, Shenzhen, China.
| |
Collapse
|
2
|
Lea-Smith DJ, Hassard F, Coulon F, Partridge N, Horsfall L, Parker KDJ, Smith RDJ, McCarthy RR, McKew B, Gutierrez T, Kumar V, Dotro G, Yang Z, Krasnogor N. Engineering biology applications for environmental solutions: potential and challenges. Nat Commun 2025; 16:3538. [PMID: 40229265 PMCID: PMC11997111 DOI: 10.1038/s41467-025-58492-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 03/24/2025] [Indexed: 04/16/2025] Open
Abstract
Engineering biology applies synthetic biology to address global environmental challenges like bioremediation, biosequestration, pollutant monitoring, and resource recovery. This perspective outlines innovations in engineering biology, its integration with other technologies (e.g., nanotechnology, IoT, AI), and commercial ventures leveraging these advancements. We also discuss commercialisation and scaling challenges, biosafety and biosecurity considerations including biocontainment strategies, social and political dimensions, and governance issues that must be addressed for successful real-world implementation. Finally, we highlight future perspectives and propose strategies to overcome existing hurdles, aiming to accelerate the adoption of engineering biology for environmental solutions.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | - Natalio Krasnogor
- GitLife Biotech Ltd, Newcastle Upon Tyne, UK.
- Newcastle University, Newcastle upon Tyne, UK.
| |
Collapse
|
3
|
Hernandez SI, Peccoud SJ, Berezin CT, Peccoud J. Self-documenting plasmids. Trends Biotechnol 2025:S0167-7799(25)00095-2. [PMID: 40340197 DOI: 10.1016/j.tibtech.2025.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2024] [Revised: 03/10/2025] [Accepted: 03/11/2025] [Indexed: 05/10/2025]
Abstract
Plasmids are the workhorse of biotechnology. These small DNA molecules are used to produce recombinant proteins and to engineer living organisms. They can be regarded as the blueprints of many biotechnology products. Therefore, it is critical to ensure that the sequences of these DNA molecules match their intended designs. Yet, plasmid verification remains challenging. To secure the exchange of plasmids in research and development workflows, we have developed self-documenting plasmids that encode information about themselves in their own DNA molecules. Users of self-documenting plasmids can retrieve critical information about the plasmid without prior knowledge of the plasmid identity. The insertion of documentation in the plasmid sequence does not preclude their propagation in bacteria or functional fluorescent protein expression in mammalian cells. This technology simplifies plasmid verification, hardens supply chains, and has the potential to transform the protection of intellectual property (IP) in the life sciences.
Collapse
Affiliation(s)
- Sarah I Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO, USA
| | - Samuel J Peccoud
- GenoFAB, Fort Collins, CO, USA; Department of Electrical Engineering, Colorado State University, Fort Collins, CO, USA
| | - Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO, USA
| | - Jean Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO, USA; GenoFAB, Fort Collins, CO, USA; Department of Computer Sciences, Colorado State University, Fort Collins, CO, USA; School of Biomedical Engineering, Colorado State University, Fort Collins, CO, USA; Department of Systems Engineering, Colorado State University, Fort Collins, CO, USA.
| |
Collapse
|
4
|
Hernandez SI, Peccoud SJ, Berezin CT, Peccoud J. Self-Documenting Plasmids. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.29.620927. [PMID: 39554086 PMCID: PMC11565722 DOI: 10.1101/2024.10.29.620927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Plasmids are the workhorse of biotechnology. These small DNA molecules are used to produce recombinant proteins and to engineer living organisms. They can be regarded as the blueprints of many biotechnology products. It is, therefore, critical to ensure that the sequences of these DNA molecules match their intended designs. Yet, plasmid verification remains challenging. To secure the exchange of plasmids in research and development workflows, we have developed self-documenting plasmids that encode information about themselves in their own DNA molecules. Users of self-documenting plasmids can retrieve critical information about the plasmid without prior knowledge of the plasmid identity. The insertion of documentation in the plasmid sequence does not adversely affect their propagation in bacteria and does not compromise protein expression in mammalian cells. This technology simplifies plasmid verification, hardens supply chains, and has the potential to transform the protection of intellectual property in the life sciences.
Collapse
|
5
|
Hernandez SI, Berezin CT, Miller KM, Peccoud SJ, Peccoud J. Sequencing Strategy to Ensure Accurate Plasmid Assembly. ACS Synth Biol 2024; 13:4099-4109. [PMID: 39508818 DOI: 10.1021/acssynbio.4c00539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
Abstract
Despite the wide use of plasmids in research and clinical production, the need to verify plasmid sequences is a bottleneck that is too often underestimated in the manufacturing process. Although sequencing platforms continue to improve, the method and assembly pipeline chosen still influence the final plasmid assembly sequence. Furthermore, few dedicated tools exist for plasmid assembly, especially for de novo assembly. Here, we evaluated short-read, long-read, and hybrid (both short and long reads) de novo assembly pipelines across three replicates of a 24-plasmid library. Consistent with previous characterizations of each sequencing technology, short-read assemblies had issues resolving GC-rich regions, and long-read assemblies commonly had small insertions and deletions, especially in repetitive regions. The hybrid approach facilitated the most accurate, consistent assembly generation and identified mutations relative to the reference sequence. Although Sanger sequencing can be used to verify specific regions, some GC-rich and repetitive regions were difficult to resolve using any method, suggesting that easily sequenced genetic parts should be prioritized in the design of new genetic constructs.
Collapse
Affiliation(s)
- Sarah I Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado 80523, United States of America
| | - Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado 80523, United States of America
| | - Katie M Miller
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado 80523, United States of America
| | - Samuel J Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado 80523, United States of America
| | - Jean Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado 80523, United States of America
| |
Collapse
|
6
|
Elgabry M, Johnson S. Cyber-biological convergence: a systematic review and future outlook. Front Bioeng Biotechnol 2024; 12:1456354. [PMID: 39380896 PMCID: PMC11458441 DOI: 10.3389/fbioe.2024.1456354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 09/05/2024] [Indexed: 10/10/2024] Open
Abstract
The introduction of the capability to "program" a biological system is referred to as Engineered biology and can be compared to the introduction of the internet and the capability of programming a computer. Engineered biology is supported by a digital infrastructure that includes data, data storage, computer-dependent laboratory equipment, internet-connected communication networks, and supply chains. This connectivity is important. It can improve workflows and enhance productivity. At the same time and unlike computer programs, biological systems introduce unique threats as they can self-assemble, self-repair, and self-replicate. The aim of this paper is to systematically review the cyber implications of engineered biology. This includes cyber-bio opportunities and threats as engineered biology continues to integrate into cyberspace. We used a systematic search methodology to review the academic literature, and supplemented this with a review of opensource materials and "grey" literature that is not disseminated by academic publishers. A comprehensive search of articles published in or after 2017 until the 21st of October 2022 found 52 studies that focus on implications of engineered biology to cyberspace. The search was conducted using search engines that index over 60 databases-databases that specifically cover the information security, and biology literatures, as well as the wider set of academic disciplines. Across these 52 articles, we identified a total of 7 cyber opportunities including automated bio-foundries and 4 cyber threats such as Artificial Intelligence misuse and biological dataset targeting. We highlight the 4 main types of cyberbiosecurity solutions identified in the literature and we suggest a total of 9 policy recommendations that can be utilized by various entities, including governments, to ensure that cyberbiosecurity remains frontline in a growing bioeconomy.
Collapse
Affiliation(s)
- Mariam Elgabry
- DAWES Center for Future Crime at UCL, Jill Dando Institute for Security and Crime Science, London, United Kingdom
- Bronic, London, United Kingdom
| | - Shane Johnson
- DAWES Center for Future Crime at UCL, Jill Dando Institute for Security and Crime Science, London, United Kingdom
| |
Collapse
|
7
|
Tay AP, Didi K, Wickramarachchi A, Bauer DC, Wilson LOW, Maselko M. Synsor: a tool for alignment-free detection of engineered DNA sequences. Front Bioeng Biotechnol 2024; 12:1375626. [PMID: 39070163 PMCID: PMC11272466 DOI: 10.3389/fbioe.2024.1375626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 06/18/2024] [Indexed: 07/30/2024] Open
Abstract
DNA sequences of nearly any desired composition, length, and function can be synthesized to alter the biology of an organism for purposes ranging from the bioproduction of therapeutic compounds to invasive pest control. Yet despite offering many great benefits, engineered DNA poses a risk due to their possible misuse or abuse by malicious actors, or their unintentional introduction into the environment. Monitoring the presence of engineered DNA in biological or environmental systems is therefore crucial for routine and timely detection of emerging biological threats, and for improving public acceptance of genetic technologies. To address this, we developed Synsor, a tool for identifying engineered DNA sequences in high-throughput sequencing data. Synsor leverages the k-mer signature differences between naturally occurring and engineered DNA sequences and uses an artificial neural network to classify whether a DNA sequence is natural or engineered. By querying suspected sequences against the model, Synsor can identify sequences that are likely to have been engineered. Using natural plasmid and engineered vector sequences, we showed that Synsor identifies engineered DNA with >99% accuracy. We demonstrate how Synsor can be used to detect potential genetically engineered organisms and locate where engineered DNA is being introduced into the environment by analysing genomic and metagenomic data from yeast and wastewater samples, respectively. Synsor is therefore a powerful tool that will streamline the process of identifying engineered DNA in poorly characterized biological or environmental systems, thereby allowing for enhanced monitoring of emerging biological threats.
Collapse
Affiliation(s)
- Aidan P. Tay
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, NSW, Australia
- Applied Biosciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, Australia
| | - Kieran Didi
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, NSW, Australia
| | - Anuradha Wickramarachchi
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, NSW, Australia
| | - Denis C. Bauer
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, NSW, Australia
- Applied Biosciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, Australia
| | - Laurence O. W. Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, NSW, Australia
- Applied Biosciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, Australia
| | - Maciej Maselko
- Applied Biosciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, Australia
- Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, NSW, Australia
| |
Collapse
|
8
|
Hernandez SI, Berezin CT, Miller KM, Peccoud SJ, Peccoud J. Sequencing Strategy to Ensure Accurate Plasmid Assembly. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.25.586694. [PMID: 38585828 PMCID: PMC10996661 DOI: 10.1101/2024.03.25.586694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Despite the wide use of plasmids in research and clinical production, the need to verify plasmid sequences is a bottleneck that is too often underestimated in the manufacturing process. Although sequencing platforms continue to improve, the method and assembly pipeline chosen still influence the final plasmid assembly sequence. Furthermore, few dedicated tools exist for plasmid assembly, especially for de novo assembly. Here, we evaluated short-read, long-read, and hybrid (both short and long reads) de novo assembly pipelines across three replicates of a 24-plasmid library. Consistent with previous characterizations of each sequencing technology, short-read assemblies had issues resolving GC-rich regions, and long-read assemblies commonly had small insertions and deletions, especially in repetitive regions. The hybrid approach facilitated the most accurate, consistent assembly generation and identified mutations relative to the reference sequence. Although Sanger sequencing can be used to verify specific regions, some GC-rich and repetitive regions were difficult to resolve using any method, suggesting that easily sequenced genetic parts should be prioritized in the design of new genetic constructs.
Collapse
Affiliation(s)
- Sarah I. Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Katie M. Miller
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Samuel J. Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| | - Jean Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, 80523, United States of America
| |
Collapse
|