1
|
Liu J, Zhang WG, Rao ZM. Transcriptional regulator-based biosensors for biomanufacturing in Corynebacterium glutamicum. Microbiol Res 2025; 297:128169. [PMID: 40209574 DOI: 10.1016/j.micres.2025.128169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2025] [Revised: 03/10/2025] [Accepted: 04/02/2025] [Indexed: 04/12/2025]
Abstract
Intracellular biosensors based on transcriptional regulators have become essential instruments in biomanufacturing, extensively employed for the semi-quantitative assessment of intracellular metabolites, high-throughput screening of production strains, and the directed evolution of enzymes. Corynebacterium glutamicum serves as an industrial chassis for the production of amino acids and a variety of high-value-added chemicals. This paper discusses the varieties and modes of action of transcriptional regulators employed in the construction of intracellular biosensors in C. glutamicum. It also reviews the design principles and progress in the application of transcriptional regulator-based biosensors. Furthermore, measures designed to improve the efficacy of these biosensors are delineated. The challenges and future prospects of biosensors based on transcriptional regulators in practical applications are analyzed. This review seeks to offer theoretical direction for the systematic design and development of transcriptional regulator-based biosensors and to aid researchers in enhancing the growth and productivity of microbial production strains.
Collapse
Affiliation(s)
- Jie Liu
- School of Biological and Food Engineering, Anhui Polytechnic University, 18# Beijing Middle Road, WuHu 241000, PR China; The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800# Lihu Road, WuXi 214122, PR China.
| | - Wei-Guo Zhang
- The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800# Lihu Road, WuXi 214122, PR China
| | - Zhi-Ming Rao
- The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800# Lihu Road, WuXi 214122, PR China; National Engineering Laboratory for Cereal Fermentation Technology, School of Biotechnology, Jiangnan University, 1800# Lihu Road, WuXi 214122, PR China
| |
Collapse
|
2
|
Du Q, Poon MN, Zeng X, Zhang P, Wei Z, Wang H, Wang Y, Wei L, Wang X. Synthetic promoter design in Escherichia coli based on multinomial diffusion model. iScience 2024; 27:111207. [PMID: 39524356 PMCID: PMC11550136 DOI: 10.1016/j.isci.2024.111207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 10/03/2024] [Accepted: 10/15/2024] [Indexed: 11/16/2024] Open
Abstract
Generative design of promoters has enhanced the efficiency of de novo creation of functional sequences. Though several deep generative models have been employed in biological sequence generation, including variational autoencoder (VAE) or Wasserstein generative adversarial network (WGAN), these models might struggle with mode collapse and low sample diversity. In this study, we introduce the multinomial diffusion model (MDM) for promoter sequence design and propose a structured set of criteria for effectively comparing the performance of generative models. In silico experiments demonstrate that MDM outperforms existing generative AI approaches. MDM demonstrates superior performance in various computational evaluations, remains robust during the training process, and exhibits a strong ability in capturing weak signals. In addition, we experimentally validated that the majority of our model designed promoters have expression activities in vivo, indicating the practicality and potential of MDM for bioengineering.
Collapse
Affiliation(s)
- Qixiu Du
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - May Nee Poon
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaocheng Zeng
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Pengcheng Zhang
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Zheng Wei
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Haochen Wang
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Ye Wang
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Lei Wei
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Bioinformatics Division, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
3
|
Augustijn HE, Karapliafis D, Joosten KMM, Rigali S, van Wezel GP, Medema MH. LogoMotif: A Comprehensive Database of Transcription Factor Binding Site Profiles in Actinobacteria. J Mol Biol 2024; 436:168558. [PMID: 38580076 DOI: 10.1016/j.jmb.2024.168558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 03/28/2024] [Accepted: 03/30/2024] [Indexed: 04/07/2024]
Abstract
Actinobacteria undergo a complex multicellular life cycle and produce a wide range of specialized metabolites, including the majority of the antibiotics. These biological processes are controlled by intricate regulatory pathways, and to better understand how they are controlled we need to augment our insights into the transcription factor binding sites. Here, we present LogoMotif (https://logomotif.bioinformatics.nl), an open-source database for characterized and predicted transcription factor binding sites in Actinobacteria, along with their cognate position weight matrices and hidden Markov models. Genome-wide predictions of binding site locations in Streptomyces model organisms are supplied and visualized in interactive regulatory networks. In the web interface, users can freely access, download and investigate the underlying data. With this curated collection of actinobacterial regulatory interactions, LogoMotif serves as a basis for binding site predictions, thus providing users with clues on how to elicit the expression of genes of interest and guide genome mining efforts.
Collapse
Affiliation(s)
- Hannah E Augustijn
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands; Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | | | - Kristy M M Joosten
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Sébastien Rigali
- InBioS - Center for Protein Engineering, University of Liège, Institut de Chimie, B-4000 Liège, Belgium
| | - Gilles P van Wezel
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands; Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
4
|
Zuo W, Yin G, Zhang L, Zhang W, Xu R, Wang Y, Li J, Kang Z. Engineering artificial cross-species promoters with different transcriptional strengths. Synth Syst Biotechnol 2024; 10:49-57. [PMID: 39224149 PMCID: PMC11366860 DOI: 10.1016/j.synbio.2024.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 07/22/2024] [Accepted: 08/07/2024] [Indexed: 09/04/2024] Open
Abstract
As a fundamental tool in synthetic biology, promoters are pivotal in regulating gene expression, enabling precise genetic control and spurring innovation across diverse biotechnological applications. However, most advances in engineered genetic systems rely on host-specific regulation of the genetic portion. With the burgeoning diversity of synthetic biology chassis cells, there emerges a pressing necessity to broaden the universal promoter toolkit spectrum, ensuring adaptability across various microbial chassis cells for enhanced applicability and customization in the evolving landscape of synthetic biology. In this study, we analyzed and validated the primary structures of natural endogenous promoters from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Saccharomyces cerevisiae, and Pichia pastoris, and through strategic integration and rational modification of promoter motifs, we developed a series of cross-species promoters (Psh) with transcriptional activity in five strains (prokaryotic and eukaryotic). This series of cross species promoters can significantly expand the synthetic biology promoter toolkit while providing a foundation and inspiration for standardized development of universal components The combinatorial use of key elements from prokaryotic and eukaryotic promoters presented in this study represents a novel strategy that may offer new insights and methods for future advancements in promoter engineering.
Collapse
Affiliation(s)
- Wenjie Zuo
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Guobin Yin
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Luyao Zhang
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Weijiao Zhang
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Ruirui Xu
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Yang Wang
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Jianghua Li
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| | - Zhen Kang
- The Science Center for Future Foods, Jiangnan University, Wuxi, 214122, China
- The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122, China
| |
Collapse
|
5
|
Andreani V, South EJ, Dunlop MJ. Generating information-dense promoter sequences with optimal string packing. PLoS Comput Biol 2024; 20:e1012276. [PMID: 39047028 PMCID: PMC11268586 DOI: 10.1371/journal.pcbi.1012276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 06/25/2024] [Indexed: 07/27/2024] Open
Abstract
Dense arrangements of binding sites within nucleotide sequences can collectively influence downstream transcription rates or initiate biomolecular interactions. For example, natural promoter regions can harbor many overlapping transcription factor binding sites that influence the rate of transcription initiation. Despite the prevalence of overlapping binding sites in nature, rapid design of nucleotide sequences with many overlapping sites remains a challenge. Here, we show that this is an NP-hard problem, coined here as the nucleotide String Packing Problem (SPP). We then introduce a computational technique that efficiently assembles sets of DNA-protein binding sites into dense, contiguous stretches of double-stranded DNA. For the efficient design of nucleotide sequences spanning hundreds of base pairs, we reduce the SPP to an Orienteering Problem with integer distances, and then leverage modern integer linear programming solvers. Our method optimally packs sets of 20-100 binding sites into dense nucleotide arrays of 50-300 base pairs in 0.05-10 seconds. Unlike approximation algorithms or meta-heuristics, our approach finds provably optimal solutions. We demonstrate how our method can generate large sets of diverse sequences suitable for library generation, where the frequency of binding site usage across the returned sequences can be controlled by modulating the objective function. As an example, we then show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The nucleotide string packing approach we present can accelerate the design of sequences with complex DNA-protein interactions. When used in combination with synthesis and high-throughput screening, this design strategy could help interrogate how complex binding site arrangements impact either gene expression or biomolecular mechanisms in varied cellular contexts.
Collapse
Affiliation(s)
- Virgile Andreani
- Biomedical Engineering Department, Boston University, Boston, Massachusetts, United States of America
- Biological Design Center, Boston University, Boston, Massachusetts, United States of America
| | - Eric J. South
- Biological Design Center, Boston University, Boston, Massachusetts, United States of America
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, Massachusetts, United States of America
| | - Mary J. Dunlop
- Biomedical Engineering Department, Boston University, Boston, Massachusetts, United States of America
- Biological Design Center, Boston University, Boston, Massachusetts, United States of America
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|
6
|
Ji CH, Je HW, Kim H, Kang HS. Promoter engineering of natural product biosynthetic gene clusters in actinomycetes: concepts and applications. Nat Prod Rep 2024; 41:672-699. [PMID: 38259139 DOI: 10.1039/d3np00049d] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Covering 2011 to 2022Low titers of natural products in laboratory culture or fermentation conditions have been one of the challenging issues in natural products research. Many natural product biosynthetic gene clusters (BGCs) are also transcriptionally silent in laboratory culture conditions, making it challenging to characterize the structures and activities of their metabolites. Promoter engineering offers a potential solution to this problem by providing tools for transcriptional activation or optimization of biosynthetic genes. In this review, we summarize the 10 years of progress in promoter engineering approaches in natural products research focusing on the most metabolically talented group of bacteria actinomycetes.
Collapse
Affiliation(s)
- Chang-Hun Ji
- Department of Biomedical Science and Engineering, Konkuk University, Seoul 05029, Korea.
| | - Hyun-Woo Je
- Department of Biomedical Science and Engineering, Konkuk University, Seoul 05029, Korea.
| | - Hiyoung Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul 05029, Korea.
| | - Hahk-Soo Kang
- Department of Biomedical Science and Engineering, Konkuk University, Seoul 05029, Korea.
| |
Collapse
|
7
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
8
|
Wang H, Du Q, Wang Y, Xu H, Wei Z, Wang X. GPro: generative AI-empowered toolkit for promoter design. Bioinformatics 2024; 40:btae123. [PMID: 38429953 PMCID: PMC10937896 DOI: 10.1093/bioinformatics/btae123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 02/19/2024] [Accepted: 02/28/2024] [Indexed: 03/03/2024] Open
Abstract
MOTIVATION Promoters with desirable properties are crucial in biotechnological applications. Generative AI (GenAI) has demonstrated potential in creating novel synthetic promoters with significantly enhanced functionality. However, these methods' reliance on various programming frameworks and specific task-oriented contexts limits their flexibilities. Overcoming these limitations is essential for researchers to fully leverage the power of GenAI to design promoters for their tasks. RESULTS Here, we introduce GPro (Generative AI-empowered toolkit for promoter design), a user-friendly toolkit that integrates a collection of cutting-edge GenAI-empowered approaches for promoter design. This toolkit provides a standardized pipeline covering essential promoter design processes, including training, optimization, and evaluation. Several detailed demos are provided to reproduce state-of-the-art promoter design pipelines. GPro's user-friendly interface makes it accessible to a wide range of users including non-AI experts. It also offers a variety of optional algorithms for each design process, and gives users the flexibility to compare methods and create customized pipelines. AVAILABILITY AND IMPLEMENTATION GPro is released as an open-source software under the MIT license. The source code for GPro is available on GitHub for Linux, macOS, and Windows: https://github.com/WangLabTHU/GPro, and is available for download via Zenodo repository at https://zenodo.org/doi/10.5281/zenodo.10681733.
Collapse
Affiliation(s)
- Haochen Wang
- Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Qixiu Du
- Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Ye Wang
- Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Hanwen Xu
- Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Zheng Wei
- Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
- Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
9
|
Deal C, De Wannemaeker L, De Mey M. Towards a rational approach to promoter engineering: understanding the complexity of transcription initiation in prokaryotes. FEMS Microbiol Rev 2024; 48:fuae004. [PMID: 38383636 PMCID: PMC10911233 DOI: 10.1093/femsre/fuae004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/29/2024] [Accepted: 02/20/2024] [Indexed: 02/23/2024] Open
Abstract
Promoter sequences are important genetic control elements. Through their interaction with RNA polymerase they determine transcription strength and specificity, thereby regulating the first step in gene expression. Consequently, they can be targeted as elements to control predictability and tuneability of a genetic circuit, which is essential in applications such as the development of robust microbial cell factories. This review considers the promoter elements implicated in the three stages of transcription initiation, detailing the complex interplay of sequence-specific interactions that are involved, and highlighting that DNA sequence features beyond the core promoter elements work in a combinatorial manner to determine transcriptional strength. In particular, we emphasize that, aside from promoter recognition, transcription initiation is also defined by the kinetics of open complex formation and promoter escape, which are also known to be highly sequence specific. Significantly, we focus on how insights into these interactions can be manipulated to lay the foundation for a more rational approach to promoter engineering.
Collapse
Affiliation(s)
- Cara Deal
- Centre for Synthetic Biology, Ghent University. Coupure Links 653, BE-9000 Ghent, Belgium
| | - Lien De Wannemaeker
- Centre for Synthetic Biology, Ghent University. Coupure Links 653, BE-9000 Ghent, Belgium
| | - Marjan De Mey
- Centre for Synthetic Biology, Ghent University. Coupure Links 653, BE-9000 Ghent, Belgium
| |
Collapse
|
10
|
Andreani V, South EJ, Dunlop MJ. Generating information-dense promoter sequences with optimal string packing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.01.565124. [PMID: 37961203 PMCID: PMC10635063 DOI: 10.1101/2023.11.01.565124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Dense arrangements of binding sites within nucleotide sequences can collectively influence downstream transcription rates or initiate biomolecular interactions. For example, natural promoter regions can harbor many overlapping transcription factor binding sites that influence the rate of transcription initiation. Despite the prevalence of overlapping binding sites in nature, rapid design of nucleotide sequences with many overlapping sites remains a challenge. Here, we show that this is an NP-hard problem, coined here as the nucleotide String Packing Problem (SPP). We then introduce a computational technique that efficiently assembles sets of DNA-protein binding sites into dense, contiguous stretches of double-stranded DNA. For the efficient design of nucleotide sequences spanning hundreds of base pairs, we reduce the SPP to an Orienteering Problem with integer distances, and then leverage modern integer linear programming solvers. Our method optimally packs libraries of 20-100 binding sites into dense nucleotide arrays of 50-300 base pairs in 0.05-10 seconds. Unlike approximation algorithms or meta-heuristics, our approach finds provably optimal solutions. We demonstrate how our method can generate large sets of diverse sequences suitable for library generation, where the frequency of binding site usage across the returned sequences can be controlled by modulating the objective function. As an example, we then show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The nucleotide string packing approach we present can accelerate the design of sequences with complex DNA-protein interactions. When used in combination with synthesis and high-throughput screening, this design strategy could help interrogate how complex binding site arrangements impact either gene expression or biomolecular mechanisms in varied cellular contexts. Author Summary The way protein binding sites are arranged on DNA can control the regulation and transcription of downstream genes. Areas with a high concentration of binding sites can enable complex interplay between transcription factors, a feature that is exploited by natural promoters. However, designing synthetic promoters that contain dense arrangements of binding sites is a challenge. The task involves overlapping many binding sites, each typically about 10 nucleotides long, within a constrained sequence area, which becomes increasingly difficult as sequence length decreases, and binding site variety increases. We introduce an approach to design nucleotide sequences with optimally packed protein binding sites, which we call the nucleotide String Packing Problem (SPP). We show that the SPP can be solved efficiently using integer linear programming to identify the densest arrangements of binding sites for a specified sequence length. We show how adding additional constraints, like the inclusion of sequence elements with fixed positions, allows for the design of bacterial promoters. The presented approach enables the rapid design and study of nucleotide sequences with complex, dense binding site architectures.
Collapse
|
11
|
Okay S. Fine-Tuning Gene Expression in Bacteria by Synthetic Promoters. Methods Mol Biol 2024; 2844:179-195. [PMID: 39068340 DOI: 10.1007/978-1-0716-4063-0_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Promoters are key genetic elements in the initiation and regulation of gene expression. A limited number of natural promoters has been described for the control of gene expression in synthetic biology applications. Therefore, synthetic promoters have been developed to fine-tune the transcription for the desired amount of gene product. Mostly, synthetic promoters are characterized using promoter libraries that are constructed via mutagenesis of promoter sequences. The strength of promoters in the library is determined according to the expression of a reporter gene such as gfp encoding green fluorescent protein. Gene expression can be controlled using inducers. The majority of the studies on gram-negative bacteria are conducted using the expression system of the model organism Escherichia coli while that of the model organism Bacillus subtilis is mostly used in the studies on gram-positive bacteria. Additionally, synthetic promoters for the cyanobacteria, which are phototrophic microorganisms, are evaluated, especially using the model cyanobacterium Synechocystis sp. PCC 6803. Moreover, a variety of algorithms based on machine learning methods were developed to characterize the features of promoter elements. Some of these in silico models were verified using in vitro or in vivo experiments. Identification of novel synthetic promoters with improved features compared to natural ones contributes much to the synthetic biology approaches in terms of fine-tuning gene expression.
Collapse
Affiliation(s)
- Sezer Okay
- Department of Vaccine Technology, Vaccine Institute, Hacettepe University, Ankara, Türkiye
| |
Collapse
|
12
|
Wang X, Xu K, Tan Y, Yu S, Zhao X, Zhou J. Deep Learning-Assisted Design of Novel Promoters in Escherichia coli. ADVANCED GENETICS (HOBOKEN, N.J.) 2023; 4:2300184. [PMID: 38099247 PMCID: PMC10716054 DOI: 10.1002/ggn2.202300184] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 10/09/2023] [Indexed: 12/17/2023]
Abstract
Deep learning (DL) approaches have the ability to accurately recognize promoter regions and predict their strength. Here, the potential for controllably designing active Escherichia coli promoter is explored by combining multiple deep learning models. First, "DRSAdesign," which relies on a diffusion model to generate different types of novel promoters is created, followed by predicting whether they are real or fake and strength. Experimental validation showed that 45 out of 50 generated promoters are active with high diversity, but most promoters have relatively low activity. Next, "Ndesign," which relies on generating random sequences carrying functional -35 and -10 motifs of the sigma70 promoter is introduced, and their strength is predicted using the designed DL model. The DL model is trained and validated using 200 and 50 generated promoters, and displays Pearson correlation coefficients of 0.49 and 0.43, respectively. Taking advantage of the DL models developed in this work, possible 6-mers are predicted as key functional motifs of the sigma70 promoter, suggesting that promoter recognition and strength prediction mainly rely on the accommodation of functional motifs. This work provides DL tools to design promoters and assess their functions, paving the way for DL-assisted metabolic engineering.
Collapse
Affiliation(s)
- Xinglong Wang
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of BiotechnologyJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Science Center for Future FoodsJiangnan University1800 Lihu RoadWuxiJiangsu214122China
| | - Kangjie Xu
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of BiotechnologyJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Science Center for Future FoodsJiangnan University1800 Lihu RoadWuxiJiangsu214122China
| | - Yameng Tan
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of BiotechnologyJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Science Center for Future FoodsJiangnan University1800 Lihu RoadWuxiJiangsu214122China
| | - Shangyang Yu
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of BiotechnologyJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Science Center for Future FoodsJiangnan University1800 Lihu RoadWuxiJiangsu214122China
| | - Xinyi Zhao
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of BiotechnologyJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Science Center for Future FoodsJiangnan University1800 Lihu RoadWuxiJiangsu214122China
| | - Jingwen Zhou
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of BiotechnologyJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Science Center for Future FoodsJiangnan University1800 Lihu RoadWuxiJiangsu214122China
- Jiangsu Province Engineering Research Center of Food Synthetic BiotechnologyJiangnan UniversityWuxi214122China
| |
Collapse
|
13
|
Parthiban S, Vijeesh T, Gayathri T, Shanmugaraj B, Sharma A, Sathishkumar R. Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals. FRONTIERS IN PLANT SCIENCE 2023; 14:1252166. [PMID: 38034587 PMCID: PMC10684705 DOI: 10.3389/fpls.2023.1252166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 10/17/2023] [Indexed: 12/02/2023]
Abstract
Recombinant biopharmaceuticals including antigens, antibodies, hormones, cytokines, single-chain variable fragments, and peptides have been used as vaccines, diagnostics and therapeutics. Plant molecular pharming is a robust platform that uses plants as an expression system to produce simple and complex recombinant biopharmaceuticals on a large scale. Plant system has several advantages over other host systems such as humanized expression, glycosylation, scalability, reduced risk of human or animal pathogenic contaminants, rapid and cost-effective production. Despite many advantages, the expression of recombinant proteins in plant system is hindered by some factors such as non-human post-translational modifications, protein misfolding, conformation changes and instability. Artificial intelligence (AI) plays a vital role in various fields of biotechnology and in the aspect of plant molecular pharming, a significant increase in yield and stability can be achieved with the intervention of AI-based multi-approach to overcome the hindrance factors. Current limitations of plant-based recombinant biopharmaceutical production can be circumvented with the aid of synthetic biology tools and AI algorithms in plant-based glycan engineering for protein folding, stability, viability, catalytic activity and organelle targeting. The AI models, including but not limited to, neural network, support vector machines, linear regression, Gaussian process and regressor ensemble, work by predicting the training and experimental data sets to design and validate the protein structures thereby optimizing properties such as thermostability, catalytic activity, antibody affinity, and protein folding. This review focuses on, integrating systems engineering approaches and AI-based machine learning and deep learning algorithms in protein engineering and host engineering to augment protein production in plant systems to meet the ever-expanding therapeutics market.
Collapse
Affiliation(s)
- Subramanian Parthiban
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Thandarvalli Vijeesh
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Thashanamoorthi Gayathri
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Balamurugan Shanmugaraj
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Ashutosh Sharma
- Tecnologico de Monterrey, School of Engineering and Sciences, Centre of Bioengineering, Queretaro, Mexico
| | - Ramalingam Sathishkumar
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| |
Collapse
|
14
|
Zhang P, Wang H, Xu H, Wei L, Liu L, Hu Z, Wang X. Deep flanking sequence engineering for efficient promoter design using DeepSEED. Nat Commun 2023; 14:6309. [PMID: 37813854 PMCID: PMC10562447 DOI: 10.1038/s41467-023-41899-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 09/20/2023] [Indexed: 10/11/2023] Open
Abstract
Designing promoters with desirable properties is essential in synthetic biology. Human experts are skilled at identifying strong explicit patterns in small samples, while deep learning models excel at detecting implicit weak patterns in large datasets. Biologists have described the sequence patterns of promoters via transcription factor binding sites (TFBSs). However, the flanking sequences of cis-regulatory elements, have long been overlooked and often arbitrarily decided in promoter design. To address this limitation, we introduce DeepSEED, an AI-aided framework that efficiently designs synthetic promoters by combining expert knowledge with deep learning techniques. DeepSEED has demonstrated success in improving the properties of Escherichia coli constitutive, IPTG-inducible, and mammalian cell doxycycline (Dox)-inducible promoters. Furthermore, our results show that DeepSEED captures the implicit features in flanking sequences, such as k-mer frequencies and DNA shape features, which are crucial for determining promoter properties.
Collapse
Affiliation(s)
- Pengcheng Zhang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Haochen Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Hanwen Xu
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Lei Wei
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Liyang Liu
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Zhirui Hu
- Center for Statistical Science, Tsinghua University, Beijing, China
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
15
|
Zhang XE, Liu C, Dai J, Yuan Y, Gao C, Feng Y, Wu B, Wei P, You C, Wang X, Si T. Enabling technology and core theory of synthetic biology. SCIENCE CHINA. LIFE SCIENCES 2023; 66:1742-1785. [PMID: 36753021 PMCID: PMC9907219 DOI: 10.1007/s11427-022-2214-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 10/04/2022] [Indexed: 02/09/2023]
Abstract
Synthetic biology provides a new paradigm for life science research ("build to learn") and opens the future journey of biotechnology ("build to use"). Here, we discuss advances of various principles and technologies in the mainstream of the enabling technology of synthetic biology, including synthesis and assembly of a genome, DNA storage, gene editing, molecular evolution and de novo design of function proteins, cell and gene circuit engineering, cell-free synthetic biology, artificial intelligence (AI)-aided synthetic biology, as well as biofoundries. We also introduce the concept of quantitative synthetic biology, which is guiding synthetic biology towards increased accuracy and predictability or the real rational design. We conclude that synthetic biology will establish its disciplinary system with the iterative development of enabling technologies and the maturity of the core theory.
Collapse
Affiliation(s)
- Xian-En Zhang
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Chenli Liu
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Junbiao Dai
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Yingjin Yuan
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.
| | - Caixia Gao
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Yan Feng
- State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Bian Wu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Ping Wei
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| | - Chun You
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, 100084, China.
| | - Tong Si
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
- Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
16
|
Chen JP, Gong JS, Su C, Li H, Xu ZH, Shi JS. Improving the soluble expression of difficult-to-express proteins in prokaryotic expression system via protein engineering and synthetic biology strategies. Metab Eng 2023; 78:99-114. [PMID: 37244368 DOI: 10.1016/j.ymben.2023.05.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Accepted: 05/23/2023] [Indexed: 05/29/2023]
Abstract
Solubility and folding stability are key concerns for difficult-to-express proteins (DEPs) restricted by amino acid sequences and superarchitecture, resolved by the precise distribution of amino acids and molecular interactions as well as the assistance of the expression system. Therefore, an increasing number of tools are available to achieve efficient expression of DEPs, including directed evolution, solubilization partners, chaperones, and affluent expression hosts, among others. Furthermore, genome editing tools, such as transposons and CRISPR Cas9/dCas9, have been developed and expanded to construct engineered expression hosts capable of efficient expression ability of soluble proteins. Accounting for the accumulated knowledge of the pivotal factors in the solubility and folding stability of proteins, this review focuses on advanced technologies and tools of protein engineering, protein quality control systems, and the redesign of expression platforms in prokaryotic expression systems, as well as advances of the cell-free expression technologies for membrane proteins production.
Collapse
Affiliation(s)
- Jin-Ping Chen
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Life Sciences and Health Engineering, Jiangnan University, Wuxi, 214122, PR China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing, 214200, PR China
| | - Jin-Song Gong
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Life Sciences and Health Engineering, Jiangnan University, Wuxi, 214122, PR China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing, 214200, PR China.
| | - Chang Su
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Life Sciences and Health Engineering, Jiangnan University, Wuxi, 214122, PR China
| | - Heng Li
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Life Sciences and Health Engineering, Jiangnan University, Wuxi, 214122, PR China
| | - Zheng-Hong Xu
- National Engineering Research Center for Cereal Fermentation and Food Biomanufacturing, School of Biotechnology, Jiangnan University, Wuxi, 214122, PR China; Jiangsu Provincial Research Center for Bioactive Product Processing Technology, Jiangnan University, Wuxi, 214122, PR China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing, 214200, PR China
| | - Jin-Song Shi
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Life Sciences and Health Engineering, Jiangnan University, Wuxi, 214122, PR China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing, 214200, PR China
| |
Collapse
|
17
|
Lee H, Song ES, Lee YH, Park JY, Kuk MU, Kwon HW, Roh H, Park JT. A novel hybrid promoter capable of continuously producing proteins in high yield. Biochem Biophys Res Commun 2023; 650:103-108. [PMID: 36774687 DOI: 10.1016/j.bbrc.2023.02.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 02/07/2023] [Indexed: 02/11/2023]
Abstract
The establishment of cell lines with a high protein production is the most crucial objective in the field of biopharmaceuticals. To this end, efforts have been made to increase transgene expression through promoter improvement, but the efficiency or stability of protein production was insufficient for use in commercial production. Here, we developed a novel strategy to increase the efficiency and stability of protein production by hybridizing a promoter that exhibits higher expression levels at the transient level with a promoter that exhibits higher stability at the stable level. Expression levels of transgenes by each promoter were measured at transient and stable levels for five single promoters: Rous sarcoma virus (RSV), cytomegalovirus (CMV), human phosphoglycerate kinase (hPGK), simian virus 40 (SV40), and zebrafish ubiquitin B (Ubb). The hPGK promoter enabled high-yield transgene expression at transient levels and the SV40 promoter enabled sustained expression at stable levels. Therefore, hPGK and SV40 promoters were selected as candidates for establishing hybrid promoters and two hybrid promoters were constructed; one hybrid promoter in which the SV40 promoter is added before the hPGK promoter (a.k.a. SKYI) and the other hybrid promoter in which the SV40 promoter is added after the hPGK promoter (a.k.a. SKYII). Of the two hybrid promoters, the hybrid promoter SKYII promoted high-yield transgene expression at both transient and stable levels compared to single hPGK and SV40. Together, our findings open new doors in the field of biopharmaceuticals by presenting a novel promoter platform that can be used for high-yield and sustained protein production.
Collapse
Affiliation(s)
- Haneur Lee
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea
| | - Eun Seon Song
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea
| | - Yun Haeng Lee
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea
| | - Ji Yun Park
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea
| | - Myeong Uk Kuk
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea
| | - Hyung Wook Kwon
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea; Convergence Research Center for Insect Vectors, Incheon National University, Incheon, 22012, South Korea
| | - Hyungmin Roh
- Department of Chemical and Biological Engineering, Inha Technical College, Incheon, 22212, South Korea.
| | - Joon Tae Park
- Division of Life Sciences, College of Life Sciences and Bioengineering, Incheon National University, Incheon, 22012, South Korea; Convergence Research Center for Insect Vectors, Incheon National University, Incheon, 22012, South Korea.
| |
Collapse
|
18
|
Xu Z, Tian P. Rethinking Biosynthesis of Aclacinomycin A. Molecules 2023; 28:molecules28062761. [PMID: 36985733 PMCID: PMC10054333 DOI: 10.3390/molecules28062761] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 03/01/2023] [Accepted: 03/06/2023] [Indexed: 03/22/2023] Open
Abstract
Aclacinomycin A (ACM-A) is an anthracycline antitumor agent widely used in clinical practice. The current industrial production of ACM-A relies primarily on chemical synthesis and microbial fermentation. However, chemical synthesis involves multiple reactions which give rise to high production costs and environmental pollution. Microbial fermentation is a sustainable strategy, yet the current fermentation yield is too low to satisfy market demand. Hence, strain improvement is highly desirable, and tremendous endeavors have been made to decipher biosynthesis pathways and modify key enzymes. In this review, we comprehensively describe the reported biosynthesis pathways, key enzymes, and, especially, catalytic mechanisms. In addition, we come up with strategies to uncover unknown enzymes and improve the activities of rate-limiting enzymes. Overall, this review aims to provide valuable insights for complete biosynthesis of ACM-A.
Collapse
|
19
|
Abstract
This chapter outlines the myriad applications of machine learning (ML) in synthetic biology, specifically in engineering cell and protein activity, and metabolic pathways. Though by no means comprehensive, the chapter highlights several prominent computational tools applied in the field and their potential use cases. The examples detailed reinforce how ML algorithms can enhance synthetic biology research by providing data-driven insights into the behavior of living systems, even without detailed knowledge of their underlying mechanisms. By doing so, ML promises to increase the efficiency of research projects by modeling hypotheses in silico that can then be tested through experiments. While challenges related to training dataset generation and computational costs remain, ongoing improvements in ML tools are paving the way for smarter and more streamlined synthetic biology workflows that can be readily employed to address grand challenges across manufacturing, medicine, engineering, agriculture, and beyond.
Collapse
Affiliation(s)
- Brendan Fu-Long Sieow
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- NUS Graduate School for Integrative Sciences and Engineering Programme, National University of Singapore, Singapore, Singapore
| | - Ryan De Sotto
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Zhi Ren Darren Seet
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - In Young Hwang
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Matthew Wook Chang
- NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), National University of Singapore, Singapore, Singapore.
- Synthetic Biology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
20
|
LaFleur TL, Hossain A, Salis HM. Automated model-predictive design of synthetic promoters to control transcriptional profiles in bacteria. Nat Commun 2022; 13:5159. [PMID: 36056029 PMCID: PMC9440211 DOI: 10.1038/s41467-022-32829-5] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 08/19/2022] [Indexed: 12/22/2022] Open
Abstract
Transcription rates are regulated by the interactions between RNA polymerase, sigma factor, and promoter DNA sequences in bacteria. However, it remains unclear how non-canonical sequence motifs collectively control transcription rates. Here, we combine massively parallel assays, biophysics, and machine learning to develop a 346-parameter model that predicts site-specific transcription initiation rates for any σ70 promoter sequence, validated across 22132 bacterial promoters with diverse sequences. We apply the model to predict genetic context effects, design σ70 promoters with desired transcription rates, and identify undesired promoters inside engineered genetic systems. The model provides a biophysical basis for understanding gene regulation in natural genetic systems and precise transcriptional control for engineering synthetic genetic systems.
Collapse
Affiliation(s)
- Travis L LaFleur
- Department of Chemical Engineering, Pennsylvania State University, University Park, PA, 16801, USA
| | - Ayaan Hossain
- Bioinformatics and Genomics, Pennsylvania State University, University Park, PA, 16801, USA
| | - Howard M Salis
- Department of Chemical Engineering, Pennsylvania State University, University Park, PA, 16801, USA.
- Bioinformatics and Genomics, Pennsylvania State University, University Park, PA, 16801, USA.
- Department of Biological Engineering, Pennsylvania State University, University Park, PA, 16801, USA.
- Department of Biomedical Engineering, Pennsylvania State University, University Park, PA, 16801, USA.
| |
Collapse
|
21
|
Chen SY, Zhang Y, Li R, Wang B, Ye BC. De Novo Design of the ArsR Regulated P ars Promoter Enables a Highly Sensitive Whole-Cell Biosensor for Arsenic Contamination. Anal Chem 2022; 94:7210-7218. [PMID: 35537205 PMCID: PMC9134189 DOI: 10.1021/acs.analchem.2c00055] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Whole-cell biosensors for arsenic contamination are typically designed based on natural bacterial sensing systems, which are often limited by their poor performance for precisely tuning the genetic response to environmental stimuli. Promoter design remains one of the most important approaches to address such issues. Here, we use the arsenic-responsive ArsR-Pars regulation system from Escherichia coli MG1655 as the sensing element and coupled gfp or lacZ as the reporter gene to construct the genetic circuit for characterizing the refactored promoters. We first analyzed the ArsR binding site and a library of RNA polymerase binding sites to mine potential promoter sequences. A set of tightly regulated Pars promoters by ArsR was designed by placing the ArsR binding sites into the promoter's core region, and a novel promoter with maximal repression efficiency and optimal fold change was obtained. The fluorescence sensor PlacV-ParsOC2 constructed with the optimized ParsOC2 promoter showed a fold change of up to 63.80-fold (with green fluorescence visible to the naked eye) at 9.38 ppb arsenic, and the limit of detection was as low as 0.24 ppb. Further, the optimized colorimetric sensor PlacV-ParsOC2-lacZ with a linear response between 0 and 5 ppb was used to perform colorimetric reactions in 24-well plates combined with a smartphone application for the quantification of the arsenic level in groundwater. This study offers a new approach to improve the performance of bacterial sensing promoters and will facilitate the on-site application of arsenic whole-cell biosensors.
Collapse
Affiliation(s)
- Sheng-Yan Chen
- School
of Chemistry and Chemical Engineering, Shihezi
University, Shihezi 832003, China
| | - Yan Zhang
- School
of Chemistry and Chemical Engineering, Shihezi
University, Shihezi 832003, China
| | - Renjie Li
- School
of Chemistry and Chemical Engineering, Shihezi
University, Shihezi 832003, China
| | - Baojun Wang
- College
of Chemical and Biological Engineering & ZJU-Hangzhou Global Scientific
and Technological Innovation Center, Zhejiang
University, Hangzhou 311200, China,Research
Center of Biological Computation, Zhejiang
Laboratory, Hangzhou 311100, China,Centre
for Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FF, United Kingdom,
| | - Bang-Ce Ye
- School
of Chemistry and Chemical Engineering, Shihezi
University, Shihezi 832003, China,Institute
of Engineering Biology and Health, Collaborative Innovation Center
of Yangtze River Delta Region Green Pharmaceuticals, College of Pharmaceutical
Sciences, Zhejiang University of Technology, Hangzhou 310014, Zhejiang, China,Lab of Biosystem
and Microanalysis, State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China,. Tel/Fax: 0086-21-64252094
| |
Collapse
|
22
|
Wei PJ, Pang ZZ, Jiang LJ, Tan D, Su Y, Zheng CH. Promoter Prediction in Nannochloropsis Based on Densely Connected Convolutional Neural Networks. Methods 2022; 204:38-46. [DOI: 10.1016/j.ymeth.2022.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 03/03/2022] [Accepted: 03/28/2022] [Indexed: 10/18/2022] Open
|
23
|
Logel DY, Trofimova E, Jaschke PR. Codon-Restrained Method for Both Eliminating and Creating Intragenic Bacterial Promoters. ACS Synth Biol 2022; 11:689-699. [PMID: 35043622 DOI: 10.1021/acssynbio.1c00359] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Future applications of synthetic biology will require refactored genetic sequences devoid of internal regulatory elements within coding sequences. These regulatory elements include cryptic and intragenic promoters, which may constitute up to a third of the predicted Escherichia coli promoters. The promoter activity is dependent on the structural interaction of core bases with a σ factor. Rational engineering can be used to alter key promoter element nucleotides interacting with σ factors and eliminate downstream transcriptional activity. In this paper, we present codon-restrained promoter silencing (CORPSE), a system for removing intragenic promoters. CORPSE exploits the DNA-σ factor structural relationship to disrupt σ70 promoters embedded within gene coding sequences with a minimum of synonymous codon changes. Additionally, we present an inverted CORPSE system, iCORPSE, which can create highly active promoters within a gene sequence while not perturbing the function of the modified gene.
Collapse
Affiliation(s)
- Dominic Y. Logel
- School of Natural Sciences, ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney 2109, New South Wales, Australia
| | - Ellina Trofimova
- School of Natural Sciences, ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney 2109, New South Wales, Australia
| | - Paul R. Jaschke
- School of Natural Sciences, ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney 2109, New South Wales, Australia
| |
Collapse
|
24
|
Wang C, Zhang W, Tian R, Zhang J, Zhang L, Deng Z, Lv X, Li J, Liu L, Du G, Liu Y. Model‐driven design of synthetic N‐terminal coding sequences for regulating gene expression in yeast and bacteria. Biotechnol J 2022; 17:e2100655. [DOI: 10.1002/biot.202100655] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Revised: 01/12/2022] [Accepted: 01/13/2022] [Indexed: 11/12/2022]
Affiliation(s)
- Chenyun Wang
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Wei Zhang
- School of Artificial Intelligence and Computer Science Jiangnan University Wuxi 214122 China
- Jiangsu Key Laboratory of Media Design and Software Technology Wuxi 214122 China
| | - Rongzhen Tian
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Jianing Zhang
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Linpei Zhang
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science Jiangnan University Wuxi 214122 China
- Jiangsu Key Laboratory of Media Design and Software Technology Wuxi 214122 China
| | - Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Jianghua Li
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Guocheng Du
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology Jiangnan University Wuxi 214122 China
- Science Center for Future Foods Jiangnan University Wuxi 214122 China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology Jiangnan University Wuxi 214122 China
- Qingdao Special Food Research Institute Wuxi 214122 China
| |
Collapse
|
25
|
Mey F, Clauwaert J, Van Brempt M, Stock M, Maertens J, Waegeman W, De Mey M. ProD: A Tool for Predictive Design of Tailored Promoters in Escherichia coli. Methods Mol Biol 2022; 2516:51-59. [PMID: 35922621 DOI: 10.1007/978-1-0716-2413-5_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A major goal in synthetic biology is the engineering of synthetic gene circuits with a predictable, controlled and designed outcome. This creates a need for building blocks that can modulate gene expression without interference with the native cell system. A tool allowing forward engineering of promoters with predictable transcription initiation frequency is still lacking. Promoter libraries specific for σ70 to ensure the orthogonality of gene expression were built in Escherichia coli and labeled using fluorescence-activated cell sorting to obtain high-throughput DNA sequencing data to train a convolutional neural network. We were able to confirm in vivo that the model is able to predict the promoter transcription initiation frequency (TIF) of new promoter sequences. Here, we provide an online tool for promoter design (ProD) in E. coli, which can be used to tailor output sequences of desired promoter TIF or predict the TIF of a custom sequence.
Collapse
Affiliation(s)
- Friederike Mey
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, Ghent, Belgium
| | - Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Maarten Van Brempt
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, Ghent, Belgium
| | - Michiel Stock
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Jo Maertens
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, Ghent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Marjan De Mey
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, Ghent, Belgium.
| |
Collapse
|
26
|
Wan X, Saltepe B, Yu L, Wang B. Programming living sensors for environment, health and biomanufacturing. Microb Biotechnol 2021; 14:2334-2342. [PMID: 33960658 PMCID: PMC8601174 DOI: 10.1111/1751-7915.13820] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 04/05/2021] [Accepted: 04/11/2021] [Indexed: 01/10/2023] Open
Abstract
Synthetic biology offers new tools and capabilities of engineering cells with desired functions for example as new biosensing platforms leveraging engineered microbes. In the last two decades, bacterial cells have been programmed to sense and respond to various input cues for versatile purposes including environmental monitoring, disease diagnosis and adaptive biomanufacturing. Despite demonstrated proof-of-concept success in the laboratory, the real-world applications of microbial sensors have been restricted due to certain technical and societal limitations. Yet, most limitations can be addressed by new technological developments in synthetic biology such as circuit design, biocontainment and machine learning. Here, we summarize the latest advances in synthetic biology and discuss how they could accelerate the development, enhance the performance and address the present limitations of microbial sensors to facilitate their use in the field. We view that programmable living sensors are promising sensing platforms to achieve sustainable, affordable and easy-to-use on-site detection in diverse settings.
Collapse
Affiliation(s)
- Xinyi Wan
- Centre for Synthetic and Systems BiologySchool of Biological SciencesUniversity of EdinburghEdinburghEH9 3FFUK
- Hangzhou Innovation CenterZhejiang UniversityHangzhou311200China
| | - Behide Saltepe
- Centre for Synthetic and Systems BiologySchool of Biological SciencesUniversity of EdinburghEdinburghEH9 3FFUK
| | - Luyang Yu
- The Provincial International Science and Technology Cooperation Base for Engineering BiologyInternational CampusZhejiang UniversityHaining314400China
- College of Life SciencesZhejiang UniversityHangzhou310058China
| | - Baojun Wang
- Centre for Synthetic and Systems BiologySchool of Biological SciencesUniversity of EdinburghEdinburghEH9 3FFUK
- Hangzhou Innovation CenterZhejiang UniversityHangzhou311200China
- The Provincial International Science and Technology Cooperation Base for Engineering BiologyInternational CampusZhejiang UniversityHaining314400China
- College of Life SciencesZhejiang UniversityHangzhou310058China
| |
Collapse
|
27
|
Munro LJ, Kell DB. Intelligent host engineering for metabolic flux optimisation in biotechnology. Biochem J 2021; 478:3685-3721. [PMID: 34673920 PMCID: PMC8589332 DOI: 10.1042/bcj20210535] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 09/22/2021] [Accepted: 09/24/2021] [Indexed: 12/13/2022]
Abstract
Optimising the function of a protein of length N amino acids by directed evolution involves navigating a 'search space' of possible sequences of some 20N. Optimising the expression levels of P proteins that materially affect host performance, each of which might also take 20 (logarithmically spaced) values, implies a similar search space of 20P. In this combinatorial sense, then, the problems of directed protein evolution and of host engineering are broadly equivalent. In practice, however, they have different means for avoiding the inevitable difficulties of implementation. The spare capacity exhibited in metabolic networks implies that host engineering may admit substantial increases in flux to targets of interest. Thus, we rehearse the relevant issues for those wishing to understand and exploit those modern genome-wide host engineering tools and thinking that have been designed and developed to optimise fluxes towards desirable products in biotechnological processes, with a focus on microbial systems. The aim throughput is 'making such biology predictable'. Strategies have been aimed at both transcription and translation, especially for regulatory processes that can affect multiple targets. However, because there is a limit on how much protein a cell can produce, increasing kcat in selected targets may be a better strategy than increasing protein expression levels for optimal host engineering.
Collapse
Affiliation(s)
- Lachlan J. Munro
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Douglas B. Kell
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K
- Mellizyme Biotechnology Ltd, IC1, Liverpool Science Park, 131 Mount Pleasant, Liverpool L3 5TF, U.K
| |
Collapse
|
28
|
Mey F, Clauwaert J, Van Huffel K, Waegeman W, De Mey M. Improving the performance of machine learning models for biotechnology: The quest for deus ex machina. Biotechnol Adv 2021; 53:107858. [PMID: 34695560 DOI: 10.1016/j.biotechadv.2021.107858] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 10/13/2021] [Accepted: 10/14/2021] [Indexed: 11/24/2022]
Abstract
Machine learning is becoming an integral part of the Design-Build-Test-Learn cycle in biotechnology. Machine learning models learn from collected datasets such as omics data and predict a defined outcome, which has led to both production improvements and predictive tools in the field. Robust prediction of the behavior of microbial cell factories and production processes not only greatly increases our understanding of the function of such systems, but also provides significant savings of development time. However, many pitfalls when modeling biological data - bad fit, noisy data, model instability, low data quantity and imbalances in the data - cause models to suffer in their performance. Here we provide an accessible, in-depth analysis on the problems created by these pitfalls, as well as means of their detection and mediation, with a focus on supervised learning. Assessing the state of the art, we show that, currently, in-depth analyses of model performance are often absent and must be improved. This review provides a toolbox for the analysis of model robustness and performance, and simultaneously proposes a standard for the community to facilitate future work. It is further accompanied by an interactive online tutorial on the discussed issues.
Collapse
Affiliation(s)
- Friederike Mey
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, 9000 Ghent, Belgium
| | - Kirsten Van Huffel
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, 9000 Ghent, Belgium
| | - Marjan De Mey
- Centre for Synthetic Biology (CSB), Department of Biotechnology, Ghent University, 9000 Ghent, Belgium.
| |
Collapse
|
29
|
Zrimec J, Buric F, Kokina M, Garcia V, Zelezniak A. Learning the Regulatory Code of Gene Expression. Front Mol Biosci 2021; 8:673363. [PMID: 34179082 PMCID: PMC8223075 DOI: 10.3389/fmolb.2021.673363] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open
Abstract
Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology.
Collapse
Affiliation(s)
- Jan Zrimec
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Filip Buric
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Mariia Kokina
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Victor Garcia
- School of Life Sciences and Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
| | - Aleksej Zelezniak
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| |
Collapse
|
30
|
Recent advances in tuning the expression and regulation of genes for constructing microbial cell factories. Biotechnol Adv 2021; 50:107767. [PMID: 33974979 DOI: 10.1016/j.biotechadv.2021.107767] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/29/2021] [Accepted: 05/05/2021] [Indexed: 12/14/2022]
Abstract
To overcome environmental problems caused by the use of fossil resources, microbial cell factories have become a promising technique for the sustainable and eco-friendly development of valuable products from renewable resources. Constructing microbial cell factories with high titers, yields, and productivity requires a balance between growth and production; to this end, tuning gene expression and regulation is necessary to optimise and precisely control complicated metabolic fluxes. In this article, we review the current trends and advances in tuning gene expression and regulation and consider their engineering at each of the three stages of gene regulation: genomic, mRNA, and protein. In particular, the technological approaches utilised in a diverse range of genetic-engineering-based tools for the construction of microbial cell factories are reviewed and representative applications of these strategies are presented. Finally, the prospects for strategies and systems for tuning gene expression and regulation are discussed.
Collapse
|
31
|
de Dios R, Santero E, Reyes-Ramírez F. Extracytoplasmic Function σ Factors as Tools for Coordinating Stress Responses. Int J Mol Sci 2021; 22:ijms22083900. [PMID: 33918849 PMCID: PMC8103513 DOI: 10.3390/ijms22083900] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/05/2021] [Accepted: 04/07/2021] [Indexed: 01/03/2023] Open
Abstract
The ability of bacterial core RNA polymerase (RNAP) to interact with different σ factors, thereby forming a variety of holoenzymes with different specificities, represents a powerful tool to coordinately reprogram gene expression. Extracytoplasmic function σ factors (ECFs), which are the largest and most diverse family of alternative σ factors, frequently participate in stress responses. The classification of ECFs in 157 different groups according to their phylogenetic relationships and genomic context has revealed their diversity. Here, we have clustered 55 ECF groups with experimentally studied representatives into two broad classes of stress responses. The remaining 102 groups still lack any mechanistic or functional insight, representing a myriad of systems yet to explore. In this work, we review the main features of ECFs and discuss the different mechanisms controlling their production and activity, and how they lead to a functional stress response. Finally, we focus in more detail on two well-characterized ECFs, for which the mechanisms to detect and respond to stress are complex and completely different: Escherichia coli RpoE, which is the best characterized ECF and whose structural and functional studies have provided key insights into the transcription initiation by ECF-RNAP holoenzymes, and the ECF15-type EcfG, the master regulator of the general stress response in Alphaproteobacteria.
Collapse
|