1
|
Bennett AR, Bojar D. Syntactic sugars: crafting a regular expression framework for glycan structures. Bioinform Adv 2024; 4:vbae059. [PMID: 38708029 PMCID: PMC11069104 DOI: 10.1093/bioadv/vbae059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 03/15/2024] [Accepted: 04/17/2024] [Indexed: 05/07/2024]
Abstract
Motivation Structural analysis of glycans poses significant challenges in glycobiology due to their complex sequences. Research questions such as analyzing the sequence content of the α1-6 branch in N-glycans, are biologically meaningful yet can be hard to automate. Results Here, we introduce a regular expression system, designed for glycans, feature-complete, and closely aligned with regular expression formatting. We use this to annotate glycan motifs of arbitrary complexity, perform differential expression analysis on designated sequence stretches, or elucidate branch-specific binding specificities of lectins in an automated manner. We are confident that glycan regular expressions will empower computational analyses of these sequences. Availability and implementation Our regular expression framework for glycans is implemented in Python and is incorporated into the open-source glycowork package (version 1.1+). Code and documentation are available at https://github.com/BojarLab/glycowork/blob/master/glycowork/motif/regex.py.
Collapse
Affiliation(s)
- Alexander R Bennett
- Department of Medical Biochemistry, Institute of Biomedicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| |
Collapse
|
2
|
Lundstrøm J, Thomès L, Bojar D. Protocol for constructing glycan biosynthetic networks using glycowork. STAR Protoc 2024; 5:102937. [PMID: 38630592 PMCID: PMC11036093 DOI: 10.1016/j.xpro.2024.102937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/09/2024] [Accepted: 02/19/2024] [Indexed: 04/19/2024] Open
Abstract
Glycans, present across all domains of life, comprise a wide range of monosaccharides assembled into complex, branching structures. Here, we present an in silico protocol to construct biosynthetic networks from a list of observed glycans using the Python package glycowork. We describe steps for data preparation, network construction, feature analysis, and data export. This protocol is implemented in Python using example data and can be adapted for use with customized datasets. For complete details on the use and execution of this protocol, please refer to Thomès et al.1.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden.
| | - Luc Thomès
- University Lille, CHU Lille, ULR 7364 - RADEME - Maladies RAres du DÉveloppement embryonnaire et du Métabolisme, 59000 Lille, France
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden.
| |
Collapse
|
3
|
Lundstrøm J, Bojar D. The evolving world of milk oligosaccharides: Biochemical diversity understood by computational advances. Carbohydr Res 2024; 537:109069. [PMID: 38402731 DOI: 10.1016/j.carres.2024.109069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024]
Abstract
Milk oligosaccharides, complex carbohydrates unique to mammalian milk, play crucial roles in infant nutrition and immune development. This review explores their biochemical diversity, tracing the evolutionary paths that have led to their variation across different species. We highlight the intersection of nutrition, biology, and chemistry in understanding these compounds. Additionally, we discuss the latest computational methods and analytical techniques that have revolutionized the study of milk oligosaccharides, offering insights into their structural complexity and functional roles. This brief but essential review not only aims to provide a deeper understanding of milk oligosaccharides but also discuss the road toward their potential applications.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390, Gothenburg, Sweden.
| |
Collapse
|
4
|
Lundstrøm J, Gillon E, Chazalet V, Kerekes N, Di Maio A, Feizi T, Liu Y, Varrot A, Bojar D. Elucidating the glycan-binding specificity and structure of Cucumis melo agglutinin, a new R-type lectin. Beilstein J Org Chem 2024; 20:306-320. [PMID: 38410776 PMCID: PMC10896221 DOI: 10.3762/bjoc.20.31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 02/09/2024] [Indexed: 02/28/2024] Open
Abstract
Plant lectins have garnered attention for their roles as laboratory probes and potential therapeutics. Here, we report the discovery and characterization of Cucumis melo agglutinin (CMA1), a new R-type lectin from melon. Our findings reveal CMA1's unique glycan-binding profile, mechanistically explained by its 3D structure, augmenting our understanding of R-type lectins. We expressed CMA1 recombinantly and assessed its binding specificity using multiple glycan arrays, covering 1,046 unique sequences. This resulted in a complex binding profile, strongly preferring C2-substituted, beta-linked galactose (both GalNAc and Fuca1-2Gal), which we contrasted with the established R-type lectin Ricinus communis agglutinin 1 (RCA1). We also report binding of specific glycosaminoglycan subtypes and a general enhancement of binding by sulfation. Further validation using agglutination, thermal shift assays, and surface plasmon resonance confirmed and quantified this binding specificity in solution. Finally, we solved the high-resolution structure of the CMA1 N-terminal domain using X-ray crystallography, supporting our functional findings at the molecular level. Our study provides a comprehensive understanding of CMA1, laying the groundwork for further exploration of its biological and therapeutic potential.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7B, 413 90 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90 Gothenburg, Sweden
| | - Emilie Gillon
- Univ. Grenoble Alpes, CNRS, CERMAV, 601 Rue de la Chimie, 38610 Gières, France
| | - Valérie Chazalet
- Univ. Grenoble Alpes, CNRS, CERMAV, 601 Rue de la Chimie, 38610 Gières, France
| | - Nicole Kerekes
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7B, 413 90 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90 Gothenburg, Sweden
| | - Antonio Di Maio
- Glycosciences Laboratory, Faculty of Medicine, Imperial College London, Du Cane Rd, London W12 0NN, United Kingdom
| | - Ten Feizi
- Glycosciences Laboratory, Faculty of Medicine, Imperial College London, Du Cane Rd, London W12 0NN, United Kingdom
| | - Yan Liu
- Glycosciences Laboratory, Faculty of Medicine, Imperial College London, Du Cane Rd, London W12 0NN, United Kingdom
| | - Annabelle Varrot
- Univ. Grenoble Alpes, CNRS, CERMAV, 601 Rue de la Chimie, 38610 Gières, France
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7B, 413 90 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90 Gothenburg, Sweden
| |
Collapse
|
5
|
Lundstrøm J, Urban J, Thomès L, Bojar D. GlycoDraw: a python implementation for generating high-quality glycan figures. Glycobiology 2023; 33:927-934. [PMID: 37498172 PMCID: PMC10859633 DOI: 10.1093/glycob/cwad063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 07/14/2023] [Accepted: 07/26/2023] [Indexed: 07/28/2023] Open
Abstract
Glycans are essential to all scales of biology, with their intricate structures being crucial for their biological functions. The structural complexity of glycans is communicated through simplified and unified visual representations according to the Symbol Nomenclature for Glycans (SNFGs) guidelines adopted by the community. Here, we introduce GlycoDraw, a Python-native implementation for high-throughput generation of high-quality, SNFG-compliant glycan figures with flexible display options. GlycoDraw is released as part of our glycan analysis ecosystem, glycowork, facilitating integration into existing workflows by enabling fully automated annotation of glycan-related figures and thus assisting the analysis of e.g. differential abundance data or glycomics mass spectra.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| | - James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| | - Luc Thomès
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| |
Collapse
|
6
|
Lundstrøm J, Urban J, Bojar D. Decoding glycomics with a suite of methods for differential expression analysis. Cell Rep Methods 2023; 3:100652. [PMID: 37992708 PMCID: PMC10753297 DOI: 10.1016/j.crmeth.2023.100652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/04/2023] [Accepted: 10/30/2023] [Indexed: 11/24/2023]
Abstract
Glycomics, the comprehensive profiling of all glycan structures in samples, is rapidly expanding to enable insights into physiology and disease mechanisms. However, glycan structure complexity and glycomics data interpretation present challenges, especially for differential expression analysis. Here, we present a framework for differential glycomics expression analysis. Our methodology encompasses specialized and domain-informed methods for data normalization and imputation, glycan motif extraction and quantification, differential expression analysis, motif enrichment analysis, time series analysis, and meta-analytic capabilities, synthesizing results across multiple studies. All methods are integrated into our open-source glycowork package, facilitating performant workflows and user-friendly access. We demonstrate these methods using dedicated simulations and glycomics datasets of N-, O-, lipid-linked, and free glycans. Differential expression tests here focus on human datasets and cancer vs. healthy tissue comparisons. Our rigorous approach allows for robust, reliable, and comprehensive differential expression analyses in glycomics, contributing to advancing glycomics research and its translation to clinical and diagnostic applications.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden.
| |
Collapse
|
7
|
Jin C, Lundstrøm J, Korhonen E, Luis AS, Bojar D. Breast Milk Oligosaccharides Contain Immunomodulatory Glucuronic Acid and LacdiNAc. Mol Cell Proteomics 2023; 22:100635. [PMID: 37597722 PMCID: PMC10509713 DOI: 10.1016/j.mcpro.2023.100635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 07/31/2023] [Accepted: 08/16/2023] [Indexed: 08/21/2023] Open
Abstract
Breast milk is abundant with functionalized milk oligosaccharides (MOs) to nourish and protect the neonate. Yet we lack a comprehensive understanding of the repertoire and evolution of MOs across Mammalia. We report ∼400 MO-species associations (>100 novel structures) from milk glycomics of nine mostly understudied species: alpaca, beluga whale, black rhinoceros, bottlenose dolphin, impala, L'Hoest's monkey, pygmy hippopotamus, domestic sheep, and striped dolphin. This revealed the hitherto unknown existence of the LacdiNAc motif (GalNAcβ1-4GlcNAc) in MOs of all species except alpaca, sheep, and striped dolphin, indicating the widespread occurrence of this potentially antimicrobial motif in MOs. We also characterize glucuronic acid-containing MOs in the milk of impala, dolphins, sheep, and rhinoceros, previously only reported in cows. We demonstrate that these GlcA-MOs exhibit potent immunomodulatory effects. Our study extends the number of known MOs by >15%. Combined with >1900 curated MO-species associations, we characterize MO motif distributions, presenting an exhaustive overview of MO biodiversity.
Collapse
Affiliation(s)
- Chunsheng Jin
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Emma Korhonen
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Ana S Luis
- Department of Medical Biochemistry and Cell Biology, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
8
|
Thomès L, Karlsson V, Lundstrøm J, Bojar D. Mammalian milk glycomes: Connecting the dots between evolutionary conservation and biosynthetic pathways. Cell Rep 2023; 42:112710. [PMID: 37379211 DOI: 10.1016/j.celrep.2023.112710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 05/09/2023] [Accepted: 06/12/2023] [Indexed: 06/30/2023] Open
Abstract
Milk oligosaccharides (MOs) are among the most abundant constituents of breast milk and are essential for health and development. Biosynthesized from monosaccharides into complex sequences, MOs differ considerably between taxonomic groups. Even human MO biosynthesis is insufficiently understood, hampering evolutionary and functional analyses. Using a comprehensive resource of all published MOs from >100 mammals, we develop a pipeline for generating and analyzing MO biosynthetic networks. We then use evolutionary relationships and inferred intermediates of these networks to discover (1) systematic glycome biases, (2) biosynthetic restrictions, such as reaction path preference, and (3) conserved biosynthetic modules. This allows us to prune and pinpoint biosynthetic pathways despite missing information. Machine learning and network analysis cluster species by their milk glycome, identifying characteristic sequence relationships and evolutionary gains/losses of motifs, MOs, and biosynthetic modules. These resources and analyses will advance our understanding of glycan biosynthesis and the evolution of breast milk.
Collapse
Affiliation(s)
- Luc Thomès
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Viktoria Karlsson
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
9
|
Joeres R, Bojar D, Kalinina OV. GlyLES: Grammar-based Parsing of Glycans from IUPAC-condensed to SMILES. J Cheminform 2023; 15:37. [PMID: 36959676 PMCID: PMC10035253 DOI: 10.1186/s13321-023-00704-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/18/2023] [Indexed: 03/25/2023] Open
Abstract
Glycans are important polysaccharides on cellular surfaces that are bound to glycoproteins and glycolipids. These are one of the most common post-translational modifications of proteins in eukaryotic cells. They play important roles in protein folding, cell-cell interactions, and other extracellular processes. Changes in glycan structures may influence the course of different diseases, such as infections or cancer. Glycans are commonly represented using the IUPAC-condensed notation. IUPAC-condensed is a textual representation of glycans operating on the same topological level as the Symbol Nomenclature for Glycans (SNFG) that assigns colored, geometrical shapes to the main monomers. These symbols are then connected in tree-like structures, visualizing the glycan structure on a topological level. Yet for a representation on the atomic level, notations such as SMILES should be used. To our knowledge, there is no easy-to-use, general, open-source, and offline tool to convert the IUPAC-condensed notation to SMILES. Here, we present the open-access Python package GlyLES for the generalizable generation of SMILES representations out of IUPAC-condensed representations. GlyLES uses a grammar to read in the monomer tree from the IUPAC-condensed notation. From this tree, the tool can compute the atomic structures of each monomer based on their IUPAC-condensed descriptions. In the last step, it merges all monomers into the atomic structure of a glycan in the SMILES notation. GlyLES is the first package that allows conversion from the IUPAC-condensed notation of glycans to SMILES strings. This may have multiple applications, including straightforward visualization, substructure search, molecular modeling and docking, and a new featurization strategy for machine-learning algorithms. GlyLES is available at https://github.com/kalininalab/GlyLES.
Collapse
Affiliation(s)
- Roman Joeres
- grid.7490.a0000 0001 2238 295XHelmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbruecken, Germany
- grid.11749.3a0000 0001 2167 7588Center for Bioinformatics, Saarland University, Saarbruecken, Germany
| | - Daniel Bojar
- grid.8761.80000 0000 9919 9582Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- grid.8761.80000 0000 9919 9582Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Olga V. Kalinina
- grid.7490.a0000 0001 2238 295XHelmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbruecken, Germany
- grid.11749.3a0000 0001 2167 7588Center for Bioinformatics, Saarland University, Saarbruecken, Germany
- grid.11749.3a0000 0001 2167 7588Faculty of Medicine, Saarland University, Homburg, Germany
| |
Collapse
|
10
|
Bojar D, Meche L, Meng G, Eng W, Smith DF, Cummings RD, Mahal LK. A Useful Guide to Lectin Binding: Machine-Learning Directed Annotation of 57 Unique Lectin Specificities. ACS Chem Biol 2022; 17:2993-3012. [PMID: 35084820 PMCID: PMC9679999 DOI: 10.1021/acschembio.1c00689] [Citation(s) in RCA: 87] [Impact Index Per Article: 43.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Glycans are critical to every facet of biology and medicine, from viral infections to embryogenesis. Tools to study glycans are rapidly evolving; however, the majority of our knowledge is deeply dependent on binding by glycan binding proteins (e.g., lectins). The specificities of lectins, which are often naturally isolated proteins, have not been well-defined, making it difficult to leverage their full potential for glycan analysis. Herein, we use a combination of machine learning algorithms and expert annotation to define lectin specificity for this important probe set. Our analysis uses comprehensive glycan microarray analysis of commercially available lectins we obtained using version 5.0 of the Consortium for Functional Glycomics glycan microarray (CFGv5). This data set was made public in 2011. We report the creation of this data set and its use in large-scale evaluation of lectin-glycan binding behaviors. Our motif analysis was performed by integrating 68 manually defined glycan features with systematic probing of computational rules for significant binding motifs using mono- and disaccharides and linkages. Combining machine learning with manual annotation, we create a detailed interpretation of glycan-binding specificity for 57 unique lectins, categorized by their major binding motifs: mannose, complex-type N-glycan, O-glycan, fucose, sialic acid and sulfate, GlcNAc and chitin, Gal and LacNAc, and GalNAc. Our work provides fresh insights into the complex binding features of commercially available lectins in current use, providing a critical guide to these important reagents.
Collapse
Affiliation(s)
- Daniel Bojar
- Department
of Chemistry and Molecular Biology and Wallenberg Centre for Molecular
and Translational Medicine, University of
Gothenburg, Gothenburg, Sweden 405 30
| | - Lawrence Meche
- Biomedical
Chemistry Institute, Department of Chemistry, New York University, 100 Washington Square East, Room 1001, New
York, New York 10003, United States
| | - Guanmin Meng
- Department
of Chemistry, University of Alberta, Edmonton, Canada, T6G 2G2
| | - William Eng
- Biomedical
Chemistry Institute, Department of Chemistry, New York University, 100 Washington Square East, Room 1001, New
York, New York 10003, United States
| | - David F. Smith
- Department
of Biochemistry, Glycomics Center, School of Medicine, Emory University, Atlanta, Georgia 30322, United States
| | - Richard D. Cummings
- Department
of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts 02115, United States
| | - Lara K. Mahal
- Biomedical
Chemistry Institute, Department of Chemistry, New York University, 100 Washington Square East, Room 1001, New
York, New York 10003, United States,Department
of Chemistry, University of Alberta, Edmonton, Canada, T6G 2G2,E-mail:
| |
Collapse
|
11
|
Abstract
Artificial intelligence (AI) methods have been and are now being increasingly integrated in prediction software implemented in bioinformatics and its glycoscience branch known as glycoinformatics. AI techniques have evolved in the past decades, and their applications in glycoscience are not yet widespread. This limited use is partly explained by the peculiarities of glyco-data that are notoriously hard to produce and analyze. Nonetheless, as time goes, the accumulation of glycomics, glycoproteomics, and glycan-binding data has reached a point where even the most recent deep learning methods can provide predictors with good performance. We discuss the historical development of the application of various AI methods in the broader field of glycoinformatics. A particular focus is placed on shining a light on challenges in glyco-data handling, contextualized by lessons learnt from related disciplines. Ending on the discussion of state-of-the-art deep learning approaches in glycoinformatics, we also envision the future of glycoinformatics, including development that need to occur in order to truly unleash the capabilities of glycoscience in the systems biology era.
Collapse
Affiliation(s)
- Daniel Bojar
- Department
of Chemistry and Molecular Biology, University
of Gothenburg, Gothenburg 41390, Sweden
- Wallenberg
Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg 41390, Sweden
| | - Frederique Lisacek
- Proteome
Informatics Group, Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer
Science Department & Section of Biology, University of Geneva, route de Drize 7, CH-1227, Geneva, Switzerland
| |
Collapse
|
12
|
Qin R, Mahal LK, Bojar D. Deep learning explains the biology of branched glycans from single-cell sequencing data. iScience 2022; 25:105163. [PMID: 36217547 PMCID: PMC9547197 DOI: 10.1016/j.isci.2022.105163] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 09/06/2022] [Accepted: 09/16/2022] [Indexed: 11/03/2022] Open
Abstract
Glycosylation is ubiquitous and often dysregulated in disease. However, the regulation and functional significance of various types of glycosylation at cellular levels is hard to unravel experimentally. Multi-omics, single-cell measurements such as SUGAR-seq, which quantifies transcriptomes and cell surface glycans, facilitate addressing this issue. Using SUGAR-seq data, we pioneered a deep learning model to predict the glycan phenotypes of cells (mouse T lymphocytes) from transcripts, with the example of predicting β1,6GlcNAc-branching across T cell subtypes (test set F1 score: 0.9351). Model interpretation via SHAP (SHapley Additive exPlanations) identified highly predictive genes, in part known to impact (i) branched glycan levels and (ii) the biology of branched glycans. These genes included physiologically relevant low-abundance genes that were not captured by conventional differential expression analysis. Our work shows that interpretable deep learning models are promising for uncovering novel functions and regulatory mechanisms of glycans from integrated transcriptomic and glycomic datasets.
Collapse
Affiliation(s)
- Rui Qin
- Department of Chemistry, University of Alberta, Edmonton, AB T6G 2G2, Canada
| | - Lara K. Mahal
- Department of Chemistry, University of Alberta, Edmonton, AB T6G 2G2, Canada
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 405 30 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 405 30 Gothenburg, Sweden
| |
Collapse
|
13
|
Lundstrøm J, Bojar D. Structural insights into host-microbe glycointeractions. Curr Opin Struct Biol 2022; 73:102337. [PMID: 35182928 DOI: 10.1016/j.sbi.2022.102337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/30/2021] [Accepted: 01/14/2022] [Indexed: 11/03/2022]
Abstract
Despite their ubiquitous presence in biological systems, glycans have historically received less attention than they deserved. Investigations in recent years have featured important findings about the role of glycans in regulating the human gut microbiota. Here, we present a brief overview of current trends that shape future directions of computational and experimental research approaches and add to our understanding of host-microbe glycointeractions.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden. https://twitter.com/jonlundstrm
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
14
|
Lundstrøm J, Korhonen E, Lisacek F, Bojar D. LectinOracle: A Generalizable Deep Learning Model for Lectin-Glycan Binding Prediction. Adv Sci (Weinh) 2022; 9:e2103807. [PMID: 34862760 PMCID: PMC8728848 DOI: 10.1002/advs.202103807] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/03/2021] [Indexed: 05/07/2023]
Abstract
Ranging from bacterial cell adhesion over viral cell entry to human innate immunity, glycan-binding proteins or lectins are abound in nature. Widely used as staining and characterization reagents in cell biology and crucial for understanding the interactions in biological systems, lectins are a focal point of study in glycobiology. Yet the sheer breadth and depth of specificity for diverse oligosaccharide motifs has made studying lectins a largely piecemeal approach, with few options to generalize. Here, LectinOracle, a model combining transformer-based representations for proteins and graph convolutional neural networks for glycans to predict their interaction, is presented. Using a curated data set of 564,647 unique protein-glycan interactions, it is shown that LectinOracle predictions agree with literature-annotated specificities for a wide range of lectins. Using a range of specialized glycan arrays, it is shown that LectinOracle predictions generalize to new glycans and lectins, with qualitative and quantitative agreement with experimental data. It is further demonstrated that LectinOracle can be used to improve lectin classification, accelerate lectin directed evolution, predict epidemiological outcomes in the context of influenza virus, and analyze whole lectomes in host-microbe interactions. It is envisioned that the herein presented platform will advance both the study of lectins and their role in (glyco)biology.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular BiologyUniversity of GothenburgGothenburg41390Sweden
- Wallenberg Centre for Molecular and Translational MedicineUniversity of GothenburgGothenburg41390Sweden
| | - Emma Korhonen
- Department of Chemistry and Molecular BiologyUniversity of GothenburgGothenburg41390Sweden
- Wallenberg Centre for Molecular and Translational MedicineUniversity of GothenburgGothenburg41390Sweden
| | - Frédérique Lisacek
- Swiss Institute of BioinformaticsGeneva1227Switzerland
- Computer Science DepartmentUniGeGeneva1227Switzerland
- Section of BiologyUniGeGeneva1205Switzerland
| | - Daniel Bojar
- Department of Chemistry and Molecular BiologyUniversity of GothenburgGothenburg41390Sweden
- Wallenberg Centre for Molecular and Translational MedicineUniversity of GothenburgGothenburg41390Sweden
| |
Collapse
|
15
|
Thomès L, Burkholz R, Bojar D. Glycowork: A Python package for glycan data science and machine learning. Glycobiology 2021; 31:1240-1244. [PMID: 34192308 PMCID: PMC8600276 DOI: 10.1093/glycob/cwab067] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/02/2021] [Accepted: 06/25/2021] [Indexed: 12/14/2022] Open
Abstract
While glycans are crucial for biological processes, existing analysis modalities make it difficult for researchers with limited computational background to include these diverse carbohydrates into workflows. Here, we present glycowork, an open-source Python package designed for glycan-related data science and machine learning by end users. Glycowork includes functions to, for instance, automatically annotate glycan motifs and analyze their distributions via heatmaps and statistical enrichment. We also provide visualization methods, routines to interact with stored databases, trained machine learning models and learned glycan representations. We envision that glycowork can extract further insights from glycan datasets and demonstrate this with workflows that analyze glycan motifs in various biological contexts. Glycowork can be freely accessed at https://github.com/BojarLab/glycowork/.
Collapse
Affiliation(s)
- Luc Thomès
- Department of Chemistry and Molecular Biology and Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard School of Public Health, Boston, 02115 MA, USA
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology and Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| |
Collapse
|
16
|
Abstract
The extraordinary diversity of glycans leads to large differences in the glycomes of different kingdoms of life. Yet, while most monosaccharides are solely found in certain taxonomic groups, there is a small set of monosaccharides with widespread distribution across nearly all domains of life. These general monosaccharides are particularly relevant for glycan motifs, as they can readily be used by commensals and pathogens to mimic host glycans or hijack existing glycan recognition systems. Among these, the monosaccharide fucose is especially interesting, as it frequently presents itself as a terminal monosaccharide, primed for interaction with proteins. Here, we analyze fucose-containing glycan motifs across all taxonomic kingdoms. Using a hereby presented large species-specific glycan dataset and a plethora of methods for glycan-focused bioinformatics and machine learning, we identify characteristic as well as shared fucose-containing glycan motifs for various taxonomic groups, demonstrating clear differences in fucose usage. Even within domains, fucose is used differentially based on an organism’s physiology and habitat. We particularly highlight differences in fucose-containing motifs between vertebrates and invertebrates. With the example of pathogenic and non-pathogenic Escherichia coli strains, we also demonstrate the importance of fucose-containing motifs in molecular mimicry and thereby pathogenic potential. We envision that this study will shed light on an important class of glycan motifs, with potential new insights into the role of fucosylated glycans in symbiosis, pathogenicity, and immunity.
Collapse
Affiliation(s)
- Luc Thomès
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
17
|
Burkholz R, Quackenbush J, Bojar D. Using graph convolutional neural networks to learn a representation for glycans. Cell Rep 2021; 35:109251. [PMID: 34133929 PMCID: PMC9208909 DOI: 10.1016/j.celrep.2021.109251] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 05/05/2021] [Accepted: 05/24/2021] [Indexed: 02/06/2023] Open
Abstract
As the only nonlinear and the most diverse biological sequence, glycans offer substantial challenges for computational biology. These carbohydrates participate in nearly all biological processes—from protein folding to viral cell entry—yet are still not well understood. There are few computational methods to link glycan sequences to functions, and they do not fully leverage all available information about glycans. SweetNet is a graph convolutional neural network that uses graph representation learning to facilitate a computational understanding of glycobiology. SweetNet explicitly incorporates the nonlinear nature of glycans and establishes a framework to map any glycan sequence to a representation. We show that SweetNet outperforms other computational methods in predicting glycan properties on all reported tasks. More importantly, we show that glycan representations, learned by SweetNet, are predictive of organismal phenotypic and environmental properties. Finally, we use glycan-focused machine learning to predict viral glycan binding, which can be used to discover viral receptors. Burkholz et al. develop an analysis platform for glycans, using graph convolutional neural networks, that considers the branched nature of these carbohydrates. They demonstrate that glycan-focused machine learning can be employed for various purposes, such as to cluster species according to their glycomic similarity or to identify viral receptors.
Collapse
Affiliation(s)
- Rebekka Burkholz
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
18
|
Strittmatter T, Egli S, Bertschi A, Plieninger R, Bojar D, Xie M, Fussenegger M. Gene switch for l-glucose-induced biopharmaceutical production in mammalian cells. Biotechnol Bioeng 2021; 118:2220-2233. [PMID: 33629358 DOI: 10.1002/bit.27730] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 01/23/2021] [Accepted: 02/17/2021] [Indexed: 12/11/2022]
Abstract
In this study, we designed and built a gene switch that employs metabolically inert l-glucose to regulate transgene expression in mammalian cells via d-idonate-mediated control of the bacterial regulator LgnR. To this end, we engineered a metabolic cascade in mammalian cells to produce the inducer molecule d-idonate from its precursor l-glucose by ectopically expressing the Paracoccus species 43P-derived catabolic enzymes LgdA, LgnH, and LgnI. To obtain ON- and OFF-switches, we fused LgnR to the human transcriptional silencer domain Krüppel associated box (KRAB) and the viral trans-activator domain VP16, respectively. Thus, these artificial transcription factors KRAB-LgnR or VP16-LgnR modulated cognate promoters containing LgnR-specific binding sites in a d-idonate-dependent manner as a direct result of l-glucose metabolism. In a proof-of-concept experiment, we show that the switches can control production of the model biopharmaceutical rituximab in both transiently and stably transfected HEK-293T cells, as well as CHO-K1 cells. Rituximab production reached 5.9 µg/ml in stably transfected HEK-293T cells and 3.3 µg/ml in stably transfected CHO-K1 cells.
Collapse
Affiliation(s)
- Tobias Strittmatter
- Department of Biosystems, Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Sabina Egli
- Department of Biosystems, Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Adrian Bertschi
- Department of Biosystems, Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Richard Plieninger
- Department of Biosystems, Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Daniel Bojar
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Cambridge, Massachusetts, USA
| | - Mingqi Xie
- Key Laboratory of Growth Regulation and Translational Research of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
| | - Martin Fussenegger
- Department of Biosystems, Science and Engineering, ETH Zurich, Basel, Switzerland.,Faculty of Science, University of Basel, Mattenstrasse, Basel, Switzerland
| |
Collapse
|
19
|
Bojar D, Powers RK, Camacho DM, Collins JJ. Deep-Learning Resources for Studying Glycan-Mediated Host-Microbe Interactions. Cell Host Microbe 2021; 29:132-144.e3. [DOI: 10.1016/j.chom.2020.10.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 09/09/2020] [Accepted: 10/08/2020] [Indexed: 02/07/2023]
|
20
|
Affiliation(s)
- Daniel Bojar
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.,Department of Biological Engineering and Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
21
|
Bojar D, Fussenegger M. The Role of Protein Engineering in Biomedical Applications of Mammalian Synthetic Biology. Small 2020; 16:e1903093. [PMID: 31588687 DOI: 10.1002/smll.201903093] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Revised: 09/05/2019] [Indexed: 06/10/2023]
Abstract
Engineered proteins with enhanced or altered functionality, generated for example by mutation or domain fusion, are at the core of nearly all synthetic biology endeavors in the context of precision medicine, also known as personalized medicine. From designer receptors sensing elevated blood markers to effectors rerouting signaling pathways to synthetic transcription factors and the customized therapeutics they regulate, engineered proteins play a crucial role at every step of novel therapeutic approaches using synthetic biology. Here, recent developments in protein engineering aided by advances in directed evolution, de novo design, and machine learning are discussed. Building on clinical successes already achieved with chimeric antigen receptor (CAR-) T cells and other cell-based therapies, these developments are expected to further enhance the capabilities of mammalian synthetic biology in biomedical and other applications.
Collapse
Affiliation(s)
- Daniel Bojar
- ETH Zurich, Department of Biosystems Science and Engineering, Faculty of Life Science, University of Basel, Mattenstrasse 26, CH-4058, Basel, Switzerland
| | - Martin Fussenegger
- ETH Zurich, Department of Biosystems Science and Engineering, Faculty of Life Science, University of Basel, Mattenstrasse 26, CH-4058, Basel, Switzerland
| |
Collapse
|
22
|
Bojar D. Synthetic bacterial stem cells and their multicellularity for synthetic biology and beyond. Synth Biol (Oxf) 2019; 4:ysz023. [PMID: 32995545 PMCID: PMC7445793 DOI: 10.1093/synbio/ysz023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
23
|
Bojar D. Speaking to nature: a deep learning representational model of proteins ushers in protein linguistics. Synth Biol (Oxf) 2019; 4:ysz013. [PMID: 32995538 PMCID: PMC7445760 DOI: 10.1093/synbio/ysz013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
24
|
Bojar D. Arch enemy no more: designing the first synthetic globular all-beta proteins with beta-arches. Synth Biol (Oxf) 2019; 4:ysz002. [PMID: 32995530 PMCID: PMC7445790 DOI: 10.1093/synbio/ysz002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
25
|
Bojar D. New adaptive laboratory evolution database highlights the need for consolidating directed evolution data. Synth Biol (Oxf) 2019; 4:ysz004. [PMID: 32995531 PMCID: PMC7445876 DOI: 10.1093/synbio/ysz004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
26
|
Bojar D, Fuhrer T, Fussenegger M. Purity by design: Reducing impurities in bioproduction by stimulus-controlled global translational downregulation of non-product proteins. Metab Eng 2018; 52:110-123. [PMID: 30468874 DOI: 10.1016/j.ymben.2018.11.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 11/01/2018] [Accepted: 11/17/2018] [Indexed: 01/22/2023]
Abstract
Capitalizing on the ability of mammalian cells to conduct complex post-translational modifications, most protein therapeutics are currently produced in cell culture systems. Addition of a signal peptide to the product protein enables its accumulation in the cell culture supernatant, but separation of the product from endogenously secreted proteins remains costly and labor-intensive. We considered that global downregulation of translation of non-product proteins would be an efficient strategy to minimize downstream processing requirements. Therefore, taking advantage of the ability of mammalian protein kinase R (PKR) to switch off most cellular translation processes in response to infection by viruses, we fused a caffeine-inducible dimerization domain to the catalytic domain of PKR. Addition of caffeine to this construct results in homodimerization and activation of PKR, effectively rewiring rapid global translational downregulation to the addition of the stimulus in a dose-dependent manner. Then, to protect translation of the target therapeutic, we screened viral and cellular internal ribosomal entry sites (IRESes) known or suspected to be resistant to PKR-induced translational stress. After choosing the best-in-class Seneca valley virus (SVV) IRES, we additionally screened for IRES transactivation factors (ITAFs) as well as for supplementary small molecules to further boost the production titer of the product protein under conditions of global translational downregulation. Importantly, the residual global translation activity of roughly 10% under maximal downregulation is sufficient to maintain cellular viability during a production timeframe of at least five days. Standard industrially used adherent as well as suspension-adapted cell lines transfected with this synthetic biology-inspired Protein Kinase R-Enhanced Protein Production (PREPP) system could produce several medicinally relevant protein therapeutics, such as the blockbuster drug rituximab, in substantial quantities and with significantly higher purity than previous culture technologies. We believe incorporation of such purity-by-design technology in the production process will alleviate downstream processing bottlenecks in future biopharmaceutical manufacturing.
Collapse
Affiliation(s)
- Daniel Bojar
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058 Basel, Switzerland
| | - Tobias Fuhrer
- ETH Zurich, Institute of Molecular Systems Biology, Auguste-Piccard-Hof 1, 8093 Zurich, Switzerland
| | - Martin Fussenegger
- ETH Zurich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058 Basel, Switzerland; Faculty of Life Science, University of Basel, Mattenstrasse 26, CH-4058 Basel, Switzerland.
| |
Collapse
|
27
|
Affiliation(s)
- Daniel Bojar
- Dept. of Biosystems Science and Engineering; ETH Zurich; Basel Switzerland
| | - Martin Fussenegger
- Dept. of Biosystems Science and Engineering; ETH Zurich; Basel Switzerland
- Faculty of Science; University of Basel; Basel Switzerland
| |
Collapse
|
28
|
Bojar D, Scheller L, Hamri GCE, Xie M, Fussenegger M. Caffeine-inducible gene switches controlling experimental diabetes. Nat Commun 2018; 9:2318. [PMID: 29921872 PMCID: PMC6008335 DOI: 10.1038/s41467-018-04744-1] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Accepted: 05/08/2018] [Indexed: 02/08/2023] Open
Abstract
Programming cellular behavior using trigger-inducible gene switches is integral to synthetic biology. Although significant progress has been achieved in trigger-induced transgene expression, side-effect-free remote control of transgenes continues to challenge cell-based therapies. Here, utilizing a caffeine-binding single-domain antibody we establish a caffeine-inducible protein dimerization system, enabling synthetic transcription factors and cell-surface receptors that enable transgene expression in response to physiologically relevant concentrations of caffeine generated by routine intake of beverages such as tea and coffee. Coffee containing different caffeine concentrations dose-dependently and reversibly controlled transgene expression by designer cells with this caffeine-stimulated advanced regulators (C-STAR) system. Type-2 diabetic mice implanted with microencapsulated, C-STAR-equipped cells for caffeine-sensitive expression of glucagon-like peptide 1 showed substantially improved glucose homeostasis after coffee consumption compared to untreated mice. Biopharmaceutical production control by caffeine, which is non-toxic, inexpensive and only present in specific beverages, is expected to improve patient compliance by integrating therapy with lifestyle.
Collapse
Affiliation(s)
- Daniel Bojar
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Leo Scheller
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Ghislaine Charpin-El Hamri
- IUT, Département Génie Biologique, Institut Universitaire de Technologie, F-69622, Villeurbanne Cedex, France
| | - Mingqi Xie
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Martin Fussenegger
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland. .,Faculty of Life Science, University of Basel, Mattenstrasse 26, CH-4058, Basel, Switzerland.
| |
Collapse
|
29
|
Scheller L, Strittmatter T, Fuchs D, Bojar D, Fussenegger M. Generalized extracellular molecule sensor platform for programming cellular behavior. Nat Chem Biol 2018; 14:723-729. [DOI: 10.1038/s41589-018-0046-z] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 03/02/2018] [Indexed: 12/13/2022]
|
30
|
Kojima R, Bojar D, Rizzi G, Hamri GCE, El-Baba MD, Saxena P, Ausländer S, Tan KR, Fussenegger M. Designer exosomes produced by implanted cells intracerebrally deliver therapeutic cargo for Parkinson's disease treatment. Nat Commun 2018; 9:1305. [PMID: 29610454 PMCID: PMC5880805 DOI: 10.1038/s41467-018-03733-8] [Citation(s) in RCA: 399] [Impact Index Per Article: 66.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 03/09/2018] [Indexed: 12/15/2022] Open
Abstract
Exosomes are cell-derived nanovesicles (50-150 nm), which mediate intercellular communication, and are candidate therapeutic agents. However, inefficiency of exosomal message transfer, such as mRNA, and lack of methods to create designer exosomes have hampered their development into therapeutic interventions. Here, we report a set of EXOsomal transfer into cells (EXOtic) devices that enable efficient, customizable production of designer exosomes in engineered mammalian cells. These genetically encoded devices in exosome producer cells enhance exosome production, specific mRNA packaging, and delivery of the mRNA into the cytosol of target cells, enabling efficient cell-to-cell communication without the need to concentrate exosomes. Further, engineered producer cells implanted in living mice could consistently deliver cargo mRNA to the brain. Therapeutic catalase mRNA delivery by designer exosomes attenuated neurotoxicity and neuroinflammation in in vitro and in vivo models of Parkinson's disease, indicating the potential usefulness of the EXOtic devices for RNA delivery-based therapeutic applications.
Collapse
Affiliation(s)
- Ryosuke Kojima
- ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
- Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Daniel Bojar
- ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Giorgio Rizzi
- Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Ghislaine Charpin-El Hamri
- Département Génie Biologique, Institut Universitaire de Technologie (IUTA), F-69622, Villeurbanne Cedex, France
| | - Marie Daoud El-Baba
- Département Génie Biologique, Institut Universitaire de Technologie (IUTA), F-69622, Villeurbanne Cedex, France
| | - Pratik Saxena
- ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Simon Ausländer
- ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland
| | - Kelly R Tan
- Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Martin Fussenegger
- ETH Zürich, Department of Biosystems Science and Engineering, Mattenstrasse 26, 4058, Basel, Switzerland.
- Faculty of Life Science, University of Basel, Mattenstrasse 26, 4058, Basel, Switzerland.
| |
Collapse
|
31
|
Abstract
Synthetic biology, the synthesis of engineering and biology, has rapidly matured and has dramatically increased the complexity of artificial gene circuits in recent years. The deployment of intricate synthetic gene circuits in mammalian cells requires the establishment of very precise and orthogonal control of transgene expression. In this chapter, we describe methods of modulating the expression of transgenes at the transcriptional level. Using cAMP-response element-binding protein (CREB)-dependent promoters as examples, a tool for the precise tuning of gene expression by using different core promoters and by varying the binding affinity of transcription factor operator sites is described.
Collapse
Affiliation(s)
- Pratik Saxena
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, Basel, CH-4058, Switzerland
| | - Daniel Bojar
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, Basel, CH-4058, Switzerland
| | - Martin Fussenegger
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, Basel, CH-4058, Switzerland. .,Faculty of Science, University of Basel, Mattenstrasse 26, Basel, CH-4058, Switzerland.
| |
Collapse
|
32
|
Bojar D, Martinez J, Santiago J, Rybin V, Bayliss R, Hothorn M. Crystal structures of the phosphorylated BRI1 kinase domain and implications for brassinosteroid signal initiation. Plant J 2014; 78:31-43. [PMID: 24461462 PMCID: PMC4260089 DOI: 10.1111/tpj.12445] [Citation(s) in RCA: 107] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Revised: 01/14/2014] [Accepted: 01/16/2014] [Indexed: 05/18/2023]
Abstract
Brassinosteroids, which control plant growth and development, are sensed by the membrane receptor kinase BRASSINOSTEROID INSENSITIVE 1 (BRI1). Brassinosteroid binding to the BRI1 leucine-rich repeat (LRR) domain induces heteromerisation with a SOMATIC EMBRYOGENESIS RECEPTOR KINASE (SERK)-family co-receptor. This process allows the cytoplasmic kinase domains of BRI1 and SERK to interact, trans-phosphorylate and activate each other. Here we report crystal structures of the BRI1 kinase domain in its activated form and in complex with nucleotides. BRI1 has structural features reminiscent of both serine/threonine and tyrosine kinases, providing insight into the evolution of dual-specificity kinases in plants. Phosphorylation of Thr1039, Ser1042 and Ser1044 causes formation of a catalytically competent activation loop. Mapping previously identified serine/threonine and tyrosine phosphorylation sites onto the structure, we analyse their contribution to brassinosteroid signaling. The location of known genetic missense alleles provide detailed insight into the BRI1 kinase mechanism, while our analyses are inconsistent with a previously reported guanylate cyclase activity. We identify a protein interaction surface on the C-terminal lobe of the kinase and demonstrate that the isolated BRI1, SERK2 and SERK3 cytoplasmic segments form homodimers in solution and have a weak tendency to heteromerise. We propose a model in which heterodimerisation of the BRI1 and SERK ectodomains brings their cytoplasmic kinase domains in a catalytically competent arrangement, an interaction that can be modulated by the BRI1 inhibitor protein BKI1.
Collapse
Affiliation(s)
- Daniel Bojar
- Structural Plant Biology Lab, Friedrich Miescher Laboratory of the Max Planck SocietySpemannstrasse 39, 72076, Tuebingen, Germany
| | - Jacobo Martinez
- Structural Plant Biology Lab, Friedrich Miescher Laboratory of the Max Planck SocietySpemannstrasse 39, 72076, Tuebingen, Germany
| | - Julia Santiago
- Structural Plant Biology Lab, Friedrich Miescher Laboratory of the Max Planck SocietySpemannstrasse 39, 72076, Tuebingen, Germany
| | - Vladimir Rybin
- Protein Expression and Purification Core Facility, European Molecular Biology LaboratoryMeyerhofstrasse 1, 69117, Heidelberg, Germany
| | - Richard Bayliss
- Department of Biochemistry, University of LeicesterLancaster Road, Leicester, LE1 9HN, UK
| | - Michael Hothorn
- Structural Plant Biology Lab, Friedrich Miescher Laboratory of the Max Planck SocietySpemannstrasse 39, 72076, Tuebingen, Germany
- *For correspondence (e-mail )
| |
Collapse
|