1
|
Akune-Taylor Y. Glyco you should know. Glycobiology 2025; 35:cwaf016. [PMID: 40111002 DOI: 10.1093/glycob/cwaf016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2025] [Accepted: 03/17/2025] [Indexed: 03/22/2025] Open
Affiliation(s)
- Yukie Akune-Taylor
- Glycan and Life Systems Integration Center, Soka University, 1-236, Tangimachi, Hachioji City, Tokyo, 192-8577, Japan
- Metabolism Digestion and Reproduction, Faculty of Medicine, Imperial College London, London
| |
Collapse
|
2
|
Flevaris K, Kotidis P, Kontoravdi C. GlyCompute: towards the automated analysis of protein N-linked glycosylation kinetics via an open-source computational framework. Anal Bioanal Chem 2025; 417:957-972. [PMID: 39322800 PMCID: PMC11782420 DOI: 10.1007/s00216-024-05522-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/20/2024] [Accepted: 08/26/2024] [Indexed: 09/27/2024]
Abstract
Understanding the complex biosynthetic pathways of glycosylation is crucial for the expanding field of glycosciences. Computer-aided glycosylation analysis has greatly benefited in recent years from the development of tools found in web-based portals and open-source libraries. However, the in silico analysis of cellular glycosylation kinetics is underrepresented in current glycoscience-related tools and databases. This could be partly attributed to the limited accessibility of kinetic models developed using proprietary software and the difficulty in reliably parameterising such models. This work aims to address these challenges by proposing GlyCompute, an open-source framework demonstrating a novel, streamlined approach for the assembly, simulation, and parameterisation of kinetic models of protein N-linked glycosylation. Specifically, given one or more sets of experimentally observed N-glycan structures and their relative abundances, minimum representations of a glycosylation reaction network are generated. The topology of the resulting networks is then used to automatically assemble the material balances and kinetic mechanisms underpinning the mathematical model. To match the experimentally observed relative abundances, a sequential parameter estimation strategy using Bayesian inference is proposed, with stages determined automatically based on the underlying network topology. The proposed framework was tested on a case study involving the simultaneous fitting of the kinetic model to two protein N-linked glycoprofiles produced by the same CHO cell culture, showing good agreement with experimental observations. We envision that GlyCompute could help glycoscientists gain quantitative insights into the effect of enzyme kinetics and their perturbations on experimentally observed glycoprofiles in biomanufacturing and clinical settings.
Collapse
Affiliation(s)
| | - Pavlos Kotidis
- Department of Chemical Engineering, Imperial, London, SW7 2AZ, UK
- Biopharm Process Research, GSK, Stevenage, UK
| | - Cleo Kontoravdi
- Department of Chemical Engineering, Imperial, London, SW7 2AZ, UK.
| |
Collapse
|
3
|
Urban J, Joeres R, Thomès L, Thomsson KA, Bojar D. Navigating the maze of mass spectra: a machine-learning guide to identifying diagnostic ions in O-glycan analysis. Anal Bioanal Chem 2025; 417:931-943. [PMID: 39180595 PMCID: PMC11782297 DOI: 10.1007/s00216-024-05500-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 08/06/2024] [Accepted: 08/09/2024] [Indexed: 08/26/2024]
Abstract
Structural details of oligosaccharides, or glycans, often carry biological relevance, which is why they are typically elucidated using tandem mass spectrometry. Common approaches to distinguish isomers rely on diagnostic glycan fragments for annotating topologies or linkages. Diagnostic fragments are often only known informally among practitioners or stem from individual studies, with unclear validity or generalizability, causing annotation heterogeneity and hampering new analysts. Drawing on a curated set of 237,000 O-glycomics spectra, we here present a rule-based machine learning workflow to uncover quantifiably valid and generalizable diagnostic fragments. This results in fragmentation rules to robustly distinguish common O-glycan isomers for reduced glycans in negative ion mode. We envision this resource to improve glycan annotation accuracy and concomitantly make annotations more transparent and homogeneous across analysts.
Collapse
Affiliation(s)
- James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Roman Joeres
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | - Luc Thomès
- ULR 7364 - RADEME - Maladies RAres du DÉveloppement embryonnaire et du Métabolisme, CHU Lille, University Lille, 59000, Lille, France
| | - Kristina A Thomsson
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
4
|
Bennett AR, Lundstrøm J, Chatterjee S, Thaysen-Andersen M, Bojar D. Compositional data analysis enables statistical rigor in comparative glycomics. Nat Commun 2025; 16:795. [PMID: 39824855 PMCID: PMC11748655 DOI: 10.1038/s41467-025-56249-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 01/13/2025] [Indexed: 01/20/2025] Open
Abstract
Comparative glycomics data are compositional data, where measured glycans are parts of a whole, indicated by relative abundances. Applying traditional statistical analyses to these data often results in misleading conclusions, such as spurious "decreases" of glycans when other structures increase in abundance, or high false-positive rates for differential abundance. Our work introduces a compositional data analysis framework, tailored to comparative glycomics, to account for these data dependencies. We employ center log-ratio and additive log-ratio transformations, augmented with a scale uncertainty/information model, to introduce a statistically robust and sensitive data analysis pipeline. Applied to comparative glycomics datasets, including known glycan concentrations in defined mixtures, this approach controls false-positive rates and results in reproducible biological findings. Additionally, we present specialized analysis modalities: alpha- and beta-diversity analyze glycan distributions within and between samples, while cross-class glycan correlations shed light on previously undetected interdependencies. These approaches reveal insights into glycome variations that are critical to understanding roles of glycans in health and disease.
Collapse
Affiliation(s)
- Alexander R Bennett
- Department of Medical Biochemistry, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| | - Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Sayantani Chatterjee
- School of Natural Sciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, Australia
| | - Morten Thaysen-Andersen
- School of Natural Sciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, Australia
- Institute for Glyco-core Research (iGCORE), Nagoya University, Nagoya, Japan
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
5
|
Gheeraert A, Bailly T, Ren Y, Hamraoui A, Te J, Vander Meersche Y, Cretin G, Leon Foun Lin R, Gelly JC, Pérez S, Guyon F, Galochkina T. DIONYSUS: a database of protein-carbohydrate interfaces. Nucleic Acids Res 2025; 53:D387-D395. [PMID: 39436020 PMCID: PMC11701518 DOI: 10.1093/nar/gkae890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 09/03/2024] [Accepted: 09/26/2024] [Indexed: 10/23/2024] Open
Abstract
Protein-carbohydrate interactions govern a wide variety of biological processes and play an essential role in the development of different diseases. Here, we present DIONYSUS, the first database of protein-carbohydrate interfaces annotated according to structural, chemical and functional properties of both proteins and carbohydrates. We provide exhaustive information on the nature of interactions, binding site composition, biological function and specific additional information retrieved from existing databases. The user can easily search the database using protein sequence and structure information or by carbohydrate binding site properties. Moreover, for a given interaction site, the user can perform its comparison with a representative subset of non-covalent protein-carbohydrate interactions to retrieve information on its potential function or specificity. Therefore, DIONYSUS is a source of valuable information both for a deeper understanding of general protein-carbohydrate interaction patterns, for annotation of the previously unannotated proteins and for such applications as carbohydrate-based drug design. DIONYSUS is freely available at www.dsimb.inserm.fr/DIONYSUS/.
Collapse
Affiliation(s)
- Aria Gheeraert
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Thomas Bailly
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Yani Ren
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
- Université Paris-Saclay, INRAE, MetaGenoPolis, 78350 Jouy-en-Josas, France
| | - Ali Hamraoui
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
- Institut de biologie de l’Ecole normale supérieure (IBENS), Ecole normale supérieure, CNRS, INSERM, PSL Universite Paris, 75005 Paris, France
| | - Julie Te
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Yann Vander Meersche
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Gabriel Cretin
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Ravy Leon Foun Lin
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Jean-Christophe Gelly
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Serge Pérez
- Centre de Recherches sur les Macromolécules Végétales, University Grenoble Alpes, CNRS, UPR, 5301 Grenoble, France
| | - Frédéric Guyon
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, DSIMB, F-75015 Paris, France
| |
Collapse
|
6
|
Ives CM, Singh O, D'Andrea S, Fogarty CA, Harbison AM, Satheesan A, Tropea B, Fadda E. Restoring protein glycosylation with GlycoShape. Nat Methods 2024; 21:2117-2127. [PMID: 39402214 PMCID: PMC11541215 DOI: 10.1038/s41592-024-02464-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 09/12/2024] [Indexed: 11/08/2024]
Abstract
Despite ground-breaking innovations in experimental structural biology and protein structure prediction techniques, capturing the structure of the glycans that functionalize proteins remains a challenge. Here we introduce GlycoShape ( https://glycoshape.org ), an open-access glycan structure database and toolbox designed to restore glycoproteins to their native and functional form in seconds. The GlycoShape database counts over 500 unique glycans so far, covering the human glycome and augmented by elements from a wide range of organisms, obtained from 1 ms of cumulative sampling from molecular dynamics simulations. These structures can be linked to proteins with a robust algorithm named Re-Glyco, directly compatible with structural data in open-access repositories, such as the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) and AlphaFold Protein Structure Database, or own. The quality, performance and broad applicability of GlycoShape is demonstrated by its ability to predict N-glycosylation occupancy, scoring a 93% agreement with experiment, based on screening all proteins in the PDB with a corresponding glycoproteomics profile, for a total of 4,259 N-glycosylation sequons.
Collapse
Affiliation(s)
- Callum M Ives
- Department of Chemistry, Maynooth University, Maynooth, Ireland
| | - Ojas Singh
- Department of Chemistry, Maynooth University, Maynooth, Ireland
| | - Silvia D'Andrea
- Department of Chemistry, Maynooth University, Maynooth, Ireland
| | - Carl A Fogarty
- Department of Chemistry, Maynooth University, Maynooth, Ireland
| | | | | | | | - Elisa Fadda
- School of Biological Sciences, University of Southampton, Southampton, UK.
| |
Collapse
|
7
|
Porat J, Watkins CP, Jin C, Xie X, Tan X, Lebedenko CG, Hemberger H, Shin W, Chai P, Collins JJ, Garcia BA, Bojar D, Flynn RA. O-glycosylation contributes to mammalian glycoRNA biogenesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.28.610074. [PMID: 39257776 PMCID: PMC11384000 DOI: 10.1101/2024.08.28.610074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
There is an increasing appreciation for the role of cell surface glycans in modulating interactions with extracellular ligands and participating in intercellular communication. We recently reported the existence of sialoglycoRNAs, where mammalian small RNAs are covalently linked to N-glycans through the modified base acp3U and trafficked to the cell surface. However, little is currently known about the role for O-glycosylation, another major class of carbohydrate polymer modifications. Here, we use parallel genetic, enzymatic, and mass spectrometry approaches to demonstrate that O-linked glycan biosynthesis is responsible for the majority of sialoglycoRNA levels. By examining the O-glycans associated with RNA from cell lines and colon organoids we find known and previously unreported O-linked glycan structures. Further, we find that O-linked glycans released from small RNA from organoids derived from ulcerative colitis patients exhibit higher levels of sialylation than glycans from healthy organoids. Together, our work provides flexible tools to interrogate O-linked glycoRNAs (O-glycoRNA) and suggests that they may be modulated in human disease.
Collapse
Affiliation(s)
- Jennifer Porat
- Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital, Boston, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, USA
| | - Christopher P. Watkins
- Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital, Boston, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, USA
| | - Chunsheng Jin
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Xixuan Xie
- State Key Laboratory of Genetic Engineering, Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai 200438, China
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
| | - Xiao Tan
- Wyss Institute of Biologically Inspired Engineering, Harvard University, Boston, USA
- Division of Gastroenterology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA
- Harvard Medical School, 25 Shattuck St., Boston, MA 02115, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Charlotta G. Lebedenko
- Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital, Boston, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, USA
| | - Helena Hemberger
- Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital, Boston, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, USA
| | - Woojung Shin
- Wyss Institute of Biologically Inspired Engineering, Harvard University, Boston, USA
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Peiyuan Chai
- Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital, Boston, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, USA
| | - James J. Collins
- Wyss Institute of Biologically Inspired Engineering, Harvard University, Boston, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Benjamin A. Garcia
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden. Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Ryan A. Flynn
- Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital, Boston, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, USA
- Harvard Stem Cell Institute, Harvard University, Cambridge, USA
| |
Collapse
|
8
|
Martinez K, Agirre J, Akune Y, Aoki-Kinoshita KF, Arighi C, Axelsen KB, Bolton E, Bordeleau E, Edwards NJ, Fadda E, Feizi T, Hayes C, Ives CM, Joshi HJ, Krishna Prasad K, Kossida S, Lisacek F, Liu Y, Lütteke T, Ma J, Malik A, Martin M, Mehta AY, Neelamegham S, Panneerselvam K, Ranzinger R, Ricard-Blum S, Sanou G, Shanker V, Thomas PD, Tiemeyer M, Urban J, Vita R, Vora J, Yamamoto Y, Mazumder R. Functional implications of glycans and their curation: insights from the workshop held at the 16th Annual International Biocuration Conference in Padua, Italy. Database (Oxford) 2024; 2024:baae073. [PMID: 39137905 PMCID: PMC11321244 DOI: 10.1093/database/baae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/24/2024] [Accepted: 07/10/2024] [Indexed: 08/15/2024]
Abstract
Dynamic changes in protein glycosylation impact human health and disease progression. However, current resources that capture disease and phenotype information focus primarily on the macromolecules within the central dogma of molecular biology (DNA, RNA, proteins). To gain a better understanding of organisms, there is a need to capture the functional impact of glycans and glycosylation on biological processes. A workshop titled "Functional impact of glycans and their curation" was held in conjunction with the 16th Annual International Biocuration Conference to discuss ongoing worldwide activities related to glycan function curation. This workshop brought together subject matter experts, tool developers, and biocurators from over 20 projects and bioinformatics resources. Participants discussed four key topics for each of their resources: (i) how they curate glycan function-related data from publications and other sources, (ii) what type of data they would like to acquire, (iii) what data they currently have, and (iv) what standards they use. Their answers contributed input that provided a comprehensive overview of state-of-the-art glycan function curation and annotations. This report summarizes the outcome of discussions, including potential solutions and areas where curators, data wranglers, and text mining experts can collaborate to address current gaps in glycan and glycosylation annotations, leveraging each other's work to improve their respective resources and encourage impactful data sharing among resources. Database URL: https://wiki.glygen.org/Glycan_Function_Workshop_2023.
Collapse
Affiliation(s)
- Karina Martinez
- Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, 2300 I St. NW, Washington, DC 20052, United States
| | - Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, Wentworth Way, York YO10 5DD, United Kingdom
| | - Yukie Akune
- The Glycosciences Laboratory, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, United Kingdom
| | - Kiyoko F Aoki-Kinoshita
- Glycan and Life Systems Integration Center (GaLSIC), Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Ave, Newark, DE 19716, United States
| | - Kristian B Axelsen
- Swiss-Prot Group, Swiss Institute of Bioinformatics (SIB), CMU, 1 rue Michel Servet, Geneva 4 1211, Switzerland
| | - Evan Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, United States
| | - Emily Bordeleau
- Michael Smith Laboratories, The University of British Columbia, 2185 East Mall, Vancouver, British Columbia V6T 1Z4, Canada
| | - Nathan J Edwards
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, 2115 Wisconsin Ave NW, Washington, DC 20007, United States
| | - Elisa Fadda
- Department of Chemistry and Hamilton Institute, Maynooth University, Kilcock Road, Maynooth, Co. Kildare W23 AH3Y, Ireland
| | - Ten Feizi
- The Glycosciences Laboratory, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, United Kingdom
| | - Catherine Hayes
- Proteome Informatics Group, Swiss Institute of Bioinformatics (SIB), route de Drize 7, Geneva CH-1227, Switzerland
| | - Callum M Ives
- Department of Chemistry and Hamilton Institute, Maynooth University, Kilcock Road, Maynooth, Co. Kildare W23 AH3Y, Ireland
| | - Hiren J Joshi
- Copenhagen Center for Glycomics, Department of Cellular and Molecular Medicine, Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3, Copenhagen DK-2200, Denmark
| | - Khakurel Krishna Prasad
- ELI Beamlines Facility, The Extreme Light Infrastructure ERIC, Za Radnicí 835, Dolní Břežany 25241, Czech Republic
| | - Sofia Kossida
- IMGT, The International ImMunoGeneTics Information System, National Center for Scientific Research (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), 141 rue de la Cardonille, Montpellier 34 090, France
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics (SIB), route de Drize 7, Geneva CH-1227, Switzerland
| | - Yan Liu
- The Glycosciences Laboratory, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, United Kingdom
| | - Thomas Lütteke
- Institute of Veterinary Physiology and Biochemistry, Justus-Liebig-University Gießen, Frankfurter Str. 100, Gießen 35392, Germany
| | - Junfeng Ma
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, 3900 Reservior Road NW, Washington, DC 20007, United States
| | - Adnan Malik
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Maria Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Akul Y Mehta
- Department of Surgery, Beth Israel Deaconess Medical Center, National Center for Functional Glycomics, Harvard Medical School, 330 Brookline Avenue, Boston, MA 02215, United States
| | - Sriram Neelamegham
- Departments of Chemical & Biological Engineering, Biomedical Engineering and Medicine, University at Buffalo, State University of New York, 906 Furnas Hall, Buffalo, NY 14260, United States
| | - Kalpana Panneerselvam
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - René Ranzinger
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd, Athens, GA 30602, United States
| | - Sylvie Ricard-Blum
- Institute of Molecular and Supramolecular Chemistry and Biochemistry (ICBMS), UMR 5246, University Lyon 1, CNRS, 43 Boulevard du 11 novembre 1918, Villeurbanne cedex F-69622, France
| | - Gaoussou Sanou
- IMGT, The International ImMunoGeneTics Information System, National Center for Scientific Research (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), 141 rue de la Cardonille, Montpellier 34 090, France
| | - Vijay Shanker
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Ave, Newark, DE 19716, United States
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, 2001 N Soto Street, Los Angeles, CA 90032, United States
| | - Michael Tiemeyer
- Complex Carbohydrate Research Center, University of Georgia, 315 Riverbend Rd, Athens, GA 30602, United States
| | - James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7 B, Gothenburg 41390, Sweden
| | - Randi Vita
- Immune Epitope Database and Analysis Project, La Jolla Institute for Allergy & Immunology, 9420 Athena Circle, La Jolla, CA 92037, United States
| | - Jeet Vora
- Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, 2300 I St. NW, Washington, DC 20052, United States
| | - Yasunori Yamamoto
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Raja Mazumder
- Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, 2300 I St. NW, Washington, DC 20052, United States
| |
Collapse
|
9
|
Xu T, Wang YC, Ma J, Cui Y, Wang L. In silico discovery and anti-tumor bioactivities validation of an algal lectin from Kappaphycus alvarezii genome. Int J Biol Macromol 2024; 275:133311. [PMID: 38909728 DOI: 10.1016/j.ijbiomac.2024.133311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 05/24/2024] [Accepted: 06/13/2024] [Indexed: 06/25/2024]
Abstract
Lectins are proteins that bind specifically and reversibly to carbohydrates, and some of them have significant anti-tumor activities. Compared to those of lectins from land plants, there are far fewer studies on algal lectins, despite of the high biodiversity of algae. However, canonical strategies based on chromatographic feature-oriented screening cannot satisfy the requirement for algal lectin discovery. In this study, prospecting for novel OAAH family lectins throughout 358 genomes of red algae and cyanobacteria was conducted. Then 35 candidate lectins and 1843 of their simulated mutated forms were virtually screened based on predicted binding specificities to characteristic carbohydrates on cancer cells inferred by a deep learning model. A new lectin, named Siye, was discovered in Kappaphycus alvarezii genome and further verified on different cancer cells. Without causing agglutination of erythrocytes, Siye showed significant cytotoxicity to four human cancer cell lines (IC50 values ranging from 0.11 to 3.95 μg/mL), including breast adenocarcinoma HCC1937, lung carcinoma A549, liver cancer HepG2 and romyelocytic leukemia HL60. And the cytotoxicity was induced through promoting apoptosis by regulating the caspase and the p53 pathway within 24 h. This study testifies the feasibility and efficiency of the genome mining guided by evolutionary theory and artificial intelligence in the discovery of algal lectins.
Collapse
Affiliation(s)
- Tongli Xu
- Key Laboratory of Coastal Biology and Biological Resource Utilization, Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China; Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao 266071, China
| | - Yin-Chu Wang
- Key Laboratory of Coastal Biology and Biological Resource Utilization, Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China; Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao 266071, China; National Basic Science Data Center, Beijing 100190, China.
| | - Jiahao Ma
- Hong Kong University of Science and Technology, Clear Water Bay, 999077, Hong Kong
| | - Yulin Cui
- Binzhou Medical University, Yantai 264003, China.
| | - Lu Wang
- School of Pharmacy, Yantai University, Yantai 264005, China.
| |
Collapse
|
10
|
Urban J, Jin C, Thomsson KA, Karlsson NG, Ives CM, Fadda E, Bojar D. Predicting glycan structure from tandem mass spectrometry via deep learning. Nat Methods 2024; 21:1206-1215. [PMID: 38951670 PMCID: PMC11239490 DOI: 10.1038/s41592-024-02314-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 05/17/2024] [Indexed: 07/03/2024]
Abstract
Glycans constitute the most complicated post-translational modification, modulating protein activity in health and disease. However, structural annotation from tandem mass spectrometry (MS/MS) data is a bottleneck in glycomics, preventing high-throughput endeavors and relegating glycomics to a few experts. Trained on a newly curated set of 500,000 annotated MS/MS spectra, here we present CandyCrunch, a dilated residual neural network predicting glycan structure from raw liquid chromatography-MS/MS data in seconds (top-1 accuracy: 90.3%). We developed an open-access Python-based workflow of raw data conversion and prediction, followed by automated curation and fragment annotation, with predictions recapitulating and extending expert annotation. We demonstrate that this can be used for de novo annotation, diagnostic fragment identification and high-throughput glycomics. For maximum impact, this entire pipeline is tightly interlaced with our glycowork platform and can be easily tested at https://colab.research.google.com/github/BojarLab/CandyCrunch/blob/main/CandyCrunch.ipynb . We envision CandyCrunch to democratize structural glycomics and the elucidation of biological roles of glycans.
Collapse
Affiliation(s)
- James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Chunsheng Jin
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Kristina A Thomsson
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Niclas G Karlsson
- Section of Pharmacy, Department of Life Sciences and Health, Faculty of Health Sciences, Oslo Metropolitan University, Oslo, Norway
| | - Callum M Ives
- Department of Chemistry and Hamilton Institute, Maynooth University, Maynooth, Ireland
| | - Elisa Fadda
- School of Biological Sciences, University of Southampton, Southampton, UK
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
11
|
Akune-Taylor Y, Kon A, Aoki-Kinoshita KF. In silico simulation of glycosylation and related pathways. Anal Bioanal Chem 2024; 416:3687-3696. [PMID: 38748247 PMCID: PMC11180631 DOI: 10.1007/s00216-024-05331-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 04/30/2024] [Accepted: 05/02/2024] [Indexed: 06/18/2024]
Abstract
Glycans participate in a vast number of recognition systems in diverse organisms in health and in disease. However, glycans cannot be sequenced because there is no sequencer technology that can fully characterize them. There is no "template" for replicating glycans as there are for amino acids and nucleic acids. Instead, glycans are synthesized by a complicated orchestration of multitudes of glycosyltransferases and glycosidases. Thus glycans can vary greatly in structure, but they are not genetically reproducible and are usually isolated in minute amounts. To characterize (sequence) the glycome (defined as the glycans in a particular organism, tissue, cell, or protein), glycosylation pathway prediction using in silico methods based on glycogene expression data, and glycosylation simulations have been attempted. Since many of the mammalian glycogenes have been identified and cloned, it has become possible to predict the glycan biosynthesis pathway in these systems. By then incorporating systems biology and bioprocessing technologies to these pathway models, given the right enzymatic parameters including enzyme and substrate concentrations and kinetic reaction parameters, it is possible to predict the potentially synthesized glycans in the pathway. This review presents information on the data resources that are currently available to enable in silico simulations of glycosylation and related pathways. Then some of the software tools that have been developed in the past to simulate and analyze glycosylation pathways will be described, followed by a summary and vision for the future developments and research directions in this area.
Collapse
Affiliation(s)
- Yukie Akune-Taylor
- Glycan and Life Systems Integration Center, Soka University, Tokyo, Japan
| | - Akane Kon
- Graduate School of Science and Engineering, Soka University, Tokyo, Japan
| | - Kiyoko F Aoki-Kinoshita
- Glycan and Life Systems Integration Center, Soka University, Tokyo, Japan.
- Graduate School of Science and Engineering, Soka University, Tokyo, Japan.
- iGCORE, Nagoya University, Nagoya, Japan.
| |
Collapse
|
12
|
Lundstrøm J, Thomès L, Bojar D. Protocol for constructing glycan biosynthetic networks using glycowork. STAR Protoc 2024; 5:102937. [PMID: 38630592 PMCID: PMC11036093 DOI: 10.1016/j.xpro.2024.102937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/09/2024] [Accepted: 02/19/2024] [Indexed: 04/19/2024] Open
Abstract
Glycans, present across all domains of life, comprise a wide range of monosaccharides assembled into complex, branching structures. Here, we present an in silico protocol to construct biosynthetic networks from a list of observed glycans using the Python package glycowork. We describe steps for data preparation, network construction, feature analysis, and data export. This protocol is implemented in Python using example data and can be adapted for use with customized datasets. For complete details on the use and execution of this protocol, please refer to Thomès et al.1.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden.
| | - Luc Thomès
- University Lille, CHU Lille, ULR 7364 - RADEME - Maladies RAres du DÉveloppement embryonnaire et du Métabolisme, 59000 Lille, France
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden.
| |
Collapse
|
13
|
Kellman BP, Mariethoz J, Zhang Y, Shaul S, Alteri M, Sandoval D, Jeffris M, Armingol E, Bao B, Lisacek F, Bojar D, Lewis NE. Decoding glycosylation potential from protein structure across human glycoproteins with a multi-view recurrent neural network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594334. [PMID: 38798633 PMCID: PMC11118808 DOI: 10.1101/2024.05.15.594334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Glycosylation is described as a non-templated biosynthesis. Yet, the template-free premise is antithetical to the observation that different N-glycans are consistently placed at specific sites. It has been proposed that glycosite-proximal protein structures could constrain glycosylation and explain the observed microheterogeneity. Using site-specific glycosylation data, we trained a hybrid neural network to parse glycosites (recurrent neural network) and match them to feasible N-glycosylation events (graph neural network). From glycosite-flanking sequences, the algorithm predicts most human N-glycosylation events documented in the GlyConnect database and proposed structures corresponding to observed monosaccharide composition of the glycans at these sites. The algorithm also recapitulated glycosylation in Enhanced Aromatic Sequons, SARS-CoV-2 spike, and IgG3 variants, thus demonstrating the ability of the algorithm to predict both glycan structure and abundance. Thus, protein structure constrains glycosylation, and the neural network enables predictive in silico glycosylation of uncharacterized or novel protein sequences and genetic variants.
Collapse
Affiliation(s)
- Benjamin P. Kellman
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
- Augment Biologics, La Jolla, CA 92092
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
| | - Julien Mariethoz
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
| | - Yujie Zhang
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Sigal Shaul
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Mia Alteri
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Daniel Sandoval
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Mia Jeffris
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Erick Armingol
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Bokan Bao
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer Science Department & Section of Biology, University of Geneva, route de Drize 7, CH-1227, Geneva, Switzerland
| | - Daniel Bojar
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg 41390, Sweden
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg 41390, Sweden
| | - Nathan E. Lewis
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
| |
Collapse
|
14
|
Bennett AR, Bojar D. Syntactic sugars: crafting a regular expression framework for glycan structures. BIOINFORMATICS ADVANCES 2024; 4:vbae059. [PMID: 38708029 PMCID: PMC11069104 DOI: 10.1093/bioadv/vbae059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 03/15/2024] [Accepted: 04/17/2024] [Indexed: 05/07/2024]
Abstract
Motivation Structural analysis of glycans poses significant challenges in glycobiology due to their complex sequences. Research questions such as analyzing the sequence content of the α1-6 branch in N-glycans, are biologically meaningful yet can be hard to automate. Results Here, we introduce a regular expression system, designed for glycans, feature-complete, and closely aligned with regular expression formatting. We use this to annotate glycan motifs of arbitrary complexity, perform differential expression analysis on designated sequence stretches, or elucidate branch-specific binding specificities of lectins in an automated manner. We are confident that glycan regular expressions will empower computational analyses of these sequences. Availability and implementation Our regular expression framework for glycans is implemented in Python and is incorporated into the open-source glycowork package (version 1.1+). Code and documentation are available at https://github.com/BojarLab/glycowork/blob/master/glycowork/motif/regex.py.
Collapse
Affiliation(s)
- Alexander R Bennett
- Department of Medical Biochemistry, Institute of Biomedicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| |
Collapse
|
15
|
Lundstrøm J, Gillon E, Chazalet V, Kerekes N, Di Maio A, Feizi T, Liu Y, Varrot A, Bojar D. Elucidating the glycan-binding specificity and structure of Cucumis melo agglutinin, a new R-type lectin. Beilstein J Org Chem 2024; 20:306-320. [PMID: 38410776 PMCID: PMC10896221 DOI: 10.3762/bjoc.20.31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 02/09/2024] [Indexed: 02/28/2024] Open
Abstract
Plant lectins have garnered attention for their roles as laboratory probes and potential therapeutics. Here, we report the discovery and characterization of Cucumis melo agglutinin (CMA1), a new R-type lectin from melon. Our findings reveal CMA1's unique glycan-binding profile, mechanistically explained by its 3D structure, augmenting our understanding of R-type lectins. We expressed CMA1 recombinantly and assessed its binding specificity using multiple glycan arrays, covering 1,046 unique sequences. This resulted in a complex binding profile, strongly preferring C2-substituted, beta-linked galactose (both GalNAc and Fuca1-2Gal), which we contrasted with the established R-type lectin Ricinus communis agglutinin 1 (RCA1). We also report binding of specific glycosaminoglycan subtypes and a general enhancement of binding by sulfation. Further validation using agglutination, thermal shift assays, and surface plasmon resonance confirmed and quantified this binding specificity in solution. Finally, we solved the high-resolution structure of the CMA1 N-terminal domain using X-ray crystallography, supporting our functional findings at the molecular level. Our study provides a comprehensive understanding of CMA1, laying the groundwork for further exploration of its biological and therapeutic potential.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7B, 413 90 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90 Gothenburg, Sweden
| | - Emilie Gillon
- Univ. Grenoble Alpes, CNRS, CERMAV, 601 Rue de la Chimie, 38610 Gières, France
| | - Valérie Chazalet
- Univ. Grenoble Alpes, CNRS, CERMAV, 601 Rue de la Chimie, 38610 Gières, France
| | - Nicole Kerekes
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7B, 413 90 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90 Gothenburg, Sweden
| | - Antonio Di Maio
- Glycosciences Laboratory, Faculty of Medicine, Imperial College London, Du Cane Rd, London W12 0NN, United Kingdom
| | - Ten Feizi
- Glycosciences Laboratory, Faculty of Medicine, Imperial College London, Du Cane Rd, London W12 0NN, United Kingdom
| | - Yan Liu
- Glycosciences Laboratory, Faculty of Medicine, Imperial College London, Du Cane Rd, London W12 0NN, United Kingdom
| | - Annabelle Varrot
- Univ. Grenoble Alpes, CNRS, CERMAV, 601 Rue de la Chimie, 38610 Gières, France
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 7B, 413 90 Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 413 90 Gothenburg, Sweden
| |
Collapse
|
16
|
Lundstrøm J, Urban J, Thomès L, Bojar D. GlycoDraw: a python implementation for generating high-quality glycan figures. Glycobiology 2023; 33:927-934. [PMID: 37498172 PMCID: PMC10859633 DOI: 10.1093/glycob/cwad063] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 07/14/2023] [Accepted: 07/26/2023] [Indexed: 07/28/2023] Open
Abstract
Glycans are essential to all scales of biology, with their intricate structures being crucial for their biological functions. The structural complexity of glycans is communicated through simplified and unified visual representations according to the Symbol Nomenclature for Glycans (SNFGs) guidelines adopted by the community. Here, we introduce GlycoDraw, a Python-native implementation for high-throughput generation of high-quality, SNFG-compliant glycan figures with flexible display options. GlycoDraw is released as part of our glycan analysis ecosystem, glycowork, facilitating integration into existing workflows by enabling fully automated annotation of glycan-related figures and thus assisting the analysis of e.g. differential abundance data or glycomics mass spectra.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| | - James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| | - Luc Thomès
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Medicinaregatan 9C, 41390 Gothenburg, Västra Götaland, Sweden
| |
Collapse
|
17
|
Lundstrøm J, Urban J, Bojar D. Decoding glycomics with a suite of methods for differential expression analysis. CELL REPORTS METHODS 2023; 3:100652. [PMID: 37992708 PMCID: PMC10753297 DOI: 10.1016/j.crmeth.2023.100652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/04/2023] [Accepted: 10/30/2023] [Indexed: 11/24/2023]
Abstract
Glycomics, the comprehensive profiling of all glycan structures in samples, is rapidly expanding to enable insights into physiology and disease mechanisms. However, glycan structure complexity and glycomics data interpretation present challenges, especially for differential expression analysis. Here, we present a framework for differential glycomics expression analysis. Our methodology encompasses specialized and domain-informed methods for data normalization and imputation, glycan motif extraction and quantification, differential expression analysis, motif enrichment analysis, time series analysis, and meta-analytic capabilities, synthesizing results across multiple studies. All methods are integrated into our open-source glycowork package, facilitating performant workflows and user-friendly access. We demonstrate these methods using dedicated simulations and glycomics datasets of N-, O-, lipid-linked, and free glycans. Differential expression tests here focus on human datasets and cancer vs. healthy tissue comparisons. Our rigorous approach allows for robust, reliable, and comprehensive differential expression analyses in glycomics, contributing to advancing glycomics research and its translation to clinical and diagnostic applications.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - James Urban
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, 41390 Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden.
| |
Collapse
|
18
|
Krishna Perumal P, Dong CD, Chauhan AS, Anisha GS, Kadri MS, Chen CW, Singhania RR, Patel AK. Advances in oligosaccharides production from algal sources and potential applications. Biotechnol Adv 2023; 67:108195. [PMID: 37315876 DOI: 10.1016/j.biotechadv.2023.108195] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 06/02/2023] [Accepted: 06/05/2023] [Indexed: 06/16/2023]
Abstract
In recent years, algal-derived glycans and oligosaccharides have become increasingly important in health applications due to higher bioactivities than plant-derived oligosaccharides. The marine organisms have complex, and highly branched glycans and more reactive groups to elicit greater bioactivities. However, complex and large molecules have limited use in broad commercial applications due to dissolution limitations. In comparison to these, oligosaccharides show better solubility and retain their bioactivities, hence, offering better applications opportunity. Accordingly, efforts are being made to develop a cost-effective method for enzymatic extraction of oligosaccharides from algal polysaccharides and algal biomass. Yet detailed structural characterization of algal-derived glycans is required to produce and characterize the potential biomolecules for improved bioactivity and commercial applications. Some macroalgae and microalgae are being evaluated as in vivo biofactories for efficient clinical trials, which could be very helpful in understanding the therapeutic responses. This review discusses the recent advancements in the production of oligosaccharides from microalgae. It also discusses the bottlenecks of the oligosaccharides research, technological limitations, and probable solutions to these problems. Furthermore, it presents the emerging bioactivities of algal oligosaccharides and their promising potential for possible biotherapeutic application.
Collapse
Affiliation(s)
- Pitchurajan Krishna Perumal
- Institute of Aquatic Science and Technology, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan
| | - Cheng-Di Dong
- Institute of Aquatic Science and Technology, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan; Sustainable Environment Research Centre, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan; Department of Marine Environmental Engineering, National Kaohsiung University of Science and Technology, Kaohsiung City, Taiwan
| | - Ajeet Singh Chauhan
- Institute of Aquatic Science and Technology, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan
| | - Grace Sathyanesan Anisha
- Post-Graduate and Research Department of Zoology, Government College for Women, Thiruvananthapuram 695014, Kerala, India
| | - Mohammad Sibtain Kadri
- Department of Marine Biotechnology and Resources, National Sun Yat-Sen University, Kaohsiung City-804201, Taiwan
| | - Chiu-Wen Chen
- Institute of Aquatic Science and Technology, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan; Sustainable Environment Research Centre, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan; Department of Marine Environmental Engineering, National Kaohsiung University of Science and Technology, Kaohsiung City, Taiwan
| | - Reeta Rani Singhania
- Institute of Aquatic Science and Technology, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan; Centre for Energy and Environmental Sustainability, Lucknow 226 029, Uttar Pradesh, India
| | - Anil Kumar Patel
- Institute of Aquatic Science and Technology, National Kaohsiung University of Science and Technology, Kaohsiung City 81157, Taiwan; Centre for Energy and Environmental Sustainability, Lucknow 226 029, Uttar Pradesh, India.
| |
Collapse
|
19
|
Jin C, Lundstrøm J, Korhonen E, Luis AS, Bojar D. Breast Milk Oligosaccharides Contain Immunomodulatory Glucuronic Acid and LacdiNAc. Mol Cell Proteomics 2023; 22:100635. [PMID: 37597722 PMCID: PMC10509713 DOI: 10.1016/j.mcpro.2023.100635] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 07/31/2023] [Accepted: 08/16/2023] [Indexed: 08/21/2023] Open
Abstract
Breast milk is abundant with functionalized milk oligosaccharides (MOs) to nourish and protect the neonate. Yet we lack a comprehensive understanding of the repertoire and evolution of MOs across Mammalia. We report ∼400 MO-species associations (>100 novel structures) from milk glycomics of nine mostly understudied species: alpaca, beluga whale, black rhinoceros, bottlenose dolphin, impala, L'Hoest's monkey, pygmy hippopotamus, domestic sheep, and striped dolphin. This revealed the hitherto unknown existence of the LacdiNAc motif (GalNAcβ1-4GlcNAc) in MOs of all species except alpaca, sheep, and striped dolphin, indicating the widespread occurrence of this potentially antimicrobial motif in MOs. We also characterize glucuronic acid-containing MOs in the milk of impala, dolphins, sheep, and rhinoceros, previously only reported in cows. We demonstrate that these GlcA-MOs exhibit potent immunomodulatory effects. Our study extends the number of known MOs by >15%. Combined with >1900 curated MO-species associations, we characterize MO motif distributions, presenting an exhaustive overview of MO biodiversity.
Collapse
Affiliation(s)
- Chunsheng Jin
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Emma Korhonen
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Ana S Luis
- Department of Medical Biochemistry and Cell Biology, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
20
|
Thomès L, Karlsson V, Lundstrøm J, Bojar D. Mammalian milk glycomes: Connecting the dots between evolutionary conservation and biosynthetic pathways. Cell Rep 2023; 42:112710. [PMID: 37379211 DOI: 10.1016/j.celrep.2023.112710] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 05/09/2023] [Accepted: 06/12/2023] [Indexed: 06/30/2023] Open
Abstract
Milk oligosaccharides (MOs) are among the most abundant constituents of breast milk and are essential for health and development. Biosynthesized from monosaccharides into complex sequences, MOs differ considerably between taxonomic groups. Even human MO biosynthesis is insufficiently understood, hampering evolutionary and functional analyses. Using a comprehensive resource of all published MOs from >100 mammals, we develop a pipeline for generating and analyzing MO biosynthetic networks. We then use evolutionary relationships and inferred intermediates of these networks to discover (1) systematic glycome biases, (2) biosynthetic restrictions, such as reaction path preference, and (3) conserved biosynthetic modules. This allows us to prune and pinpoint biosynthetic pathways despite missing information. Machine learning and network analysis cluster species by their milk glycome, identifying characteristic sequence relationships and evolutionary gains/losses of motifs, MOs, and biosynthetic modules. These resources and analyses will advance our understanding of glycan biosynthesis and the evolution of breast milk.
Collapse
Affiliation(s)
- Luc Thomès
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Viktoria Karlsson
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Jon Lundstrøm
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
21
|
Perez S, Makshakova O, Angulo J, Bedini E, Bisio A, de Paz JL, Fadda E, Guerrini M, Hricovini M, Hricovini M, Lisacek F, Nieto PM, Pagel K, Paiardi G, Richter R, Samsonov SA, Vivès RR, Nikitovic D, Ricard Blum S. Glycosaminoglycans: What Remains To Be Deciphered? JACS AU 2023; 3:628-656. [PMID: 37006755 PMCID: PMC10052243 DOI: 10.1021/jacsau.2c00569] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 12/05/2022] [Accepted: 12/07/2022] [Indexed: 06/19/2023]
Abstract
Glycosaminoglycans (GAGs) are complex polysaccharides exhibiting a vast structural diversity and fulfilling various functions mediated by thousands of interactions in the extracellular matrix, at the cell surface, and within the cells where they have been detected in the nucleus. It is known that the chemical groups attached to GAGs and GAG conformations comprise "glycocodes" that are not yet fully deciphered. The molecular context also matters for GAG structures and functions, and the influence of the structure and functions of the proteoglycan core proteins on sulfated GAGs and vice versa warrants further investigation. The lack of dedicated bioinformatic tools for mining GAG data sets contributes to a partial characterization of the structural and functional landscape and interactions of GAGs. These pending issues will benefit from the development of new approaches reviewed here, namely (i) the synthesis of GAG oligosaccharides to build large and diverse GAG libraries, (ii) GAG analysis and sequencing by mass spectrometry (e.g., ion mobility-mass spectrometry), gas-phase infrared spectroscopy, recognition tunnelling nanopores, and molecular modeling to identify bioactive GAG sequences, biophysical methods to investigate binding interfaces, and to expand our knowledge and understanding of glycocodes governing GAG molecular recognition, and (iii) artificial intelligence for in-depth investigation of GAGomic data sets and their integration with proteomics.
Collapse
Affiliation(s)
- Serge Perez
- Centre
de Recherche sur les Macromolecules, Vegetales,
University of Grenoble-Alpes, Centre National de la Recherche Scientifique, Grenoble F-38041 France
| | - Olga Makshakova
- FRC
Kazan Scientific Center of Russian Academy of Sciences, Kazan Institute of Biochemistry and Biophysics, Kazan 420111, Russia
| | - Jesus Angulo
- Insituto
de Investigaciones Quimicas, CIC Cartuja, CSIC and Universidad de Sevilla, Sevilla, SP 41092, Spain
| | - Emiliano Bedini
- Department
of Chemical Sciences, University of Naples
Federico II, Naples,I-80126, Italy
| | - Antonella Bisio
- Istituto
di Richerche Chimiche e Biochimiche, G. Ronzoni, Milan I-20133, Italy
| | - Jose Luis de Paz
- Insituto
de Investigaciones Quimicas, CIC Cartuja, CSIC and Universidad de Sevilla, Sevilla, SP 41092, Spain
| | - Elisa Fadda
- Department
of Chemistry and Hamilton Institute, Maynooth
University, Maynooth W23 F2H6, Ireland
| | - Marco Guerrini
- Istituto
di Richerche Chimiche e Biochimiche, G. Ronzoni, Milan I-20133, Italy
| | - Michal Hricovini
- Institute
of Chemistry, Slovak Academy of Sciences, Bratislava SK-845 38, Slovakia
| | - Milos Hricovini
- Institute
of Chemistry, Slovak Academy of Sciences, Bratislava SK-845 38, Slovakia
| | - Frederique Lisacek
- Computer
Science Department & Section of Biology, University of Geneva & Swiss Institue of Bioinformatics, Geneva CH-1227, Switzerland
| | - Pedro M. Nieto
- Insituto
de Investigaciones Quimicas, CIC Cartuja, CSIC and Universidad de Sevilla, Sevilla, SP 41092, Spain
| | - Kevin Pagel
- Institut
für Chemie und Biochemie Organische Chemie, Freie Universität Berlin, Berlin 14195, Germany
| | - Giulia Paiardi
- Molecular
and Cellular Modeling Group, Heidelberg Institute for Theoretical
Studies, Heidelberg University, Heidelberg 69118, Germany
| | - Ralf Richter
- School
of Biomedical Sciences, Faculty of Biological Sciences, School of
Physics and Astronomy, Faculty of Engineering and Physical Sciences,
Astbury Centre for Structural Molecular Biology and Bragg Centre for
Materials Research, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Sergey A. Samsonov
- Department
of Theoretical Chemistry, Faculty of Chemistry, University of Gdansk, Gdsank 80-309, Poland
| | - Romain R. Vivès
- Univ.
Grenoble Alpes, CNRS, CEA, IBS, Grenoble F-38044, France
| | - Dragana Nikitovic
- School
of Histology-Embriology, Medical School, University of Crete, Heraklion 71003, Greece
| | - Sylvie Ricard Blum
- University
Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry,
UMR 5246, Villeurbanne F 69622 Cedex, France
| |
Collapse
|
22
|
Joeres R, Bojar D, Kalinina OV. GlyLES: Grammar-based Parsing of Glycans from IUPAC-condensed to SMILES. J Cheminform 2023; 15:37. [PMID: 36959676 PMCID: PMC10035253 DOI: 10.1186/s13321-023-00704-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/18/2023] [Indexed: 03/25/2023] Open
Abstract
Glycans are important polysaccharides on cellular surfaces that are bound to glycoproteins and glycolipids. These are one of the most common post-translational modifications of proteins in eukaryotic cells. They play important roles in protein folding, cell-cell interactions, and other extracellular processes. Changes in glycan structures may influence the course of different diseases, such as infections or cancer. Glycans are commonly represented using the IUPAC-condensed notation. IUPAC-condensed is a textual representation of glycans operating on the same topological level as the Symbol Nomenclature for Glycans (SNFG) that assigns colored, geometrical shapes to the main monomers. These symbols are then connected in tree-like structures, visualizing the glycan structure on a topological level. Yet for a representation on the atomic level, notations such as SMILES should be used. To our knowledge, there is no easy-to-use, general, open-source, and offline tool to convert the IUPAC-condensed notation to SMILES. Here, we present the open-access Python package GlyLES for the generalizable generation of SMILES representations out of IUPAC-condensed representations. GlyLES uses a grammar to read in the monomer tree from the IUPAC-condensed notation. From this tree, the tool can compute the atomic structures of each monomer based on their IUPAC-condensed descriptions. In the last step, it merges all monomers into the atomic structure of a glycan in the SMILES notation. GlyLES is the first package that allows conversion from the IUPAC-condensed notation of glycans to SMILES strings. This may have multiple applications, including straightforward visualization, substructure search, molecular modeling and docking, and a new featurization strategy for machine-learning algorithms. GlyLES is available at https://github.com/kalininalab/GlyLES .
Collapse
Affiliation(s)
- Roman Joeres
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbruecken, Germany
- Center for Bioinformatics, Saarland University, Saarbruecken, Germany
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Olga V. Kalinina
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbruecken, Germany
- Center for Bioinformatics, Saarland University, Saarbruecken, Germany
- Faculty of Medicine, Saarland University, Homburg, Germany
| |
Collapse
|
23
|
Li H, Chiang AWT, Lewis NE. Artificial intelligence in the analysis of glycosylation data. Biotechnol Adv 2022; 60:108008. [PMID: 35738510 PMCID: PMC11157671 DOI: 10.1016/j.biotechadv.2022.108008] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 06/15/2022] [Accepted: 06/16/2022] [Indexed: 11/18/2022]
Abstract
Glycans are complex, yet ubiquitous across biological systems. They are involved in diverse essential organismal functions. Aberrant glycosylation may lead to disease development, such as cancer, autoimmune diseases, and inflammatory diseases. Glycans, both normal and aberrant, are synthesized using extensive glycosylation machinery, and understanding this machinery can provide invaluable insights for diagnosis, prognosis, and treatment of various diseases. Increasing amounts of glycomics data are being generated thanks to advances in glycoanalytics technologies, but to maximize the value of such data, innovations are needed for analyzing and interpreting large-scale glycomics data. Artificial intelligence (AI) provides a powerful analysis toolbox in many scientific fields, and here we review state-of-the-art AI approaches on glycosylation analysis. We further discuss how models can be analyzed to gain mechanistic insights into glycosylation machinery and how the machinery shapes glycans under different scenarios. Finally, we propose how to leverage the gained knowledge for developing predictive AI-based models of glycosylation. Thus, guiding future research of AI-based glycosylation model development will provide valuable insights into glycosylation and glycan machinery.
Collapse
Affiliation(s)
- Haining Li
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Austin W T Chiang
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Nathan E Lewis
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
24
|
Abstract
Artificial intelligence (AI) methods have been and are now being increasingly integrated in prediction software implemented in bioinformatics and its glycoscience branch known as glycoinformatics. AI techniques have evolved in the past decades, and their applications in glycoscience are not yet widespread. This limited use is partly explained by the peculiarities of glyco-data that are notoriously hard to produce and analyze. Nonetheless, as time goes, the accumulation of glycomics, glycoproteomics, and glycan-binding data has reached a point where even the most recent deep learning methods can provide predictors with good performance. We discuss the historical development of the application of various AI methods in the broader field of glycoinformatics. A particular focus is placed on shining a light on challenges in glyco-data handling, contextualized by lessons learnt from related disciplines. Ending on the discussion of state-of-the-art deep learning approaches in glycoinformatics, we also envision the future of glycoinformatics, including development that need to occur in order to truly unleash the capabilities of glycoscience in the systems biology era.
Collapse
Affiliation(s)
- Daniel Bojar
- Department
of Chemistry and Molecular Biology, University
of Gothenburg, Gothenburg 41390, Sweden
- Wallenberg
Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg 41390, Sweden
| | - Frederique Lisacek
- Proteome
Informatics Group, Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer
Science Department & Section of Biology, University of Geneva, route de Drize 7, CH-1227, Geneva, Switzerland
| |
Collapse
|
25
|
Akmal MA, Hassan MA, Muhammad S, Khurshid KS, Mohamed A. An analytical study on the identification of N-linked glycosylation sites using machine learning model. PeerJ Comput Sci 2022; 8:e1069. [PMID: 36262138 PMCID: PMC9575850 DOI: 10.7717/peerj-cs.1069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 07/25/2022] [Indexed: 06/16/2023]
Abstract
N-linked is the most common type of glycosylation which plays a significant role in identifying various diseases such as type I diabetes and cancer and helps in drug development. Most of the proteins cannot perform their biological and psychological functionalities without undergoing such modification. Therefore, it is essential to identify such sites by computational techniques because of experimental limitations. This study aims to analyze and synthesize the progress to discover N-linked places using machine learning methods. It also explores the performance of currently available tools to predict such sites. Almost seventy research articles published in recognized journals of the N-linked glycosylation field have shortlisted after the rigorous filtering process. The findings of the studies have been reported based on multiple aspects: publication channel, feature set construction method, training algorithm, and performance evaluation. Moreover, a literature survey has developed a taxonomy of N-linked sequence identification. Our study focuses on the performance evaluation criteria, and the importance of N-linked glycosylation motivates us to discover resources that use computational methods instead of the experimental method due to its limitations.
Collapse
Affiliation(s)
- Muhammad Aizaz Akmal
- Department of Computer Science, University of Engineering and Technology, KSK, Lahore, Punjab, Pakistan
| | - Muhammad Awais Hassan
- Department of Computer Science, University of Engineering and Technology, Lahore, Punjab, Pakistan
| | - Shoaib Muhammad
- Department of Computer Science, University of Engineering and Technology, Lahore, Punjab, Pakistan
| | - Khaldoon S. Khurshid
- Department of Computer Science, University of Engineering and Technology, Lahore, Punjab, Pakistan
| | | |
Collapse
|
26
|
Flevaris K, Kontoravdi C. Immunoglobulin G N-glycan Biomarkers for Autoimmune Diseases: Current State and a Glycoinformatics Perspective. Int J Mol Sci 2022; 23:5180. [PMID: 35563570 PMCID: PMC9100869 DOI: 10.3390/ijms23095180] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/02/2022] [Accepted: 05/04/2022] [Indexed: 02/04/2023] Open
Abstract
The effective treatment of autoimmune disorders can greatly benefit from disease-specific biomarkers that are functionally involved in immune system regulation and can be collected through minimally invasive procedures. In this regard, human serum IgG N-glycans are promising for uncovering disease predisposition and monitoring progression, and for the identification of specific molecular targets for advanced therapies. In particular, the IgG N-glycome in diseased tissues is considered to be disease-dependent; thus, specific glycan structures may be involved in the pathophysiology of autoimmune diseases. This study provides a critical overview of the literature on human IgG N-glycomics, with a focus on the identification of disease-specific glycan alterations. In order to expedite the establishment of clinically-relevant N-glycan biomarkers, the employment of advanced computational tools for the interpretation of clinical data and their relationship with the underlying molecular mechanisms may be critical. Glycoinformatics tools, including artificial intelligence and systems glycobiology approaches, are reviewed for their potential to provide insight into patient stratification and disease etiology. Challenges in the integration of such glycoinformatics approaches in N-glycan biomarker research are critically discussed.
Collapse
Affiliation(s)
| | - Cleo Kontoravdi
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
27
|
Lundstrøm J, Korhonen E, Lisacek F, Bojar D. LectinOracle: A Generalizable Deep Learning Model for Lectin-Glycan Binding Prediction. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2103807. [PMID: 34862760 PMCID: PMC8728848 DOI: 10.1002/advs.202103807] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/03/2021] [Indexed: 05/07/2023]
Abstract
Ranging from bacterial cell adhesion over viral cell entry to human innate immunity, glycan-binding proteins or lectins are abound in nature. Widely used as staining and characterization reagents in cell biology and crucial for understanding the interactions in biological systems, lectins are a focal point of study in glycobiology. Yet the sheer breadth and depth of specificity for diverse oligosaccharide motifs has made studying lectins a largely piecemeal approach, with few options to generalize. Here, LectinOracle, a model combining transformer-based representations for proteins and graph convolutional neural networks for glycans to predict their interaction, is presented. Using a curated data set of 564,647 unique protein-glycan interactions, it is shown that LectinOracle predictions agree with literature-annotated specificities for a wide range of lectins. Using a range of specialized glycan arrays, it is shown that LectinOracle predictions generalize to new glycans and lectins, with qualitative and quantitative agreement with experimental data. It is further demonstrated that LectinOracle can be used to improve lectin classification, accelerate lectin directed evolution, predict epidemiological outcomes in the context of influenza virus, and analyze whole lectomes in host-microbe interactions. It is envisioned that the herein presented platform will advance both the study of lectins and their role in (glyco)biology.
Collapse
Affiliation(s)
- Jon Lundstrøm
- Department of Chemistry and Molecular BiologyUniversity of GothenburgGothenburg41390Sweden
- Wallenberg Centre for Molecular and Translational MedicineUniversity of GothenburgGothenburg41390Sweden
| | - Emma Korhonen
- Department of Chemistry and Molecular BiologyUniversity of GothenburgGothenburg41390Sweden
- Wallenberg Centre for Molecular and Translational MedicineUniversity of GothenburgGothenburg41390Sweden
| | - Frédérique Lisacek
- Swiss Institute of BioinformaticsGeneva1227Switzerland
- Computer Science DepartmentUniGeGeneva1227Switzerland
- Section of BiologyUniGeGeneva1205Switzerland
| | - Daniel Bojar
- Department of Chemistry and Molecular BiologyUniversity of GothenburgGothenburg41390Sweden
- Wallenberg Centre for Molecular and Translational MedicineUniversity of GothenburgGothenburg41390Sweden
| |
Collapse
|
28
|
Dealing with the Ambiguity of Glycan Substructure Search. MOLECULES (BASEL, SWITZERLAND) 2021; 27:molecules27010065. [PMID: 35011294 PMCID: PMC8746581 DOI: 10.3390/molecules27010065] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 12/17/2021] [Accepted: 12/17/2021] [Indexed: 01/15/2023]
Abstract
The level of ambiguity in describing glycan structure has significantly increased with the upsurge of large-scale glycomics and glycoproteomics experiments. Consequently, an ontology-based model appears as an appropriate solution for navigating these data. However, navigation is not sufficient and the model should also enable advanced search and comparison. A new ontology with a tree logical structure is introduced to represent glycan structures irrespective of the precision of molecular details. The model heavily relies on the GlycoCT encoding of glycan structures. Its implementation in the GlySTreeM knowledge base was validated with GlyConnect data and benchmarked with the Glycowork library. GlySTreeM is shown to be fast, consistent, reliable and more flexible than existing solutions for matching parts of or whole glycan structures. The model is also well suited for painless future expansion.
Collapse
|
29
|
Thomès L, Bojar D. The Role of Fucose-Containing Glycan Motifs Across Taxonomic Kingdoms. Front Mol Biosci 2021; 8:755577. [PMID: 34631801 PMCID: PMC8492980 DOI: 10.3389/fmolb.2021.755577] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 09/10/2021] [Indexed: 11/13/2022] Open
Abstract
The extraordinary diversity of glycans leads to large differences in the glycomes of different kingdoms of life. Yet, while most monosaccharides are solely found in certain taxonomic groups, there is a small set of monosaccharides with widespread distribution across nearly all domains of life. These general monosaccharides are particularly relevant for glycan motifs, as they can readily be used by commensals and pathogens to mimic host glycans or hijack existing glycan recognition systems. Among these, the monosaccharide fucose is especially interesting, as it frequently presents itself as a terminal monosaccharide, primed for interaction with proteins. Here, we analyze fucose-containing glycan motifs across all taxonomic kingdoms. Using a hereby presented large species-specific glycan dataset and a plethora of methods for glycan-focused bioinformatics and machine learning, we identify characteristic as well as shared fucose-containing glycan motifs for various taxonomic groups, demonstrating clear differences in fucose usage. Even within domains, fucose is used differentially based on an organism’s physiology and habitat. We particularly highlight differences in fucose-containing motifs between vertebrates and invertebrates. With the example of pathogenic and non-pathogenic Escherichia coli strains, we also demonstrate the importance of fucose-containing motifs in molecular mimicry and thereby pathogenic potential. We envision that this study will shed light on an important class of glycan motifs, with potential new insights into the role of fucosylated glycans in symbiosis, pathogenicity, and immunity.
Collapse
Affiliation(s)
- Luc Thomès
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.,Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|