1
|
Hayes C, Daponte V, Mariethoz J, Lisacek F. This is GlycoQL. Bioinformatics 2022; 38:ii162-ii167. [PMID: 36124803 DOI: 10.1093/bioinformatics/btac500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION We have previously designed and implemented a tree-based ontology to represent glycan structures with the aim of searching these structures with a glyco-driven syntax. This resulted in creating the GlySTreeM knowledge-base as a linchpin of the structural matching procedure and we now introduce a query language, called GlycoQL, for the actual implementation of a glycan structure search. RESULTS The methodology is described and illustrated with a use-case focused on Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) spike protein glycosylation. We show how to enhance site annotation with federated queries involving UniProt and GlyConnect, our glycoprotein database. AVAILABILITY AND IMPLEMENTATION https://glyconnect.expasy.org/glycoql/.
Collapse
Affiliation(s)
- Catherine Hayes
- Department of Computer Science, University of Geneva, Geneva 1227, Switzerland.,Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland
| | - Vincenzo Daponte
- Department of Computer Science, University of Geneva, Geneva 1227, Switzerland.,Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland
| | - Julien Mariethoz
- Department of Computer Science, University of Geneva, Geneva 1227, Switzerland.,Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland
| | - Frederique Lisacek
- Department of Computer Science, University of Geneva, Geneva 1227, Switzerland.,Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland.,Section of Biology, University of Geneva, Geneva 1211, Switzerland
| |
Collapse
|
2
|
Mariethoz J, Alocci D, Karlsson NG, Packer NH, Lisacek F. An Interactive View of Glycosylation. Methods Mol Biol 2022; 2370:41-65. [PMID: 34611864 DOI: 10.1007/978-1-0716-1685-7_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The present chapter focuses on the interactive and explorative aspects of bioinformatics resources that have been recently released in glycobiology. The comparative analysis of data in a field where knowledge is scattered, incomplete, and disconnected from main biology requires efficient visualization, integration, and interactive tools that are currently only partially implemented. This overview highlights converging efforts toward building a consistent picture of protein glycosylation.
Collapse
Affiliation(s)
- Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
- Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Niclas G Karlsson
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Nicolle H Packer
- Department of Molecular Sciences and ARC Centre of Excellence for Nanoscale Biophotonics, Macquarie University, Sydney, NSW, Australia
- Institute for Glycomics, Griffith University, Gold Coast, QLD, Australia
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, University of Geneva, Geneva, Switzerland.
| |
Collapse
|
3
|
Yamada I, Campbell MP, Edwards N, Castro LJ, Lisacek F, Mariethoz J, Ono T, Ranzinger R, Shinmachi D, Aoki-Kinoshita KF. Corrigendum to: The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application. Glycobiology 2021; 32:909. [PMID: 34379754 PMCID: PMC9487897 DOI: 10.1093/glycob/cwab065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 12/31/2020] [Accepted: 01/01/2021] [Indexed: 12/01/2022] Open
Affiliation(s)
- Issaku Yamada
- Research Department, The Noguchi Institute, 1-9-7 Kaga, Itabashi, Tokyo 173-0003, Japan
| | - Matthew P Campbell
- Institute for Glycomics, Griffith University at Gold Coast, Southport, QLD 4215, Australia
| | - Nathan Edwards
- Department of Biochemistry, Molecular and Cellular Biology, Georgetown University Medical Center, Washington, D.C. 20007, USA
| | - Leyla Jael Castro
- ZB MED Information Centre for Life Sciences, Gleueler Str. 60, 50931 Cologne, Germany
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Computer Science Department, University of Geneva, route de Drize 7, CH - 1227 Geneva Switzerland, and also Section of Biology, University of Geneva, Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Tamiko Ono
- Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| | - Rene Ranzinger
- Complex Carbohydrate Research Center, The University of Georgia, 315 Riverbend Rd, Athens, Georgia 30602, USA
| | - Daisuke Shinmachi
- R&D Department, SparqLite LLC., 1615-22 Ishikawamachi, Hachioji, Tokyo 192-0032, Japan
| | - Kiyoko F Aoki-Kinoshita
- Glycan & Life Science Integration Center (GaLSIC), Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| |
Collapse
|
4
|
Murugan AVM, Oliveira T, Alagesan K, Mojtahedinyazdi Y, Mariethoz J, Hayes C, Lisacek F, Karlsson N, Nothaft H, Szymanski C, Finlayson K, Merwe J, Richardson S, Kolarich D. Evolutionary Glycomics: A Comprehensive Study of Vertebrate Host Serum/Plasma Glycome Using Orthogonal Glycomics Techniques. FASEB J 2021. [DOI: 10.1096/fasebj.2021.35.s1.04538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
| | | | | | | | - Julien Mariethoz
- SIB Swiss Institute of BioinformaticsSIB Swiss Institute of BioinformaticsGeneva
| | - Catherine Hayes
- SIB Swiss Institute of BioinformaticsSIB Swiss Institute of BioinformaticsGeneva
| | - Frédérique Lisacek
- SIB Swiss Institute of BioinformaticsSIB Swiss Institute of BioinformaticsGeneva
| | - Niclas Karlsson
- Department of Medical BiochemistryUniversity of GothenburgGothenburg
| | | | | | | | | | | | | |
Collapse
|
5
|
Yamada I, Campbell MP, Edwards N, Castro LJ, Lisacek F, Mariethoz J, Ono T, Ranzinger R, Shinmachi D, Aoki-Kinoshita KF. The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application. Glycobiology 2021; 31:741-750. [PMID: 33677548 PMCID: PMC8351504 DOI: 10.1093/glycob/cwab013] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 12/31/2020] [Accepted: 01/01/2021] [Indexed: 01/19/2023] Open
Abstract
Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).
Collapse
Affiliation(s)
- Issaku Yamada
- Research Department, The Noguchi Institute, 1-9-7 Kaga, Itabashi, Tokyo 173-0003, Japan
| | - Matthew P Campbell
- Institute for Glycomics, Griffith University at Gold Coast, Southport, QLD 4215, Australia
| | - Nathan Edwards
- Department of Biochemistry, Molecular and Cellular Biology, Georgetown University Medical Center, Washington, D.C. 20007, USA
| | - Leyla Jael Castro
- ZB MED Information Centre for Life Sciences, Gleueler Str. 60, 50931 Cologne, Germany
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Computer Science Department, University of Geneva, route de Drize 7, CH - 1227 Geneva Switzerland, and also Section of Biology, University of Geneva, Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Tamiko Ono
- Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| | - Rene Ranzinger
- Complex Carbohydrate Research Center, The University of Georgia, 315 Riverbend Rd, Athens, Georgia 30602, USA
| | - Daisuke Shinmachi
- R&D Department, SparqLite LLC., 1615-22 Ishikawamachi, Hachioji, Tokyo 192-0032, Japan
| | - Kiyoko F Aoki-Kinoshita
- Glycan & Life Science Integration Center (GaLSIC), Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| |
Collapse
|
6
|
Bonnardel F, Mariethoz J, Pérez S, Imberty A, Lisacek F. LectomeXplore, an update of UniLectin for the discovery of carbohydrate-binding proteins based on a new lectin classification. Nucleic Acids Res 2021; 49:D1548-D1554. [PMID: 33174598 PMCID: PMC7778903 DOI: 10.1093/nar/gkaa1019] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/13/2020] [Accepted: 10/16/2020] [Indexed: 12/22/2022] Open
Abstract
Lectins are non-covalent glycan-binding proteins mediating cellular interactions but their annotation in newly sequenced organisms is lacking. The limited size of functional domains and the low level of sequence similarity challenge usual bioinformatics tools. The identification of lectin domains in proteomes requires the manual curation of sequence alignments based on structural folds. A new lectin classification is proposed. It is built on three levels: (i) 35 lectin domain folds, (ii) 109 classes of lectins sharing at least 20% sequence similarity and (iii) 350 families of lectins sharing at least 70% sequence similarity. This information is compiled in the UniLectin platform that includes the previously described UniLectin3D database of curated lectin 3D structures. Since its first release, UniLectin3D has been updated with 485 additional 3D structures. The database is now complemented by two additional modules: PropLec containing predicted β-propeller lectins and LectomeXplore including predicted lectins from sequences of the NBCI-nr and UniProt for every curated lectin class. UniLectin is accessible at https://www.unilectin.eu/.
Collapse
Affiliation(s)
- François Bonnardel
- Univ. Grenoble Alpes, CNRS, CERMAV, 38000 Grenoble, France
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
- Section of Biology, University of Geneva, CH-1205 Geneva, Switzerland
| | - Serge Pérez
- Univ. Grenoble Alpes, CNRS, CERMAV, 38000 Grenoble, France
| | - Anne Imberty
- Univ. Grenoble Alpes, CNRS, CERMAV, 38000 Grenoble, France
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
- Section of Biology, University of Geneva, CH-1205 Geneva, Switzerland
| |
Collapse
|
7
|
Robin T, Mariethoz J, Lisacek F. Examining and Fine-tuning the Selection of Glycan Compositions with GlyConnect Compozitor. Mol Cell Proteomics 2020; 19:1602-1618. [PMID: 32636234 PMCID: PMC8014996 DOI: 10.1074/mcp.ra120.002041] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 07/01/2020] [Indexed: 01/22/2023] Open
Abstract
A key point in achieving accurate intact glycopeptide identification is the definition of the glycan composition file that is used to match experimental with theoretical masses by a glycoproteomics search engine. At present, these files are mainly built from searching the literature and/or querying data sources focused on posttranslational modifications. Most glycoproteomics search engines include a default composition file that is readily used when processing MS data. We introduce here a glycan composition visualizing and comparative tool associated with the GlyConnect database and called GlyConnect Compozitor. It offers a web interface through which the database can be queried to bring out contextual information relative to a set of glycan compositions. The tool takes advantage of compositions being related to one another through shared monosaccharide counts and outputs interactive graphs summarizing information searched in the database. These results provide a guide for selecting or deselecting compositions in a file in order to reflect the context of a study as closely as possible. They also confirm the consistency of a set of compositions based on the content of the GlyConnect database. As part of the tool collection of the Glycomics@ExPASy initiative, Compozitor is hosted at https://glyconnect.expasy.org/compozitor/ where it can be run as a web application. It is also directly accessible from the GlyConnect database.
Collapse
Affiliation(s)
- Thibault Robin
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CMU, Geneva, Switzerland; Computer Science Dept., Faculty of Science, University of Geneva, Switzerland; CALIPHO Group, SIB Swiss Institute of BioinformaticsCMU, Geneva, Switzerland; Microbiology and Molecular Medicine Dept., Faculty of Medicine, University of Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CMU, Geneva, Switzerland; Computer Science Dept., Faculty of Science, University of Geneva, Switzerland
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CMU, Geneva, Switzerland; Computer Science Dept., Faculty of Science, University of Geneva, Switzerland; Section of Biology, Faculty of Science, University of Geneva, Switzerland.
| |
Collapse
|
8
|
Bonnardel F, Mariethoz J, Salentin S, Robin X, Schroeder M, Perez S, Lisacek F, Imberty A. UniLectin3D, a database of carbohydrate binding proteins with curated information on 3D structures and interacting ligands. Nucleic Acids Res 2020; 47:D1236-D1244. [PMID: 30239928 PMCID: PMC6323968 DOI: 10.1093/nar/gky832] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Accepted: 09/07/2018] [Indexed: 01/02/2023] Open
Abstract
Lectins, and related receptors such as adhesins and toxins, are glycan-binding proteins from all origins that decipher the glycocode, i.e. the structural information encoded in the conformation of complex carbohydrates present on the surface of all cells. Lectins are still poorly classified and annotated, but since their functions are based on ligand recognition, their 3D-structures provide a solid foundation for characterization. UniLectin3D is a curated database that classifies lectins on origin and fold, with cross-links to literature, other databases in glycosciences and functional data such as known specificity. The database provides detailed information on lectins, their bound glycan ligands, and features their interactions using the Protein–Ligand Interaction Profiler (PLIP) server. Special care was devoted to the description of the bound glycan ligands with the use of simple graphical representation and numerical format for cross-linking to other databases in glycoscience. We conceived the design of the database architecture and the navigation tools to account for all organisms, as well as to search for oligosaccharide epitopes complexed within specified binding sites. UniLectin3D is accessible at https://www.unilectin.eu/unilectin3D.
Collapse
Affiliation(s)
- François Bonnardel
- Univ. Grenoble Alpes, CNRS, CERMAV, 38000 Grenoble, France.,Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland.,Department of Computer Science, University of Geneva, Route de Drize 7, CH-1227 Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland.,Department of Computer Science, University of Geneva, Route de Drize 7, CH-1227 Geneva, Switzerland
| | - Sebastian Salentin
- Biotechnology Center (BIOTEC), TU Dresden, Tatzberg 47-49, 01307 Dresden, Germany
| | - Xavier Robin
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,Computational Structural Biology Group, SIB Swiss Institute of Bioinformatics, CH-4056 Basel, Switzerland
| | - Michael Schroeder
- Biotechnology Center (BIOTEC), TU Dresden, Tatzberg 47-49, 01307 Dresden, Germany
| | - Serge Perez
- Univ. Grenoble Alpes, CNRS, DPM, 38000 Grenoble, France
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CH-1227 Geneva, Switzerland.,Department of Computer Science, University of Geneva, Route de Drize 7, CH-1227 Geneva, Switzerland.,Section of Biology, University of Geneva, CH-1205 Geneva, Switzerland
| | - Anne Imberty
- Univ. Grenoble Alpes, CNRS, CERMAV, 38000 Grenoble, France
| |
Collapse
|
9
|
Rojas-Macias MA, Mariethoz J, Andersson P, Jin C, Venkatakrishnan V, Aoki NP, Shinmachi D, Ashwood C, Madunic K, Zhang T, Miller RL, Horlacher O, Struwe WB, Watanabe Y, Okuda S, Levander F, Kolarich D, Rudd PM, Wuhrer M, Kettner C, Packer NH, Aoki-Kinoshita KF, Lisacek F, Karlsson NG. Towards a standardized bioinformatics infrastructure for N- and O-glycomics. Nat Commun 2019; 10:3275. [PMID: 31332201 PMCID: PMC6796180 DOI: 10.1038/s41467-019-11131-x] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 06/24/2019] [Indexed: 12/21/2022] Open
Abstract
The mass spectrometry (MS)-based analysis of free polysaccharides and glycans released from proteins, lipids and proteoglycans increasingly relies on databases and software. Here, we review progress in the bioinformatics analysis of protein-released N- and O-linked glycans (N- and O-glycomics) and propose an e-infrastructure to overcome current deficits in data and experimental transparency. This workflow enables the standardized submission of MS-based glycomics information into the public repository UniCarb-DR. It implements the MIRAGE (Minimum Requirement for A Glycomics Experiment) reporting guidelines, storage of unprocessed MS data in the GlycoPOST repository and glycan structure registration using the GlyTouCan registry, thereby supporting the development and extension of a glycan structure knowledgebase.
Collapse
Affiliation(s)
- Miguel A Rojas-Macias
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, 40530, Sweden
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
- Computer Science Department, University of Geneva, Geneva, 1227, Switzerland
| | - Peter Andersson
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, 40530, Sweden
| | - Chunsheng Jin
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, 40530, Sweden
| | - Vignesh Venkatakrishnan
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, 40530, Sweden
| | - Nobuyuki P Aoki
- Soka University, Hachioji, 192-8577, Tokyo, Japan
- SparqLite LLC., Hachioji, 192-0032, Tokyo, Japan
| | - Daisuke Shinmachi
- Soka University, Hachioji, 192-8577, Tokyo, Japan
- SparqLite LLC., Hachioji, 192-0032, Tokyo, Japan
| | - Christopher Ashwood
- Department of Molecular Sciences, Macquarie University, Sydney, 2109, Australia
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
| | | | - Tao Zhang
- Leiden University Medical Center, Leiden, 2333ZA, Netherlands
| | - Rebecca L Miller
- Copenhagen Centre for Glycomics, Department of Cellular and Molecular Medicine, University of Copenhagen, København, DK-2200, Denmark
| | - Oliver Horlacher
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Weston B Struwe
- Department of Chemistry, Chemistry Research Laboratory, University of Oxford, Oxford, OX1 3TA, UK
| | - Yu Watanabe
- Graduate School of Medical and Dental Sciences, Niigata University, 950-2181, Niigata, Japan
| | - Shujiro Okuda
- Graduate School of Medical and Dental Sciences, Niigata University, 950-2181, Niigata, Japan
| | - Fredrik Levander
- National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Department of Immunotechnology, Lund University, Lund, 22387, Sweden
| | - Daniel Kolarich
- Institute for Glycomics, Gold Coast Campus, Griffith University, Gold Coast, QLD, QLD 4222, Australia
- ARC Centre for Nanoscale BioPhotonics, Macquarie University and Griffith University, North Ryde and Gold Coast, NSW and QLD, NSW 2109 and QLD 4222, Australia
| | - Pauline M Rudd
- Bioprocessing Technology Institute, AStar, Singapore, 138668, Singapore
| | - Manfred Wuhrer
- Leiden University Medical Center, Leiden, 2333ZA, Netherlands
| | | | - Nicolle H Packer
- Department of Molecular Sciences, Macquarie University, Sydney, 2109, Australia
- Institute for Glycomics, Gold Coast Campus, Griffith University, Gold Coast, QLD, QLD 4222, Australia
- ARC Centre for Nanoscale BioPhotonics, Macquarie University and Griffith University, North Ryde and Gold Coast, NSW and QLD, NSW 2109 and QLD 4222, Australia
| | | | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
- Computer Science Department, University of Geneva, Geneva, 1227, Switzerland
- Section of Biology, University of Geneva, Geneva, 1211, Switzerland
| | - Niclas G Karlsson
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, 40530, Sweden.
| |
Collapse
|
10
|
Clerc O, Mariethoz J, Rivet A, Lisacek F, Pérez S, Ricard-Blum S. A pipeline to translate glycosaminoglycan sequences into 3D models. Application to the exploration of glycosaminoglycan conformational space. Glycobiology 2019; 29:36-44. [PMID: 30239692 DOI: 10.1093/glycob/cwy084] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Accepted: 09/16/2018] [Indexed: 12/11/2022] Open
Abstract
Mammalian glycosaminoglycans are linear complex polysaccharides comprising heparan sulfate, heparin, dermatan sulfate, chondroitin sulfate, keratan sulfate and hyaluronic acid. They bind to numerous proteins and these interactions mediate their biological activities. GAG-protein interaction data reported in the literature are curated mostly in MatrixDB database (http://matrixdb.univ-lyon1.fr/). However, a standard nomenclature and a machine-readable format of GAGs together with bioinformatics tools for mining these interaction data are lacking. We report here the building of an automated pipeline to (i) standardize the format of GAG sequences interacting with proteins manually curated from the literature, (ii) translate them into the machine-readable GlycoCT format and into SNFG (Symbol Nomenclature For Glycan) images and (iii) convert their sequences into a format processed by a builder generating three-dimensional structures of polysaccharides based on a repertoire of conformations experimentally validated by data extracted from crystallized GAG-protein complexes. We have developed for this purpose a converter (the CT23D converter) to automatically translate the GlycoCT code of a GAG sequence into the input file required to construct a three-dimensional model.
Collapse
Affiliation(s)
- Olivier Clerc
- University Lyon, University Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry, UMR 5246, Villeurbanne Cedex, France
| | - Julien Mariethoz
- SIB Swiss Institute of Bioinformatics, Geneva 4, Switzerland.,Department of Computer Science, University of Geneva, Geneva 4, Switzerland
| | - Alain Rivet
- Centre de Recherches sur les MAcromolécules Végétales, UPR 5301 CNRS, University Grenoble Alpes, Grenoble, France
| | - Frédérique Lisacek
- SIB Swiss Institute of Bioinformatics, Geneva 4, Switzerland.,Department of Computer Science, University of Geneva, Geneva 4, Switzerland.,Section of Biology, University of Geneva, Geneva 4, Switzerland
| | - Serge Pérez
- Centre de Recherches sur les MAcromolécules Végétales, UPR 5301 CNRS, University Grenoble Alpes, Grenoble, France
| | - Sylvie Ricard-Blum
- University Lyon, University Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry, UMR 5246, Villeurbanne Cedex, France
| |
Collapse
|
11
|
Alocci D, Mariethoz J, Gastaldello A, Gasteiger E, Karlsson NG, Kolarich D, Packer NH, Lisacek F. GlyConnect: Glycoproteomics Goes Visual, Interactive, and Analytical. J Proteome Res 2018; 18:664-677. [DOI: 10.1021/acs.jproteome.8b00766] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Rue Michel-Servet 1, CH-1211 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Rue Michel-Servet 1, CH-1211 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
| | - Alessandra Gastaldello
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Rue Michel-Servet 1, CH-1211 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
| | - Elisabeth Gasteiger
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CH-1211 Geneva, Switzerland
| | - Niclas G. Karlsson
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, University of Gothenburg, 40530 Gothenburg, Sweden
| | - Daniel Kolarich
- Institute for Glycomics, Griffith University, Southport, Queensland 4215, Australia
- ARC Centre for Nanoscale BioPhotonics, Macquarie University and Griffith University, Sydney, New South Wales 2109, Australia
| | - Nicolle H. Packer
- Institute for Glycomics, Griffith University, Southport, Queensland 4215, Australia
- ARC Centre for Nanoscale BioPhotonics, Macquarie University and Griffith University, Sydney, New South Wales 2109, Australia
- Department of Molecular Sciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Rue Michel-Servet 1, CH-1211 Geneva, Switzerland
- Computer Science Department, University of Geneva, CH-1227 Geneva, Switzerland
- Section of Biology, University of Geneva, CH-1211 Geneva, Switzerland
| |
Collapse
|
12
|
Alocci D, Suchánková P, Costa R, Hory N, Mariethoz J, Vařeková RS, Toukach P, Lisacek F. SugarSketcher: Quick and Intuitive Online Glycan Drawing. Molecules 2018; 23:E3206. [PMID: 30563078 PMCID: PMC6320881 DOI: 10.3390/molecules23123206] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 11/23/2018] [Accepted: 11/29/2018] [Indexed: 01/24/2023] Open
Abstract
SugarSketcher is an intuitive and fast JavaScript interface module for online drawing of glycan structures in the popular Symbol Nomenclature for Glycans (SNFG) notation and exporting them to various commonly used formats encoding carbohydrate sequences (e.g., GlycoCT) or quality images (e.g., svg). It does not require a backend server or any specific browser plugins and can be integrated in any web glycoinformatics project. SugarSketcher allows drawing glycans both for glycobiologists and non-expert users. The "quick mode" allows a newcomer to build up a glycan structure having only a limited knowledge in carbohydrate chemistry. The "normal mode" integrates advanced options which enable glycobiologists to tailor complex carbohydrate structures. The source code is freely available on GitHub and glycoinformaticians are encouraged to participate in the development process while users are invited to test a prototype available on the ExPASY web-site and send feedback.
Collapse
Affiliation(s)
- Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland.
- Computer Science Department, University of Geneva, 1211 Geneva, Switzerland.
| | - Pavla Suchánková
- CEITEC⁻Central European Institute of Technology, Masaryk University Brno, 625 00 Brno-Bohunice, Czech Republic.
- National Centre for Biomolecular Research, Faculty of Science, 625 00 Brno-Bohunice, Czech Republic.
| | - Renaud Costa
- Polytech Nice Sophia, Campus SophiaTech, 06903 Sophia-Antipolis, France.
| | - Nicolas Hory
- Polytech Nice Sophia, Campus SophiaTech, 06903 Sophia-Antipolis, France.
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland.
- Computer Science Department, University of Geneva, 1211 Geneva, Switzerland.
| | - Radka Svobodová Vařeková
- CEITEC⁻Central European Institute of Technology, Masaryk University Brno, 625 00 Brno-Bohunice, Czech Republic.
- National Centre for Biomolecular Research, Faculty of Science, 625 00 Brno-Bohunice, Czech Republic.
| | - Philip Toukach
- Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Laboratory of Carbohydrate Chemistry, 119991 Moscow, Russia.
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland.
- Computer Science Department, University of Geneva, 1211 Geneva, Switzerland.
- Section of Biology, University of Geneva, 1211 Geneva, Switzerland.
| |
Collapse
|
13
|
Alocci D, Ghraichy M, Barletta E, Gastaldello A, Mariethoz J, Lisacek F. Understanding the glycome: an interactive view of glycosylation from glycocompositions to glycoepitopes. Glycobiology 2018. [PMID: 29518231 DOI: 10.1093/glycob/cwy019] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Nowadays, due to the advance of experimental techniques in glycomics, large collections of glycan profiles are regularly published. The rapid growth of available glycan data accentuates the lack of innovative tools for visualizing and exploring large amount of information. Scientists resort to using general-purpose spreadsheet applications to create ad hoc data visualization. Thus, results end up being encoded in publication images and text, while valuable curated data is stored in files as supplementary information. To tackle this problem, we have built an interactive pipeline composed with three tools: Glynsight, EpitopeXtractor and Glydin'. Glycan profile data can be imported in Glynsight, which generates a custom interactive glycan profile. Several profiles can be compared and glycan composition is integrated with structural data stored in databases. Glycan structures of interest can then be sent to EpitopeXtractor to perform a glycoepitope extraction. EpitopeXtractor results can be superimposed on the Glydin' glycoepitope network. The network visualization allows fast detection of clusters of glycoepitopes and discovery of potential new targets. Each of these tools is standalone or can be used in conjunction with the others, depending on the data and the specific interest of the user. All the tools composing this pipeline are part of the Glycomics@ExPASy initiative and are available at https://www.expasy.org/glycomics.
Collapse
Affiliation(s)
- Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland.,Computer Science Department CUI, University of Geneva, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Marie Ghraichy
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland.,Division of Immunology, University Children's Hospital Zurich, Zurich, Switzerland
| | - Elena Barletta
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland.,Computer Science Department CUI, University of Geneva, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Alessandra Gastaldello
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland.,Computer Science Department CUI, University of Geneva, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland.,Computer Science Department CUI, University of Geneva, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland.,Computer Science Department CUI, University of Geneva, 7 Route de Drize, 1227 Geneva, Switzerland.,Section of Biology, University of Geneva, Geneva, Switzerland
| |
Collapse
|
14
|
Mariethoz J, Alocci D, Gastaldello A, Horlacher O, Gasteiger E, Rojas-Macias M, Karlsson NG, Packer NH, Lisacek F. Glycomics@ExPASy: Bridging the Gap. Mol Cell Proteomics 2018; 17:2164-2176. [PMID: 30097532 PMCID: PMC6210229 DOI: 10.1074/mcp.ra118.000799] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 07/15/2018] [Indexed: 12/28/2022] Open
Abstract
Glycomics@ExPASy (https://www.expasy.org/glycomics) is the glycomics tab of ExPASy, the server of SIB Swiss Institute of Bioinformatics. It was created in 2016 to centralize web-based glycoinformatics resources developed within an international network of glycoscientists. The hosted collection currently includes mainly databases and tools created and maintained at SIB but also links to a range of reference resources popular in the glycomics community. The philosophy of our toolbox is that it should be {glycoscientist AND protein scientist}-friendly with the aim of (1) popularizing the use of bioinformatics in glycobiology and (2) emphasizing the relationship between glycobiology and protein-oriented bioinformatics resources. The scarcity of data bridging these two disciplines led us to design tools as interactive as possible based on database connectivity to facilitate data exploration and support hypothesis building. Glycomics@ExPASy was designed, and is developed, with a long-term vision in close collaboration with glycoscientists to meet as closely as possible the growing needs of the community for glycoinformatics.
Collapse
Affiliation(s)
- Julien Mariethoz
- From the ‡Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
- §Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Davide Alocci
- From the ‡Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
- §Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Alessandra Gastaldello
- From the ‡Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
- §Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Oliver Horlacher
- From the ‡Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Elisabeth Gasteiger
- ¶Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Miguel Rojas-Macias
- ‖Glyco Inflammatory Group, Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Niclas G Karlsson
- ‖Glyco Inflammatory Group, Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Nicolle H Packer
- **Institute for Glycomics, Gold Coast Campus, Griffith University, Southport, QLD, Australia
- ‡‡Biomolecular Discovery & Design Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Frédérique Lisacek
- From the ‡Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland;
- §Computer Science Department, University of Geneva, Geneva, Switzerland
- §§Section of Biology, University of Geneva, Geneva, Switzerland
| |
Collapse
|
15
|
Abstract
Tandem mass spectrometry, when combined with liquid chromatography and applied to complex mixtures, produces large amounts of raw data, which needs to be analyzed to identify molecular structures. This technique is widely used, particularly in glycomics. Due to a lack of high throughput glycan sequencing software, glycan spectra are predominantly sequenced manually. A challenge for writing glycan-sequencing software is that there is no direct template that can be used to infer structures detectable in an organism. To help alleviate this bottleneck, we present Glycoforest 1.0, a partial de novo algorithm for sequencing glycan structures based on MS/MS spectra. Glycoforest was tested on two data sets (human gastric and salmon mucosa O-linked glycomes) for which MS/MS spectra were annotated manually. Glycoforest generated the human validated structure for 92% of test cases. The correct structure was found as the best scoring match for 70% and among the top 3 matches for 83% of test cases. In addition, the Glycoforest algorithm detected glycan structures from MS/MS spectra missing a manual annotation. In total 1532 MS/MS previously unannotated spectra were annotated by Glycoforest. A portion containing 521 spectra was manually checked confirming that Glycoforest annotated an additional 50 MS/MS spectra overlooked during manual annotation.
Collapse
Affiliation(s)
- Oliver Horlacher
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva, 1211, Switzerland.,University of Geneva , Geneva, 1211, Switzerland
| | - Chunsheng Jin
- Glyco Inflammatory Group, Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg , Gothenburg, SE405 30, Sweden
| | - Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva, 1211, Switzerland.,University of Geneva , Geneva, 1211, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva, 1211, Switzerland.,University of Geneva , Geneva, 1211, Switzerland
| | - Markus Müller
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva, 1211, Switzerland.,University of Geneva , Geneva, 1211, Switzerland
| | - Niclas G Karlsson
- Glyco Inflammatory Group, Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg , Gothenburg, SE405 30, Sweden
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva, 1211, Switzerland.,University of Geneva , Geneva, 1211, Switzerland
| |
Collapse
|
16
|
Abstract
UniCarbKB ( http://unicarbkb.org ) is a comprehensive resource for mammalian glycoprotein and annotation data. In particular, the database provides information on the oligosaccharides characterized from a glycoprotein at either the global or site-specific level. This evidence is accumulated from a peer-reviewed and manually curated collection of information on oligosaccharides derived from membrane and secreted glycoproteins purified from biological fluids and/or tissues. This information is further supplemented with experimental method descriptions that summarize important sample preparation and analytical strategies. A new release of UniCarbKB is published every three months, each includes a collection of curated data and improvements to database functionality. In this Chapter, we outline the objectives of UniCarbKB, and describe a selection of step-by-step workflows for navigating the information available. We also provide a short description of web services available and future plans for improving data access. The information presented in this Chapter supplements content available in our knowledgebase including regular updates on interface improvements, new features, and revisions to the database content ( http://confluence.unicarbkb.org ).
Collapse
Affiliation(s)
- Matthew P Campbell
- Department of Chemistry and Biomolecular Sciences, Research Drive, Building E8C, Macquarie University, North Ryde, Sydney, 2109, NSW, Australia
| | - Robyn A Peterson
- Department of Chemistry and Biomolecular Sciences, Research Drive, Building E8C, Macquarie University, North Ryde, Sydney, 2109, NSW, Australia
| | - Elisabeth Gasteiger
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Battelle - Building A7, Route de Drize, 1227 Carouge, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Battelle - Building A7, Route de Drize, 1227 Carouge, Switzerland
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Battelle - Building A7, Route de Drize, 1227 Carouge, Switzerland
- Computer Science Department, University of Geneva, Battelle - Building A7, Route de Drize, 1227 Carouge, Switzerland
| | - Nicolle H Packer
- Department of Chemistry and Biomolecular Sciences, Research Drive, Building E8C, Macquarie University, North Ryde, Sydney, 2109, NSW, Australia.
| |
Collapse
|
17
|
Lisacek F, Mariethoz J, Alocci D, Rudd PM, Abrahams JL, Campbell MP, Packer NH, Ståhle J, Widmalm G, Mullen E, Adamczyk B, Rojas-Macias MA, Jin C, Karlsson NG. Databases and Associated Tools for Glycomics and Glycoproteomics. Methods Mol Biol 2017; 1503:235-264. [PMID: 27743371 DOI: 10.1007/978-1-4939-6493-2_18] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
The access to biodatabases for glycomics and glycoproteomics has proven to be essential for current glycobiological research. This chapter presents available databases that are devoted to different aspects of glycobioinformatics. This includes oligosaccharide sequence databases, experimental databases, 3D structure databases (of both glycans and glycorelated proteins) and association of glycans with tissue, disease, and proteins. Specific search protocols are also provided using tools associated with experimental databases for converting primary glycoanalytical data to glycan structural information. In particular, researchers using glycoanalysis methods by U/HPLC (GlycoBase), MS (GlycoWorkbench, UniCarb-DB, GlycoDigest), and NMR (CASPER) will benefit from this chapter. In addition we also include information on how to utilize glycan structural information to query databases that associate glycans with proteins (UniCarbKB) and with interactions with pathogens (SugarBind).
Collapse
Affiliation(s)
- Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Pauline M Rudd
- NIBRT GlycoScience Group, NIBRT-The National Institute for Bioprocessing Research and Training, Fosters Avenue, Mount Merrion, Blackrock, Co., Dublin, Ireland
| | - Jodie L Abrahams
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Matthew P Campbell
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Nicolle H Packer
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Jonas Ståhle
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, Stockholm, Sweden
| | - Göran Widmalm
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, Stockholm, Sweden
| | | | - Barbara Adamczyk
- NIBRT GlycoScience Group, NIBRT-The National Institute for Bioprocessing Research and Training, Fosters Avenue, Mount Merrion, Blackrock, Co., Dublin, Ireland
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden
| | - Miguel A Rojas-Macias
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden
| | - Chunsheng Jin
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden
| | - Niclas G Karlsson
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden.
| |
Collapse
|
18
|
Affiliation(s)
- Alessandra Gastaldello
- Proteome
Informatics Group, SIB Swiss Institute of Bioinformatics, 7 route
de Drize, 1227 Geneva, Switzerland
- Computer
Science Department CUI, University of Geneva, 1227 Geneva, Switzerland
| | - Davide Alocci
- Proteome
Informatics Group, SIB Swiss Institute of Bioinformatics, 7 route
de Drize, 1227 Geneva, Switzerland
- Computer
Science Department CUI, University of Geneva, 1227 Geneva, Switzerland
| | - Jean-Luc Baeriswyl
- Proteome
Informatics Group, SIB Swiss Institute of Bioinformatics, 7 route
de Drize, 1227 Geneva, Switzerland
- Section
of Biology, Faculty of Sciences, University of Geneva, 1211 Geneva, Switzerland
| | - Julien Mariethoz
- Proteome
Informatics Group, SIB Swiss Institute of Bioinformatics, 7 route
de Drize, 1227 Geneva, Switzerland
- Computer
Science Department CUI, University of Geneva, 1227 Geneva, Switzerland
| | - Frederique Lisacek
- Proteome
Informatics Group, SIB Swiss Institute of Bioinformatics, 7 route
de Drize, 1227 Geneva, Switzerland
- Computer
Science Department CUI, University of Geneva, 1227 Geneva, Switzerland
- Section
of Biology, Faculty of Sciences, University of Geneva, 1211 Geneva, Switzerland
| |
Collapse
|
19
|
Mariethoz J, Khatib K, Alocci D, Campbell MP, Karlsson NG, Packer NH, Mullen EH, Lisacek F. SugarBindDB, a resource of glycan-mediated host-pathogen interactions. Nucleic Acids Res 2016; 44:D1243-50. [PMID: 26578555 PMCID: PMC4702881 DOI: 10.1093/nar/gkv1247] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 10/22/2015] [Accepted: 10/31/2015] [Indexed: 12/16/2022] Open
Abstract
The SugarBind Database (SugarBindDB) covers knowledge of glycan binding of human pathogen lectins and adhesins. It is a curated database; each glycan-protein binding pair is associated with at least one published reference. The core data element of SugarBindDB is a set of three inseparable components: the pathogenic agent, a lectin/adhesin and a glycan ligand. Each entity (agent, lectin or ligand) is described by a range of properties that are summarized in an entity-dedicated page. Several search, navigation and visualisation tools are implemented to investigate the functional role of glycans in pathogen binding. The database is cross-linked to protein and glycan-relaled resources such as UniProtKB and UniCarbKB. It is tightly bound to the latter via a substructure search tool that maps each ligand to full structures where it occurs. Thus, a glycan-lectin binding pair of SugarBindDB can lead to the identification of a glycan-mediated protein-protein interaction, that is, a lectin-glycoprotein interaction, via substructure search and the knowledge of site-specific glycosylation stored in UniCarbKB. SugarBindDB is accessible at: http://sugarbind.expasy.org.
Collapse
Affiliation(s)
- Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | | | - Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland Department of Computer Science, University of Geneva, Geneva, Switzerland
| | - Matthew P Campbell
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Niclas G Karlsson
- University of Gothenburg, Sahlgrenska Academy, Institute of Biomedicine, Department of Medical Biochemistry and Cell Biology, Gothenburg, Sweden
| | - Nicolle H Packer
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | | | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland Department of Computer Science, University of Geneva, Geneva, Switzerland
| |
Collapse
|
20
|
Alocci D, Mariethoz J, Horlacher O, Bolleman JT, Campbell MP, Lisacek F. Property Graph vs RDF Triple Store: A Comparison on Glycan Substructure Search. PLoS One 2015; 10:e0144578. [PMID: 26656740 PMCID: PMC4684231 DOI: 10.1371/journal.pone.0144578] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 11/22/2015] [Indexed: 11/18/2022] Open
Abstract
Resource description framework (RDF) and Property Graph databases are emerging technologies that are used for storing graph-structured data. We compare these technologies through a molecular biology use case: glycan substructure search. Glycans are branched tree-like molecules composed of building blocks linked together by chemical bonds. The molecular structure of a glycan can be encoded into a direct acyclic graph where each node represents a building block and each edge serves as a chemical linkage between two building blocks. In this context, Graph databases are possible software solutions for storing glycan structures and Graph query languages, such as SPARQL and Cypher, can be used to perform a substructure search. Glycan substructure searching is an important feature for querying structure and experimental glycan databases and retrieving biologically meaningful data. This applies for example to identifying a region of the glycan recognised by a glycan binding protein (GBP). In this study, 19,404 glycan structures were selected from GlycomeDB (www.glycome-db.org) and modelled for being stored into a RDF triple store and a Property Graph. We then performed two different sets of searches and compared the query response times and the results from both technologies to assess performance and accuracy. The two implementations produced the same results, but interestingly we noted a difference in the query response times. Qualitative measures such as portability were also used to define further criteria for choosing the technology adapted to solving glycan substructure search and other comparable issues.
Collapse
Affiliation(s)
- Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
- Computer Science Department, University of Geneva, Geneva, 1227, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Oliver Horlacher
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
- Computer Science Department, University of Geneva, Geneva, 1227, Switzerland
| | - Jerven T. Bolleman
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
| | - Matthew P. Campbell
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, Australia
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland
- Computer Science Department, University of Geneva, Geneva, 1227, Switzerland
- * E-mail:
| |
Collapse
|
21
|
Horlacher O, Nikitin F, Alocci D, Mariethoz J, Müller M, Lisacek F. MzJava: An open source library for mass spectrometry data processing. J Proteomics 2015; 129:63-70. [PMID: 26141507 DOI: 10.1016/j.jprot.2015.06.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2015] [Revised: 06/17/2015] [Accepted: 06/22/2015] [Indexed: 10/23/2022]
Abstract
Mass spectrometry (MS) is a widely used and evolving technique for the high-throughput identification of molecules in biological samples. The need for sharing and reuse of code among bioinformaticians working with MS data prompted the design and implementation of MzJava, an open-source Java Application Programming Interface (API) for MS related data processing. MzJava provides data structures and algorithms for representing and processing mass spectra and their associated biological molecules, such as metabolites, glycans and peptides. MzJava includes functionality to perform mass calculation, peak processing (e.g. centroiding, filtering, transforming), spectrum alignment and clustering, protein digestion, fragmentation of peptides and glycans as well as scoring functions for spectrum-spectrum and peptide/glycan-spectrum matches. For data import and export MzJava implements readers and writers for commonly used data formats. For many classes support for the Hadoop MapReduce (hadoop.apache.org) and Apache Spark (spark.apache.org) frameworks for cluster computing was implemented. The library has been developed applying best practices of software engineering. To ensure that MzJava contains code that is correct and easy to use the library's API was carefully designed and thoroughly tested. MzJava is an open-source project distributed under the AGPL v3.0 licence. MzJava requires Java 1.7 or higher. Binaries, source code and documentation can be downloaded from http://mzjava.expasy.org and https://bitbucket.org/sib-pig/mzjava. This article is part of a Special Issue entitled: Computational Proteomics.
Collapse
Affiliation(s)
- Oliver Horlacher
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland; Centre Universitaire de Bioinformatique, University of Geneva, Geneva 1211, Switzerland
| | - Frederic Nikitin
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland
| | - Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland; Centre Universitaire de Bioinformatique, University of Geneva, Geneva 1211, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland
| | - Markus Müller
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland; Centre Universitaire de Bioinformatique, University of Geneva, Geneva 1211, Switzerland.
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva 1211, Switzerland; Centre Universitaire de Bioinformatique, University of Geneva, Geneva 1211, Switzerland.
| |
Collapse
|
22
|
Gotz L, Abrahams JL, Mariethoz J, Rudd PM, Karlsson NG, Packer NH, Campbell MP, Lisacek F. GlycoDigest: a tool for the targeted use of exoglycosidase digestions in glycan structure determination. Bioinformatics 2014; 30:3131-3. [PMID: 25015990 DOI: 10.1093/bioinformatics/btu425] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED Sequencing oligosaccharides by exoglycosidases, either sequentially or in an array format, is a powerful tool to unambiguously determine the structure of complex N- and O-link glycans. Here, we introduce GlycoDigest, a tool that simulates exoglycosidase digestion, based on controlled rules acquired from expert knowledge and experimental evidence available in GlycoBase. The tool allows the targeted design of glycosidase enzyme mixtures by allowing researchers to model the action of exoglycosidases, thereby validating and improving the efficiency and accuracy of glycan analysis. AVAILABILITY AND IMPLEMENTATION http://www.glycodigest.org.
Collapse
Affiliation(s)
- Lou Gotz
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Jodie L Abrahams
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Pauline M Rudd
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Niclas G Karlsson
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Nicolle H Packer
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Matthew P Campbell
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland Proteome Informatics Group, Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland, Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, National Institute for Bioprocessing Research and Training, GlycoScience Group, Dublin, Ireland, Department of Medical Biochemistry and Cell Biology, University of Gothenburg, 40530 Gothenburg, Sweden and Section of Biology, University of Geneva, 1211 Geneva, Switzerland
| |
Collapse
|
23
|
Campbell MP, Ranzinger R, Lütteke T, Mariethoz J, Hayes CA, Zhang J, Akune Y, Aoki-Kinoshita KF, Damerell D, Carta G, York WS, Haslam SM, Narimatsu H, Rudd PM, Karlsson NG, Packer NH, Lisacek F. Toolboxes for a standardised and systematic study of glycans. BMC Bioinformatics 2014; 15 Suppl 1:S9. [PMID: 24564482 PMCID: PMC4016020 DOI: 10.1186/1471-2105-15-s1-s9] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Background Recent progress in method development for characterising the branched structures of complex carbohydrates has now enabled higher throughput technology. Automation of structure analysis then calls for software development since adding meaning to large data collections in reasonable time requires corresponding bioinformatics methods and tools. Current glycobioinformatics resources do cover information on the structure and function of glycans, their interaction with proteins or their enzymatic synthesis. However, this information is partial, scattered and often difficult to find to for non-glycobiologists. Methods Following our diagnosis of the causes of the slow development of glycobioinformatics, we review the "objective" difficulties encountered in defining adequate formats for representing complex entities and developing efficient analysis software. Results Various solutions already implemented and strategies defined to bridge glycobiology with different fields and integrate the heterogeneous glyco-related information are presented. Conclusions Despite the initial stage of our integrative efforts, this paper highlights the rapid expansion of glycomics, the validity of existing resources and the bright future of glycobioinformatics.
Collapse
|
24
|
Campbell MP, Peterson R, Mariethoz J, Gasteiger E, Akune Y, Aoki-Kinoshita KF, Lisacek F, Packer NH. UniCarbKB: building a knowledge platform for glycoproteomics. Nucleic Acids Res 2013; 42:D215-21. [PMID: 24234447 PMCID: PMC3964942 DOI: 10.1093/nar/gkt1128] [Citation(s) in RCA: 129] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The UniCarb KnowledgeBase (UniCarbKB; http://unicarbkb.org) offers public access to a growing, curated database of information on the glycan structures of glycoproteins. UniCarbKB is an international effort that aims to further our understanding of structures, pathways and networks involved in glycosylation and glyco-mediated processes by integrating structural, experimental and functional glycoscience information. This initiative builds upon the success of the glycan structure database GlycoSuiteDB, together with the informatic standards introduced by EUROCarbDB, to provide a high-quality and updated resource to support glycomics and glycoproteomics research. UniCarbKB provides comprehensive information concerning glycan structures, and published glycoprotein information including global and site-specific attachment information. For the first release over 890 references, 3740 glycan structure entries and 400 glycoproteins have been curated. Further, 598 protein glycosylation sites have been annotated with experimentally confirmed glycan structures from the literature. Among these are 35 glycoproteins, 502 structures and 60 publications previously not included in GlycoSuiteDB. This article provides an update on the transformation of GlycoSuiteDB (featured in previous NAR Database issues and hosted by ExPASy since 2009) to UniCarbKB and its integration with UniProtKB and GlycoMod. Here, we introduce a refactored database, supported by substantial new curated data collections and intuitive user-interfaces that improve database searching.
Collapse
Affiliation(s)
- Matthew P Campbell
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW 2109, Australia, Proteome Informatics Group, Swiss Institute of Bioinformatics, Geneva, Switzerland, Swiss-Prot Group, Swiss Institute of Bioinformatics, Geneva, Switzerland, Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, Japan and Section of Biology, Faculty of Sciences, University of Geneva, Switzerland
| | | | | | | | | | | | | | | |
Collapse
|