1
|
Kazantsev K, Toukach P. Remediation of the NMR data of natural glycans. Int J Biol Macromol 2024; 282:137042. [PMID: 39521218 DOI: 10.1016/j.ijbiomac.2024.137042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 09/05/2024] [Accepted: 10/27/2024] [Indexed: 11/16/2024]
Abstract
Primary structure elucidation in glycobiology is strongly affected by published structure-reporting NMR signals, especially on the 13C nucleus. The glycan NMR simulation accuracy and machine learning outcome depend on the quality of the NMR signal assignment in glycan databases. Within our work on improving the data quality in the Carbohydrate Structure Database (CSDB), we have applied a systematic search for inconsistencies in the published NMR data. The search was based on a bulk comparison between the experimental and simulated 13C NMR chemical shifts and manual analysis of the mismatches. On the basis of this analysis, CSDB was remediated by marking and correcting the NMR errors found in 272 structure elucidation reports published over the past 40 years.
Collapse
Affiliation(s)
- Kirill Kazantsev
- Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, 119991 Moscow, Russia
| | - Philip Toukach
- Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, 119991 Moscow, Russia; National Research University Higher School of Economics, Faculty of Chemistry, Vavilova 7, 117312 Moscow, Russia.
| |
Collapse
|
2
|
Guan Y, Zhao S, Fu C, Zhang J, Yang F, Luo J, Dai L, Li X, Schlüter H, Wang J, Xu C. nQuant Enables Precise Quantitative N-Glycomics. Anal Chem 2024; 96:15531-15539. [PMID: 39302767 DOI: 10.1021/acs.analchem.4c01153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
N-glycosylation is a highly heterogeneous post-translational modification that modulates protein function. Defects in N-glycosylation are directly linked to various human diseases. Despite the importance of quantifying N-glycans with high precision, existing glycoinformatics tools are limited. Here, we developed nQuant, a glycoinformatics tool that enables label-free and isotopic labeling quantification of N-glycomics data obtained via LC-MS/MS, ensuring a low false quantitation rate. Using the label-free quantification module, we profiled the N-glycans released from purified glycoproteins and HEK293 cells as well as the dynamic changes of N-glycosylation during mouse corpus callosum development. Through the isotopic labeling quantification module, we revealed the dynamic changes of N-glycans in acute promyelocytic leukemia cells after all-trans retinoic acid treatment. Taken together, we demonstrate that nQuant enables fast and precise quantitative N-glycomics.
Collapse
Affiliation(s)
- Yudong Guan
- Department of Critical Care Medicine, Guangdong Provincial Clinical Research Center for Geriatrics, Shenzhen Clinical Research Center for Geriatrics, Shenzhen People's Hospital, The First Affiliated Hospital, Southern University of Science and Technology, Shenzhen, Guangdong 518020, China
| | - Shanshan Zhao
- Section Mass Spectrometry and Proteomics, Center for Diagnostics, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Chunjin Fu
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, Artemisinin Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Junzhe Zhang
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, Artemisinin Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Fan Yang
- Translational Neurodegeneration Section "Albrecht-Kossel", Department of Neurology, University Medical Center Rostock, Rostock 18147, Germany
| | - Jiankai Luo
- Translational Neurodegeneration Section "Albrecht-Kossel", Department of Neurology, University Medical Center Rostock, Rostock 18147, Germany
| | - Lingyun Dai
- Department of Critical Care Medicine, Guangdong Provincial Clinical Research Center for Geriatrics, Shenzhen Clinical Research Center for Geriatrics, Shenzhen People's Hospital, The First Affiliated Hospital, Southern University of Science and Technology, Shenzhen, Guangdong 518020, China
| | - Xihai Li
- College of Integrative Medicine, Laboratory of Pathophysiology, Key Laboratory of Integrative Medicine on Chronic Diseases, Fujian University of Traditional Chinese Medicine, Fuzhou, Fujian 350122, China
| | - Hartmut Schlüter
- Section Mass Spectrometry and Proteomics, Center for Diagnostics, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - Jigang Wang
- Department of Critical Care Medicine, Guangdong Provincial Clinical Research Center for Geriatrics, Shenzhen Clinical Research Center for Geriatrics, Shenzhen People's Hospital, The First Affiliated Hospital, Southern University of Science and Technology, Shenzhen, Guangdong 518020, China
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, Artemisinin Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
- State Key Laboratory of Antiviral Drugs, School of Pharmacy, Henan University, Kaifeng, Henan 475004, China
| | - Chengchao Xu
- Department of Critical Care Medicine, Guangdong Provincial Clinical Research Center for Geriatrics, Shenzhen Clinical Research Center for Geriatrics, Shenzhen People's Hospital, The First Affiliated Hospital, Southern University of Science and Technology, Shenzhen, Guangdong 518020, China
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, Artemisinin Research Center, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
- College of Integrative Medicine, Laboratory of Pathophysiology, Key Laboratory of Integrative Medicine on Chronic Diseases, Fujian University of Traditional Chinese Medicine, Fuzhou, Fujian 350122, China
| |
Collapse
|
3
|
Toukach PV. Supplementing the Carbohydrate Structure Database with glycoepitopes. Glycobiology 2023; 33:528-531. [PMID: 37306951 DOI: 10.1093/glycob/cwad043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/10/2023] [Accepted: 05/27/2023] [Indexed: 06/13/2023] Open
Abstract
Carbohydrate structures in the Carbohydrate Structure Database have been referenced to glycoepitopes from the Immune Epitope Database allowing users to explore the glycan structures and contained epitopes. Starting with an epitope, one can figure out the glycans from other organisms that share the same structural determinant, and retrieve the associated taxonomical, medical, and other data. This database mapping demonstrates the advantages of the integration of immunological and glycomic databases.
Collapse
Affiliation(s)
- Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Laboratory of carbohydrate chemistry and biocides, Leninsky pr. 47, Moscow 119991, Russia
| |
Collapse
|
4
|
Toukach PV, Shirkovskaya AI. Carbohydrate Structure Database and Other Glycan Databases as an Important Element of Glycoinformatics. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2022. [DOI: 10.1134/s1068162022030190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
5
|
Toukach PV, Egorova KS. Source files of the Carbohydrate Structure Database: the way to sophisticated analysis of natural glycans. Sci Data 2022; 9:131. [PMID: 35354826 PMCID: PMC8968703 DOI: 10.1038/s41597-022-01186-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 02/03/2022] [Indexed: 11/18/2022] Open
Abstract
The Carbohydrate Structure Database (CSDB, http://csdb.glycoscience.ru/ ) is a free curated repository storing various data on glycans of bacterial, fungal and plant origins. Currently, it maintains a close-to-full coverage on bacterial and fungal carbohydrates up to the year 2020. The CSDB web-interface provides free access to the database content and dedicated tools. Still, the number of these tools and the types of the corresponding analyses is limited, whereas the database itself contains data that can be used in a broader scope of analytical studies. In this paper, we present CSDB source data files and a self-contained SQL dump, and exemplify their possible application in glycan-related studies. By using CSDB in an SQL format, the user can gain access to the chain length distribution or charge distribution (as an example) in a given set of glycans defined according to specific structural, taxonomic, or other parameters, whereas the source text dump files can be imported to any dedicated database with a specific internal architecture differing from that of CSDB.
Collapse
Affiliation(s)
- Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow, 119991, Russia.
| | - Ksenia S Egorova
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow, 119991, Russia.
| |
Collapse
|
6
|
Scherbinina SI, Frank M, Toukach PV. Carbohydrate structure database (CSDB) oligosaccharide conformation tool. Glycobiology 2022; 32:460-468. [PMID: 35275211 DOI: 10.1093/glycob/cwac011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 02/17/2022] [Accepted: 03/04/2022] [Indexed: 11/13/2022] Open
Abstract
Population analysis in terms of glycosidic torsion angles is frequently used to reveal preferred conformers of glycans. However, due to high structural diversity and flexibility of carbohydrates, conformational characterization of complex glycans can be a challenging task. Herein we present a conformation module of oligosaccharide fragments occurring in natural glycan structures developed on the platform of the Carbohydrate Structure Database (CSDB). Currently, this module deposits free energy surface and conformer abundance maps plotted as a function of glycosidic torsions for 194 inter-residue bonds. Data are automatically and continuously derived from explicit-solvent molecular dynamics (MD) simulations. The module was also supplemented with high-temperature MD data of saccharides (2403 maps) provided by GlycoMapsDB (hosted by GLYCOSCIENCES.de project). Conformational data defined by up to four torsional degrees of freedom can be freely explored using a web interface of the module available at http://csdb.glycoscience.ru/database/core/search_conf.html.
Collapse
Affiliation(s)
- S I Scherbinina
- Higher Chemical College, D. Mendeleev University of Chemical Technology of Russia, Miusskaya Square 9, 125047 Moscow, Russia
| | - M Frank
- Biognos AB, Box 8963, 40274 Göteborg, Sweden
| | - P V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Science, Leninsky prospect 47, 119991 Moscow, Russia
| |
Collapse
|
7
|
Egorova KS, Smirnova NS, Toukach PV. CSDB_GT, a curated glycosyltransferase database with close-to-full coverage on three most studied nonanimal species. Glycobiology 2020; 31:524-529. [PMID: 33242091 DOI: 10.1093/glycob/cwaa107] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 11/13/2020] [Accepted: 11/18/2020] [Indexed: 11/13/2022] Open
Abstract
We report the accomplishment of the first stage of the development of a novel manually curated database on glycosyltransferase (GT) activities, CSDB_GT. CSDB_GT (http://csdb.glycoscience.ru/gt.html) has been supplemented with GT activities from Saccharomyces cerevisiae. Now it provides the close-to-complete coverage on experimentally confirmed GTs from the three most studied model organisms from the three kingdoms: plantae (Arabidopsis thaliana, ca. 930 activities), bacteria (Escherichia coli, ca. 820 activities) and fungi (S. cerevisiae, ca. 270 activities).
Collapse
Affiliation(s)
- Ksenia S Egorova
- Laboratory of Metal-Complex and Nano-Scale Catalysts, N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow 119991, Russia
| | - Nadezhda S Smirnova
- Kurnakov Institute of General and Inorganic Chemistry, Russian Academy of Sciences, Leninsky prospect 31, Moscow 119991, Russia
| | - Philip V Toukach
- Laboratory of Carbohydrate Chemistry, N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow 119991, Russia
| |
Collapse
|
8
|
Toukach PV, Egorova KS. New Features of Carbohydrate Structure Database Notation (CSDB Linear), As Compared to Other Carbohydrate Notations. J Chem Inf Model 2019; 60:1276-1289. [DOI: 10.1021/acs.jcim.9b00744] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Philip V. Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prosect 47, Moscow, Russia 119991
- National Research University Higher School of Economics, Myasnitskaya 20, Moscow, Russia 101000
| | - Ksenia S. Egorova
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prosect 47, Moscow, Russia 119991
| |
Collapse
|
9
|
Gaudin T, Lu H, Fayet G, Berthauld-Drelich A, Rotureau P, Pourceau G, Wadouachi A, Van Hecke E, Nesterenko A, Pezron I. Impact of the chemical structure on amphiphilic properties of sugar-based surfactants: A literature overview. Adv Colloid Interface Sci 2019; 270:87-100. [PMID: 31200263 DOI: 10.1016/j.cis.2019.06.003] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Revised: 05/20/2019] [Accepted: 06/03/2019] [Indexed: 01/26/2023]
Abstract
In this review, structure-property trends are systematically analyzed for four amphiphilic properties of sugar-based surfactants: critical micelle concentration (CMC), its associated surface tension (γCMC), efficiency (pC20) and Krafft temperature (TK). First, the impact on amphiphilic properties of the alkyl chain size and the presence of branching and/or unsaturation is investigated. Then, various polar head parameters are explored, such as the degree of polymerization of the sugar unit (mono- or oligosaccharides), the chemical nature of the linker and the sugar configuration. Some systematic comparisons between ethoxylated surfactants and sugar-based surfactants are also carried out. While some structural trends with the impact of alkyl chain length or the polar head size are now well understood, this analysis points out that systematic studies of more specific effects of alkyl chain (e.g. branching, unsaturation, presence of rings, position on the polar head) and polar head (e.g. linker, anomeric configuration, internal stereochemistry, cyclic vs. acyclic sugar residues) were scarcer or not available to date. This work encourages the use of these structural trends in the perspective of developing new bio-based surfactants and their consideration in predictive models. It also highlights the need of further experimental tests to fill remaining gaps notably to explore some specific structural features (such as the introduction of rings in the alkyl chain or the position of the alkyl chain on the polar head) and towards applicative properties (like foaming capacity or wettability).
Collapse
|
10
|
Egorova KS, Toukach PV. Glykoinformatik: Brücken zwischen isolierten Inseln im Datenmeer. Angew Chem Int Ed Engl 2018. [DOI: 10.1002/ange.201803576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Ksenia S. Egorova
- Zelinsky Institute of Organic ChemistryRussian Academy of Sciences Leninsky Prospect 47 Moscow 119991 Russland
| | - Philip V. Toukach
- Zelinsky Institute of Organic ChemistryRussian Academy of Sciences Leninsky Prospect 47 Moscow 119991 Russland
| |
Collapse
|
11
|
Egorova KS, Toukach PV. Glycoinformatics: Bridging Isolated Islands in the Sea of Data. Angew Chem Int Ed Engl 2018; 57:14986-14990. [DOI: 10.1002/anie.201803576] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Indexed: 11/07/2022]
Affiliation(s)
- Ksenia S. Egorova
- Zelinsky Institute of Organic ChemistryRussian Academy of Sciences Leninsky Prospect 47 Moscow 119991 Russia
| | - Philip V. Toukach
- Zelinsky Institute of Organic ChemistryRussian Academy of Sciences Leninsky Prospect 47 Moscow 119991 Russia
| |
Collapse
|
12
|
Hähnke VD, Kim S, Bolton EE. PubChem chemical structure standardization. J Cheminform 2018; 10:36. [PMID: 30097821 PMCID: PMC6086778 DOI: 10.1186/s13321-018-0293-8] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 08/01/2018] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND PubChem is a chemical information repository, consisting of three primary databases: Substance, Compound, and BioAssay. When individual data contributors submit chemical substance descriptions to Substance, the unique chemical structures are extracted and stored into Compound through an automated process called structure standardization. The present study describes the PubChem standardization approaches and analyzes them for their success rates, reasons that cause structures to be rejected, and modifications applied to structures during the standardization process. Furthermore, the PubChem standardization is compared to the structure normalization of the IUPAC International Chemical Identifier (InChI) software, as manifested by conversion of the InChI back into a chemical structure. RESULTS The observed rejection rate for substances processed by PubChem standardization was 0.36%, which is predominantly attributed to structures with invalid atom valences that cannot be readily corrected without additional information from contributors. Of all structures that pass standardization, 44% are modified in the process, reducing the count of unique structures from 53,574,724 in substance to 45,808,881 in compound as identified by de-aromatized canonical isomeric SMILES. Even though the processing time is very low on average (only 0.4% of structures have individual standardization time above 0.1 s), total standardization time is completely dominated by edge cases: 90% of the time to standardize all structures in PubChem substance is spent on the 2.05% of structures with the highest individual standardization time. It is worth noting that 60% of the structures obtained from PubChem structure standardization are not identical to the chemical structure resulting from the InChI (primarily due to preferences for a different tautomeric form). CONCLUSIONS Standardization of chemical structures is complicated by the diversity of chemical information and their representations approaches. The PubChem standardization is an effective and efficient tool to account for molecular diversity and to eliminate invalid/incomplete structures. Further development will concentrate on improved tautomer consideration and an expanded stereocenter definition. Modifications are difficult to thoroughly validate, with slight changes often affecting many thousands of structures and various edge cases. The PubChem structure standardization service is accessible as a public resource ( https://pubchem.ncbi.nlm.nih.gov/standardize ), and via programmatic interfaces.
Collapse
Affiliation(s)
- Volker D. Hähnke
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD 20894 USA
- Present Address: European Patent Office, Patentlaan 2, 2288 EE Rijswijk, The Netherlands
| | - Sunghwan Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Evan E. Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD 20894 USA
| |
Collapse
|
13
|
Harvey DJ. Analysis of carbohydrates and glycoconjugates by matrix-assisted laser desorption/ionization mass spectrometry: An update for 2011-2012. MASS SPECTROMETRY REVIEWS 2017; 36:255-422. [PMID: 26270629 DOI: 10.1002/mas.21471] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 01/15/2015] [Indexed: 06/04/2023]
Abstract
This review is the seventh update of the original article published in 1999 on the application of MALDI mass spectrometry to the analysis of carbohydrates and glycoconjugates and brings coverage of the literature to the end of 2012. General aspects such as theory of the MALDI process, matrices, derivatization, MALDI imaging, and fragmentation are covered in the first part of the review and applications to various structural types constitute the remainder. The main groups of compound are oligo- and poly-saccharides, glycoproteins, glycolipids, glycosides, and biopharmaceuticals. Much of this material is presented in tabular form. Also discussed are medical and industrial applications of the technique, studies of enzyme reactions, and applications to chemical synthesis. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:255-422, 2017.
Collapse
Affiliation(s)
- David J Harvey
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, Oxford, OX1 3QU, UK
| |
Collapse
|
14
|
Egorova KS, Toukach PV. CSDB_GT: a new curated database on glycosyltransferases. Glycobiology 2016; 27:285-290. [PMID: 28011601 DOI: 10.1093/glycob/cww137] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Revised: 10/11/2016] [Accepted: 11/07/2016] [Indexed: 01/09/2023] Open
Abstract
Glycosyltransferases (GTs) are carbohydrate-active enzymes (CAZy) involved in the synthesis of natural glycan structures. The application of CAZy is highly demanded in biotechnology and pharmaceutics. However, it is being hindered by the lack of high-quality and comprehensive repositories of the research data accumulated so far. In this paper, we describe a new curated Carbohydrate Structure Glycosyltransferase Database (CSDB_GT). Currently, CSDB_GT provides ca. 780 activities exhibited by GTs, as well as several other CAZy, found in Arabidopsis thaliana and described in ca. 180 publications. It covers most published data on A. thaliana GTs with evidenced functions. CSDB_GT is linked to the Carbohydrate Structure Database (CSDB), which stores data on archaeal, bacterial, fungal and plant glycans. The CSDB_GT data are supported by experimental evidences and can be traced to original publications. CSDB_GT is freely available at http://csdb.glycoscience.ru/gt.html.
Collapse
Affiliation(s)
- Ksenia S Egorova
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow, Russia
| | - Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow, Russia
| |
Collapse
|
15
|
|
16
|
Toukach PV, Egorova KS. Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts. Nucleic Acids Res 2015; 44:D1229-36. [PMID: 26286194 PMCID: PMC4702937 DOI: 10.1093/nar/gkv840] [Citation(s) in RCA: 156] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 08/07/2015] [Indexed: 12/31/2022] Open
Abstract
The Carbohydrate Structure Databases (CSDBs, http://csdb.glycoscience.ru) store structural, bibliographic, taxonomic, NMR spectroscopic, and other data on natural carbohydrates and their derivatives published in the scientific literature. The CSDB project was launched in 2005 for bacterial saccharides (as BCSDB). Currently, it includes two parts, the Bacterial CSDB and the Plant&Fungal CSDB. In March 2015, these databases were merged to the single CSDB. The combined CSDB includes information on bacterial and archaeal glycans and derivatives (the coverage is close to complete), as well as on plant and fungal glycans and glycoconjugates (almost all structures published up to 1998). CSDB is regularly updated via manual expert annotation of original publications. Both newly annotated data and data imported from other databases are manually curated. The CSDB data are exportable in a number of modern formats, such as GlycoRDF. CSDB provides additional services for simulation of (1)H, (13)C and 2D NMR spectra of saccharides, NMR-based structure prediction, glycan-based taxon clustering and other.
Collapse
Affiliation(s)
- Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow 119991, Russia
| | - Ksenia S Egorova
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow 119991, Russia
| |
Collapse
|
17
|
Toukach PV, Egorova KS. Bacterial, plant, and fungal carbohydrate structure databases: daily usage. Methods Mol Biol 2015; 1273:55-85. [PMID: 25753703 DOI: 10.1007/978-1-4939-2343-4_5] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Natural carbohydrates play important roles in living systems and therefore are used as diagnostic and therapeutic targets. The main goal of glycomics is systematization of carbohydrates and elucidation of their role in human health and disease. The amount of information on natural carbohydrates accumulates rapidly, but scientists still lack databases and computer-assisted tools needed for orientation in the glycomic information space. Therefore, freely available, regularly updated, and cross-linked databases are demanded. Bacterial Carbohydrate Structure Database (Bacterial CSDB) was developed for provision of structural, bibliographic, taxonomic, NMR spectroscopic, and other related information on bacterial and archaeal carbohydrate structures. Its main features are (1) coverage above 90%, (2) high data consistence (above 90% of error-free records), and (3) presence of manually verified bibliographic, NMR spectroscopic, and taxonomic annotations. Recently, CSDB has been expanded to cover carbohydrates of plant and fungal origin. The achievement of full coverage in the plant and fungal domains is expected in the future. CSDB is freely available on the Internet as a web service at http://csdb.glycoscience.ru. This chapter aims at showing how to use CSDB in your daily scientific practice.
Collapse
Affiliation(s)
- Philip V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky Prospekt 47, Moscow, 119991, Russia,
| | | |
Collapse
|
18
|
Eavenson M, Kochut KJ, Miller JA, Ranzinger R, Tiemeyer M, Aoki K, York WS. Qrator: a web-based curation tool for glycan structures. Glycobiology 2014; 25:66-73. [PMID: 25165068 DOI: 10.1093/glycob/cwu090] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Most currently available glycan structure databases use their own proprietary structure representation schema and contain numerous annotation errors. These cause problems when glycan databases are used for the annotation or mining of data generated in the laboratory. Due to the complexity of glycan structures, curating these databases is often a tedious and labor-intensive process. However, rigorously validating glycan structures can be made easier with a curation workflow that incorporates a structure-matching algorithm that compares candidate glycans to a canonical tree that embodies structural features consistent with established mechanisms for the biosynthesis of a particular class of glycans. To this end, we have implemented Qrator, a web-based application that uses a combination of external literature and database references, user annotations and canonical trees to assist and guide researchers in making informed decisions while curating glycans. Using this application, we have started the curation of large numbers of N-glycans, O-glycans and glycosphingolipids. Our curation workflow allows creating and extending canonical trees for these classes of glycans, which have subsequently been used to improve the curation workflow.
Collapse
Affiliation(s)
| | | | | | - René Ranzinger
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602-7404, USA
| | - Michael Tiemeyer
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602-7404, USA
| | - Kazuhiro Aoki
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602-7404, USA
| | - William S York
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602-7404, USA
| |
Collapse
|
19
|
Egorova KS, Toukach PV. Expansion of coverage of Carbohydrate Structure Database (CSDB). Carbohydr Res 2013; 389:112-4. [PMID: 24680503 DOI: 10.1016/j.carres.2013.10.009] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2013] [Revised: 10/12/2013] [Accepted: 10/14/2013] [Indexed: 10/26/2022]
Abstract
The Bacterial Carbohydrate Structure Database (BCSDB), which has been maintained since 2005, was expanded to cover glycans from plants and fungi. The current coverage on plant and fungal glycans includes several thousands of the CarbBank records, as well as data published before 1996 but not deposited in CarbBank. Prior to deposition, the data were verified against the original publications and supplemented with additional information, such as NMR spectra. Both the Bacterial and Plant and Fungal Carbohydrate Structure Databases are freely available at http://csdb.glycoscience.ru.
Collapse
Affiliation(s)
- K S Egorova
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow 119991, Russia.
| | - P V Toukach
- N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky prospect 47, Moscow 119991, Russia.
| |
Collapse
|
20
|
Aoki-Kinoshita KF. Using databases and web resources for glycomics research. Mol Cell Proteomics 2013; 12:1036-45. [PMID: 23325765 DOI: 10.1074/mcp.r112.026252] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Many databases of carbohydrate structures and related information can be found on the World Wide Web. This review covers the major carbohydrate databases that have potential utility for glycoscientists and researchers entering the glycosciences. The first half provides a brief overview of carbohydrate databases and web resources (including a history of carbohydrate databases and carbohydrate notations used in these databases), and the second half provides a guide that can be used as an index to determine which resources provide the data of most interest to the user.
Collapse
Affiliation(s)
- Kiyoko F Aoki-Kinoshita
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, Japan.
| |
Collapse
|