1
|
Fujita A, Aoki-Kinoshita KF. Development of a novel monosaccharide substitution matrix for improved comparison of glycan structures. Carbohydr Res 2022; 511:108496. [DOI: 10.1016/j.carres.2021.108496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 12/13/2021] [Accepted: 12/28/2021] [Indexed: 11/02/2022]
|
2
|
Akiyoshi S, Iwata M, Berenger F, Yamanishi Y. Omics-based Identification of Glycan Structures as Biomarkers for a Variety of Diseases. Mol Inform 2019; 39:e1900112. [PMID: 31622036 DOI: 10.1002/minf.201900112] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Accepted: 09/24/2019] [Indexed: 12/11/2022]
Abstract
Glycans play important roles in cell communication, protein interaction, and immunity, and structural changes in glycans are associated with the regulation of a range of biological pathways involved in disease. However, our understanding of the detailed relationships between specific diseases and glycans is very limited. In this study, we proposed an omics-based method to investigate the correlations between glycans and a wide range of human diseases. We analyzed the gene expression patterns of glycogenes (glycosyltransferases and glycosidases) for 79 different diseases. A biological pathway-based glycogene signature was constructed to identify the alteration in glycan biosynthesis and the associated glycan structures for each disease state. The degradation of N-glycan and keratan sulfate, for example, may promote the growth or metastasis of multiple types of cancer, including endometrial, gastric, and nasopharyngeal. Our results also revealed that commonalities between diseases can be interpreted using glycogene expression patterns, as well as the associated glycan structure patterns at the level of the affected pathway. The proposed method is expected to be useful for understanding the relationships between glycans, glycogenes, and disease and identifying disease-specific glycan biomarkers.
Collapse
Affiliation(s)
- Sayaka Akiyoshi
- Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, Fukuoka, 812-8582, Japan
| | - Michio Iwata
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Francois Berenger
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| |
Collapse
|
3
|
Hosoda M, Takahashi Y, Shiota M, Shinmachi D, Inomoto R, Higashimoto S, Aoki-Kinoshita KF. MCAW-DB: A glycan profile database capturing the ambiguity of glycan recognition patterns. Carbohydr Res 2018; 464:44-56. [PMID: 29859376 DOI: 10.1016/j.carres.2018.05.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2017] [Revised: 05/08/2018] [Accepted: 05/08/2018] [Indexed: 01/17/2023]
Abstract
Glycan-binding protein (GBP) interaction experiments, such as glycan microarrays, are often used to understand glycan recognition patterns. However, oftentimes the interpretation of glycan array experimental data makes it difficult to identify discrete GBP binding patterns due to their ambiguity. It is known that lectins, for example, are non-specific in their binding affinities; the same lectin can bind to different monosaccharides or even different glycan structures. In bioinformatics, several tools to mine the data generated from these sorts of experiments have been developed. These tools take a library of predefined motifs, which are commonly-found glycan patterns such as sialyl-Lewis X, and attempt to identify the motif(s) that are specific to the GBP being analyzed. In our previous work, as opposed to using predefined motifs, we developed the Multiple Carbohydrate Alignment with Weights (MCAW) tool to visualize the state of the glycans being recognized by the GBP under analysis. We previously reported on the effectiveness of our tool and algorithm by analyzing several glycan array datasets from the Consortium of Functional Glycomics (CFG). In this work, we report on our analysis of 1081 data sets which we collected from the CFG, the results of which we have made publicly and freely available as a database called MCAW-DB. We introduce this database, its usage and describe several analysis results. We show how MCAW-DB can be used to analyze glycan-binding patterns of GBPs amidst their ambiguity. For example, the visualization of glycan-binding patterns in MCAW-DB show how they correlate with the concentrations of the samples used in the array experiments. Using MCAW-DB, the patterns of glycans found to bind to various GBP-glycan binding proteins are visualized, indicating the binding "environment" of the glycans. Thus, the ambiguity of glycan recognition is numerically represented, along with the patterns of monosaccharides surrounding the binding region. The profiles in MCAW-DB could potentially be used as predictors of affinity of unknown or novel glycans to particular GBPs by comparing how well they match the existing profiles for those GBPs. Moreover, as the glycan profiles of diseased tissues become available, glycan alignments could also be used to identify glycan biomarkers unique to that tissue. Databases of these alignments may be of great use for drug discovery.
Collapse
Affiliation(s)
- Masae Hosoda
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, 192-8577, Japan
| | - Yushi Takahashi
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, 192-8577, Japan
| | - Masaaki Shiota
- Department of Science and Engineering for Sustainable Innovation, Faculty of Science and Engineering, Soka University, Tokyo, 192-8577, Japan
| | - Daisuke Shinmachi
- Department of Science and Engineering for Sustainable Innovation, Faculty of Science and Engineering, Soka University, Tokyo, 192-8577, Japan
| | - Renji Inomoto
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, 192-8577, Japan
| | - Shinichi Higashimoto
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, 192-8577, Japan
| | - Kiyoko F Aoki-Kinoshita
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, 192-8577, Japan; Department of Science and Engineering for Sustainable Innovation, Faculty of Science and Engineering, Soka University, Tokyo, 192-8577, Japan.
| |
Collapse
|
4
|
Hosoda M, Akune Y, Aoki-Kinoshita KF. Development and application of an algorithm to compute weighted multiple glycan alignments. Bioinformatics 2017; 33:1317-1323. [PMID: 28093404 PMCID: PMC5408794 DOI: 10.1093/bioinformatics/btw827] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 12/22/2016] [Accepted: 01/10/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan-binding proteins. However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules. Results We have described in detail the first algorithm to perform multiple glycan alignments by modeling glycans as trees. To test our tool, we prepared several data sets, and as a result, we found that the glycan motif could be successfully aligned without any prior knowledge applied to the tool, and the known recognition binding sites of glycans could be aligned at a high rate amongst all our datasets tested. We thus claim that our tool is able to find meaningful glycan recognition and binding patterns using data obtained by glycan-binding experiments. The development and availability of an effective multiple glycan alignment tool opens possibilities for many other glycoinformatics analysis, making this work a big step towards furthering glycomics analysis. Availability and Implementation http://www.rings.t.soka.ac.jp. Contact kkiyoko@soka.ac.jp. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Masae Hosoda
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, Japan
| | - Yukie Akune
- Department of Bioinformatics, Graduate School of Engineering, Soka University, Tokyo, Japan
| | | |
Collapse
|
5
|
Lee HS, Jo S, Mukherjee S, Park SJ, Skolnick J, Lee J, Im W. GS-align for glycan structure alignment and similarity measurement. Bioinformatics 2015; 31:2653-9. [PMID: 25857669 DOI: 10.1093/bioinformatics/btv202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 04/03/2015] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Glycans play critical roles in many biological processes, and their structural diversity is key for specific protein-glycan recognition. Comparative structural studies of biological molecules provide useful insight into their biological relationships. However, most computational tools are designed for protein structure, and despite their importance, there is no currently available tool for comparing glycan structures in a sequence order- and size-independent manner. RESULTS A novel method, GS-align, is developed for glycan structure alignment and similarity measurement. GS-align generates possible alignments between two glycan structures through iterative maximum clique search and fragment superposition. The optimal alignment is then determined by the maximum structural similarity score, GS-score, which is size-independent. Benchmark tests against the Protein Data Bank (PDB) N-linked glycan library and PDB homologous/non-homologous N-glycoprotein sets indicate that GS-align is a robust computational tool to align glycan structures and quantify their structural similarity. GS-align is also applied to template-based glycan structure prediction and monosaccharide substitution matrix generation to illustrate its utility. AVAILABILITY AND IMPLEMENTATION http://www.glycanstructure.org/gsalign. CONTACT wonpil@ku.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Molecular Biosciences and Center for Computational Biology, University of Kansas, Lawrence, KS 66047, USA
| | - Sunhwan Jo
- Department of Molecular Biosciences and Center for Computational Biology, University of Kansas, Lawrence, KS 66047, USA
| | - Srayanta Mukherjee
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Sang-Jun Park
- School of Computational Sciences and Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul 130-722, Korea and
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA 30076, USA
| | - Jooyoung Lee
- School of Computational Sciences and Center for In Silico Protein Science, Korea Institute for Advanced Study, Seoul 130-722, Korea and
| | - Wonpil Im
- Department of Molecular Biosciences and Center for Computational Biology, University of Kansas, Lawrence, KS 66047, USA
| |
Collapse
|
6
|
Fujita A, Hosoda M, Tsuchiya S, Akune Y, Aoki-Kinoshita KF. Trends and Future Perspectives for Glycoinformatics. TRENDS GLYCOSCI GLYC 2014. [DOI: 10.4052/tigg.26.89] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
7
|
Abstract
Carbohydrate libraries printed in glycan micorarray format have had a great impact on the high-throughput analysis of the specificity of a wide range of mammalian, plant, and bacterial lectins. Chemical and chemo-enzymatic synthesis allows the construction of diverse glycan libraries but requires substantial effort and resources. To leverage the synthetic effort, the ideal library would be a minimal subset of all structures that provides optimal diversity. Therefore, a measure of library diversity is needed. To this end, we developed a linear representation of glycans using standard chemoinformatic tools. This representation was applied to measure pairwise similarity and consequently diversity of glycan libraries in a single value. The diversities of four existing sialoside glycan arrays were compared. More diverse arrays are proposed reducing the number of glycans. This algorithm can be applied to diverse aspects of library design from target structure selection to the choice of building blocks for their synthesis.
Collapse
Affiliation(s)
- Christoph Rademacher
- Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, CA 92037, USA.
| | | |
Collapse
|
8
|
Multiple Tree Alignment with Weights Applied to Carbohydrates to Extract Binding Recognition Patterns. PATTERN RECOGNITION IN BIOINFORMATICS 2012. [DOI: 10.1007/978-3-642-34123-6_5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
9
|
Akune Y, Hosoda M, Kaiya S, Shinmachi D, Aoki-Kinoshita KF. The RINGS Resource for Glycome Informatics Analysis and Data Mining on the Web. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 14:475-86. [DOI: 10.1089/omi.2009.0129] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Yukie Akune
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-cho, Hachioji, Tokyo 192-8577, Japan
| | - Masae Hosoda
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-cho, Hachioji, Tokyo 192-8577, Japan
| | - Sakiko Kaiya
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-cho, Hachioji, Tokyo 192-8577, Japan
| | - Daisuke Shinmachi
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-cho, Hachioji, Tokyo 192-8577, Japan
| | - Kiyoko F. Aoki-Kinoshita
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-cho, Hachioji, Tokyo 192-8577, Japan
| |
Collapse
|
10
|
Frank M, Schloissnig S. Bioinformatics and molecular modeling in glycobiology. Cell Mol Life Sci 2010; 67:2749-72. [PMID: 20364395 PMCID: PMC2912727 DOI: 10.1007/s00018-010-0352-4] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2009] [Revised: 03/08/2010] [Accepted: 03/11/2010] [Indexed: 12/11/2022]
Abstract
The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein-carbohydrate interaction are reviewed.
Collapse
Affiliation(s)
- Martin Frank
- Molecular Structure Analysis Core Facility-W160, Deutsches Krebsforschungszentrum (German Cancer Research Centre), 69120 Heidelberg, Germany.
| | | |
Collapse
|
11
|
Li L, Ching WK, Yamaguchi T, Aoki-Kinoshita KF. A weighted q-gram method for glycan structure classification. BMC Bioinformatics 2010; 11 Suppl 1:S33. [PMID: 20122206 PMCID: PMC3009505 DOI: 10.1186/1471-2105-11-s1-s33] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. Many computational approaches have been proposed for analyzing these complex glycan structures, which are chains of monosaccharides. The monosaccharides are linked to one another by glycosidic bonds, which can take on a variety of comformations, thus forming branches and resulting in complex tree structures. The q-gram method is one of these recent methods used to understand glycan function based on the classification of their tree structures. This q-gram method assumes that for a certain q, different q-grams share no similarity among themselves. That is, that if two structures have completely different components, then they are completely different. However, from a biological standpoint, this is not the case. In this paper, we propose a weighted q-gram method to measure the similarity among glycans by incorporating the similarity of the geometric structures, monosaccharides and glycosidic bonds among q-grams. In contrast to the traditional q-gram method, our weighted q-gram method admits similarity among q-grams for a certain q. Thus our new kernels for glycan structure were developed and then applied in SVMs to classify glycans. Results Two glycan datasets were used to compare the weighted q-gram method and the original q-gram method. The results show that the incorporation of q-gram similarity improves the classification performance for all of the important glycan classes tested. Conclusion The results in this paper indicate that similarity among q-grams obtained from geometric structure, monosaccharides and glycosidic linkage contributes to the glycan function classification. This is a big step towards the understanding of glycan function based on their complex structures.
Collapse
Affiliation(s)
- Limin Li
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong.
| | | | | | | |
Collapse
|
12
|
|
13
|
Herget S, Toukach PV, Ranzinger R, Hull WE, Knirel YA, von der Lieth CW. Statistical analysis of the Bacterial Carbohydrate Structure Data Base (BCSDB): characteristics and diversity of bacterial carbohydrates in comparison with mammalian glycans. BMC STRUCTURAL BIOLOGY 2008; 8:35. [PMID: 18694500 PMCID: PMC2543016 DOI: 10.1186/1472-6807-8-35] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2008] [Accepted: 08/11/2008] [Indexed: 11/24/2022]
Abstract
Background There are considerable differences between bacterial and mammalian glycans. In contrast to most eukaryotic carbohydrates, bacterial glycans are often composed of repeating units with diverse functions ranging from structural reinforcement to adhesion, colonization and camouflage. Since bacterial glycans are typically displayed at the cell surface, they can interact with the environment and, therefore, have significant biomedical importance. Results The sequence characteristics of glycans (monosaccharide composition, modifications, and linkage patterns) for the higher bacterial taxonomic classes have been examined and compared with the data for mammals, with both similarities and unique features becoming evident. Compared to mammalian glycans, the bacterial glycans deposited in the current databases have a more than ten-fold greater diversity at the monosaccharide level, and the disaccharide pattern space is approximately nine times larger. Specific bacterial subclasses exhibit characteristic glycans which can be distinguished on the basis of distinctive structural features or sequence properties. Conclusion For the first time a systematic database analysis of the bacterial glycome has been performed. This study summarizes the current knowledge of bacterial glycan architecture and diversity and reveals putative targets for the rational design and development of therapeutic intervention strategies by comparing bacterial and mammalian glycans.
Collapse
Affiliation(s)
- Stephan Herget
- Core Facility: Molecular Structure Analysis (W160), German Cancer Research Center, Heidelberg, Germany.
| | | | | | | | | | | |
Collapse
|
14
|
|
15
|
Yusufi FNK, Park W, Lee MM, Lee DY. An alpha-numeric code for representing N-linked glycan structures in secreted glycoproteins. Bioprocess Biosyst Eng 2008; 32:97-107. [PMID: 18458952 DOI: 10.1007/s00449-008-0226-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2008] [Accepted: 04/16/2008] [Indexed: 11/25/2022]
Abstract
Advances in high-throughput techniques have led to the creation of increasing amounts of glycome data. The storage and analysis of this data would benefit greatly from a compact notation for describing glycan structures that can be easily stored and interpreted by computers. Towards this end, we propose a fixed-length alpha-numeric code for representing N-linked glycan structures commonly found in secreted glycoproteins from mammalian cell cultures. This code, GlycoDigit, employs a pre-assigned alpha-numeric index to represent the monosaccharides attached in different branches to the core glycan structure. The present branch-centric representation allows us to visualize the structure while the numerical nature of the code makes it machine readable. In addition, a difference operator can be defined to quantitatively differentiate between glycan structures for further analysis. The usefulness and applicability of GlycoDigit were demonstrated by constructing and visualizing an N-linked glycosylation network.
Collapse
Affiliation(s)
- Faraaz Noor Khan Yusufi
- Bioprocessing Technology Institute, Biomedical Sciences Institutes, Agency for Science, Technology and Research (A*STAR), Singapore
| | | | | | | |
Collapse
|
16
|
Abstract
MOTIVATION Glycans are covalent assemblies of sugar that play crucial roles in many cellular processes. Recently, comprehensive data about the structure and function of glycans have been accumulated, therefore the need for methods and algorithms to analyze these data is growing fast. RESULTS This article presents novel methods for classifying glycans and detecting discriminative glycan motifs with support vector machines (SVM). We propose a new class of tree kernels to measure the similarity between glycans. These kernels are based on the comparison of tree substructures, and take into account several glycan features such as the sugar type, the sugar bound type or layer depth. The proposed methods are tested on their ability to classify human glycans into four blood components: leukemia cells, erythrocytes, plasma and serum. They are shown to outperform a previously published method. We also applied a feature selection approach to extract glycan motifs which are characteristic of each blood component. We confirmed that some leukemia-specific glycan motifs detected by our method corresponded to several results in the literature. AVAILABILITY Softwares are available upon request. SUPPLEMENTARY INFORMATION Datasets are available at the following website: http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/glycankernel/
Collapse
Affiliation(s)
- Yoshihiro Yamanishi
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan.
| | | | | |
Collapse
|
17
|
Li J, Wang W. Detailed assessment of homology detection using different substitution matrices. CHINESE SCIENCE BULLETIN-CHINESE 2006. [DOI: 10.1007/s11434-006-1538-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
18
|
von der Lieth CW, Lütteke T, Frank M. The role of informatics in glycobiology research with special emphasis on automatic interpretation of MS spectra. Biochim Biophys Acta Gen Subj 2005; 1760:568-77. [PMID: 16459020 DOI: 10.1016/j.bbagen.2005.12.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2005] [Revised: 12/01/2005] [Accepted: 12/01/2005] [Indexed: 12/17/2022]
Abstract
This paper reviews the current status of bioinformatics applications and databases in glycobiology, which are based on bioinformatics approaches as well as informatics for glycobiology where an explicit encoding of glycan structures is required. The availability of the complete sequence of the human genome has accelerated the systematic identification of so far unidentified glycogenes considerably in many areas of glycobiology using well-established bioinfomatics tools. Although there has been an immense development of new glyco-related data collections as well as informatics tools and several efforts have been started to cross-link and reference the various data deposited in distributed databases, informatics for glycobiology and glycomics is still poorly developed compared to the genomics and proteomics area. The development of algorithms for the automatic interpretation of MS spectra - currently, a severe bottleneck, which hampers the rapid and reliable interpretation of MS data in high-throughput glycomics projects - is reviewed. A comprehensive list of web resources is given. Several lines of progression are discussed. There is an urgent need for the development of decentralised input facilities of experimentally determined glycan structures. Simultaneously, agreements of standards for the structural description of glycans as well as formats for the related data have to be established. The integration of glycomics with genomics/proteomics has to increase.
Collapse
Affiliation(s)
- Claus-W von der Lieth
- German Cancer Research Center, Spectroscopic Department (B090), Molecular Modelling, Heidelberg, Germany.
| | | | | |
Collapse
|
19
|
Lütteke T, Bohne-Lang A, Loss A, Goetz T, Frank M, von der Lieth CW. GLYCOSCIENCES.de: an Internet portal to support glycomics and glycobiology research. Glycobiology 2005; 16:71R-81R. [PMID: 16239495 DOI: 10.1093/glycob/cwj049] [Citation(s) in RCA: 176] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The development of glycan-related databases and bioinformatics applications is considerably lagging behind compared with the wealth of available data and software tools in genomics and proteomics. Because the encoding of glycan structures is more complex, most of the bioinformatics approaches cannot be applied to glycan structures. No standard procedures exist where glycan structures found in various species, organs, tissues or cells can be routinely deposited. In this article the concepts of the GLYCOSCIENCES.de portal are described. It is demonstrated how an efficient structure-based cross-linking of various glycan-related data originating from different resources can be accomplished using a single user interface. The structure oriented retrieval options-exact structure, substructure, motif, composition and sugar components-are discussed. The types of available data-references, composition, spatial structures, nuclear magnetic resonance (NMR) shifts (experimental and estimated), theoretically calculated fragments and Protein Database (PDB) entries-are exemplified for Man(3.) The free availability and unrestricted use of glycan-related data is an absolute prerequisite to efficiently share distributed resources. Additionally, there is an urgent need to agree to a generally accepted exchange format as well as to a common software interface. An open access repository for glyco-related experimental data will secure that the loss of primary data will be considerably reduced.
Collapse
Affiliation(s)
- Thomas Lütteke
- Spectroscopic Department(B9090), German Cancer Research Center, Molecular Modelling, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany
| | | | | | | | | | | |
Collapse
|
20
|
Pérez S, Mulloy B. Prospects for glycoinformatics. Curr Opin Struct Biol 2005; 15:517-24. [PMID: 16143513 DOI: 10.1016/j.sbi.2005.08.005] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2005] [Revised: 07/11/2005] [Accepted: 08/24/2005] [Indexed: 10/25/2022]
Abstract
High-throughput and automated techniques (mass spectrometry in particular) allow increasingly rapid structural analysis of complex glycans. Information concerning the primary structure (composition, sequence and linkages), three-dimensional structure (including dynamics) and interactions of glycans is now available in sufficient quantity to justify the maintenance of databases and search facilities. Several such resources (both commercial and open access) are now available as web tools. To derive the full value of glycan databases, it will be necessary to develop a universally accepted machine-readable structural representation of glycans.
Collapse
Affiliation(s)
- Serge Pérez
- Centres de Recherches sur les Macromolécules Végétales, CNRS, BP 53, 38041 Grenoble, France.
| | | |
Collapse
|
21
|
Hashimoto K, Goto S, Kawano S, Aoki-Kinoshita KF, Ueda N, Hamajima M, Kawasaki T, Kanehisa M. KEGG as a glycome informatics resource. Glycobiology 2005; 16:63R-70R. [PMID: 16014746 DOI: 10.1093/glycob/cwj010] [Citation(s) in RCA: 179] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Bioinformatics approaches to carbohydrate research have recently begun using large amounts of protein and carbohydrate data. In this field called glycome informatics, the foremost necessity is a comprehensive resource for genome-scale bioinformatics analysis of glycan data. Although the accumulation of experimental data may be useful as a reference of biological and biochemical information on carbohydrates, this is insufficient for bioinformatics analysis. Thus, we have developed a glycome informatics resource (http://www.genome.jp/kegg/glycan/) in KEGG (Kyoto Encyclopedia of Genes and Genomes), an integrated knowledge base of protein networks, genomic information, and chemical information. This review describes three noteworthy features: (1) GLYCAN, a database of carbohydrate structures; (2) glycan-related pathways; and (3) Composite Structure Map (CSM), a map illustrating all possible variations of carbohydrate structures within organisms. GLYCAN includes two useful tools: an intuitive drawing tool called KegDraw, and an efficient glycan search and alignment tool called KEGG Carbohydrate Matcher (KCaM). KEGG's glycan biosynthesis and metabolism pathways, integrating carbohydrate structures, proteins, and reactions, are also a pivotal resource. CSM is constructed as a bridge between carbohydrate functions and structures. CSM is able to display, for example, expression data of glycosyltransferases in a compact manner. In all the KEGG resources, various objects including KEGG pathways, chemical compounds, as well as carbohydrate structures are commonly represented as graphs, which are widely studied and utilized in the computer science field.
Collapse
Affiliation(s)
- Kosuke Hashimoto
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | | | | | | | | | | | | | | |
Collapse
|