1
|
Kodama Y, Mashima J, Kosuge T, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA Data Bank of Japan: 30th anniversary. Nucleic Acids Res 2019; 46:D30-D35. [PMID: 29040613 PMCID: PMC5753283 DOI: 10.1093/nar/gkx926] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/02/2017] [Indexed: 11/17/2022] Open
Abstract
The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups.
Collapse
Affiliation(s)
- Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan.,National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| |
Collapse
|
2
|
Stevens H. Globalizing Genomics: The Origins of the International Nucleotide Sequence Database Collaboration. JOURNAL OF THE HISTORY OF BIOLOGY 2018; 51:657-691. [PMID: 28986915 DOI: 10.1007/s10739-017-9490-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Genomics is increasingly considered a global enterprise - the fact that biological information can flow rapidly around the planet is taken to be important to what genomics is and what it can achieve. However, the large-scale international circulation of nucleotide sequence information did not begin with the Human Genome Project. Efforts to formalize and institutionalize the circulation of sequence information emerged concurrently with the development of centralized facilities for collecting that information. That is, the very first databases build for collecting and sharing DNA sequence information were, from their outset, international collaborative enterprises. This paper describes the origins of the International Nucleotide Sequence Database Collaboration between GenBank in the United States, the European Molecular Biology Laboratory Databank, and the DNA Database of Japan. The technical and social groundwork for the international exchange of nucleotide sequences created the conditions of possibility for imagining nucleotide sequences (and subsequently genomes) as a "global" objects. The "transnationalism" of nucleotide sequence was critical to their ontology - what DNA sequences came to be during the Human Genome Project was deeply influenced by international exchange.
Collapse
Affiliation(s)
- Hallam Stevens
- School of Humanities and Social Sciences, Nanyang Technological University, 14 Nanyang Drive #05-07, Singapore, 637332, Singapore.
| |
Collapse
|
3
|
Mashima J, Kodama Y, Fujisawa T, Katayama T, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA Data Bank of Japan. Nucleic Acids Res 2016; 45:D25-D31. [PMID: 27924010 PMCID: PMC5210514 DOI: 10.1093/nar/gkw1001] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 10/13/2016] [Accepted: 10/15/2016] [Indexed: 12/27/2022] Open
Abstract
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.
Collapse
Affiliation(s)
- Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takatomo Fujisawa
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | | | - Yoshihiro Okuda
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan .,National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| |
Collapse
|
4
|
Tanizawa Y, Fujisawa T, Kaminuma E, Nakamura Y, Arita M. DFAST and DAGA: web-based integrated genome annotation tools and resources. BIOSCIENCE OF MICROBIOTA FOOD AND HEALTH 2016; 35:173-184. [PMID: 27867804 PMCID: PMC5107635 DOI: 10.12938/bmfh.16-003] [Citation(s) in RCA: 171] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 06/27/2016] [Indexed: 12/15/2022]
Abstract
Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
Collapse
Affiliation(s)
- Yasuhiro Tanizawa
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan; Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Takatomo Fujisawa
- Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Masanori Arita
- Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan; RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
5
|
Mashima J, Kodama Y, Kosuge T, Fujisawa T, Katayama T, Nagasaki H, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA data bank of Japan (DDBJ) progress report. Nucleic Acids Res 2015; 44:D51-7. [PMID: 26578571 PMCID: PMC4702806 DOI: 10.1093/nar/gkv1105] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 10/09/2015] [Indexed: 01/07/2023] Open
Abstract
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.
Collapse
Affiliation(s)
- Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takatomo Fujisawa
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | | | - Hideki Nagasaki
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yoshihiro Okuda
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| |
Collapse
|
6
|
Tateno Y, Imanishi T, Miyazaki S, Fukami-Kobayashi K, Saitou N, Sugawara H, Gojobori T. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res 2002; 30:27-30. [PMID: 11752245 PMCID: PMC99140 DOI: 10.1093/nar/30.1.27] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has made an effort to collect as much data as possible mainly from Japanese researchers. The increase rates of the data we collected, annotated and released to the public in the past year are 43% for the number of entries and 52% for the number of bases. The increase rates are accelerated even after the human genome was sequenced, because sequencing technology has been remarkably advanced and simplified, and research in life science has been shifted from the gene scale to the genome scale. In addition, we have developed the Genome Information Broker (GIB, http://gib.genes.nig.ac.jp) that now includes more than 50 complete microbial genome and Arabidopsis genome data. We have also developed a database of the human genome, the Human Genomics Studio (HGS, http://studio.nig.ac.jp). HGS provides one with a set of sequences being as continuous as possible in any one of the 24 chromosomes. Both GIB and HGS have been updated incorporating newly available data and retrieval tools.
Collapse
Affiliation(s)
- Y Tateno
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Yata, Mishima 411-8540, Japan
| | | | | | | | | | | | | |
Collapse
|
7
|
Shafer RW, Jung DR, Betts BJ. Human immunodeficiency virus type 1 reverse transcriptase and protease mutation search engine for queries. Nat Med 2000; 6:1290-2. [PMID: 11062545 PMCID: PMC2582445 DOI: 10.1038/81407] [Citation(s) in RCA: 65] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- R W Shafer
- Division of Infectious Diseases, School of Medicine, Stanford University, Stanford, California 94305, USA.
| | | | | |
Collapse
|
8
|
Ruiz M, Giudicelli V, Ginestoux C, Stoehr P, Robinson J, Bodmer J, Marsh SG, Bontrop R, Lemaitre M, Lefranc G, Chaume D, Lefranc MP. IMGT, the international ImMunoGeneTics database. Nucleic Acids Res 2000; 28:219-21. [PMID: 10592230 PMCID: PMC102442 DOI: 10.1093/nar/28.1.219] [Citation(s) in RCA: 104] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/1999] [Revised: 10/02/1999] [Accepted: 10/13/1999] [Indexed: 11/15/2022] Open
Abstract
IMGT, the international ImMunoGeneTics database (http://imgt.cines. fr:8104 ), is a high-quality integrated database specialising in Immunoglobulins (Ig), T cell Receptors (TcR) and Major Histocompatibility Complex (MHC) molecules of all vertebrate species, created in 1989 by Marie-Paule Lefranc, Université Montpellier II, CNRS, Montpellier, France (lefranc@ligm.igh.cnrs.fr ). At present, IMGT includes two databases: IMGT/LIGM-DB, a comprehensive database of Ig and TcR from human and other vertebrates, with translation for fully annotated sequences, and IMGT/HLA-DB, a database of the human MHC referred to as HLA (Human Leucocyte Antigens). The IMGT server provides a common access to expertized genomic, proteomic, structural and polymorphic data of Ig and TcR molecules of all vertebrates. By its high quality and its easy data distribution, IMGT has important implications in medical research (repertoire in autoimmune diseases, AIDS, leukemias, lymphomas), therapeutic approaches (antibody engineering), genome diversity and genome evolution studies. IMGT is freely available at http://imgt.cines.fr:8104. The IMGT Index is provided at the IMGT Marie-Paule page (http://imgt.cines.fr:8104/textes/IMGTindex.html).
Collapse
Affiliation(s)
- M Ruiz
- Laboratoire d'ImmunoGénétique Moléculaire, LIGM, UPR CNRS 1142 IGH, 141 rue de la Cardonille, 34396 Montpellier Cedex 5, France
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Abstract
The GenBank((R))sequence database incorporates publicly available DNA sequences of >55 000 different organisms, primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (Web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping and protein structure information, plus the biomedical literature via PubMed. Sequence similarity searching is provided by the BLAST family of programs. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. NCBI also offers a wide range of WWW retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the NCBI home page at http://www.ncbi.nlm.nih.gov
Collapse
Affiliation(s)
- D A Benson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| | | | | | | | | | | |
Collapse
|
10
|
Maidak BL, Cole JR, Lilburn TG, Parker CT, Saxman PR, Stredwick JM, Garrity GM, Li B, Olsen GJ, Pramanik S, Schmidt TM, Tiedje JM. The RDP (Ribosomal Database Project) continues. Nucleic Acids Res 2000; 28:173-4. [PMID: 10592216 PMCID: PMC102428 DOI: 10.1093/nar/28.1.173] [Citation(s) in RCA: 362] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/1999] [Accepted: 10/06/1999] [Indexed: 11/14/2022] Open
Abstract
The Ribosomal Database Project (RDP-II), previously described by Maidak et al., continued during the past year to add new rRNA sequences to the aligned data and to improve the analysis commands. Release 7.1 (September 17, 1999) included more than 10 700 small subunit rRNA sequences. More than 850 type strain sequences were identified and added to the prokaryotic alignment, bringing the total number of type sequences to 3324 representing 2460 different species. Availability of an RDP-II mirror site in Japan is also near completion. RDP-II provides aligned and annotated rRNA sequences, derived phylogenetic trees and taxonomic hierarchies, and analysis services through its WWW server (http://rdp.cme.msu.edu/ ). Analysis services include rRNA probe checking, approx-i-mate phylogenetic placement of user sequences, screening user sequences for possible chimeric rRNA sequences, automated alignment, production of similarity matrices and services to plan and analyze terminal restriction fragment length polymorphism (T-RFLP) experiments.
Collapse
Affiliation(s)
- B L Maidak
- Center for Microbial Ecology, 540 Plant and Soil Sciences Building, Michigan State University, East Lansing, MI 48824-1325, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Blake JA, Eppig JT, Richardson JE, Davisson MT. The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. The Mouse Genome Database Group. Nucleic Acids Res 2000; 28:108-11. [PMID: 10592195 PMCID: PMC102449 DOI: 10.1093/nar/28.1.108] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/1999] [Accepted: 10/07/1999] [Indexed: 11/14/2022] Open
Abstract
The Mouse Genome Database (MGD) is a comprehensive public database of mouse genomic, genetic and phenotypic information (http://www. informatics.jax.org). This community database provides information about genes, serves as a mapping resource of the mouse genome, details mammalian orthologs, integrates experimental data, represents standardized mouse nomenclature for genes and alleles, incorporates links to other genomic resources such as sequence data, and includes a variety of additional information about the laboratory mouse. MGD scientists and annotators work cooperatively with the research community to provide an integrated, consensus view of the mouse genome while also providing experimental data including data conflicting with the consensus representation. Recent improvements focus on the representation of phenotypic information and the enhancement of gene and allele descriptions.
Collapse
Affiliation(s)
- J A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | | | |
Collapse
|
12
|
Tateno Y, Miyazaki S, Ota M, Sugawara H, Gojobori T. DNA data bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Res 2000; 28:24-6. [PMID: 10592172 PMCID: PMC102400 DOI: 10.1093/nar/28.1.24] [Citation(s) in RCA: 44] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We at DDBJ (http://www.ddbj.nig.ac.jp) process and publicise the massive amounts of data submitted mainly by Japanese genome projects and sequencing teams. It is emphasised that the collaboration between data producing teams and the data bank is crucial in carrying out these processes smoothly. The amount of data submitted in 1999 is so large that it alone exceeds the total amount submitted in the preceding 10 years. To cope with this situation, we have developed tools not only for processing such massive amounts of data but also for efficiently retrieving data on demand.
Collapse
Affiliation(s)
- Y Tateno
- Center for Information Biology, National Institute of Genetics, Yata, Mishima 411-8540, Japan.
| | | | | | | | | |
Collapse
|
13
|
Perrière G, Bessières P, Labedan B. EMGLib: the enhanced microbial genomes library (update 2000). Nucleic Acids Res 2000; 28:68-71. [PMID: 10592183 PMCID: PMC102414 DOI: 10.1093/nar/28.1.68] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/1999] [Accepted: 10/04/1999] [Indexed: 11/13/2022] Open
Abstract
As the number of complete microbial genomes publicly available is still growing, the problem of annotation quality in these very large sequences remains unsolved. Indeed, the number of annotations associated with complete genomes is usually lower than those of the shorter entries encountered in the repository collections. Moreover, classical sequence database management systems have difficulties in handling entries of such size. In this context, the Enhanced Microbial Genomes Library (EMGLib) was developed to try to alleviate these problems. This library contains all the complete genomes from prokaryotes (bacteria and archaea) already sequenced and the yeast genome in GenBank format. The annotations are improved by the introduction of data on codon usage, gene orientation on the chromosome and gene families. It is possible to access EMGLib through two database systems set up on WWW servers: the PBIL server at http://pbil.univ-lyon1.fr/emglib.html and the MICADO server at http://locus.jouy.inra.fr/micado
Collapse
Affiliation(s)
- G Perrière
- Laboratoire de Biométrie et Biologie Evolutive, Université Claude Bernard, Lyon 1, 43 boulevard du 11 Novembre 1918, 69622 Villeurbanne Cedex, France.
| | | | | |
Collapse
|
14
|
Harger C, Chen G, Farmer A, Huang W, Inman J, Kiphart D, Schilkey F, Skupski MP, Weller J. The genome sequence DataBase. Nucleic Acids Res 2000; 28:31-2. [PMID: 10592174 PMCID: PMC102463 DOI: 10.1093/nar/28.1.31] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/1999] [Revised: 10/13/1999] [Accepted: 10/13/1999] [Indexed: 11/13/2022] Open
Abstract
The Genome Sequence DataBase (GSDB) is a database of publicly available nucleotide sequences and their associated biological and bibliographic information. Several notable changes have occurred in the past year: GSDB stopped accepting data submissions from researchers; ownership of data submitted to GSDB was transferred to GenBank; sequence analysis capabilities were expanded to include Smith-Waterman and Frame Search; and Sequence Viewer became available to Mac users. The content of GSDB remains up-to-date because publicly available data is acquired from the International Nucleotide Sequence Database Collaboration databases (IC) on a nightly basis. This allows GSDB to continue providing researchers with the ability to analyze, query and retrieve nucleotide sequences in the database. GSDB and its related tools are freely accessible from the URL: http://www.ncgr.org
Collapse
Affiliation(s)
- C Harger
- National Center for Genome Resources, 1800 Old Pecos Trail, Suite A, Santa Fe, NM 87505, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Ringwald M, Eppig JT, Kadin JA, Richardson JE. GXD: a Gene Expression Database for the laboratory mouse: current status and recent enhancements. The Gene Expresison Database group. Nucleic Acids Res 2000; 28:115-9. [PMID: 10592197 PMCID: PMC102464 DOI: 10.1093/nar/28.1.115] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/1999] [Accepted: 10/13/1999] [Indexed: 11/14/2022] Open
Abstract
The Gene Expression Database (GXD) is a community resource of gene expression information for the laboratory mouse. The database is designed as an open-ended system that can integrate different types of expression data. New expression data are made available on a daily basis. Thus, GXD provides increasingly complete information about what transcripts and proteins are produced by what genes; where, when and in what amounts these gene products are expressed; and how their expression varies in different mouse strains and mutants. GXD is integrated with the Mouse Genome Database (MGD). Continuously refined interconnections with sequence databases and with databases from other species place the gene expression information in the larger biological and analytical context. GXD is accessible through the Mouse Genome Informatics Web site at http://www.informatics.jax.org/ or directly at http://www.informatics.jax.org/menus/expression_menu.shtm l
Collapse
Affiliation(s)
- M Ringwald
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | | | |
Collapse
|
16
|
Abstract
Xylanases are classified into two families, numbered F/10 and G/11 according to the similarity of amino acid sequences of their catalytic domain (Henrissat, B., Bairoch, A., 1993. New families in the classification of glycosyl hydrolases based on amino acid sequence similarities. Biochem. J. 293, 781-788). Three-dimensional structure of the catalytic domain of the family F/10 xylanase was reported (White, A., Withers, S.G., Gilkes, N.R., Rose, D.R., 1994. Crystal structure of the catalytic domain of the beta-1,4-glycanase Cex from Cellulomonas fimi. Biochemistry 33, 12546-12552). The domain was decomposed into 22 modules by centripetal profiles (Go, M., Nosaka, M., 1987. Protein architecture and the origin of introns. Cold Spring Harbor Symp. Quant. Biol. 52, 915-924; Noguti, T., Sakakibara, H., Go, M., 1993. Localization of hydrogen-bonds within modules in barnase. Proteins 16, 357-363). A module is a contiguous polypeptide segment of amino acid residues having a compact conformation within a globular domain. Collected 31 intron sites of the family F/10 xylanase genes from fungus were found to be correlated to module boundaries with considerable statistical force (p values <0.001). The relationship between the intron locations and protein structures provides supporting evidence for the ancient origin of introns, because such a relationship cannot be expected by random insertion of introns into eukaryotic genes, but it rather suggests pre-existence of introns in the ancestral genes of prokaryotes and eukaryotes. A phylogenetic tree of the fungal and bacterial xylanase sequences made two clusters; one includes both the bacterial and fungal genes, but the other consists of only fungal genes. The mixed cluster of bacterial genes without introns and the fungal genes with introns further supports the ancient origin of introns. Comparison of the conserved base sequences of introns indicates that sliding of a splice site occurred in Aspergillus kawachii gene by one base from the ancestral position. Substrate-binding sites of xylanase are localized on eight modules, and introns are found at both termini of six out of these functional modules. This result suggests that introns might play a functional role in shuffling the exons encoding the substrate-binding modules.
Collapse
Affiliation(s)
- Y Sato
- Division of Biological Science, Graduate School of Science, Nagoya University, Japan
| | | | | | | |
Collapse
|