1
|
Stevens H. Globalizing Genomics: The Origins of the International Nucleotide Sequence Database Collaboration. JOURNAL OF THE HISTORY OF BIOLOGY 2018; 51:657-691. [PMID: 28986915 DOI: 10.1007/s10739-017-9490-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Genomics is increasingly considered a global enterprise - the fact that biological information can flow rapidly around the planet is taken to be important to what genomics is and what it can achieve. However, the large-scale international circulation of nucleotide sequence information did not begin with the Human Genome Project. Efforts to formalize and institutionalize the circulation of sequence information emerged concurrently with the development of centralized facilities for collecting that information. That is, the very first databases build for collecting and sharing DNA sequence information were, from their outset, international collaborative enterprises. This paper describes the origins of the International Nucleotide Sequence Database Collaboration between GenBank in the United States, the European Molecular Biology Laboratory Databank, and the DNA Database of Japan. The technical and social groundwork for the international exchange of nucleotide sequences created the conditions of possibility for imagining nucleotide sequences (and subsequently genomes) as a "global" objects. The "transnationalism" of nucleotide sequence was critical to their ontology - what DNA sequences came to be during the Human Genome Project was deeply influenced by international exchange.
Collapse
Affiliation(s)
- Hallam Stevens
- School of Humanities and Social Sciences, Nanyang Technological University, 14 Nanyang Drive #05-07, Singapore, 637332, Singapore.
| |
Collapse
|
2
|
Kamal MS, Sarowar MG, Dey N, Ashour AS, Ripon SH, Panigrahi BK, Tavares JMRS. Self-organizing mapping based swarm intelligence for secondary and tertiary proteins classification. INT J MACH LEARN CYB 2017. [DOI: 10.1007/s13042-017-0710-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
3
|
Liu H, Zhu R, Lv J, He H, Yang L, Huang Z, Su J, Zhang Y, Yu S, Wu Q. DevMouse, the mouse developmental methylome database and analysis tools. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bat084. [PMID: 24408217 PMCID: PMC3885893 DOI: 10.1093/database/bat084] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
DNA methylation undergoes dynamic changes during mouse development and plays crucial roles in embryogenesis, cell-lineage determination and genomic imprinting. Bisulfite sequencing enables profiling of mouse developmental methylomes on an unprecedented scale; however, integrating and mining these data are challenges for experimental biologists. Therefore, we developed DevMouse, which focuses on the efficient storage of DNA methylomes in temporal order and quantitative analysis of methylation dynamics during mouse development. The latest release of DevMouse incorporates 32 normalized and temporally ordered methylomes across 15 developmental stages and related genome information. A flexible query engine is developed for acquisition of methylation profiles for genes, microRNAs, long non-coding RNAs and genomic intervals of interest across selected developmental stages. To facilitate in-depth mining of these profiles, DevMouse offers online analysis tools for the quantification of methylation variation, identification of differentially methylated genes, hierarchical clustering, gene function annotation and enrichment. Moreover, a configurable MethyBrowser is provided to view the base-resolution methylomes under a genomic context. In brief, DevMouse hosts comprehensive mouse developmental methylome data and provides online tools to explore the relationships of DNA methylation and development. Database URL: http://www.devmouse.org/
Collapse
Affiliation(s)
- Hongbo Liu
- Department of Developmental Biology, School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China, Department of Computational Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China, Department of Food Science, School of Food Science and Engineering, Harbin Institute of Technology, Harbin 150001, China and Department of Respiratory Medicine, the First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Chen WH, Lu YW, Lai F, Chien YH, Hwu WL. Integrating human genome database into electronic health record with sequence alignment and compression mechanism. J Med Syst 2011; 36:2587-97. [PMID: 21559844 DOI: 10.1007/s10916-011-9731-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2011] [Accepted: 05/02/2011] [Indexed: 11/24/2022]
Abstract
With the initial completion of Human Genome Project, the post-genomic era is coming. Although the genome map of human has been decoded, the roles that each segment of sequences acts are not totally discovered. On the other hand, with the rapid expansion of sequence information, the issues of data compilation and data storage are increasingly important. In this paper, a "Human genome database system" is designed and implemented in National Taiwan University Hospital (NTUH). By accessing this system, the doctors can store and manage the experimental sequence data. The achievement of this system is that it integrates the modules of sequence alignment and data compression. By embedding with the NCBI alignment program-blastall [1], it automatically aligns the uploaded sequences and searches for the corresponding genomic positions. Besides, the system encodes the differences between sequences, effectively compresses them and decreases the demand of storage spaces by the compression ratio at 12.28. At the same time, it offers a variety of query methods. Users can quickly access the interesting data by inputting the keywords of specimen number, GI and sequence position, etc. The electronic health record (EHR) in Health Information System (HIS) of NTUH is also combined in this system and the doctors can utilize the valuable information to figure out the relation between the diseases and genes. With this system, a genetic personal healthcare environment will be established in the future.
Collapse
|
5
|
Affiliation(s)
- Dmitrij Frishman
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenchaftszentrum Weihenstephan, 85350 Freising, Germany
| |
Collapse
|
6
|
Identification phénotypique et moléculaire des bactéries appartenant au genre Nocardia. ACTA ACUST UNITED AC 2007. [DOI: 10.1016/s1773-035x(07)80129-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
7
|
Dhingra V, Gupta M, Andacht T, Fu ZF. New frontiers in proteomics research: A perspective. Int J Pharm 2005; 299:1-18. [PMID: 15979831 DOI: 10.1016/j.ijpharm.2005.04.010] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2004] [Revised: 03/01/2005] [Accepted: 04/04/2005] [Indexed: 12/12/2022]
Abstract
Substantial advances have been made in the fundamental understanding of human biology, ranging from DNA structure to identification of diseases associated with genetic abnormalities. Genome sequence information is becoming available in unprecedented amounts. The absence of a direct functional correlation between gene transcripts and their corresponding proteins, however, represents a significant roadblock for improving the efficiency of biological discoveries. The success of proteomics depends on the ability to identify and analyze protein products in a cell or tissue and, this is reliant on the application of several key technologies. Proteomics is in its exponential growth phase. Two-dimensional electrophoresis complemented with mass spectrometry provides a global view of the state of the proteins from the sample. Proteins identification is a requirement to understand their functional diversity. Subtle difference in protein structure and function can contribute to complexity and diversity of life. This review focuses on the progress and the applications of proteomics science with special reference to integration of the evolving technologies involved to address biological questions.
Collapse
Affiliation(s)
- Vikas Dhingra
- Department of Pathology, University of Georgia, Athens, GA 30602, USA.
| | | | | | | |
Collapse
|
8
|
Burren OS, Healy BC, Lam AC, Schuilenburg H, Dolman GE, Everett VH, Laneri D, Nutland S, Rance HE, Payne F, Smyth D, Lowe C, Barratt BJ, Twells RCJ, Rainbow DB, Wicker LS, Todd JA, Walker NM, Smink LJ. Development of an integrated genome informatics, data management and workflow infrastructure: a toolbox for the study of complex disease genetics. Hum Genomics 2005; 1:98-109. [PMID: 15601538 PMCID: PMC3525068 DOI: 10.1186/1479-7364-1-2-98] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The genetic dissection of complex disease remains a significant challenge. Sample-tracking and the recording, processing and storage of high-throughput laboratory data with public domain data, require integration of databases, genome informatics and genetic analyses in an easily updated and scaleable format. To find genes involved in multifactorial diseases such as type 1 diabetes (T1D), chromosome regions are defined based on functional candidate gene content, linkage information from humans and animal model mapping information. For each region, genomic information is extracted from Ensembl, converted and loaded into ACeDB for manual gene annotation. Homology information is examined using ACeDB tools and the gene structure verified. Manually curated genes are extracted from ACeDB and read into the feature database, which holds relevant local genomic feature data and an audit trail of laboratory investigations. Public domain information, manually curated genes, polymorphisms, primers, linkage and association analyses, with links to our genotyping database, are shown in Gbrowse. This system scales to include genetic, statistical, quality control (QC) and biological data such as expression analyses of RNA or protein, all linked from a genomics integrative display. Our system is applicable to any genetic study of complex disease, of either large or small scale.
Collapse
MESH Headings
- Animals
- Chromosome Mapping
- Chromosomes, Human
- Computational Biology
- Database Management Systems
- Databases, Factual
- Diabetes Mellitus, Type 1/genetics
- Disease Models, Animal
- Genetic Diseases, Inborn/genetics
- Genetic Linkage
- Genome
- Genome, Human
- Humans
- Informatics/methods
- Information Storage and Retrieval
- Information Systems
- Models, Biological
- Models, Genetic
- Polymorphism, Single Nucleotide
- Quality Control
- Sequence Analysis, DNA
Collapse
Affiliation(s)
- Oliver S Burren
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Barry C Healy
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Alex C Lam
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Helen Schuilenburg
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Geoffrey E Dolman
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Vincent H Everett
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Davide Laneri
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Sarah Nutland
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Helen E Rance
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Felicity Payne
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Deborah Smyth
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Chris Lowe
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Bryan J Barratt
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Rebecca CJ Twells
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Daniel B Rainbow
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Linda S Wicker
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - John A Todd
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Neil M Walker
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| | - Luc J Smink
- Juvenile Diabetes Research Foundation/Welcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, CB2 2XY, UK
| |
Collapse
|
9
|
Lenffer J, Lai P, El Mejaber W, Khan AM, Koh JLY, Tan PTJ, Seah SH, Brusic V. CysView: protein classification based on cysteine pairing patterns. Nucleic Acids Res 2004; 32:W350-5. [PMID: 15215409 PMCID: PMC441613 DOI: 10.1093/nar/gkh475] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
CysView is a web-based application tool that identifies and classifies proteins according to their disulfide connectivity patterns. It accepts a dataset of annotated protein sequences in various formats and returns a graphical representation of cysteine pairing patterns. CysView displays cysteine patterns for those records in the data with disulfide annotations. It allows the viewing of records grouped by connectivity patterns. CysView's utility as an analysis tool was demonstrated by the rapid and correct classification of scorpion toxin entries from GenPept on the basis of their disulfide pairing patterns. It has proved useful for rapid detection of irrelevant and partial records, or those with incomplete annotations. CysView can be used to support distant homology between proteins. CysView is publicly available at http://research.i2r.a-star.edu.sg/CysView/.
Collapse
Affiliation(s)
- Johann Lenffer
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613 Singapore
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Horng JT, Lin FM, Lin JH, Huang HD, Liu BJ. Database of repetitive elements in complete genomes and data mining using transcription factor binding sites. IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE : A PUBLICATION OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY 2003; 7:93-100. [PMID: 12834164 DOI: 10.1109/titb.2003.811878] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Approximately 43% of the human genome is occupied by repetitive elements. Even more, around 51% of the rice genome is occupied by repetitive elements. The analysis presented here indicates that repetitive elements in complete genomes may have been very important in the evolutionary genomics. In this study, a database, called the Repeat Sequence Database, is first designed and implemented to store complete and comprehensive repetitive sequences. See http://rsdb.csie.ncu.edu.tw for more information. The database contains direct, inverted and palindromic repetitive sequences, and each repetitive sequence has a variable length ranging from seven to many hundred nucleotides. The repetitive sequences in the database are explored using a mathematical algorithm to mine rules on how combinations of individual binding sites are distributed among repetitive sequences in the database. Combinations of transcription factor binding sites in the repetitive sequences are obtained and then data mining techniques are applied to mine association rules from these combinations. The discovered associations are further pruned to remove insignificant associations and obtain a set of associations. The mined association rules facilitate efforts to identify gene classes regulated by similar mechanisms and accurately predict regulatory elements. Experiments are performed on several genomes including C. elegans, human chromosome 22, and yeast.
Collapse
Affiliation(s)
- Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, Jung-li City 320, Taiwan, ROC.
| | | | | | | | | |
Collapse
|
11
|
Lefranc MP. IMGT databases, web resources and tools for immunoglobulin and T cell receptor sequence analysis, http://imgt.cines.fr. Leukemia 2003; 17:260-6. [PMID: 12529691 DOI: 10.1038/sj.leu.2402637] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2002] [Accepted: 05/06/2002] [Indexed: 11/08/2022]
Abstract
IMGT, the international ImMunoGeneTics database((R)) (http://imgt.cines.fr), is a high-quality integrated information system specializing in immunoglobulins (IG), T cell receptors (TR) and major histocompatibility complex (MHC) of human and other vertebrates, created in 1989, by LIGM, at the Université Montpellier II, CNRS, Montpellier, France. IMGT provides a common access to standardized data which include nucleotide and protein sequences, oligonucleotide primers, gene maps, genetic polymorphisms, specificities, 2D and 3D structures. IMGT includes several databases (IMGT/LIGM-DB, IMGT/3Dstructure-DB, IMGT/HLA-DB), Web resources ('IMGT Marie-Paule page') and interactive tools (IMGT/V-QUEST, IMGT/JunctionAnalysis). IMGT expertly annotated data and tools described in this paper are particularly useful for the analysis of the IG and TR rearrangements in leukemia, lymphoma and myeloma, and in translocations involving the antigen receptor loci. IMGT is freely available at http://imgt.cines.fr.
Collapse
Affiliation(s)
- M-P Lefranc
- Laboratoire d'ImmunoGénétique Moléculaire, LIGM, Université Montpellier II, UPR CNRS 1142, Institut de Génétique Humaine, IGH, Montpellier, France
| |
Collapse
|
12
|
Stoesser G, Baker W, van den Broek A, Garcia-Pastor M, Kanz C, Kulikova T, Leinonen R, Lin Q, Lombard V, Lopez R, Mancuso R, Nardone F, Stoehr P, Tuli MA, Tzouvara K, Vaughan R. The EMBL Nucleotide Sequence Database: major new developments. Nucleic Acids Res 2003; 31:17-22. [PMID: 12519939 PMCID: PMC165468 DOI: 10.1093/nar/gkg021] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) incorporates, organizes and distributes nucleotide sequences from all available public sources. The database is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis to achieve optimal synchronization. Webin is the preferred web-based submission system for individual submitters, while automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, Email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS) integrates and links the main nucleotide and protein databases plus many other specialized molecular biology databases. For sequence similarity searching, a variety of tools (e.g. Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.
Collapse
Affiliation(s)
- Guenter Stoesser
- EMBL Outstation, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Lefranc MP. IMGT, the international ImMunoGeneTics database: a high-quality information system for comparative immunogenetics and immunology. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2002; 26:697-705. [PMID: 12206833 DOI: 10.1016/s0145-305x(02)00026-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
IMGT, the international ImMunoGeneTics database (http://imgt.cines.fr), is a high quality integrated information system specializing in Immunoglobulins (IG), T cell Receptors (TR) and Major Histocompatibility Complex (MHC) of human and other vertebrates, created in 1989, by LIGM, at the Université Montpellier II, CNRS, Montpellier, France. IMGT provides a common access to standardized data, which include nucleotide and protein sequences, oligonucleotide primers, gene maps, genetic polymorphisms, specificities, 2D and 3D structures. IMGT includes several databases (IMGT/LIGM-DB, IMGT/HLA-DB, IMGT/3Dstructure-DB), Web resources ('IMGT Marie-Paule page') which comprise IMGT Scientific Chart, IMGT Repertoire, IMGT Bloc-notes, IMGT Education, IMGT Aide-mémoire and IMGT Index, and interactive tools (IMGT/V-QUEST, IMGT/JunctionAnalysis). These expertly annotated data on the genome, proteome, genetics and structure of the IG, TR and MHC are of high value for comparative genome evolution studies of the adaptative immune response.
Collapse
Affiliation(s)
- Marie-Paule Lefranc
- Laboratoire d'ImmunoGénétique Moléculaire, LIGM, Université Montpellier II, UPR CNRS 1142, IGH, 141 rue de la Cardonille, 34396 Montpellier Cedex 5, France.
| |
Collapse
|
14
|
Shirai T, Matsui Y, Shionyu-Mitsuyama C, Yamane T, Kamiya H, Ishii C, Ogawa T, Muramoto K. Crystal structure of a conger eel galectin (congerin II) at 1.45A resolution: implication for the accelerated evolution of a new ligand-binding site following gene duplication. J Mol Biol 2002; 321:879-89. [PMID: 12206768 DOI: 10.1016/s0022-2836(02)00700-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The crystal structure of congerin II, a galectin family lectin from conger eel, was determined at 1.45A resolution. The previously determined structure of its isoform, congerin I, had revealed a fold evolution via strand swap; however, the structure of congerin II described here resembles other prototype galectins. A comparison of the two congerin genes with that of several other galectins suggests acceralated evolution of both congerin genes following gene duplication. The presence of a Mes (2-[N-morpholino]ethanesulfonic acid) molecule near the carbohydrate-binding site in the crystal structure points to the possibility of an additional binding site in congerin II. The binding site consists of a group of residues that had been replaced following gene duplication suggesting that the binding site was built under selective pressure. Congerin II may be a protein specialized for biological defense with an affinity for target carbohydrates on parasites' cell surface.
Collapse
Affiliation(s)
- Tsuyoshi Shirai
- Department of Biotechnology and Biomaterial Chemistry, Graduate School of Engineering, Nagoya University, Chikusa-Ku, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
15
|
|
16
|
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL. GenBank. Nucleic Acids Res 2002; 30:17-20. [PMID: 11752243 PMCID: PMC99127 DOI: 10.1093/nar/30.1.17] [Citation(s) in RCA: 355] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2001] [Accepted: 10/10/2001] [Indexed: 11/14/2022] Open
Abstract
The GenBank sequence database incorporates publicly available DNA sequences of more than 105 000 different organisms, primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical literature via PubMed. Sequence similarity searching is provided by the BLAST family of programs. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. NCBI also offers a wide range of World Wide Web retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the NCBI home page at http://www.ncbi.nlm.nih.gov.
Collapse
Affiliation(s)
- Dennis A Benson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| | | | | | | | | | | |
Collapse
|
17
|
Abstract
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search.
Collapse
Affiliation(s)
- Mika Hirakawa
- Bioinformatics Division, Advanced Databases Department, Japan Science and Technology Corporation (JST), 5-3 Yonban-cho, Chiyoda-ku, Tokyo 102-0081, Japan.
| |
Collapse
|
18
|
Srinivasan KN, Gopalakrishnakone P, Tan PT, Chew KC, Cheng B, Kini RM, Koh JL, Seah SH, Brusic V. SCORPION, a molecular database of scorpion toxins. Toxicon 2002; 40:23-31. [PMID: 11602275 DOI: 10.1016/s0041-0101(01)00182-9] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Increasing interest in the studies of toxins and the requirements for better structural and functional annotations have created a need for improved data management in the field of toxins. The molecular database, SCORPION, contains more than 200 entries of fully referenced scorpion toxin data including primary sequences, three-dimensional structures, structural and functional annotations of scorpion toxins along with relevant literature references. SCORPION has a set of search tools that allow users to extract data and perform specific queries. These entries have been compiled from public databases and literature, cleaned of errors and enriched with additional structural and functional information. The grouping of scorpion toxins provides a basis for extending and clarifying the existing structural and functional classifications. The bioinformatics modules in SCORPION facilitate analyses aimed at classification of scorpion toxins and identification of sequence patterns associated with specific structural or functional properties of scorpion toxins. The SCORPION database is accessible via the Internet at sdmc.krdl.org.sg:8080/scorpion.
Collapse
Affiliation(s)
- K N Srinivasan
- Venom and Toxin Research Programme, Faculty of Medicine, National University of Singapore, 4-Medical Drive, 117597, Singapore
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Stoesser G, Baker W, van den Broek A, Camon E, Garcia-Pastor M, Kanz C, Kulikova T, Leinonen R, Lin Q, Lombard V, Lopez R, Redaschi N, Stoehr P, Tuli MA, Tzouvara K, Vaughan R. The EMBL Nucleotide Sequence Database. Nucleic Acids Res 2002; 30:21-6. [PMID: 11752244 PMCID: PMC99098 DOI: 10.1093/nar/30.1.21] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The EMBL Nucleotide Sequence Database (aka EMBL-Bank; http://www.ebi.ac.uk/embl/) incorporates, organises and distributes nucleotide sequences from all available public sources. EMBL-Bank is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis. Major contributors to the EMBL database are individual scientists and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many other specialized databases. For sequence similarity searching, a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.
Collapse
Affiliation(s)
- Guenter Stoesser
- EMBL Outstation, The European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Hirakawa M, Tanaka T, Hashimoto Y, Kuroda M, Takagi T, Nakamura Y. JSNP: a database of common gene variations in the Japanese population. Nucleic Acids Res 2002; 30:158-62. [PMID: 11752280 PMCID: PMC99126 DOI: 10.1093/nar/30.1.158] [Citation(s) in RCA: 203] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
JSNP is a repository of Japanese Single Nucleotide Polymorphism (SNP) data, begun in 2000 and developed through the Prime Minister's Millennium Project. The aim of this undertaking is to identify and collate up to 150 000 SNPs from the Japanese population, located in genes or in adjacent regions that might influence the coding sequence of the genes. The project has been carried out by a collaboration between the Human Genome Center (HGC) in the Institute of Medical Science (IMS) at the University of Tokyo and the Japan Science and Technology Corporation (JST). JSNP serves as both a storage site for the Japanese SNPs obtained from the ongoing project and as a facility for public dissemination to allow researchers access to high quality SNP data. A primary motivation of the project is the construction of a basic data set to identify relationships between polymorphisms and common diseases or the reaction to drugs. As such, emphasis has been placed on the identification of SNPs that lie in candidate regions which may affect phenotype but which would not necessarily directly cause disease. Unrestricted access to JSNP and any associated files is available at http://snp.ims.u-tokyo.ac.jp/.
Collapse
Affiliation(s)
- Mika Hirakawa
- Bioinformatics Division, Japan Science and Technology Corporation (JST), 5-3 Yonban-cho, Chiyoda-ku, Tokyo 102-0081, Japan
| | | | | | | | | | | |
Collapse
|
21
|
Varotto C, Richly E, Salamini F, Leister D. GST-PRIME: a genome-wide primer design software for the generation of gene sequence tags. Nucleic Acids Res 2001; 29:4373-7. [PMID: 11691924 PMCID: PMC60177 DOI: 10.1093/nar/29.21.4373] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The availability of sequenced genomes has generated a need for experimental approaches that allow the simultaneous analysis of large, or even complete, sets of genes. To facilitate such analyses, we have developed GST-PRIME, a software package for retrieving and assembling gene sequences, even from complex genomes, using the NCBI public database, and then designing sets of primer pairs for use in gene amplification. Primers were designed by the program for the direct amplification of gene sequence tags (GSTs) from either genomic DNA or cDNA. Test runs of GST-PRIME on 2000 randomly selected Arabidopsis and Drosophila genes demonstrate that 93 and 88% of resulting GSTs, respectively, fulfilled imposed length criteria. GST-PRIME primer pairs were tested on a set of 1900 Arabidopsis genes coding for chloroplast-targeted proteins: 95% of the primer pairs used in PCRs with genomic DNA generated the correct amplicons. GST-PRIME can thus be reliably used for large-scale or specific amplification of intron-containing genes of multicellular eukaryotes.
Collapse
Affiliation(s)
- C Varotto
- Zentrum zur Identifikation von Genfunktionen durch Insertionsmutagenese bei Arabidopsis thaliana (ZIGIA), Max-Planck-Institut für Züchtungsforschung, Carl-von-Linné Weg 10, 50829 Köln, Germany
| | | | | | | |
Collapse
|
22
|
Le Novère N, Changeux JP. The Ligand Gated Ion Channel database: an example of a sequence database in neuroscience. Philos Trans R Soc Lond B Biol Sci 2001; 356:1121-30. [PMID: 11545694 PMCID: PMC1088506 DOI: 10.1098/rstb.2001.0903] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Multiple comparisons of receptor sequences, or receptor subunit sequences, has proved to be an invaluable tool in modern pharmacological investigations. Although of outstanding importance, general sequence databases suffer from several imperfections due to their size and their non-specificity. Room therefore exists for expert-maintained databases of restricted focus, where knowledge of the research field helps to filter the huge amount of data generated. Accordingly, neuroscientists have designed databases covering several types of proteins, in particular receptors for neurotransmitters. Ligand-gated ion channels are oligomeric transmembrane proteins involved in the fast response to neurotransmitters. All these receptors are formed by the assembly of homologous subunits, and an unexpected wealth of genes coding for these subunits has been revealed during the last two decades. The Ligand Gated Ion Channel database (LGICdb) has been developed to handle this growing body of information. The database aims to provide only one entry for each gene, containing annotated nucleic acid and protein sequences.
Collapse
Affiliation(s)
- N Le Novère
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| | | |
Collapse
|
23
|
Bertone P, Kluger Y, Lan N, Zheng D, Christendat D, Yee A, Edwards AM, Arrowsmith CH, Montelione GT, Gerstein M. SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics. Nucleic Acids Res 2001; 29:2884-98. [PMID: 11433035 PMCID: PMC55760 DOI: 10.1093/nar/29.13.2884] [Citation(s) in RCA: 95] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
High-throughput structural proteomics is expected to generate considerable amounts of data on the progress of structure determination for many proteins. For each protein this includes information about cloning, expression, purification, biophysical characterization and structure determination via NMR spectroscopy or X-ray crystallography. It will be essential to develop specifications and ontologies for standardizing this information to make it amenable to retrospective analysis. To this end we created the SPINE database and analysis system for the Northeast Structural Genomics Consortium. SPINE, which is available at bioinfo.mbb.yale.edu/nesg or nesg.org, is specifically designed to enable distributed scientific collaboration via the Internet. It was designed not just as an information repository but as an active vehicle to standardize proteomics data in a form that would enable systematic data mining. The system features an intuitive user interface for interactive retrieval and modification of expression construct data, query forms designed to track global project progress and external links to many other resources. Currently the database contains experimental data on 985 constructs, of which 740 are drawn from Methanobacterium thermoautotrophicum, 123 from Saccharomyces cerevisiae, 93 from Caenorhabditis elegans and the remainder from other organisms. We developed a comprehensive set of data mining features for each protein, including several related to experimental progress (e.g. expression level, solubility and crystallization) and 42 based on the underlying protein sequence (e.g. amino acid composition, secondary structure and occurrence of low complexity regions). We demonstrate in detail the application of a particular machine learning approach, decision trees, to the tasks of predicting a protein's solubility and propensity to crystallize based on sequence features. We are able to extract a number of key rules from our trees, in particular that soluble proteins tend to have significantly more acidic residues and fewer hydrophobic stretches than insoluble ones. One of the characteristics of proteomics data sets, currently and in the foreseeable future, is their intermediate size ( approximately 500-5000 data points). This creates a number of issues in relation to error estimation. Initially we estimate the overall error in our trees based on standard cross-validation. However, this leaves out a significant fraction of the data in model construction and does not give error estimates on individual rules. Therefore, we present alternative methods to estimate the error in particular rules.
Collapse
Affiliation(s)
- P Bertone
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06520, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Abstract
The EMBL Outstation-European Bioinformatics Institute (EBI) is a center for research and services in bioinformatics. It serves researchers in molecular biology, genetics, medicine, and agriculture from academia, and the agricultural, biotechnology, chemical, and pharmaceutical industries. The Institute manages and makes available databases of biological data including nucleic acid, protein sequences, and macromolecular structures. It provides to this community bioinformatics services relevant to molecular biology free of charge over the Internet. Some of these databases and services are described in this review.
Collapse
Affiliation(s)
- P Rodriguez-Tomé
- EMBL Outstation, Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 ISD, UK.
| |
Collapse
|
25
|
Affiliation(s)
- L B.M. Ellis
- University of Minnesota, 55455, Minneapolis, MN, USA
| | | |
Collapse
|
26
|
Xie J, Huang J, Shi X, Liu C. Analysis of the characteristic sequence of intein and revision of its motifs. CHINESE SCIENCE BULLETIN-CHINESE 2001. [DOI: 10.1007/bf03187217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
27
|
Stoesser G, Baker W, van den Broek A, Camon E, Garcia-Pastor M, Kanz C, Kulikova T, Lombard V, Lopez R, Parkinson H, Redaschi N, Sterk P, Stoehr P, Tuli MA. The EMBL nucleotide sequence database. Nucleic Acids Res 2001; 29:17-21. [PMID: 11125039 PMCID: PMC29766 DOI: 10.1093/nar/29.1.17] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.
Collapse
Affiliation(s)
- G Stoesser
- EMBL Outstation-The European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Abstract
GOBASE (http://megasun.bch.umontreal.ca/gobase/) is a network-accessible biological database, which is unique in bringing together diverse biological data on organelles with taxonomically broad coverage, and in furnishing data that have been exhaustively verified and completed by experts. So far, we have focused on mitochondrial data: GOBASE contains all published nucleotide and protein sequences encoded by mitochondrial genomes, selected RNA secondary structures of mitochondria-encoded molecules, genetic maps of completely sequenced genomes, taxonomic information for all species whose sequences are present in the database and organismal descriptions of key protistan eukaryotes. All of these data have been integrated and organized in a formal database structure to allow sophisticated biological queries using terms that are inherent in biological concepts. Most importantly, data have been validated, completed, corrected and standardized, a prerequisite of meaningful analysis. In addition, where critical data are lacking, such as genetic maps and RNA secondary structures, they are generated by the GOBASE team and collaborators, and added to the database. The database is implemented in a relational database management system, but features an object-oriented view of the biological data through a Web/Genera-generated World Wide Web interface. Finally, we have developed software for database curation (i.e. data updates, validation and correction), which will be described in some detail in this paper.
Collapse
Affiliation(s)
- N Shimko
- Program in Evolutionary Biology, Canadian Institute for Advanced Research, Département de Biochimie, Université de Montréal, 2900 Boulevard Edouard-Montpetit, Montréal, Québec, H3T 1J4, Canada
| | | | | | | |
Collapse
|
29
|
Maidak BL, Cole JR, Lilburn TG, Parker CT, Saxman PR, Farris RJ, Garrity GM, Olsen GJ, Schmidt TM, Tiedje JM. The RDP-II (Ribosomal Database Project). Nucleic Acids Res 2001; 29:173-4. [PMID: 11125082 PMCID: PMC29785 DOI: 10.1093/nar/29.1.173] [Citation(s) in RCA: 959] [Impact Index Per Article: 41.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Ribosomal Database Project (RDP-II), previously described by Maidak et al. [Nucleic Acids Res. (2000), 28, 173-174], continued during the past year to add new rRNA sequences to the aligned data and to improve the analysis commands. Release 8.0 (June 1, 2000) consisted of 16 277 aligned prokaryotic small subunit (SSU) rRNA sequences while the number of eukaryotic and mitochondrial SSU rRNA sequences in aligned form remained at 2055 and 1503, respectively. The number of prokaryotic SSU rRNA sequences more than doubled from the previous release 14 months earlier, and approximately 75% are longer than 899 bp. An RDP-II mirror site in Japan is now available (http://wdcm.nig.ac.jp/RDP/html/index.h tml). RDP-II provides aligned and annotated rRNA sequences, derived phylogenetic trees and taxonomic hierarchies, and analysis services through its WWW server (http://rdp.cme.msu.edu/). Analysis services include rRNA probe checking, approximate phylogenetic placement of user sequences, screening user sequences for possible chimeric rRNA sequences, automated alignment, production of similarity matrices and services to plan and analyze terminal restriction fragment polymorphism experiments. The RDP-II email address for questions and comments has been changed from curator@cme.msu.edu to rdpstaff@msu.edu.
Collapse
Affiliation(s)
- B L Maidak
- Center for Microbial Ecology, 540 Plant and Soil Sciences Building, Michigan State University, East Lansing, MI 48824-1325, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Watanabe K, Nelson J, Harayama S, Kasai H. ICB database: the gyrB database for identification and classification of bacteria. Nucleic Acids Res 2001; 29:344-5. [PMID: 11125132 PMCID: PMC29849 DOI: 10.1093/nar/29.1.344] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Identification and Classification of Bacteria (ICB) database (http:/www.mbio.co.jp/icb) contains currently available information about the DNA gyrase subunit B (gyrB) gene in bacteria. The database is designed to provide the scientific community with a reference point for using gyrB as an evolutionary and taxonomic marker. Nucleic and amino acid sequence data are currently available for over 850 strains, along with alignments at several different taxonomic levels and an exhaustive review of primer selection and background information.
Collapse
Affiliation(s)
- K Watanabe
- Marine Biotechnology Institute, Kamaishi Laboratories 3-75-1 Heita, Kamaishi, Iwate 026-0001, Japan
| | | | | | | |
Collapse
|
31
|
Ringwald M, Eppig JT, Begley DA, Corradi JP, McCright IJ, Hayamizu TF, Hill DP, Kadin JA, Richardson JE. The Mouse Gene Expression Database (GXD). Nucleic Acids Res 2001; 29:98-101. [PMID: 11125060 PMCID: PMC29814 DOI: 10.1093/nar/29.1.98] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Gene Expression Database (GXD) is a community resource of gene expression information for the laboratory mouse. By combining the different types of expression data, GXD aims to provide increasingly complete information about the expression profiles of genes in different mouse strains and mutants, thus enabling valuable insights into the molecular networks that underlie normal development and disease. GXD is integrated with the Mouse Genome Database (MGD). Extensive interconnections with sequence databases and with databases from other species, and the development and use of shared controlled vocabularies extend GXD's utility for the analysis of gene expression information. GXD is accessible through the Mouse Genome Informatics web site at http://www.informatics.jax.org/ or directly at http://www.informatics.jax.org/menus/expression_menu. shtml.
Collapse
Affiliation(s)
- M Ringwald
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000. [PMID: 10802651 DOI: 10.1038/75556.gene] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
Collapse
Affiliation(s)
- M Ashburner
- Department of Genetics, Stanford University School of Medicine, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25-9. [PMID: 10802651 PMCID: PMC3037419 DOI: 10.1038/75556] [Citation(s) in RCA: 26247] [Impact Index Per Article: 1093.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org ) are being constructed: biological process, molecular function and cellular component.
Collapse
Affiliation(s)
- M Ashburner
- Department of Genetics, Stanford University School of Medicine, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
Over recent years databases have become an extremely important resource for biomedical research. Immunology research is increasingly dependent on access to extensive biological databases to extract existing information, plan experiments, and analyse experimental results. This review describes 15 immunological databases that have appeared over the last 30 years. In addition, important issues regarding database design and the potential for misuse of information contained within these databases are discussed. Access pointers are provided for the major immunological databases and also for a number of other immunological resources accessible over the World Wide Web (WWW).
Collapse
Affiliation(s)
- V Brusic
- BIC/KRDL Kent Ridge Digital Labs, 21 Heng Mui Keng Terrace, Singapore, Singapore.
| | | | | |
Collapse
|