1
|
Hu JC, Sherlock G, Siegele DA, Aleksander SA, Ball CA, Demeter J, Gouni S, Holland TA, Karp PD, Lewis JE, Liles NM, McIntosh BK, Mi H, Muruganujan A, Wymore F, Thomas PD, Altman T. PortEco: a resource for exploring bacterial biology through high-throughput data and analysis tools. Nucleic Acids Res 2013; 42:D677-84. [PMID: 24285306 PMCID: PMC3965092 DOI: 10.1093/nar/gkt1203] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
PortEco (http://porteco.org) aims to collect, curate and provide data and analysis tools to support basic biological research in Escherichia coli (and eventually other bacterial systems). PortEco is implemented as a ‘virtual’ model organism database that provides a single unified interface to the user, while integrating information from a variety of sources. The main focus of PortEco is to enable broad use of the growing number of high-throughput experiments available for E. coli, and to leverage community annotation through the EcoliWiki and GONUTS systems. Currently, PortEco includes curated data from hundreds of genome-wide RNA expression studies, from high-throughput phenotyping of single-gene knockouts under hundreds of annotated conditions, from chromatin immunoprecipitation experiments for tens of different DNA-binding factors and from ribosome profiling experiments that yield insights into protein expression. Conditions have been annotated with a consistent vocabulary, and data have been consistently normalized to enable users to find, compare and interpret relevant experiments. PortEco includes tools for data analysis, including clustering, enrichment analysis and exploration via genome browsers. PortEco search and data analysis tools are extensively linked to the curated gene, metabolic pathway and regulation content at its sister site, EcoCyc.
Collapse
Affiliation(s)
- James C Hu
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Department of Biology, Texas A&M University, College Station, TX, 77843, USA, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA and Deptartment of Preventive Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Rasmussen JP, Feldman JL, Reddy SS, Priess JR. Cell interactions and patterned intercalations shape and link epithelial tubes in C. elegans. PLoS Genet 2013; 9:e1003772. [PMID: 24039608 PMCID: PMC3764189 DOI: 10.1371/journal.pgen.1003772] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 07/19/2013] [Indexed: 01/15/2023] Open
Abstract
Many animal organs are composed largely or entirely of polarized epithelial tubes, and the formation of complex organ systems, such as the digestive or vascular systems, requires that separate tubes link with a common polarity. The Caenorhabditis elegans digestive tract consists primarily of three interconnected tubes—the pharynx, valve, and intestine—and provides a simple model for understanding the cellular and molecular mechanisms used to form and connect epithelial tubes. Here, we use live imaging and 3D reconstructions of developing cells to examine tube formation. The three tubes develop from a pharynx/valve primordium and a separate intestine primordium. Cells in the pharynx/valve primordium polarize and become wedge-shaped, transforming the primordium into a cylindrical cyst centered on the future lumenal axis. For continuity of the digestive tract, valve cells must have the same, radial axis of apicobasal polarity as adjacent intestinal cells. We show that intestinal cells contribute to valve cell polarity by restricting the distribution of a polarizing cue, laminin. After developing apicobasal polarity, many pharyngeal and valve cells appear to explore their neighborhoods through lateral, actin-rich lamellipodia. For a subset of cells, these lamellipodia precede more extensive intercalations that create the valve. Formation of the valve tube begins when two valve cells become embedded at the left-right boundary of the intestinal primordium. Other valve cells organize symmetrically around these two cells, and wrap partially or completely around the orthogonal, lumenal axis, thus extruding a small valve tube from the larger cyst. We show that the transcription factors DIE-1 and EGL-43/EVI1 regulate cell intercalations and cell fates during valve formation, and that the Notch pathway is required to establish the proper boundary between the pharyngeal and valve tubes. Tubes composed of epithelial cells are universal building blocks of animal organs, and complex organs typically contain multiple interconnected tubes, such as in the digestive tract or vascular system. The nematode Caenorhabditis elegans provides a simple genetic system to study how tubes form and link. Understanding these events provides insight into basic biology, and can inform engineering strategies for building or repairing cellular tubes. A small tube called the valve connects the two major tubular organs of the nematode digestive tract, the pharynx and intestine. The pharynx and valve form from the same primordium, while the intestine forms from a separate primordium. Cells in each primordium polarize around a central axis, and valve formation involves connecting these axes. Using live imaging, we show that valve cells initially resemble other pharyngeal cells, but undergo additional and extensive intercalations around the lumenal axis, effectively squeezing a small tube from the larger primordium. Valve cells develop the same polarity axis as intestinal cells, and we show that this depends on interactions with the intestinal cells. We show that valve formation involves dynamic changes in the localization of adhesive proteins, and identify transcription factors that play a role in valve cell specification and intercalation.
Collapse
Affiliation(s)
- Jeffrey P. Rasmussen
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
- Molecular and Cellular Biology Program, University of Washington, Seattle, Washington, United States of America
| | - Jessica L. Feldman
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| | - Sowmya Somashekar Reddy
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| | - James R. Priess
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
- Department of Biology, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
3
|
Snyder EE, Walts B, Pérusse L, Chagnon YC, Weisnagel SJ, Rankinen T, Bouchard C. The Human Obesity Gene Map: The 2003 Update. ACTA ACUST UNITED AC 2012; 12:369-439. [PMID: 15044658 DOI: 10.1038/oby.2004.47] [Citation(s) in RCA: 207] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
This is the tenth update of the human obesity gene map, incorporating published results up to the end of October 2003 and continuing the previous format. Evidence from single-gene mutation obesity cases, Mendelian disorders exhibiting obesity as a clinical feature, quantitative trait loci (QTLs) from human genome-wide scans and animal crossbreeding experiments, and association and linkage studies with candidate genes and other markers is reviewed. Transgenic and knockout murine models relevant to obesity are also incorporated (N = 55). As of October 2003, 41 Mendelian syndromes relevant to human obesity have been mapped to a genomic region, and causal genes or strong candidates have been identified for most of these syndromes. QTLs reported from animal models currently number 183. There are 208 human QTLs for obesity phenotypes from genome-wide scans and candidate regions in targeted studies. A total of 35 genomic regions harbor QTLs replicated among two to five studies. Attempts to relate DNA sequence variation in specific genes to obesity phenotypes continue to grow, with 272 studies reporting positive associations with 90 candidate genes. Fifteen such candidate genes are supported by at least five positive studies. The obesity gene map shows putative loci on all chromosomes except Y. Overall, more than 430 genes, markers, and chromosomal regions have been associated or linked with human obesity phenotypes. The electronic version of the map with links to useful sites can be found at http://obesitygene.pbrc.edu.
Collapse
Affiliation(s)
- Eric E Snyder
- Human Genomics Laboratory, Pennington Biomedical Research Center, Louisiana State University, Baton Rouge, Louisiana 70808-4124, USA
| | | | | | | | | | | | | |
Collapse
|
4
|
|
5
|
Yook K, Harris TW, Bieri T, Cabunoc A, Chan J, Chen WJ, Davis P, de la Cruz N, Duong A, Fang R, Ganesan U, Grove C, Howe K, Kadam S, Kishore R, Lee R, Li Y, Muller HM, Nakamura C, Nash B, Ozersky P, Paulini M, Raciti D, Rangarajan A, Schindelman G, Shi X, Schwarz EM, Ann Tuli M, Van Auken K, Wang D, Wang X, Williams G, Hodgkin J, Berriman M, Durbin R, Kersey P, Spieth J, Stein L, Sternberg PW. WormBase 2012: more genomes, more data, new website. Nucleic Acids Res 2011; 40:D735-41. [PMID: 22067452 PMCID: PMC3245152 DOI: 10.1093/nar/gkr954] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Since its release in 2000, WormBase (http://www.wormbase.org) has grown from a small resource focusing on a single species and serving a dedicated research community, to one now spanning 15 species essential to the broader biomedical and agricultural research fields. To enhance the rate of curation, we have automated the identification of key data in the scientific literature and use similar methodology for data extraction. To ease access to the data, we are collaborating with journals to link entities in research publications to their report pages at WormBase. To facilitate discovery, we have added new views of the data, integrated large-scale datasets and expanded descriptions of models for human disease. Finally, we have introduced a dramatic overhaul of the WormBase website for public beta testing. Designed to balance complexity and usability, the new site is species-agnostic, highly customizable, and interactive. Casual users and developers alike will be able to leverage the public RESTful application programming interface (API) to generate custom data mining solutions and extensions to the site. We report on the growth of our database and on our work in keeping pace with the growing demand for data, efforts to anticipate the requirements of users and new collaborations with the larger science community.
Collapse
Affiliation(s)
- Karen Yook
- Division of Biology 156-29, California Institute of Technology, Pasadena, CA 91125, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Chautard E, Ballut L, Thierry-Mieg N, Ricard-Blum S. MatrixDB, a database focused on extracellular protein-protein and protein-carbohydrate interactions. ACTA ACUST UNITED AC 2009; 25:690-1. [PMID: 19147664 PMCID: PMC2647840 DOI: 10.1093/bioinformatics/btp025] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY MatrixDB (http://matrixdb.ibcp.fr) is a database reporting mammalian protein-protein and protein-carbohydrate interactions involving extracellular molecules. It takes into account the full interaction repertoire of the extracellular matrix involving full-length molecules, fragments and multimers. The current version of MatrixDB contains 1972 interactions corresponding to 4412 experiments and involving 259 extracellular biomolecules. AVAILABILITY MatrixDB is freely available at http://matrixdb.ibcp.fr
Collapse
Affiliation(s)
- Emilie Chautard
- UMR 5086 CNRS-Université Lyon 1, 7 passage du Vercors, 69367 Lyon Cedex 07, France
| | | | | | | |
Collapse
|
7
|
Hunt-Newbury R, Viveiros R, Johnsen R, Mah A, Anastas D, Fang L, Halfnight E, Lee D, Lin J, Lorch A, McKay S, Okada HM, Pan J, Schulz AK, Tu D, Wong K, Zhao Z, Alexeyenko A, Burglin T, Sonnhammer E, Schnabel R, Jones SJ, Marra MA, Baillie DL, Moerman DG. High-throughput in vivo analysis of gene expression in Caenorhabditis elegans. PLoS Biol 2007; 5:e237. [PMID: 17850180 PMCID: PMC1971126 DOI: 10.1371/journal.pbio.0050237] [Citation(s) in RCA: 289] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2007] [Accepted: 07/05/2007] [Indexed: 11/18/2022] Open
Abstract
Using DNA sequences 5' to open reading frames, we have constructed green fluorescent protein (GFP) fusions and generated spatial and temporal tissue expression profiles for 1,886 specific genes in the nematode Caenorhabditis elegans. This effort encompasses about 10% of all genes identified in this organism. GFP-expressing wild-type animals were analyzed at each stage of development from embryo to adult. We have identified 5' DNA regions regulating expression at all developmental stages and in 38 different cell and tissue types in this organism. Among the regulatory regions identified are sequences that regulate expression in all cells, in specific tissues, in combinations of tissues, and in single cells. Most of the genes we have examined in C. elegans have human orthologs. All the images and expression pattern data generated by this project are available at WormAtlas (http://gfpweb.aecom.yu.edu/index) and through WormBase (http://www.wormbase.org).
Collapse
Affiliation(s)
- Rebecca Hunt-Newbury
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Ryan Viveiros
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Robert Johnsen
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Allan Mah
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Dina Anastas
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Lily Fang
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Erin Halfnight
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - David Lee
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - John Lin
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Adam Lorch
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Sheldon McKay
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - H. Mark Okada
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - Jie Pan
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Ana K Schulz
- Institut für Genetik, Technische Universität Braunschweig, Braunschweig, Germany
| | - Domena Tu
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Kim Wong
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - Z Zhao
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Andrey Alexeyenko
- Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden
| | - Thomas Burglin
- Department of Biosciences, Karolinska Institutet, Huddinge, Sweden
| | - Eric Sonnhammer
- Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden
| | - Ralf Schnabel
- Institut für Genetik, Technische Universität Braunschweig, Braunschweig, Germany
| | - Steven J Jones
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - Marco A Marra
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - David L Baillie
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Donald G Moerman
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
8
|
Mangone M, Macmenamin P, Zegar C, Piano F, Gunsalus KC. UTRome.org: a platform for 3'UTR biology in C. elegans. Nucleic Acids Res 2007; 36:D57-62. [PMID: 17986455 PMCID: PMC2238901 DOI: 10.1093/nar/gkm946] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Three-prime untranslated regions (3′UTRs) are widely recognized as important post-transcriptional regulatory regions of mRNAs. RNA-binding proteins and small non-coding RNAs such as microRNAs (miRNAs) bind to functional elements within 3′UTRs to influence mRNA stability, translation and localization. These interactions play many important roles in development, metabolism and disease. However, even in the most well-annotated metazoan genomes, 3′UTRs and their functional elements are not well defined. Comprehensive and accurate genome-wide annotation of 3′UTRs and their functional elements is thus critical. We have developed an open-access database, available at http://www.UTRome.org, to provide a rich and comprehensive resource for 3′UTR biology in the well-characterized, experimentally tractable model system Caenorhabditis elegans. UTRome.org combines data from public repositories and a large-scale effort we are undertaking to characterize 3′UTRs and their functional elements in C. elegans, including 3′UTR sequences, graphical displays, predicted and validated functional elements, secondary structure predictions and detailed data from our cloning pipeline. UTRome.org will grow substantially over time to encompass individual 3′UTR isoforms for the majority of genes, new and revised functional elements, and in vivo data on 3′UTR function as they become available. The UTRome database thus represents a powerful tool to better understand the biology of 3′UTRs.
Collapse
Affiliation(s)
- Marco Mangone
- Department of Biology and Center for Genomics and Systems Biology, New York University, 100 Washington Square East, New York, NY 10003, USA
| | | | | | | | | |
Collapse
|
9
|
Bieri T, Blasiar D, Ozersky P, Antoshechkin I, Bastiani C, Canaran P, Chan J, Chen N, Chen WJ, Davis P, Fiedler TJ, Girard L, Han M, Harris TW, Kishore R, Lee R, McKay S, Müller HM, Nakamura C, Petcherski A, Rangarajan A, Rogers A, Schindelman G, Schwarz EM, Spooner W, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Durbin R, Stein LD, Sternberg PW, Spieth J. WormBase: new content and better access. Nucleic Acids Res 2006; 35:D506-10. [PMID: 17099234 PMCID: PMC1669750 DOI: 10.1093/nar/gkl818] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
WormBase (http://wormbase.org), a model organism database for Caenorhabditis elegans and other related nematodes, continues to evolve and expand. Over the past year WormBase has added new data on C.elegans, including data on classical genetics, cell biology and functional genomics; expanded the annotation of closely related nematodes with a new genome browser for Caenorhabditis remanei; and deployed new hardware for stronger performance. Several existing datasets including phenotype descriptions and RNAi experiments have seen a large increase in new content. New datasets such as the C.remanei draft assembly and annotations, the Vancouver Fosmid library and TEC-RED 5' end sites are now available as well. Access to and searching WormBase has become more dependable and flexible via multiple mirror sites and indexing through Google.
Collapse
Affiliation(s)
- Tamberlyn Bieri
- Genome Sequencing Center, Washington University School of Medicine, St Louis, MO 63108, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Schwarz EM, Sternberg PW. Searching WormBase for information about Caenorhabditis elegans. CURRENT PROTOCOLS IN BIOINFORMATICS 2006; Chapter 1:Unit 1.8. [PMID: 18428757 DOI: 10.1002/0471250953.bi0108s14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
WormBase is the major public biological database for the nematode Caenorhabditis elegans. It is meant to be useful to any biologist who wants to use C. elegans, whatever his or her specialty. WormBase contains information about the genomic sequence of C. elegans, its genes and their products, and its higher-level traits such as gene expression patterns and neuronal connectivity. WormBase also contains genomic sequences and gene structures of C. briggsae and C. remanei, two closely related worms. These data are interconnected, so that a search beginning with one object (such as a gene) can be directed to related objects of a different type (e.g., the DNA sequence of the gene or the cells in which the gene is active). One can also perform searches for complex data sets. The WormBase developers group actively invites suggestions for improvements from the database users. WormBase's source code and underlying database are freely available for local installation and modification.
Collapse
Affiliation(s)
- Erich M Schwarz
- California Institute of Technology, Pasadena, California, USA
| | | |
Collapse
|
11
|
Nègre V, Hôtelier T, Volkoff AN, Gimenez S, Cousserans F, Mita K, Sabau X, Rocher J, López-Ferber M, d'Alençon E, Audant P, Sabourault C, Bidegainberry V, Hilliou F, Fournier P. SPODOBASE: an EST database for the lepidopteran crop pest Spodoptera. BMC Bioinformatics 2006; 7:322. [PMID: 16796757 PMCID: PMC1539033 DOI: 10.1186/1471-2105-7-322] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Accepted: 06/23/2006] [Indexed: 11/10/2022] Open
Abstract
Background The Lepidoptera Spodoptera frugiperda is a pest which causes widespread economic damage on a variety of crop plants. It is also well known through its famous Sf9 cell line which is used for numerous heterologous protein productions. Species of the Spodoptera genus are used as model for pesticide resistance and to study virus host interactions. A genomic approach is now a critical step for further new developments in biology and pathology of these insects, and the results of ESTs sequencing efforts need to be structured into databases providing an integrated set of tools and informations. Description The ESTs from five independent cDNA libraries, prepared from three different S. frugiperda tissues (hemocytes, midgut and fat body) and from the Sf9 cell line, are deposited in the database. These tissues were chosen because of their importance in biological processes such as immune response, development and plant/insect interaction. So far, the SPODOBASE contains 29,325 ESTs, which are cleaned and clustered into non-redundant sets (2294 clusters and 6103 singletons). The SPODOBASE is constructed in such a way that other ESTs from S. frugiperda or other species may be added. User can retrieve information using text searches, pre-formatted queries, query assistant or blast searches. Annotation is provided against NCBI, UNIPROT or Bombyx mori ESTs databases, and with GO-Slim vocabulary. Conclusion The SPODOBASE database provides integrated access to expressed sequence tags (EST) from the lepidopteran insect Spodoptera frugiperda. It is a publicly available structured database with insect pest sequences which will allow identification of a number of genes and comprehensive cloning of gene families of interest for scientific community. SPODOBASE is available from URL:
Collapse
Affiliation(s)
- Vincent Nègre
- Unité Informatique de Centre, INRA-AgroM, 2 place Viala, 34060 Montpellier Cedex 2, France
- EMI 0229 INSERM, CRLC Val d'Aurelle, 34298 Montpellier Cedex 5, France
| | - Thierry Hôtelier
- Unité Informatique de Centre, INRA-AgroM, 2 place Viala, 34060 Montpellier Cedex 2, France
| | - Anne-Nathalie Volkoff
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| | - Sylvie Gimenez
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| | - François Cousserans
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| | - Kazuei Mita
- Insect Genome Laboratory, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Xavier Sabau
- Unité Polymorphisme d'Intérêt Agronomique, Dép. AMIS, CIRAD, TA40/03, avenue d'Agropolis, 34398 Montpellier Cedex 5, France
| | - Janick Rocher
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
- Ecole des Mines, Départ. LGEI, 6 av. Clavières, 30319 Alès Cedex, France
| | - Miguel López-Ferber
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| | - Emmanuelle d'Alençon
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| | - Pascaline Audant
- Unité Résistance des Organismes aux Stress Environnementaux, UMR1112, INRA, 400 route des Chappes, BP167, 06903 Sophia-Antipolis Cedex, France
| | - Cécile Sabourault
- Unité Résistance des Organismes aux Stress Environnementaux, UMR1112, INRA, 400 route des Chappes, BP167, 06903 Sophia-Antipolis Cedex, France
| | - Vincent Bidegainberry
- Unité Résistance des Organismes aux Stress Environnementaux, UMR1112, INRA, 400 route des Chappes, BP167, 06903 Sophia-Antipolis Cedex, France
| | - Frédérique Hilliou
- Unité Résistance des Organismes aux Stress Environnementaux, UMR1112, INRA, 400 route des Chappes, BP167, 06903 Sophia-Antipolis Cedex, France
| | - Philippe Fournier
- Unité Biologie Intégrative et Virologie des Insectes, UMR1231, Université UMII, Bât. 24, cc101, place Eugène Bataillon, 34095 Montpellier Cedex 5, France
| |
Collapse
|
12
|
Abstract
Genomic medicine aims to revolutionize health care by applying our growing understanding of the molecular basis of disease. Research in this arena is data intensive, which means data sets are large and highly heterogeneous. To create knowledge from data, researchers must integrate these large and diverse data sets. This presents daunting informatic challenges such as representation of data that is suitable for computational inference (knowledge representation), and linking heterogeneous data sets (data integration). Fortunately, many of these challenges can be classified as data integration problems, and technologies exist in the area of data integration that may be applied to these challenges. In this paper, we discuss the opportunities of genomic medicine as well as identify the informatics challenges in this domain. We also review concepts and methodologies in the field of data integration. These data integration concepts and methodologies are then aligned with informatics challenges in genomic medicine and presented as potential solutions. We conclude this paper with challenges still not addressed in genomic medicine and gaps that remain in data integration research to facilitate genomic medicine.
Collapse
|
13
|
Schwarz EM, Antoshechkin I, Bastiani C, Bieri T, Blasiar D, Canaran P, Chan J, Chen N, Chen WJ, Davis P, Fiedler TJ, Girard L, Harris TW, Kenny EE, Kishore R, Lawson D, Lee R, Müller HM, Nakamura C, Ozersky P, Petcherski A, Rogers A, Spooner W, Tuli MA, Van Auken K, Wang D, Durbin R, Spieth J, Stein LD, Sternberg PW. WormBase: better software, richer content. Nucleic Acids Res 2006; 34:D475-8. [PMID: 16381915 PMCID: PMC1347424 DOI: 10.1093/nar/gkj061] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
WormBase (http://wormbase.org), the public database for genomics and biology of Caenorhabditis elegans, has been restructured for stronger performance and expanded for richer biological content. Performance was improved by accelerating the loading of central data pages such as the omnibus Gene page, by rationalizing internal data structures and software for greater portability, and by making the Genome Browser highly customizable in how it views and exports genomic subsequences. Arbitrarily complex, user-specified queries are now possible through Textpresso (for all available literature) and through WormMart (for most genomic data). Biological content was enriched by reconciling all available cDNA and expressed sequence tag data with gene predictions, clarifying single nucleotide polymorphism and RNAi sites, and summarizing known functions for most genes studied in this organism.
Collapse
Affiliation(s)
- Erich M Schwarz
- Division of Biology, 156-29 California Institute of Technology, Pasadena, CA, 91125, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Chen N, Harris TW, Antoshechkin I, Bastiani C, Bieri T, Blasiar D, Bradnam K, Canaran P, Chan J, Chen CK, Chen WJ, Cunningham F, Davis P, Kenny E, Kishore R, Lawson D, Lee R, Muller HM, Nakamura C, Pai S, Ozersky P, Petcherski A, Rogers A, Sabo A, Schwarz EM, Van Auken K, Wang Q, Durbin R, Spieth J, Sternberg PW, Stein LD. WormBase: a comprehensive data resource for Caenorhabditis biology and genomics. Nucleic Acids Res 2005; 33:D383-9. [PMID: 15608221 PMCID: PMC540020 DOI: 10.1093/nar/gki066] [Citation(s) in RCA: 126] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
WormBase (http://www.wormbase.org), the model organism database for information about Caenorhabditis elegans and related nematodes, continues to expand in breadth and depth. Over the past year, WormBase has added multiple large-scale datasets including SAGE, interactome, 3D protein structure datasets and NCBI KOGs. To accommodate this growth, the International WormBase Consortium has improved the user interface by adding new features to aid in navigation, visualization of large-scale datasets, advanced searching and data mining. Internally, we have restructured the database models to rationalize the representation of genes and to prepare the system to accept the genome sequences of three additional Caenorhabditis species over the coming year.
Collapse
Affiliation(s)
- Nansheng Chen
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Wang K, Tarczy-Hornoch P, Shaker R, Mork P, Brinkley JF. BioMediator data integration: beyond genomics to neuroscience data. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2005; 2005:779-83. [PMID: 16779146 PMCID: PMC1560529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The BioMediator system developed at the University of Washington (UW) provides a theoretical and practical foundation for data integration across diverse biomedical research domains and various data types. In this paper we demonstrate the generalizability of its architecture through its application to the UW Human Brain Project (HBP) for understanding language organization in the brain. We first describe the system architecture and the characteristics of the four data sources developed by the UW HBP. Second we present the process of developing the application prototype for HBP neuroscience researchers posing queries across these semantically and syntactically heterogeneous neurophysiologic data sources. Then we discuss the benefits and potential limitations of the BioMediator system as a general data integration solution for different user groups in genomic and neuroscience research domains.
Collapse
Affiliation(s)
- K Wang
- Dept. of Medical Education & Biomedical Informatics, University of Washington, Seattle WA, USA
| | | | | | | | | |
Collapse
|
16
|
Chen N, Lawson D, Bradnam K, Harris TW, Stein LD. WormBase as an integrated platform for the C. elegans ORFeome. Genome Res 2004; 14:2155-61. [PMID: 15489338 PMCID: PMC528932 DOI: 10.1101/gr.2521304] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The ORFeome project has validated and corrected a large number of predicted gene models in the nematode C. elegans, and has provided an enormous resource for proteome-scale studies. To make the resource useful to the research and teaching community, it needs to be integrated with other large-scale data sets, including the C. elegans genome, cell lineage, neurological wiring diagram, transcriptome, and gene expression map. This integration is also critical because the ORFeome data sets, like other 'omics' data sets, have significant false-positive and false-negative rates, and comparison to related data is necessary to make confidence judgments in any given data point. WormBase, the central data repository for information about C. elegans and related nematodes, provides such a platform for integration. In this report, we will describe how C. elegans ORFeome data are deposited in the database, how they are used to correct gene models, how they are integrated and displayed in the context of other data sets at the WormBase Web site, and how WormBase establishes connection with the reagent-based resources at the ORFeome project Web site.
Collapse
Affiliation(s)
- Nansheng Chen
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA.
| | | | | | | | | |
Collapse
|
17
|
McKay SJ, Johnsen R, Khattra J, Asano J, Baillie DL, Chan S, Dube N, Fang L, Goszczynski B, Ha E, Halfnight E, Hollebakken R, Huang P, Hung K, Jensen V, Jones SJM, Kai H, Li D, Mah A, Marra M, McGhee J, Newbury R, Pouzyrev A, Riddle DL, Sonnhammer E, Tian H, Tu D, Tyson JR, Vatcher G, Warner A, Wong K, Zhao Z, Moerman DG. Gene expression profiling of cells, tissues, and developmental stages of the nematode C. elegans. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2004; 68:159-69. [PMID: 15338614 DOI: 10.1101/sqb.2003.68.159] [Citation(s) in RCA: 239] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Affiliation(s)
- S J McKay
- Genome Sciences Centre, BC Cancer Agency, Vancouver, B.C., Canada, V6T 1Z4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Schwarz EM, Sternberg PW. Searching WormBase for Information AboutCaenorhabditis elegans. ACTA ACUST UNITED AC 2004. [DOI: 10.1002/0471250953.bi0108s6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
19
|
Abstract
The molecular anatomy of the vertebrate embryo was systematically analysed through gene expression during early development of the Xenopus frog using whole-mount in situ hybridization. Expression patterns are documented and assembled into the database Axeldb (http://www.dkfz-heidelberg.de/abt0135/axeldb.htm). Synexpression groups representing genes with shared, complex expression pattern that predict molecular pathways involved in patterning and differentiation have been identified. These sets of co-regulated genes show a striking similarity with operons, and may be a key determinant facilitating evolutionary change leading to animal diversity.
Collapse
Affiliation(s)
- Nicolas Pollet
- Laboratoire de transgenèse et génétique des amphibiens, CNRS UMR 8080, IBAIC Bât. 447, université Paris-Sud, 91405 Orsay Cedex, France.
| | | | | |
Collapse
|
20
|
Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, Clamp M. An overview of Ensembl. Genome Res 2004; 14:925-8. [PMID: 15078858 PMCID: PMC479121 DOI: 10.1101/gr.1860604] [Citation(s) in RCA: 305] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of individual genomes, and of the synteny and orthology relationships between them. It is also a framework for integration of any biological data that can be mapped onto features derived from the genomic sequence. Ensembl is available as an interactive Web site, a set of flat files, and as a complete, portable open source software system for handling genomes. All data are provided without restriction, and code is freely available. Ensembl's aims are to continue to "widen" this biological integration to include other model organisms relevant to understanding human biology as they become available; to "deepen" this integration to provide an ever more seamless linkage between equivalent components in different species; and to provide further classification of functional elements in the genome that have been previously elusive.
Collapse
Affiliation(s)
- Ewan Birney
- EMBL European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Gunsalus KC, Yueh WC, MacMenamin P, Piano F. RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects. Nucleic Acids Res 2004; 32:D406-10. [PMID: 14681444 PMCID: PMC308844 DOI: 10.1093/nar/gkh110] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
RNA interference (RNAi) is being used in large-scale genomic studies as a rapid way to obtain in vivo functional information associated with specific genes. How best to archive and mine the complex data derived from these studies provides a series of challenges associated with both the methods used to elicit the RNAi response and the functional data gathered. RNAiDB (RNAi Database; http://www. rnai.org) has been created for the archival, distribution and analysis of phenotypic data from large-scale RNAi analyses in Caenorhabditis elegans. The database contains a compendium of publicly available data and provides information on experimental methods and phenotypic results, including raw data in the form of images and streaming time-lapse movies. Phenotypic summaries together with graphical displays of RNAi to gene mappings allow quick intuitive comparison of results from different RNAi assays and visualization of the gene product(s) potentially inhibited by each RNAi experiment based on multiple sequence analysis methods. RNAiDB can be searched using combinatorial queries and using the novel tool PhenoBlast, which ranks genes according to their overall phenotypic similarity. RNAiDB could serve as a model database for distributing and navigating in vivo functional information from large-scale systematic phenotypic analyses in different organisms.
Collapse
Affiliation(s)
- Kristin C Gunsalus
- Center for Comparative Functional Genomics, Department of Biology, New York University, 1009 Silver Building, 100 Washington Square E., New York, NY 10003, USA.
| | | | | | | |
Collapse
|
22
|
Ruiz M, Rouard M, Raboin LM, Lartaud M, Lagoda P, Courtois B. TropGENE-DB, a multi-tropical crop information system. Nucleic Acids Res 2004; 32:D364-7. [PMID: 14681435 PMCID: PMC308839 DOI: 10.1093/nar/gkh105] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
TropGENE-DB, is a crop information system created to store genetic, molecular and phenotypic data of the numerous yet poorly documented tropical crop species. The most common data stored in TropGENE-DB are information on genetic resources (agro-morphological data, parentages, allelic diversity), molecular markers, genetic maps, results of quantitative trait loci analyses, data from physical mapping, sequences, genes, as well as the corresponding references. TropGENE-DB is organized on a crop basis with currently three running modules (sugarcane, cocoa and banana), with plans to create additional modules for rice, cotton, oil palm, coconut, rubber tree, pineapple, taro, yam and sorghum. The TropGENE-DB information system is accessible for consultation via the internet at http://tropgenedb.cirad.fr. Specific web consultation interfaces have been designed to allow quick consultations as well as complex queries.
Collapse
Affiliation(s)
- Manuel Ruiz
- CIRAD, Biotrop, TA 40/03, Avenue Agropolis, 34398 Montpellier Cedex 5, France.
| | | | | | | | | | | |
Collapse
|
23
|
Srinivasan J, Otto GW, Kahlow U, Geisler R, Sommer RJ. AppaDB: an AcedB database for the nematode satellite organism Pristionchus pacificus. Nucleic Acids Res 2004; 32:D421-2. [PMID: 14681447 PMCID: PMC308791 DOI: 10.1093/nar/gkh057] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Pristionchus pacificus is a free-living nematode of the Diplogastridae family and was recently developed as a satellite system in evolutionary developmental biology. AppaDB, a P.pacificus database, was created (http://appadb.eb.tuebingen. mpg.de) to integrate the genomic data of P.pacificus, comprising the physical map, genetic linkage map, EST and BAC end sequence and hybridization data. This developing database serves as a repository to search and find any information regarding physical contigs or genetic markers required for mapping of mutants. Additionally, it provides a platform for the Caenorhabditis elegans community to compare nematode genetic data in an evolutionary perspective.
Collapse
Affiliation(s)
- Jagan Srinivasan
- Abteilung für Evolutionsbiologie, Max-Planck Institut für Entwicklungsbiologie, Spemannstrasse 35-37, 72076 Tübingen, Germany
| | | | | | | | | |
Collapse
|
24
|
May BP, Liu H, Vollbrecht E, Senior L, Rabinowicz PD, Roh D, Pan X, Stein L, Freeling M, Alexander D, Martienssen R. Maize-targeted mutagenesis: A knockout resource for maize. Proc Natl Acad Sci U S A 2003; 100:11541-6. [PMID: 12954979 PMCID: PMC208794 DOI: 10.1073/pnas.1831119100] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2003] [Indexed: 11/18/2022] Open
Abstract
We describe an efficient system for site-selected transposon mutagenesis in maize. A total of 43,776 F1 plants were generated by using Robertson's Mutator (Mu) pollen parents and self-pollinated to establish a library of transposon-mutagenized seed. The frequency of new seed mutants was between 10-4 and 10-5 per F1 plant. As a service to the maize community, maize-targeted mutagenesis selects insertions in genes of interest from this library by using the PCR. Pedigree, knockout, sequence, phenotype, and other information is stored in a powerful interactive database (maize-targeted mutagenesis database) that enables analysis of the entire population and the handling of knockout requests. By inhibiting Mu activity in most F1 plants, we sought to reduce somatic insertions that may cause false positives selected from pooled tissue. By monitoring the remaining Mu activity in the F2, however, we demonstrate that seed phenotypes depend on it, and false positives occur in lines that appear to lack it. We conclude that more than half of all mutations arising in this population are suppressed on losing Mu activity. These results have implications for epigenetic models of inbreeding and for functional genomics.
Collapse
Affiliation(s)
- Bruce P May
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
An essential step in Serial Analysis of Gene Expression (SAGE) is tag mapping, which refers to the unambiguous determination of the gene represented by a SAGE tag. Current resources for tag mapping are incomplete, and thus do not allow assessment of the efficacy of SAGE in transcript identification. A method of tag mapping is described here and applied to the Drosophila melanogaster and Caenorhabditis elegans genomes, which permits detailed SAGE assessment and provides tag-mapping resources that were unavailable previously for these organisms. In our method, a conceptual transcriptome is constructed using genomic sequence and annotation by extending predicted coding regions to include UTRs on the basis of EST and cDNA alignments, UTR length distributions, and polyadenylation signals. Analysis of extracted tags suggests that, using the standard SAGE procedure, expression of 8% of D. melanogaster and 15% of C. elegans genes cannot be detected unambiguously by SAGE due to shared sequence or lack of NlaIII-anchoring enzyme sites. Both increasing tag length by 2-3 bp and using Sau3A instead of NlaIII as the anchoring enzyme increases potential for transcript detection. This work identifies and quantifies genes not amenable to SAGE analysis, in addition to providing tag-to-gene mappings for two model organisms.
Collapse
Affiliation(s)
- Erin D Pleasance
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver V5Z 4E6, Canada
| | | | | |
Collapse
|
26
|
Shockey JM, Fulda MS, Browse J. Arabidopsis contains a large superfamily of acyl-activating enzymes. Phylogenetic and biochemical analysis reveals a new class of acyl-coenzyme a synthetases. PLANT PHYSIOLOGY 2003; 132:1065-76. [PMID: 12805634 PMCID: PMC167044 DOI: 10.1104/pp.103.020552] [Citation(s) in RCA: 130] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Acyl-activating enzymes are a diverse group of proteins that catalyze the activation of many different carboxylic acids, primarily through the formation of a thioester bond. This group of enzymes is found in all living organisms and includes the acyl-coenzyme A synthetases, 4-coumarate:coenzyme A ligases, luciferases, and non-ribosomal peptide synthetases. The members of this superfamily share little overall sequence identity, but do contain a 12-amino acid motif common to all enzymes that activate their acid substrates using ATP via an enzyme-bound adenylate intermediate. Arabidopsis possesses an acyl-activating enzyme superfamily containing 63 different genes. In addition to the genes that had been characterized previously, 14 new cDNA clones were isolated as part of this work. The protein sequences were compared phylogenetically and grouped into seven distinct categories. At least four of these categories are plant specific. The tissue-specific expression profiles of some of the genes of unknown function were analyzed and shown to be complex, with a high degree of overlap. Most of the plant-specific genes represent uncharacterized aspects of carboxylic acid metabolism. One such group contains members whose enzymes activate short- and medium-chain fatty acids. Altogether, the results presented here describe the largest acyl-activating enzyme family present in any organism thus far studied at the genomic level and clearly indicate that carboxylic acid activation metabolism in plants is much more complex than previously thought.
Collapse
Affiliation(s)
- Jay M Shockey
- Institute of Biological Chemistry, Washington State University, Pullman 99164-6340, USA
| | | | | |
Collapse
|
27
|
Thorisson GA, Stein LD. The SNP Consortium website: past, present and future. Nucleic Acids Res 2003; 31:124-7. [PMID: 12519964 PMCID: PMC165499 DOI: 10.1093/nar/gkg052] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2002] [Accepted: 09/11/2002] [Indexed: 01/20/2023] Open
Abstract
The SNP Consortium website (http://snp.cshl.org) has undergone many changes since its initial conception three years ago. The database back end has been changed from the venerable ACeDB to the more scalable MySQL engine. Users can access the data via gene or single nucleotide polymorphism (SNP) keyword searches and browse or dump SNP data to textfiles. A graphical genome browsing interface shows SNPs mapped onto the genome assembly in the context of externally available gene predictions and other features. SNP allele frequency and genotype data are available via FTP-download and on individual SNP report web pages. SNP linkage maps are available for download and for browsing in a comparative map viewer. All software components of the data coordinating center (DCC) website (http://snp.cshl.org) are open source.
Collapse
|
28
|
Watanabe J, Sasaki M, Suzuki Y, Sugano S. Analysis of transcriptomes of human malaria parasite Plasmodium falciparum using full-length enriched library: identification of novel genes and diverse transcription start sites of messenger RNAs. Gene 2002; 291:105-13. [PMID: 12095684 DOI: 10.1016/s0378-1119(02)00552-8] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Now that the sequencing of the complete genome of the human malaria parasite Plasmodium falciparum is now underway, importance of analyses of complementary DNAs (cDNAs) is looming up. We constructed a full-length-enriched cDNA library from erythrocytic stage P. falciparum using the 'oligo-capping' method (Nucleic Acids Res. 29 (2001) 70). In this report we describe the novel genes identified using this library and detailed characterization of transcriptional start site of knob-associated histidine rich protein gene. Contrary to the previous report we conclude all the transcripts of plasmodium genes have diverse start sites. Sequence comparisons between the cDNAs and the complete sequences of chromosomes 2 identified three novel genes that had been missed by computational predictions. Moreover, analysis of transcriptional start sites revealed that the average length of the 5' untranslated region was 346 nt, which is much longer than that in humans. The transcriptional start sites of all the genes studied were far more diverse than those of human genes. These observations may reflect unique mechanism(s) of gene expression in this organism, which has an extremely AT-rich genome.
Collapse
Affiliation(s)
- Junichi Watanabe
- Department of Parasitology, Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minatoku, Tokyo 108-8639, Japan.
| | | | | | | |
Collapse
|
29
|
Abstract
The advent of whole-genome data resources--not only sequence but also other genome-scale data collections such as gene expression, protein interaction, and genetic variation--is having two marked, complementary effects on the relatively new discipline of bioinformatics. First, the veritable flood of data is creating a need and demand for new tools for dealing adequately with the deluge, and, second, the unprecedented extent, diversity, and impending completeness of the data sets are creating opportunities for new approaches to discovery based on computational methods.
Collapse
Affiliation(s)
- D B Searls
- Bioinformatics Department, SmithKline Beecham Pharmaceuticals, King of Prussia, Pennsylvania 19406, USA.
| |
Collapse
|
30
|
Martin SL, Blackmon BP, Rajagopalan R, Houfek TD, Sceeles RG, Denn SO, Mitchell TK, Brown DE, Wing RA, Dean RA. MagnaportheDB: a federated solution for integrating physical and genetic map data with BAC end derived sequences for the rice blast fungus Magnaporthe grisea. Nucleic Acids Res 2002; 30:121-4. [PMID: 11752272 PMCID: PMC99159 DOI: 10.1093/nar/30.1.121] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.
Collapse
Affiliation(s)
- Stanton L Martin
- Fungal Genomics Laboratory, North Carolina State University, 840 Main Campus Drive, Suite 1200, Raleigh, NC 27606, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Nakata K, Takai-Igarashi T, Nakano T, Kaminuma T. An Integrated Receptor Database (IRDB). DATA SCIENCE JOURNAL 2002. [DOI: 10.2481/dsj.1.271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
32
|
Wendl MC, Korf I, Chinwalla AT, Hillier LW. Automated processing of raw DNA sequence data. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE : THE QUARTERLY MAGAZINE OF THE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY 2001; 20:41-8. [PMID: 11494768 DOI: 10.1109/51.940044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- M C Wendl
- Genome Sequencing Center, Washington University, St. Louis, USA.
| | | | | | | |
Collapse
|
33
|
Bossinger O, Klebes A, Segbert C, Theres C, Knust E. Zonula adherens formation in Caenorhabditis elegans requires dlg-1, the homologue of the Drosophila gene discs large. Dev Biol 2001; 230:29-42. [PMID: 11161560 DOI: 10.1006/dbio.2000.0113] [Citation(s) in RCA: 132] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The correct assembly of junction components, such as E-cadherin and beta-catenin, into the zonula adherens is fundamental for the function of epithelia, both in flies and in vertebrates. In C. elegans, however, the cadherin-catenin system is not essential for general adhesion, raising the question as to the genetic basis controlling junction morphogenesis in nematodes. Here we show that dlg-1, the C. elegans homologue of the Drosophila tumour-suppressor gene discs-large, plays a crucial role in epithelial development. DLG-1 is restricted to adherens junctions of all embryonic epithelia, which contrasts with the localisation of the Drosophila and vertebrate homologues in septate and tight junctions, respectively. Proper localisation of DLG-1 requires the basolateral LET-413 protein, but is independent of the cadherin-catenin system. Embryos in which dlg-1 activity was eliminated by RNA-mediated interference fail to form a continuous belt of junction-associated antigens and arrest development. Loss of dlg-1 activity differentially affects localisation of proteins normally enriched apically to the zonula adherens. While the distribution of an atypical protein kinase C (PKC-3) and other cytoplasmic proteins (PAR-3, PAR-6) is not affected in dlg-1 (RNAi) embryos, the transmembrane protein encoded by crb-1, the C. elegans homologue of Drosophila crumbs, is no longer concentrated in this domain. In contrast to Drosophila, however, crb-1 and a second crb-like gene are not essential for epithelial development in C. elegans. Together the data indicate that several aspects of the spatial organisation of epithelial cells and its genetic control differ between flies, worms, and vertebrates, while others are conserved. The molecular nature of DLG-1 makes it a likely candidate to participate in the organisation of a protein scaffold that controls the assembly of junction components into the zonula adherens.
Collapse
Affiliation(s)
- O Bossinger
- Institut für Genetik, Heinrich-Heine Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany.
| | | | | | | | | |
Collapse
|
34
|
Stein L, Sternberg P, Durbin R, Thierry-Mieg J, Spieth J. WormBase: network access to the genome and biology of Caenorhabditis elegans. Nucleic Acids Res 2001; 29:82-6. [PMID: 11125056 PMCID: PMC29781 DOI: 10.1093/nar/29.1.82] [Citation(s) in RCA: 226] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
WormBase (http://www.wormbase.org) is a web-based resource for the Caenorhabditis elegans genome and its biology. It builds upon the existing ACeDB database of the C.elegans genome by providing data curation services, a significantly expanded range of subject areas and a user-friendly front end.
Collapse
Affiliation(s)
- L Stein
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
| | | | | | | | | |
Collapse
|
35
|
Piano F, Schetter AJ, Mangone M, Stein L, Kemphues KJ. RNAi analysis of genes expressed in the ovary of Caenorhabditis elegans. Curr Biol 2000; 10:1619-22. [PMID: 11137018 DOI: 10.1016/s0960-9822(00)00869-1] [Citation(s) in RCA: 165] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
As a step towards comprehensive functional analysis of genomes, systematic gene knockout projects have been initiated in several organisms [1]. In metazoans like C. elegans, however, maternal contribution can mask the effects of gene knockouts on embryogenesis. RNA interference (RNAi) provides an alternative rapid approach to obtain loss-of-function information that can also reveal embryonic roles for the genes targeted [2,3]. We have used RNAi to analyze a random set of ovarian transcripts and have identified 81 genes with essential roles in embryogenesis. Surprisingly, none of them maps on the X chromosome. Of these 81 genes, 68 showed defects before the eight-cell stage and could be grouped into ten phenotypic classes. To archive and distribute these data we have developed a database system directly linked to the C. elegans database (Wormbase). We conclude that screening cDNA libraries by RNAi is an efficient way of obtaining in vivo function for a large group of genes. Furthermore, this approach is directly applicable to other organisms sensitive to RNAi and whose genomes have not yet been sequenced.
Collapse
Affiliation(s)
- F Piano
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA.
| | | | | | | | | |
Collapse
|
36
|
Reese MG, Hartzell G, Harris NL, Ohler U, Abril JF, Lewis SE. Genome annotation assessment in Drosophila melanogaster. Genome Res 2000; 10:483-501. [PMID: 10779488 PMCID: PMC310877 DOI: 10.1101/gr.10.4.483] [Citation(s) in RCA: 125] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2000] [Accepted: 02/29/2000] [Indexed: 11/24/2022]
Abstract
Computational methods for automated genome annotation are critical to our community's ability to make full use of the large volume of genomic sequence being generated and released. To explore the accuracy of these automated feature prediction tools in the genomes of higher organisms, we evaluated their performance on a large, well-characterized sequence contig from the Adh region of Drosophila melanogaster. This experiment, known as the Genome Annotation Assessment Project (GASP), was launched in May 1999. Twelve groups, applying state-of-the-art tools, contributed predictions for features including gene structure, protein homologies, promoter sites, and repeat elements. We evaluated these predictions using two standards, one based on previously unreleased high-quality full-length cDNA sequences and a second based on the set of annotations generated as part of an in-depth study of the region by a group of Drosophila experts. Although these standard sets only approximate the unknown distribution of features in this region, we believe that when taken in context the results of an evaluation based on them are meaningful. The results were presented as a tutorial at the conference on Intelligent Systems in Molecular Biology (ISMB-99) in August 1999. Over 95% of the coding nucleotides in the region were correctly identified by the majority of the gene finders, and the correct intron/exon structures were predicted for >40% of the genes. Homology-based annotation techniques recognized and associated functions with almost half of the genes in the region; the remainder were only identified by the ab initio techniques. This experiment also presents the first assessment of promoter prediction techniques for a significant number of genes in a large contiguous region. We discovered that the promoter predictors' high false-positive rates make their predictions difficult to use. Integrating gene finding and cDNA/EST alignments with promoter predictions decreases the number of false-positive classifications but discovers less than one-third of the promoters in the region. We believe that by establishing standards for evaluating genomic annotations and by assessing the performance of existing automated genome annotation tools, this experiment establishes a baseline that contributes to the value of ongoing large-scale annotation projects and should guide further research in genome informatics.
Collapse
Affiliation(s)
- M G Reese
- Berkeley Drosophila Genome Project, Department of Molecular and Cell Biology, University of California, Berkeley 94720-3200, USA.
| | | | | | | | | | | |
Collapse
|
37
|
Pollet N, Schmidt HA, Gawantka V, Vingron M, Niehrs C. Axeldb: a Xenopus laevis database focusing on gene expression. Nucleic Acids Res 2000; 28:139-40. [PMID: 10592204 PMCID: PMC102398 DOI: 10.1093/nar/28.1.139] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/1999] [Revised: 09/22/1999] [Accepted: 10/04/1999] [Indexed: 11/13/2022] Open
Abstract
Axeldb is a database storing and integrating gene expression patterns and DNA sequences identified in a large-scale in situ hybridization study in Xenopus laevis embryos. The data are organised in a format appropriate for comprehensive analysis, and enable comparison of images of expression pattern for any given set of genes. Information on literature, cDNA clones and their availability, nucleotide sequences, expression pattern and accompanying pictures are available. Current developments are aimed toward the interconnection with other databases and the integration of data from the literature. Axeldb is implemented using an ACEDB database system, and available through the web at http://www.dkfz-heidelberg.de/abt0135/axeldb.htm
Collapse
Affiliation(s)
- N Pollet
- Department of Molecular Embryology, Deutsches Krebsforschungszentrum, lm Neuenheimer Feld 280, D-69120 Heidelberg, Germany.
| | | | | | | | | |
Collapse
|
38
|
Affiliation(s)
- L D Stein
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
| |
Collapse
|
39
|
|