1
|
Sousa A, Rocha S, Vieira J, Reboiro-Jato M, López-Fernández H, Vieira CP. On the identification of potential novel therapeutic targets for spinocerebellar ataxia type 1 (SCA1) neurodegenerative disease using EvoPPI3. J Integr Bioinform 2023; 20:jib-2022-0056. [PMID: 36848492 PMCID: PMC10561075 DOI: 10.1515/jib-2022-0056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 11/26/2022] [Indexed: 03/01/2023] Open
Abstract
EvoPPI (http://evoppi.i3s.up.pt), a meta-database for protein-protein interactions (PPI), has been upgraded (EvoPPI3) to accept new types of data, namely, PPI from patients, cell lines, and animal models, as well as data from gene modifier experiments, for nine neurodegenerative polyglutamine (polyQ) diseases caused by an abnormal expansion of the polyQ tract. The integration of the different types of data allows users to easily compare them, as here shown for Ataxin-1, the polyQ protein involved in spinocerebellar ataxia type 1 (SCA1) disease. Using all available datasets and the data here obtained for Drosophila melanogaster wt and exp Ataxin-1 mutants (also available at EvoPPI3), we show that, in humans, the Ataxin-1 network is much larger than previously thought (380 interactors), with at least 909 interactors. The functional profiling of the newly identified interactors is similar to the ones already reported in the main PPI databases. 16 out of 909 interactors are putative novel SCA1 therapeutic targets, and all but one are already being studied in the context of this disease. The 16 proteins are mainly involved in binding and catalytic activity (mainly kinase activity), functional features already thought to be important in the SCA1 disease.
Collapse
Affiliation(s)
- André Sousa
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| | - Sara Rocha
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| | - Jorge Vieira
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
- Instituto de Biologia Molecular e Celular (IBMC), Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| | - Miguel Reboiro-Jato
- Department of Computer Science, CINBIO, Universidade de Vigo, ESEI – Escuela Superior de Ingeniería Informática, 32004Ourense, Spain
- SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, 36213 Vigo, Spain
| | - Hugo López-Fernández
- Department of Computer Science, CINBIO, Universidade de Vigo, ESEI – Escuela Superior de Ingeniería Informática, 32004Ourense, Spain
- SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, 36213 Vigo, Spain
| | - Cristina P. Vieira
- Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135Porto, Portugal
- Instituto de Biologia Molecular e Celular (IBMC), Rua Alfredo Allen, 208, 4200-135Porto, Portugal
| |
Collapse
|
2
|
Wainberg M, Merico D, Keller MC, Fauman EB, Tripathy SJ. Predicting causal genes from psychiatric genome-wide association studies using high-level etiological knowledge. Mol Psychiatry 2022; 27:3095-3106. [PMID: 35411039 DOI: 10.1038/s41380-022-01542-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/08/2022] [Accepted: 03/21/2022] [Indexed: 12/24/2022]
Abstract
Genome-wide association studies have discovered hundreds of genomic loci associated with psychiatric traits, but the causal genes underlying these associations are often unclear, a research gap that has hindered clinical translation. Here, we present a Psychiatric Omnilocus Prioritization Score (PsyOPS) derived from just three binary features encapsulating high-level assumptions about psychiatric disease etiology - namely, that causal psychiatric disease genes are likely to be mutationally constrained, be specifically expressed in the brain, and overlap with known neurodevelopmental disease genes. To our knowledge, PsyOPS is the first method specifically tailored to prioritizing causal genes at psychiatric GWAS loci. We show that, despite its extreme simplicity, PsyOPS achieves state-of-the-art performance at this task, comparable to a prior domain-agnostic approach relying on tens of thousands of features. Genes prioritized by PsyOPS are substantially more likely than other genes at the same loci to have convergent evidence of direct regulation by the GWAS variant according to both DNA looping assays and expression or splicing quantitative trait locus (QTL) maps. We provide examples of genes hundreds of kilobases away from the lead variant, like GABBR1 for schizophrenia, that are prioritized by all three of PsyOPS, DNA looping and QTLs. Our results underscore the power of incorporating high-level knowledge of trait etiology into causal gene prediction at GWAS loci, and comprise a resource for researchers interested in experimentally characterizing psychiatric gene candidates.
Collapse
Affiliation(s)
- Michael Wainberg
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Daniele Merico
- Deep Genomics Inc, Toronto, ON, Canada.,The Centre for Applied Genomics (TCAG), The Hospital for Sick Children, Toronto, ON, Canada
| | - Matthew C Keller
- Department of Psychology and Neuroscience, University of Colorado, Boulder, CO, USA.,Institute for Behavioral Genetics, University of Colorado, Boulder, CO, USA
| | - Eric B Fauman
- Internal Medicine Research Unit, Pfizer Worldwide Research, Development and Medical, Cambridge, MA, USA
| | - Shreejoy J Tripathy
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada. .,Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada. .,Department of Psychiatry, University of Toronto, Toronto, ON, Canada. .,Department of Physiology, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
Brayton CF. Laboratory Codes in Nomenclature and Scientific Communication (Advancing Organism Nomenclature in Scientific Communication to Improve Research Reporting and Reproducibility). ILAR J 2021; 62:295-309. [PMID: 36528817 DOI: 10.1093/ilar/ilac016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/23/2022] [Indexed: 12/23/2022] Open
Abstract
Laboratory registration codes, also known as laboratory codes or lab codes, are a key element in standardized laboratory animal and genetic nomenclature. As such they are critical to accurate scientific communication and to research reproducibility and integrity. The original committee on Mouse Genetic Nomenclature published nomenclature conventions for mice genetics in 1940, and then conventions for inbred strains in 1952. Unique designations were needed, and have been in use since the 1950s, for the sources of animals and substrains, for the laboratories that identified new alleles or mutations, and then for developers of transgenes and induced mutations. Current laboratory codes are typically a 2- to 4-letter acronym for an institution or an investigator. Unique codes are assigned from the International Laboratory Code Registry, which was developed and is maintained by ILAR in the National Academies (National Academies of Sciences Engineering and Medicine and previously National Academy of Sciences). As a resource for the global research community, the registry has been online since 1997. Since 2003 mouse and rat genetic and strain nomenclature rules have been reviewed and updated annually as a joint effort of the International Committee on Standardized Genetic Nomenclature for Mice and the Rat Genome and Nomenclature Committee. The current nomenclature conventions (particularly conventions for non-inbred animals) are applicable beyond rodents, although not widely adopted. Ongoing recognition, since at least the 1930s, of the research relevance of genetic backgrounds and origins of animals, and of spontaneous and induced genetic variants speaks to the need for broader application of standardized nomenclature for animals in research, particularly given the increasing numbers and complexities of genetically modified swine, nonhuman primates, fish, and other species.
Collapse
Affiliation(s)
- Cory F Brayton
- Johns Hopkins Medicine, Molecular and Comparative Pathobiology, Baltimore, Maryland, USA
| |
Collapse
|
4
|
|
5
|
Germ-Free Swiss Webster Mice on a High-Fat Diet Develop Obesity, Hyperglycemia, and Dyslipidemia. Microorganisms 2020; 8:microorganisms8040520. [PMID: 32260528 PMCID: PMC7232377 DOI: 10.3390/microorganisms8040520] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Revised: 03/23/2020] [Accepted: 04/03/2020] [Indexed: 12/14/2022] Open
Abstract
A calorie-dense diet is a well-established risk factor for obesity and metabolic syndrome (MetS), whereas the role of the intestinal microbiota (IMB) in the development of diet-induced obesity (DIO) is not completely understood. To test the hypothesis that Swiss Webster (Tac:SW) mice can develop characteristics of DIO and MetS in the absence of the IMB, we fed conventional (CV) and germ-free (GF) male Tac:SW mice either a low-fat diet (LFD; 10% fat derived calories) or a high-fat diet (HFD; 60% fat derived calories) for 10 weeks. The HFD increased feed conversion and body weight in GF mice independent of the increase associated with the microbiota in CV mice. In contrast to CV mice, GF mice did not decrease feed intake on the HFD and possessed heavier fat pads. The HFD caused hyperglycemia, hyperinsulinemia, and impaired glucose absorption in GF mice independent of the increase associated with the microbiota in CV mice. A HFD also elevated plasma LDL-cholesterol and increased hepatic triacylglycerol, free fatty acids, and ceramides in all mice, whereas hypertriglyceridemia and increased hepatic medium and long-chain acylcarnitines were only observed in CV mice. Therefore, GF male Tac:SW mice developed several detrimental effects of obesity and MetS from a high-fat, calorie dense diet.
Collapse
|
6
|
Dickson PE, Roy TA, McNaughton KA, Wilcox TD, Kumar P, Chesler EJ. Systems genetics of sensation seeking. GENES BRAIN AND BEHAVIOR 2018; 18:e12519. [PMID: 30221471 PMCID: PMC6399063 DOI: 10.1111/gbb.12519] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Revised: 09/09/2018] [Accepted: 09/11/2018] [Indexed: 02/06/2023]
Abstract
Sensation seeking is a multifaceted, heritable trait which predicts the development of substance use and abuse in humans; similar phenomena have been observed in rodents. Genetic correlations among sensation seeking and substance use indicate shared biological mechanisms, but the genes and networks underlying these relationships remain elusive. Here, we used a systems genetics approach in the BXD recombinant inbred mouse panel to identify shared genetic mechanisms underlying substance use and preference for sensory stimuli, an intermediate phenotype of sensation seeking. Using the operant sensation seeking (OSS) paradigm, we quantified preference for sensory stimuli in 120 male and 127 female mice from 62 BXD strains and the C57BL/6J and DBA/2J founder strains. We used relative preference for the active and inactive levers to dissociate preference for sensory stimuli from locomotion and exploration phenotypes. We identified genomic regions on chromosome 4 (155.236‐155.742 Mb) and chromosome 13 (72.969‐89.423 Mb) associated with distinct behavioral components of OSS. Using publicly available behavioral data and mRNA expression data from brain regions involved in reward processing, we identified (a) genes within these behavioral QTL exhibiting genome‐wide significant cis‐eQTL and (b) genetic correlations among OSS phenotypes, ethanol phenotypes and mRNA expression. From these analyses, we nominated positional candidates for behavioral QTL associated with distinct OSS phenotypes including Gnb1 and Mef2c. Genetic covariation of Gnb1 expression, preference for sensory stimuli and multiple ethanol phenotypes suggest that heritable variation in Gnb1 expression in reward circuitry partially underlies the widely reported relationship between sensation seeking and substance use.
Collapse
Affiliation(s)
- Price E. Dickson
- Center for Systems Neurogenetics of AddictionThe Jackson LaboratoryBar HarborMaine
| | - Tyler A. Roy
- Center for Systems Neurogenetics of AddictionThe Jackson LaboratoryBar HarborMaine
| | | | - Troy D. Wilcox
- Center for Systems Neurogenetics of AddictionThe Jackson LaboratoryBar HarborMaine
| | - Padam Kumar
- Center for Systems Neurogenetics of AddictionThe Jackson LaboratoryBar HarborMaine
| | - Elissa J. Chesler
- Center for Systems Neurogenetics of AddictionThe Jackson LaboratoryBar HarborMaine
| |
Collapse
|
7
|
Singh NK, Ernst M, Liebscher V, Fuellen G, Taher L. Revealing complex function, process and pathway interactions with high-throughput expression and biological annotation data. MOLECULAR BIOSYSTEMS 2016; 12:3196-208. [PMID: 27507577 DOI: 10.1039/c6mb00280c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The biological relationships both between and within the functions, processes and pathways that operate within complex biological systems are only poorly characterized, making the interpretation of large scale gene expression datasets extremely challenging. Here, we present an approach that integrates gene expression and biological annotation data to identify and describe the interactions between biological functions, processes and pathways that govern a phenotype of interest. The product is a global, interconnected network, not of genes but of functions, processes and pathways, that represents the biological relationships within the system. We validated our approach on two high-throughput expression datasets describing organismal and organ development. Our findings are well supported by the available literature, confirming that developmental processes and apoptosis play key roles in cell differentiation. Furthermore, our results suggest that processes related to pluripotency and lineage commitment, which are known to be critical for development, interact mainly indirectly, through genes implicated in more general biological processes. Moreover, we provide evidence that supports the relevance of cell spatial organization in the developing liver for proper liver function. Our strategy can be viewed as an abstraction that is useful to interpret high-throughput data and devise further experiments.
Collapse
Affiliation(s)
- Nitesh Kumar Singh
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Ernst-Heydemann-Str. 8, 18057 Rostock, Germany.
| | | | | | | | | |
Collapse
|
8
|
Dolan ME, Baldarelli RM, Bello SM, Ni L, McAndrews MS, Bult CJ, Kadin JA, Richardson JE, Ringwald M, Eppig JT, Blake JA. Orthology for comparative genomics in the mouse genome database. Mamm Genome 2015. [PMID: 26223881 PMCID: PMC4534493 DOI: 10.1007/s00335-015-9588-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
Collapse
Affiliation(s)
- Mary E Dolan
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA,
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Liu H, Zhu R, Lv J, He H, Yang L, Huang Z, Su J, Zhang Y, Yu S, Wu Q. DevMouse, the mouse developmental methylome database and analysis tools. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bat084. [PMID: 24408217 PMCID: PMC3885893 DOI: 10.1093/database/bat084] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
DNA methylation undergoes dynamic changes during mouse development and plays crucial roles in embryogenesis, cell-lineage determination and genomic imprinting. Bisulfite sequencing enables profiling of mouse developmental methylomes on an unprecedented scale; however, integrating and mining these data are challenges for experimental biologists. Therefore, we developed DevMouse, which focuses on the efficient storage of DNA methylomes in temporal order and quantitative analysis of methylation dynamics during mouse development. The latest release of DevMouse incorporates 32 normalized and temporally ordered methylomes across 15 developmental stages and related genome information. A flexible query engine is developed for acquisition of methylation profiles for genes, microRNAs, long non-coding RNAs and genomic intervals of interest across selected developmental stages. To facilitate in-depth mining of these profiles, DevMouse offers online analysis tools for the quantification of methylation variation, identification of differentially methylated genes, hierarchical clustering, gene function annotation and enrichment. Moreover, a configurable MethyBrowser is provided to view the base-resolution methylomes under a genomic context. In brief, DevMouse hosts comprehensive mouse developmental methylome data and provides online tools to explore the relationships of DNA methylation and development. Database URL: http://www.devmouse.org/
Collapse
Affiliation(s)
- Hongbo Liu
- Department of Developmental Biology, School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China, Department of Computational Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China, Department of Food Science, School of Food Science and Engineering, Harbin Institute of Technology, Harbin 150001, China and Department of Respiratory Medicine, the First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Oberlin AT, Jurkovic DA, Balish MF, Friedberg I. Biological database of images and genomes: tools for community annotations linking image and genomic information. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat016. [PMID: 23550062 PMCID: PMC3708683 DOI: 10.1093/database/bat016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Genomic data and biomedical imaging data are undergoing exponential growth. However, our understanding of the phenotype-genotype connection linking the two types of data is lagging behind. While there are many types of software that enable the manipulation and analysis of image data and genomic data as separate entities, there is no framework established for linking the two. We present a generic set of software tools, BioDIG, that allows linking of image data to genomic data. BioDIG tools can be applied to a wide range of research problems that require linking images to genomes. BioDIG features the following: rapid construction of web-based workbenches, community-based annotation, user management and web services. By using BioDIG to create websites, researchers and curators can rapidly annotate a large number of images with genomic information. Here we present the BioDIG software tools that include an image module, a genome module and a user management module. We also introduce a BioDIG-based website, MyDIG, which is being used to annotate images of mycoplasmas.
Collapse
Affiliation(s)
- Andrew T Oberlin
- Department of Computer Science and Software Engineering, Miami University, Oxford, OH 45056, USA
| | | | | | | |
Collapse
|
11
|
Ackert-Bicknell C, Paigen B, Korstanje R. Recalculation of 23 mouse HDL QTL datasets improves accuracy and allows for better candidate gene analysis. J Lipid Res 2013; 54:984-94. [PMID: 23393305 DOI: 10.1194/jlr.m033035] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
In the past 15 years, the quantitative trait locus (QTL) mapping approach has been applied to crosses between different inbred mouse strains to identify genetic loci associated with plasma HDL cholesterol levels. Although successful, a disadvantage of this method is low mapping resolution, as often several hundred candidate genes fall within the confidence interval for each locus. Methods have been developed to narrow these loci by combining the data from the different crosses, but they rely on the accurate mapping of the QTL and the treatment of the data in a consistent manner. We collected 23 raw datasets used for the mapping of previously published HDL QTL and reanalyzed the data from each cross using a consistent method and the latest mouse genetic map. By utilizing this approach, we identified novel QTL and QTL that were mapped to the wrong part of chromosomes. Our new HDL QTL map allows for reliable combining of QTL data and candidate gene analysis, which we demonstrate by identifying Grin3a and Etv6, as candidate genes for QTL on chromosomes 4 and 6, respectively. In addition, we were able to narrow a QTL on Chr 19 to five candidates.
Collapse
|
12
|
Ackert-Bicknell CL, Karasik D, Li Q, Smith RV, Hsu YH, Churchill GA, Paigen BJ, Tsaih SW. Mouse BMD quantitative trait loci show improved concordance with human genome-wide association loci when recalculated on a new, common mouse genetic map. J Bone Miner Res 2010; 25:1808-20. [PMID: 20200990 PMCID: PMC3153351 DOI: 10.1002/jbmr.72] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Bone mineral density (BMD) is a heritable trait, and in mice, over 100 quantitative trait loci (QTLs) have been reported, but candidate genes have been identified for only a small percentage. Persistent errors in the mouse genetic map have negatively affected QTL localization, spurring the development of a new, corrected map. In this study, QTLs for BMD were remapped in 11 archival mouse data sets using this new genetic map. Since these QTLs all were mapped in a comparable way, direct comparisons of QTLs for concordance would be valid. We then compared human genome-wide association study (GWAS) BMD loci with the mouse QTLs. We found that 26 of the 28 human GWAS loci examined were located within the confidence interval of a mouse QTL. Furthermore, 14 of the GWAS loci mapped to within 3 cM of a mouse QTL peak. Lastly, we demonstrated that these newly remapped mouse QTLs can substantiate a candidate gene for a human GWAS locus, for which the peak single-nucleotide polymorphism (SNP) fell in an intergenic region. Specifically, we suggest that MEF2C (human chromosome 5, mouse chromosome 13) should be considered a candidate gene for the genetic regulation of BMD. In conclusion, use of the new mouse genetic map has improved the localization of mouse BMD QTLs, and these remapped QTLs show high concordance with human GWAS loci. We believe that this is an opportune time for a renewed effort by the genetics community to identify the causal variants regulating BMD using a synergistic mouse-human approach.
Collapse
|
13
|
The ANISEED database: digital representation, formalization, and elucidation of a chordate developmental program. Genome Res 2010; 20:1459-68. [PMID: 20647237 DOI: 10.1101/gr.108175.110] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Developmental biology aims to understand how the dynamics of embryonic shapes and organ functions are encoded in linear DNA molecules. Thanks to recent progress in genomics and imaging technologies, systemic approaches are now used in parallel with small-scale studies to establish links between genomic information and phenotypes, often described at the subcellular level. Current model organism databases, however, do not integrate heterogeneous data sets at different scales into a global view of the developmental program. Here, we present a novel, generic digital system, NISEED, and its implementation, ANISEED, to ascidians, which are invertebrate chordates suitable for developmental systems biology approaches. ANISEED hosts an unprecedented combination of anatomical and molecular data on ascidian development. This includes the first detailed anatomical ontologies for these embryos, and quantitative geometrical descriptions of developing cells obtained from reconstructed three-dimensional (3D) embryos up to the gastrula stages. Fully annotated gene model sets are linked to 30,000 high-resolution spatial gene expression patterns in wild-type and experimentally manipulated conditions and to 528 experimentally validated cis-regulatory regions imported from specialized databases or extracted from 160 literature articles. This highly structured data set can be explored via a Developmental Browser, a Genome Browser, and a 3D Virtual Embryo module. We show how integration of heterogeneous data in ANISEED can provide a system-level understanding of the developmental program through the automatic inference of gene regulatory interactions, the identification of inducing signals, and the discovery and explanation of novel asymmetric divisions.
Collapse
|
14
|
Buchkremer S, Hendel J, Krupp M, Weinmann A, Schlamp K, Maass T, Staib F, Galle PR, Teufel A. Library of molecular associations: curating the complex molecular basis of liver diseases. BMC Genomics 2010; 11:189. [PMID: 20302666 PMCID: PMC2851601 DOI: 10.1186/1471-2164-11-189] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 03/20/2010] [Indexed: 01/17/2023] Open
Abstract
Background Systems biology approaches offer novel insights into the development of chronic liver diseases. Current genomic databases supporting systems biology analyses are mostly based on microarray data. Although these data often cover genome wide expression, the validity of single microarray experiments remains questionable. However, for systems biology approaches addressing the interactions of molecular networks comprehensive but also highly validated data are necessary. Results We have therefore generated the first comprehensive database for published molecular associations in human liver diseases. It is based on PubMed published abstracts and aimed to close the gap between genome wide coverage of low validity from microarray data and individual highly validated data from PubMed. After an initial text mining process, the extracted abstracts were all manually validated to confirm content and potential genetic associations and may therefore be highly trusted. All data were stored in a publicly available database, Library of Molecular Associations http://www.medicalgenomics.org/databases/loma/news, currently holding approximately 1260 confirmed molecular associations for chronic liver diseases such as HCC, CCC, liver fibrosis, NASH/fatty liver disease, AIH, PBC, and PSC. We furthermore transformed these data into a powerful resource for molecular liver research by connecting them to multiple biomedical information resources. Conclusion Together, this database is the first available database providing a comprehensive view and analysis options for published molecular associations on multiple liver diseases.
Collapse
Affiliation(s)
- Stefan Buchkremer
- Department of Medicine I, Johannes Gutenberg University, Mainz, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Beisvag V, Jünge FKR, Bergum H, Jølsum L, Lydersen S, Günther CC, Ramampiaro H, Langaas M, Sandvik AK, Lægreid A. GeneTools--application for functional annotation and statistical hypothesis testing. BMC Bioinformatics 2006; 7:470. [PMID: 17062145 PMCID: PMC1630634 DOI: 10.1186/1471-2105-7-470] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2006] [Accepted: 10/24/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Modern biology has shifted from "one gene" approaches to methods for genomic-scale analysis like microarray technology, which allow simultaneous measurement of thousands of genes. This has created a need for tools facilitating interpretation of biological data in "batch" mode. However, such tools often leave the investigator with large volumes of apparently unorganized information. To meet this interpretation challenge, gene-set, or cluster testing has become a popular analytical tool. Many gene-set testing methods and software packages are now available, most of which use a variety of statistical tests to assess the genes in a set for biological information. However, the field is still evolving, and there is a great need for "integrated" solutions. RESULTS GeneTools is a web-service providing access to a database that brings together information from a broad range of resources. The annotation data are updated weekly, guaranteeing that users get data most recently available. Data submitted by the user are stored in the database, where it can easily be updated, shared between users and exported in various formats. GeneTools provides three different tools: i) NMC Annotation Tool, which offers annotations from several databases like UniGene, Entrez Gene, SwissProt and GeneOntology, in both single- and batch search mode. ii) GO Annotator Tool, where users can add new gene ontology (GO) annotations to genes of interest. These user defined GO annotations can be used in further analysis or exported for public distribution. iii) eGOn, a tool for visualization and statistical hypothesis testing of GO category representation. As the first GO tool, eGOn supports hypothesis testing for three different situations (master-target situation, mutually exclusive target-target situation and intersecting target-target situation). An important additional function is an evidence-code filter that allows users, to select the GO annotations for the analysis. CONCLUSION GeneTools is the first "all in one" annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.no
Collapse
Affiliation(s)
- Vidar Beisvag
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Frode KR Jünge
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Hallgeir Bergum
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Lars Jølsum
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Stian Lydersen
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Clara-Cecilie Günther
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Heri Ramampiaro
- Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway
| | - Mette Langaas
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Arne K Sandvik
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Medicine, St. Olav's University Hospital, Trondheim, Norway
| | - Astrid Lægreid
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
16
|
Schimmer BP, Cordova M, Cheng H, Tsao A, Goryachev AB, Schimmer AD, Morris Q. Global profiles of gene expression induced by adrenocorticotropin in Y1 mouse adrenal cells. Endocrinology 2006; 147:2357-67. [PMID: 16484322 DOI: 10.1210/en.2005-1526] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
ACTH regulates the steroidogenic capacity, size, and structural integrity of the adrenal cortex through a series of actions involving changes in gene expression; however, only a limited number of ACTH-regulated genes have been identified, and these only partly account for the global effects of ACTH on the adrenal cortex. In this study, a National Institute on Aging 15K mouse cDNA microarray was used to identify genome-wide changes in gene expression after treatment of Y1 mouse adrenocortical cells with ACTH. ACTH affected the levels of 1275 annotated transcripts, of which 46% were up-regulated. The up-regulated transcripts were enriched for functions associated with steroid biosynthesis and metabolism; the down- regulated transcripts were enriched for functions associated with cell proliferation, nuclear transport and RNA processing, including alternative splicing. A total of 133 different transcripts, i.e. only 10% of the ACTH-affected transcripts, were represented in the categories above; most of these had not been described as ACTH-regulated previously. The contributions of protein kinase A and protein kinase C to these genome-wide effects of ACTH were evaluated in microarray experiments after treatment of Y1 cells and derivative protein kinase A-defective mutants with pharmacological probes of each pathway. Protein kinase A-dependent signaling accounted for 56% of the ACTH effect; protein kinase C-dependent signaling accounted for an additional 6%. These results indicate that ACTH affects the expression profile of Y1 adrenal cells principally through cAMP- and protein kinase A- dependent signaling. The large number of transcripts affected by ACTH anticipates a broader range of actions than previously appreciated.
Collapse
Affiliation(s)
- Bernard P Schimmer
- Banting and Best Department of Medical Research, University of Toronto, Ontario, Canada.
| | | | | | | | | | | | | |
Collapse
|
17
|
Masseroli M, Martucci D, Pinciroli F. GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining. Nucleic Acids Res 2004; 32:W293-300. [PMID: 15215397 PMCID: PMC441570 DOI: 10.1093/nar/gkh432] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Statistical and clustering analyses of gene expression results from high-density microarray experiments produce lists of hundreds of genes regulated differentially, or with particular expression profiles, in the conditions under study. Independent of the microarray platforms and analysis methods used, these lists must be biologically interpreted to gain a better knowledge of the patho-physiological phenomena involved. To this end, numerous biological annotations are available within heterogeneous and widely distributed databases. Although several tools have been developed for annotating lists of genes, most of them do not give methods for evaluating the relevance of the annotations provided, or for estimating the functional bias introduced by the gene set on the array used to identify the gene list considered. We developed Genome Functional INtegrated Discoverer (GFINDer), a web server able to automatically provide large-scale lists of user-classified genes with functional profiles biologically characterizing the different gene classes in the list. GFINDer automatically retrieves annotations of several functional categories from different sources, identifies the categories enriched in each class of a user-classified gene list and calculates statistical significance values for each category. Moreover, GFINDer enables the functional classification of genes according to mined functional categories and the statistical analysis is of the classifications obtained, aiding better interpretation of microarray experiment results. GFINDer is available online at http://www.medinfopoli.polimi.it/GFINDer/.
Collapse
Affiliation(s)
- Marco Masseroli
- Bioengineering Department, Politecnico di Milano, I-20133 Milano, Italy.
| | | | | |
Collapse
|
18
|
Zhang HG, Hsu HC, Yang PA, Yang X, Wu Q, Liu Z, Yi N, Mountz JD. Identification of multiple genetic loci that regulate adenovirus gene therapy. Gene Ther 2003; 11:4-14. [PMID: 14681692 DOI: 10.1038/sj.gt.3302136] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A key aspect of the immune response to adenovirus (Ad) gene therapy is the generation of a cytotoxic T-cell (CTL) response. To better understand the genetic network underlying these events, 20 strains of C57BL/6 x DBA/2 (BXD) recombinant inbred (RI) mice were administered with AdLacZ and analyzed at days 7, 21, 30, and 50 for liver beta-galactosidase (LacZ) expression and CTL response. Sera levels of interferon gamma (IFN-gamma), tumor necrosis factor-alpha (TNF-alpha), and interleukin-6 (IL-6) were analyzed at different times after AdLacZ. There was a distinct strain-dependent expression of LacZ, which was strongly correlated with the CTL response. Among the five BXD RI strains that exhibited significantly prolonged LacZ expression, four also exhibited a marked defect in the production of Ad-specific CTL. There was a strong correlation between the sera levels of IFN-gamma, TNF-alpha, and IL-6, but cytokine responses were not significantly correlated with LacZ expression or the CTL response. Quantitative trait loci regulating LacZ on day 30 were found on chromosome (Chr) 19 (33 cM) and Chr 15 (42.8 cM). Cytotoxicity mapped to Chr 7 (41.0 and 57.4-65.2 cM), Chr 15 (61.7 cM), and Chr X (27.8 cM). IFN-gamma production mapped to Chr 18 (22, 27, and 32 cM) and Chr 11 (64.0 cM). TNF-alpha and IL-6 production mapped to Chr 6 (91.5 cM) Chr 9 (42.0 cM) and Chr 8 (52 and 73.0 cM). These results indicate that different strains of mice exhibit different pathways for effective clearance of AdLacZ depending on genetic polymorphisms and interactions at multiple genetic loci.
Collapse
Affiliation(s)
- H-G Zhang
- Department of Medicine, Division of Clinical Immunology and Rheumatology, The University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | | | | | | | | | | | | | | |
Collapse
|
19
|
Blake JA, Eppig JT, Richardson JE, Bult CJ, Kadin JA. The Mouse Genome Database (MGD): integration nexus for the laboratory mouse. Nucleic Acids Res 2001; 29:91-4. [PMID: 11125058 PMCID: PMC29788 DOI: 10.1093/nar/29.1.91] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The Mouse Genome Database (MGD) is the community database resource for the laboratory mouse, a key model organism for interpreting the human genome and for understanding human biology and disease (http://www.informatics.jax.org). MGD provides standard nomenclature and consensus map positions for mouse genes and genetic markers; it provides a curated set of mammalian homology records, user-defined chromosomal maps, experimental data sets and the definitive mouse 'gene to sequence' reference set for the research community. The integration and standardization of these data sets facilitates the transition between mouse DNA sequence, gene and phenotype annotations. A recent focus on allele and phenotype representations enhances the ability of MGD to organize and present data for exploring the relationship between genotype and phenotype. This link between the genome and the biology of the mouse is especially important as phenotype information grows from large mutagenesis projects and genotype information grows from large-scale sequencing projects.
Collapse
Affiliation(s)
- J A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609 USA.
| | | | | | | | | |
Collapse
|
20
|
Roth W, Deussing J, Botchkarev VA, Pauly-Evers M, Saftig P, Hafner A, Schmidt P, Schmahl W, Scherer J, Anton-Lamprecht I, Von Figura K, Paus R, Peters C. Cathepsin L deficiency as molecular defect of furless: hyperproliferation of keratinocytes and pertubation of hair follicle cycling. FASEB J 2000; 14:2075-86. [PMID: 11023992 DOI: 10.1096/fj.99-0970com] [Citation(s) in RCA: 259] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Lysosomal cysteine proteinases of the papain family are involved in lysosomal bulk proteolysis, major histocompatibility complex class II mediated antigen presentation, prohormone processing, and extracellular matrix remodeling. Cathepsin L (CTSL) is a ubiquitously expressed major representative of the papain-like family of cysteine proteinases. To investigate CTSL in vivo functions, the gene was inactivated by gene targeting in embryonic stem cells. CTSL-deficient mice develop periodic hair loss and epidermal hyperplasia, acanthosis, and hyperkeratosis. The hair loss is due to alterations of hair follicle morphogenesis and cycling, dilatation of hair follicle canals, and disturbed club hair formation. Hyperproliferation of hair follicle epithelial cells and basal epidermal keratinocytes-both of ectodermal origin-are the primary characteristics underlying the mutant phenotype. Pathological inflammatory responses have been excluded as a putative cause of the skin and hair disorder. The phenotype of CTSL-deficient mice is reminiscent of the spontaneous mouse mutant furless (fs). Analyses of the ctsl gene of fs mice revealed a G149R mutation inactivating the proteinase activity. CTSL is the first lysosomal proteinase shown to be essential for epidermal homeostasis and regular hair follicle morphogenesis and cycling.
Collapse
Affiliation(s)
- W Roth
- Institut für Molekulare Medizin und Zellforschung, Albert Ludwigs Universität Freiburg, 79106 Freiburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Nocentini G, Bartoli A, Ronchetti S, Giunchi L, Cupelli A, Delfino D, Migliorati G, Riccardi C. Gene structure and chromosomal assignment of mouse GITR, a member of the tumor necrosis factor/nerve growth factor receptor family. DNA Cell Biol 2000; 19:205-17. [PMID: 10798444 DOI: 10.1089/104454900314474] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
GITR is a type I transmembrane protein that belongs to the tumor necrosis factor/nerve growth factor receptor (TNF/NGFR) family. This receptor is preferentially expressed in activated T lymphocytes and may function as signaling molecule during T-cell development. In the present study, we examined the genomic organization of the entire mouse GITR (mGITR) gene. The gene spans a 2543-bp region and consists of five exons (with a length ranging from 88 bp to 395 bp) and four introns (67 bp to 778 bp). In agreement with GITR expression in activated T cells, consensus elements for transcription factors involved in T-cell development and activation were identified in the 5' flanking region, including a consensus element for NF-kappaB. Two highly significant binding sites for MyoD and one binding site for myogenin were also found, suggesting involvement of GITR in muscle development. The mGITR gene contains 17 transcription initiation sites distributed over a 76-bp region, all used with the same frequency. We localized mGITR to the murine chromosome 4 (E region), where other 4 TNF/NGFR members localize, including m4-1BB and mOX40. These results further indicate that GITR shares several features with OX40, 4-1BB, and CD27, suggesting the existence of a new subfamily of the TNFR family, as also confirmed by the similarity of their cytoplasmic domains.
Collapse
Affiliation(s)
- G Nocentini
- Department of Clinical and Experimental Medicine, Perugia University Medical School, Italy
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Pasteris NG, Nagata K, Hall A, Gorski JL. Isolation, characterization, and mapping of the mouse Fgd3 gene, a new Faciogenital Dysplasia (FGD1; Aarskog Syndrome) gene homologue. Gene 2000; 242:237-47. [PMID: 10721717 DOI: 10.1016/s0378-1119(99)00518-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
FGD1 gene mutations result in faciogenital dysplasia (FGDY, Aarskog syndrome), an X-linked developmental disorder that adversely affects the formation of multiple skeletal structures. FGD1 encodes a guanine nucleotide exchange factor (GEF) that specifically activates the Rho GTPase Cdc42. By way of Cdc42, FGD1 regulates the actin cytoskeleton and activates the c-Jun N-terminal kinase signaling cascade to regulate cell growth and differentiation. Previous work shows that FGD1 is the founding member of a family of related genes including the mouse Fgd2 gene and the rat Frabin gene. Here, we report on the isolation, characterization, and mapping of the mouse Fgd3 gene, a new and novel member of the FGD1 gene family. Fgd3 cDNA encodes a 733-amino-acid protein with a predicted mass of 81 kDa. Fgd3 and FGD1 share a high degree of sequence identity that spans >560 contiguous amino acid residues. Like FGD1, Fgd3 contains adjacent RhoGEF and pleckstrin homology (PH) domains, a second carboxy-terminal PH domain, and a distinctive FYVE domain. Together, these domains appear to form a canonical core structure for FGD1 family members. In addition, compared to other FGD1 family members, Fgd3 contains different structural regions that may be involved in distinct signaling interactions. Microinjection studies show that Fgd3 stimulates fibroblasts to form filopodia, actin microspikes formed upon the stimulation of Cdc42. Fgd3 transcripts are present in several diverse tissues and during mouse embryogenesis, suggesting a developmentally regulated pattern of expression and a potential role in embryonic development. Genetic linkage and radiation hybrid mapping data show that Fgd3 and the human FGD3 ortholog map to syntenic regions of murine chromosome 13 and human chromosome 9q22, respectively. We conclude that Fgd3 is a new and novel member of the FGD1 family of RhoGEF proteins.
Collapse
MESH Headings
- 3T3 Cells
- Abnormalities, Multiple/genetics
- Amino Acid Sequence
- Animals
- Base Sequence
- Blotting, Northern
- Chromosomes/genetics
- Chromosomes, Human, Pair 9/genetics
- DNA, Complementary/chemistry
- DNA, Complementary/genetics
- DNA, Complementary/isolation & purification
- Facial Bones/abnormalities
- Gene Expression Regulation, Developmental
- Guanine Nucleotide Exchange Factors/genetics
- Guanine Nucleotide Exchange Factors/physiology
- Humans
- Male
- Mice
- Mice, Inbred C57BL
- Molecular Sequence Data
- Muridae
- Proteins/genetics
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Rho Guanine Nucleotide Exchange Factors
- Sequence Alignment
- Sequence Analysis, DNA
- Sequence Homology, Amino Acid
- Tissue Distribution
- Urogenital Abnormalities/genetics
- cdc42 GTP-Binding Protein/metabolism
Collapse
Affiliation(s)
- N G Pasteris
- Department of Pediatrics, University of Michigan Medical School, Ann Arbor 48109-0688, USA
| | | | | | | |
Collapse
|
23
|
Blake JA, Eppig JT, Richardson JE, Davisson MT. The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. The Mouse Genome Database Group. Nucleic Acids Res 2000; 28:108-11. [PMID: 10592195 PMCID: PMC102449 DOI: 10.1093/nar/28.1.108] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/1999] [Accepted: 10/07/1999] [Indexed: 11/14/2022] Open
Abstract
The Mouse Genome Database (MGD) is a comprehensive public database of mouse genomic, genetic and phenotypic information (http://www. informatics.jax.org). This community database provides information about genes, serves as a mapping resource of the mouse genome, details mammalian orthologs, integrates experimental data, represents standardized mouse nomenclature for genes and alleles, incorporates links to other genomic resources such as sequence data, and includes a variety of additional information about the laboratory mouse. MGD scientists and annotators work cooperatively with the research community to provide an integrated, consensus view of the mouse genome while also providing experimental data including data conflicting with the consensus representation. Recent improvements focus on the representation of phenotypic information and the enhancement of gene and allele descriptions.
Collapse
Affiliation(s)
- J A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | | | |
Collapse
|
24
|
Tisljar K, Deussing J, Peters C. Cathepsin J, a novel murine cysteine protease of the papain family with a placenta-restricted expression. FEBS Lett 1999; 459:299-304. [PMID: 10526153 DOI: 10.1016/s0014-5793(99)01263-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
A novel mouse cysteine protease of the papain family was identified by searching the dbEST database. A 1.28 kb full-length cDNA was obtained which contains an open reading frame of 999 nucleotides and encodes a predicted polypeptide of 333 amino acids. The deduced polypeptide exhibits features characteristic of cysteine proteases of the papain type including the highly conserved residues of the catalytic triad, and was hence named cathepsin J. Cathepsin J represents the murine homologue of a previously described rat cathepsin L-related protein. Mature cathepsin J shows 59.3% identity to mouse cathepsin L and contains the characteristic ER(F/W)NIN motif within the propeptide indicating that this protease belongs to the subgroup of cathepsin L-like cysteine proteases. Northern blot analysis of various tissues revealed a placenta-restricted expression. This expression pattern may suggest a role of cathepsin J in embryo implantation and/or placental function. Ctsj was mapped to mouse chromosome 13 in the vicinity of cathepsin L suggesting that cathepsin J may have arisen by gene duplication from cathepsin L or a common ancestral gene.
Collapse
Affiliation(s)
- K Tisljar
- Medizinische Molekularbiologie, Abteilung Hämatologie-Onkologie, Klinikum der Albert-Ludwigs-Universität Freiburg, Hugstetter Strasse 55, 79106, Freiburg, Germany
| | | | | |
Collapse
|
25
|
Pasteris NG, Gorski JL. Isolation, characterization, and mapping of the mouse and human Fgd2 genes, faciogenital dysplasia (FGD1; Aarskog syndrome) gene homologues. Genomics 1999; 60:57-66. [PMID: 10458911 DOI: 10.1006/geno.1999.5903] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
FGD1 encodes a guanine nucleotide exchange factor (GEF) that specifically activates the Rho GTPase Cdc42. FGD1 gene mutations result in faciogenital dysplasia (FGDY, Aarskog syndrome), an X-linked developmental disorder that adversely affects the formation of multiple skeletal structures. Database searches show that the Caenorhabditis elegans genome contains an FGD1 homologue. Since C. elegans genes often have multiple vertebrate homologues, we hypothesized the existence of multiple mammalian FGD1-related sequences. Here we report the use of degenerate PCR to isolate and characterize the mouse and human Fgd2 genes, new members of the FGD1 gene family. Fgd2 cDNA encodes a 727-amino-acid protein with a predicted mass of 82 kDa. Fgd2 and FGD1 share a high degree of sequence identity that spans >560 contiguous amino acid residues. Fgd2, like FGD1, contains adjacent RhoGEF and PH domains, a second carboxy-terminal PH domain, and a distinctive FYVE domain. Genomic PCR studies indicate some degree of conserved gene structure between Fgd2 and FGD1. Fgd2 transcripts are present in several diverse tissues and during mouse embryogenesis, suggesting a role in embryonic development. Genetic linkage and radiation hybrid mapping data show that Fgd2 and the human FGD2 ortholog map to syntenic regions of murine chromosome 17 and human chromosome 6p21.2, respectively. The observation that all FGD1 gene family members contain equivalent signaling domains and a conserved structural organization strongly suggests that these signaling domains form a canonical core structure for members of the FGD1 family of RhoGEF proteins.
Collapse
MESH Headings
- Abnormalities, Multiple/genetics
- Amino Acid Sequence
- Animals
- Base Sequence
- Blotting, Northern
- Chromosome Mapping
- Chromosomes/genetics
- Chromosomes, Human, Pair 6/genetics
- Cloning, Molecular
- DNA Primers
- DNA, Complementary/chemistry
- DNA, Complementary/genetics
- DNA, Complementary/isolation & purification
- Facial Bones/abnormalities
- Facial Bones/metabolism
- GTP-Binding Proteins/genetics
- Guanine Nucleotide Exchange Factors
- Humans
- Mice
- Mice, Inbred C57BL
- Mice, Inbred Strains
- Molecular Sequence Data
- Muridae
- Polymerase Chain Reaction
- Proteins/genetics
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Sequence Alignment
- Sequence Analysis, DNA
- Sequence Homology, Amino Acid
- Tissue Distribution
- Urogenital Abnormalities/genetics
Collapse
Affiliation(s)
- N G Pasteris
- Department of Human Genetics, University of Michigan Medical Center, Ann Arbor, Michigan, 48109-0688, USA
| | | |
Collapse
|
26
|
Bult CJ, Krupke DM, Eppig JT. Electronic access to mouse tumor data: the Mouse Tumor Biology Database (MTB) project. Nucleic Acids Res 1999; 27:99-105. [PMID: 9847151 PMCID: PMC148106 DOI: 10.1093/nar/27.1.99] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Mouse Tumor Biology (MTB) Database supports the use of the mouse as a model system of hereditary and induced cancers by providing electronic access to: (i) tumor names and classifications, (ii) tumor incidence and latency data in different strains of mice, (iii) tumor pathology reports and images, (iv) information on genetic factors associated with tumors and tumor development, and (v) references (published and unpublished data). This resource has been designed to aid researchers in such areas as choosing experimental models, reviewing patterns of mutations in specific cancers, and identifying genes that are commonly mutated across a spectrum of cancers. MTB also provides hypertext links to related on-line resources and databases. MTB is accessible via the World Wide Web at http://tumor.informatics.jax.org. User support is available for MTB by Email at mgi-help@informatics.jax.org
Collapse
Affiliation(s)
- C J Bult
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | |
Collapse
|
27
|
Blake JA, Richardson JE, Davisson MT, Eppig JT. The Mouse Genome Database (MGD): genetic and genomic information about the laboratory mouse. The Mouse Genome Database Group. Nucleic Acids Res 1999; 27:95-8. [PMID: 9847150 PMCID: PMC148105 DOI: 10.1093/nar/27.1.95] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Mouse Genome Database (MGD) focuses on the integration of mapping, homology, polymorphism and molecular data about the laboratory mouse. Detailed descriptions of genes including their chromosomal location, gene function, disease associations, mutant phenotypes, molecular polymorphisms and links to representative sequences including ESTs are integrated within MGD. The association of information from experiment to gene to genome requires careful coordination and implementation of standardized vocabularies, unique nomenclature constructions, and detailed information derived from multiple sources. This information is linked to other public databases that focus on additional information such as expression patterns, sequences, bibliographic details and large mapping panel data. Scientists participate in the curation of MGD data by generating the Chromosome Committee Reports, consulting on gene family nomenclature revisions, and providing descriptions of mouse strain characteristics and of new mutant phenotypes. MGD is accessible at http://www.informatics.jax.org
Collapse
Affiliation(s)
- J A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | | | |
Collapse
|
28
|
Périer RC, Junier T, Bonnard C, Bucher P. The Eukaryotic Promoter Database (EPD): recent developments. Nucleic Acids Res 1999; 27:307-9. [PMID: 9847211 PMCID: PMC148166 DOI: 10.1093/nar/27.1.307] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes description of the initiation site mapping data, cross-references to other databases, and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. Recent efforts have focused on exhaustive cross-referencing to the EMBL nucleotide sequence database, and on the improvement of the WWW-based user interfaces and data retrieval mechanisms. EPD can be accessed at http://www.epd.isb-sib.ch
Collapse
Affiliation(s)
- R C Périer
- Swiss Institute of Bioinformatics & Swiss Institute for Experimental Cancer Research, Ch. des Boveresses 155, 1066-Epalinges s/Lausanne, Switzerland
| | | | | | | |
Collapse
|
29
|
Eppig JT, Blake JA, Davisson MT, Richardson JE. Informatics for mouse genetics and genome mapping. Methods 1998; 14:179-90. [PMID: 9571075 DOI: 10.1006/meth.1997.0576] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Bioinformatics has become an essential part of biological research. The rapid pace of technology development and the ability to carry out biological experimentation in large scale require computerized systems for data management, analysis, and display. Experimentation with the mouse, a major model organism of the Human Genome Initiative, has intensified the need for bioinformatics tools for mouse mapping and genome analysis. This article describes the Mouse Genome Database in the United States, a primary resource for mouse genomic data, as well as resources at the Mammalian Genetics Unit in the United Kingdom and the Animal Genome Database of Japan. Internet addresses are provided for major genetic and physical mapping resources, major genome data sites, and resources of molecular information.
Collapse
Affiliation(s)
- J T Eppig
- Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | | | | |
Collapse
|
30
|
Abstract
Biological sequence databases are currently being re-engineered to make them more efficient and easier to use. This re-engineering is also providing an infrastructure to make it easier to interrogate and integrate data from different sources. The net result of this effort should be a great improvement in the power and availability of bioinformatics resources to the general biology community.
Collapse
Affiliation(s)
- P G Baker
- School of Biological Sciences, University of Manchester, UK.
| | | |
Collapse
|
31
|
Blake JA, Eppig JT, Richardson JE, Davisson MT. The Mouse Genome Database (MGD): a community resource. Status and enhancements. The Mouse Genome Informatics Group. Nucleic Acids Res 1998; 26:130-7. [PMID: 9399817 PMCID: PMC147182 DOI: 10.1093/nar/26.1.130] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
The Mouse Genome Database (MGD) is a comprehensive community database that integrates genetic, genomic and phenotypic information about the laboratory mouse. MGD provides detailed information about genes and genetic markers, elemental data from mapping experiments, descriptions of molecular segments including ESTs, probes, and cDNA clones, homology information between mouse and many other mammalian genomes, and phenotypic descriptions of gene mutations, gene function and mouse strains. All data are supported by citations. Interactive graphical displays of cytogenetic, genetic and physical maps are available. User support is provided through dedicated staff, bulletin boards, and user documentation. MGD can be accessed at http://www.informatics.jax.org
Collapse
Affiliation(s)
- J A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA.
| | | | | | | |
Collapse
|
32
|
Bairoch A, Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998. Nucleic Acids Res 1998; 26:38-42. [PMID: 9399796 PMCID: PMC147215 DOI: 10.1093/nar/26.1.38] [Citation(s) in RCA: 177] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
SWISS-PROT (http://www.expasy.ch/) is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to two additional databases; a variety of new documentation files and improvements to TrEMBL, a computer annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT.
Collapse
Affiliation(s)
- A Bairoch
- Department of Medical Biochemistry, University of Geneva, 1 rue Michel Servet, 1211 Geneva 4, Switzerland.
| | | |
Collapse
|
33
|
Benton D. Integrated access to genomic and other bioinformation: an essential ingredient of the drug discovery process. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 1998; 8:121-155. [PMID: 9522473 DOI: 10.1080/10629369808039138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Due to the high rate of data production and the need of researchers to have rapid access to new data, public databases have become the major medium through which genome mapping and sequencing data as well as macromolecular structural data are published. There are now more than 250 databases of biomolecular, structural, genetic, or phenotypic data, many of which are doubling in size annually. These databases, many of which were created and are maintained by experimentalists for their own research use, provide valuable collections of organized, validated data. However, the very number and diversity of databases now make efficient data resource discovery as important as effective data resource use. Existing autonomous biological databases contain related data which are more valuable when interconnected than when isolated. Political and scientific realities dictate that these databases will be built by different teams, in different locations, for different purposes, and using different data models and supporting DBMSs. As a consequence, connecting the related data they contain is not straightforward. Experience with existing biological databases indicates that it is possible to form useful queries across these databases, but that doing so usually requires expertise in the semantic structure of each source database. Advancing to the next level of integration among biological information resources poses significant technical and sociological challenges.
Collapse
Affiliation(s)
- D Benton
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-6050, USA
| |
Collapse
|
34
|
Burczak JD, Wilkinson FE, Robbins DJ. Impact of genomics on diagnostic medicine. Drug Dev Res 1997. [DOI: 10.1002/(sici)1098-2299(199707/08)41:3/4<193::aid-ddr9>3.0.co;2-g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
35
|
|
36
|
Apweiler R, Junker V, Gateau A, O'Donovan C, Lang F, Bairoch A. New developments in linking of biological databases and computer-generation of annotation: SWISS-PROT and its computer-annotated supplement TREMBL. Bioinformatics 1996. [DOI: 10.1007/bfb0033202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|