1
|
Disease Ontology: improving and unifying disease annotations across species. Dis Model Mech 2018; 11:dmm.032839. [PMID: 29590633 PMCID: PMC5897730 DOI: 10.1242/dmm.032839] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 02/08/2018] [Indexed: 11/20/2022] Open
Abstract
Model organisms are vital to uncovering the mechanisms of human disease and developing new therapeutic tools. Researchers collecting and integrating relevant model organism and/or human data often apply disparate terminologies (vocabularies and ontologies), making comparisons and inferences difficult. A unified disease ontology is required that connects data annotated using diverse disease terminologies, and in which the terminology relationships are continuously maintained. The Mouse Genome Database (MGD, http://www.informatics.jax.org), Rat Genome Database (RGD, http://rgd.mcw.edu) and Disease Ontology (DO, http://www.disease-ontology.org) projects are collaborating to augment DO, aligning and incorporating disease terms used by MGD and RGD, and improving DO as a tool for unifying disease annotations across species. Coordinated assessment of MGD's and RGD's disease term annotations identified new terms that enhance DO's representation of human diseases. Expansion of DO term content and cross-references to clinical vocabularies (e.g. OMIM, ORDO, MeSH) has enriched the DO's domain coverage and utility for annotating many types of data generated from experimental and clinical investigations. The extension of anatomy-based DO classification structure of disease improves accessibility of terms and facilitates application of DO for computational research. A consistent representation of disease associations across data types from cellular to whole organism, generated from clinical and model organism studies, will promote the integration, mining and comparative analysis of these data. The coordinated enrichment of the DO and adoption of DO by MGD and RGD demonstrates DO's usability across human data, MGD, RGD and the rest of the model organism database community. Summary: Analyzing diverse disease data requires a comprehensive, robust disease ontology to integrate annotations and retrieve accurate, interpretable results. MGD, RGD and DO are working in collaboration to achieve this goal.
Collapse
|
2
|
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse. ILAR J 2017; 58:17-41. [PMID: 28838066 PMCID: PMC5886341 DOI: 10.1093/ilar/ilx013] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Revised: 03/14/2017] [Accepted: 03/28/2017] [Indexed: 12/13/2022] Open
Abstract
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided.
Collapse
|
3
|
Abstract 2804: Identifying therapeutically relevant mouse and patient-derived xenograft (PDX) models of human cancer using the mouse tumor biology database (MTB) data resource. Cancer Res 2017. [DOI: 10.1158/1538-7445.am2017-2804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The laboratory mouse is the foremost model organism for interrogating the genetic and molecular basis of human cancer and is a powerful platform for identifying therapeutically effective targets for prevention and treatment of cancer. Research using genetically engineered mouse models (GEMMs) have led to important advances in our understanding of the genetic basis of cancer susceptibility, the function of tumor suppressors and oncogenes, and therapy responses in preclinical and co-clinical studies. Patient Derived Xenograft (PDX) models are an increasingly important model system for in vivo studies of human cancer. These models are created by implanting patient tumors into immunodeficient or humanized mouse hosts and are a powerful translational research platform for preclinical and co-clinical studies. The number of GEMM and PDX mouse models increases significantly every year and the diverse cancer-related data about human cancer models tend to be distributed in ways that makes it difficult for researchers to integrate and interpret the information to find the most relevant model for their research. The Mouse Tumor Biology database (http://tumor.informatics.jax.org) is an expertly curated resource for information and data about genetically defined mouse strains and PDX models of human cancer. MTB provides query tools to enable integrated searches and visualization of these varied data, thus facilitating the assessment of novel mouse models of human cancer and potential preventative and therapeutic treatments. Enforcement of controlled vocabularies and standard gene, allele and strain nomenclature within MTB facilitates precise and comprehensive queries of MTB for pertinent mouse models. MTB contains data from spontaneous or endogenously induced tumors from genetically defined mice including tumor classification, incidence, Quantitative Trait Loci, pathology reports, images and genetic changes in the tumor (somatic) and background strain (germline) genomes. The PDX resource enables queries based on tumor type, cancer diagnosis and genomic properties of the engrafted tumors. Information in MTB is obtained from curation of peer-reviewed scientific publications and direct data submissions from individual investigators and large-scale programs. New features in MTB include the Faceted Tumor Search Form and a Reported Mouse Models table linking the most common fatal human cancers to reported equivalent mouse models. MTB contains over 77,000 Tumor Frequencies and over 2,200 Pathology Reports with over 6,600 images from over 4,200 references. MTB provides access to detailed clinical, pathological, expression and genomics data from over 400 PDX models. Information in MTB is integrated with cancer models data from other bioinformatics resources including PathBase, the Gene Expression Omnibus and ArrayExpress. MTB is supported by NCI grant CA089713.
Citation Format: Dale A. Begley, Debra M. Krupke, Steven B. Neuhauser, Joel E. Richardson, John P. Sundberg, Janan T. Eppig, Carol J. Bult. Identifying therapeutically relevant mouse and patient-derived xenograft (PDX) models of human cancer using the mouse tumor biology database (MTB) data resource [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 2804. doi:10.1158/1538-7445.AM2017-2804
Collapse
|
4
|
Mouse Genome Informatics (MGI): Resources for Mining Mouse Genetic, Genomic, and Biological Data in Support of Primary and Translational Research. Methods Mol Biol 2017; 1488:47-73. [PMID: 27933520 DOI: 10.1007/978-1-4939-6427-7_3] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.
Collapse
|
5
|
Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse. Nucleic Acids Res 2016; 45:D723-D729. [PMID: 27899570 PMCID: PMC5210536 DOI: 10.1093/nar/gkw1040] [Citation(s) in RCA: 230] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 10/28/2016] [Indexed: 11/30/2022] Open
Abstract
The Mouse Genome Database (MGD: http://www.informatics.jax.org) is the primary community data resource for the laboratory mouse. It provides a highly integrated and highly curated system offering a comprehensive view of current knowledge about mouse genes, genetic markers and genomic features as well as the associations of those features with sequence, phenotypes, functional and comparative information, and their relationships to human diseases. MGD continues to enhance access to these data, to extend the scope of data content and visualizations, and to provide infrastructure and user support that ensures effective and efficient use of MGD in the advancement of scientific knowledge. Here, we report on recent enhancements made to the resource and new features.
Collapse
|
6
|
The mouse Gene Expression Database (GXD): 2017 update. Nucleic Acids Res 2016; 45:D730-D736. [PMID: 27899677 PMCID: PMC5210556 DOI: 10.1093/nar/gkw1073] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 10/21/2016] [Accepted: 10/28/2016] [Indexed: 12/14/2022] Open
Abstract
The Gene Expression Database (GXD; www.informatics.jax.org/expression.shtml) is an extensive and well-curated community resource of mouse developmental expression information. Through curation of the scientific literature and by collaborations with large-scale expression projects, GXD collects and integrates data from RNA in situ hybridization, immunohistochemistry, RT-PCR, northern blot and western blot experiments. Expression data from both wild-type and mutant mice are included. The expression data are combined with genetic and phenotypic data in Mouse Genome Informatics (MGI) and made readily accessible to many types of database searches. At present, GXD includes over 1.5 million expression results and more than 300 000 images, all annotated with detailed and standardized metadata. Since our last report in 2014, we have added a large amount of data, we have enhanced data and database infrastructure, and we have implemented many new search and display features. Interface enhancements include: a new Mouse Developmental Anatomy Browser; interactive tissue-by-developmental stage and tissue-by-gene matrix views; capabilities to filter and sort expression data summaries; a batch search utility; gene-based expression overviews; and links to expression data from other species.
Collapse
|
7
|
Abstract 631: The mouse tumor biology database (MTB): An integrated data resource for mouse and patient derived xenograft (PDX) models of human cancer. Cancer Res 2016. [DOI: 10.1158/1538-7445.am2016-631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The laboratory mouse is the premier model organism for understanding the genetic basis of human cancer and is a powerful platform for investigating novel targets for therapeutic intervention. Research using genetically engineered mouse models has led to key insights into the genetics of cancer susceptibility, the function of tumor suppressors and oncogenes, and therapy responses in pre-clinical and co-clinical studies. Patient Derived Xenografts (PDX) models are another model system for in vivo cancer studies. PDX models are created by implanting patient tumors into immunodeficient or humanized mouse hosts. PDX models are a powerful translational research platform for pre-clinical and co-clinical studies. The number of mouse models and the volume and heterogeneity of data related to the characterization of these models has increased dramatically in recent years, making integrated searches of these data and identifying relevant models a significant barrier to their effective use. The Mouse Tumor Biology database (MTB) (http://tumor.informatics.jax.org) provides on-line query tools to facilitate cohesive searches and visualization of these varied data, thus enabling the identification of novel mouse models of human cancer and potential therapeutic treatments.
The Mouse Tumor Biology database is an expertly curated resource for information and data about genetically modified mouse strains and PDX models of human cancer. Enforcement of standard gene and strain nomenclature and use of controlled vocabularies within MTB enables complete and accurate searching of the published literature for relevant mouse models. MTB contains data from spontaneous or endogenously induced tumors from genetically defined mice including tumor classification, incidence and latency, tumor associated QTLs, pathology reports, images and genetic changes in the tumor (somatic) and background strain (germline) genomes. The PDX resource enables searches based on tumor type, cancer diagnosis, and genomic properties of the engrafted tumors. Information in MTB is obtained from curation of peer-reviewed scientific publications and from direct data submissions from individual investigators and large-scale programs. MTB contains over 71,000 Tumor Frequencies, and over 2,080 Pathology Reports with over 5,800 images from over 3,600 references. MTB also provides access to detailed clinical, pathological, expression and genomics data from over 450 PDX models. Information in MTB is integrated with cancer models data from other bioinformatics resources including PathBase, the Gene Expression Omnibus (GEO), and ArrayExpress. MTB is supported by NCI grant CA089713.
Citation Format: Dale A. Begley, Debbie M. Krupke, Steven B. Neuhauser, Joel E. Richardson, John P. Sundberg, Janan T. Eppig, Carol J. Bult. The mouse tumor biology database (MTB): An integrated data resource for mouse and patient derived xenograft (PDX) models of human cancer. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 631.
Collapse
|
8
|
Inferring gene-to-phenotype and gene-to-disease relationships at Mouse Genome Informatics: challenges and solutions. J Biomed Semantics 2016. [PMCID: PMC5143442 DOI: 10.1186/s13326-016-0054-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Background Inferring gene-to-phenotype and gene-to-human disease model relationships from annotated mouse phenotypes and disease associations is critical when researching gene function and identifying candidate disease genes. Filtering the various kinds of genotypes to determine which phenotypes are caused by a mutation in a particular gene can be a laborious and time-consuming process. Methods At Mouse Genome Informatics (MGI, www.informatics.jax.org), we have developed a gene annotation derivation algorithm that computes gene-to-phenotype and gene-to-disease annotations from our existing corpus of annotations to genotypes. This algorithm differentiates between simple genotypes with causative mutations in a single gene and more complex genotypes where mutations in multiple genes may contribute to the phenotype. As part of the process, alleles functioning as tools (e.g., reporters, recombinases) are filtered out. Results Using this algorithm derived gene-to-phenotype and gene-to-disease annotations were created for 16,000 and 2100 mouse markers, respectively, starting from over 57,900 and 4800 genotypes with at least one phenotype and disease annotation, respectively. Conclusions Implementation of this algorithm provides consistent and accurate gene annotations across MGI and provides a vital time-savings relative to manual annotation by curators.
Collapse
|
9
|
Abstract
The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data.
Collapse
|
10
|
miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants. Trends Genet 2015; 31:613-626. [PMID: 26453491 DOI: 10.1016/j.tig.2015.09.002] [Citation(s) in RCA: 134] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Revised: 08/10/2015] [Accepted: 09/04/2015] [Indexed: 12/21/2022]
Abstract
High-throughput sequencing of miRNAs has revealed the diversity and variability of mature and functional short noncoding RNAs, including their genomic origins, biogenesis pathways, sequence variability, and newly identified products such as miRNA-offset RNAs (moRs). Here we review known cases of alternative mature miRNA-like RNA fragments and propose a revised definition of miRNAs to encompass this diversity. We then review nomenclature guidelines for miRNAs and propose to extend nomenclature conventions to align with those for protein-coding genes established by international consortia. Finally, we suggest a system to encompass the full complexity of sequence variations (i.e., isomiRs) in the analysis of small RNA sequencing experiments.
Collapse
|
11
|
The International Mouse Strain Resource (IMSR): cataloging worldwide mouse and ES cell line resources. Mamm Genome 2015; 26:448-55. [PMID: 26373861 PMCID: PMC4602064 DOI: 10.1007/s00335-015-9600-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 08/10/2015] [Indexed: 11/04/2022]
Abstract
The availability of and access to quality genetically defined, health-status known mouse resources is critical for biomedical research. By ensuring that mice used in research experiments are biologically, genetically, and health-status equivalent, we enable knowledge transfer, hypothesis building based on multiple data streams, and experimental reproducibility based on common mouse resources (reagents). Major repositories for mouse resources have developed over time and each has significant unique resources to offer. Here we (a) describe The International Mouse Strain Resource that offers users a combined catalog of worldwide mouse resources (live, cryopreserved, embryonic stem cells), with direct access to repository sites holding resources of interest and (b) discuss the commitment to nomenclature standards among resources that remain a challenge in unifying mouse resource catalogs.
Collapse
|
12
|
Finding mouse models of human lymphomas and leukemia's using the Jackson laboratory mouse tumor biology database. Exp Mol Pathol 2015; 99:533-6. [PMID: 26302176 DOI: 10.1016/j.yexmp.2015.07.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 07/07/2015] [Indexed: 01/22/2023]
Abstract
Many mouse models have been created to study hematopoietic cancer types. There are over thirty hematopoietic tumor types and subtypes, both human and mouse, with various origins, characteristics and clinical prognoses. Determining the specific type of hematopoietic lesion produced in a mouse model and identifying mouse models that correspond to the human subtypes of these lesions has been a continuing challenge for the scientific community. The Mouse Tumor Biology Database (MTB; http://tumor.informatics.jax.org) is designed to facilitate use of mouse models of human cancer by providing detailed histopathologic and molecular information on lymphoma subtypes, including expertly annotated, on line, whole slide scans, and providing a repository for storing information on and querying these data for specific lymphoma models.
Collapse
|
13
|
Abstract
From its inception in 1989, the mission of the Mouse Genome Informatics (MGI) resource remains to integrate genetic, genomic, and biological data about the laboratory mouse to facilitate the study of human health and disease. This mission is ever more feasible as the revolution in genetics knowledge, the ability to sequence genomes, and the ability to specifically manipulate mammalian genomes are now at our fingertips. Through major paradigm shifts in biological research and computer technologies, MGI has adapted and evolved to become an integral part of the larger global bioinformatics infrastructure and honed its ability to provide authoritative reference datasets used and incorporated by many other established bioinformatics resources. Here, we review some of the major changes in research approaches over that last quarter century, how these changes are reflected in the MGI resource you use today, and what may be around the next corner.
Collapse
|
14
|
Abstract
The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
Collapse
|
15
|
Mouse Genome Database: From sequence to phenotypes and disease models. Genesis 2015; 53:458-73. [PMID: 26150326 PMCID: PMC4545690 DOI: 10.1002/dvg.22874] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 06/30/2015] [Accepted: 07/02/2015] [Indexed: 12/19/2022]
Abstract
The Mouse Genome Database (MGD, www.informatics.jax.org) is the international scientific database for genetic, genomic, and biological data on the laboratory mouse to support the research requirements of the biomedical community. To accomplish this goal, MGD provides broad data coverage, serves as the authoritative standard for mouse nomenclature for genes, mutants, and strains, and curates and integrates many types of data from literature and electronic sources. Among the key data sets MGD supports are: the complete catalog of mouse genes and genome features, comparative homology data for mouse and vertebrate genes, the authoritative set of Gene Ontology (GO) annotations for mouse gene functions, a comprehensive catalog of mouse mutations and their phenotypes, and a curated compendium of mouse models of human diseases. Here, we describe the data acquisition process, specifics about MGD's key data areas, methods to access and query MGD data, and outreach and user help facilities. genesis 53:458–473, 2015. © 2015 The Authors. Genesis Published by Wiley Periodicals, Inc.
Collapse
|
16
|
Allele, phenotype and disease data at Mouse Genome Informatics: improving access and analysis. Mamm Genome 2015; 26:285-94. [PMID: 26162703 PMCID: PMC4534497 DOI: 10.1007/s00335-015-9582-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 06/23/2015] [Indexed: 11/16/2022]
Abstract
A core part of the Mouse Genome Informatics (MGI) resource is the collection of mouse mutations and the annotation phenotypes and diseases displayed by mice carrying these mutations. These data are integrated with the rest of data in MGI and exported to numerous other resources. The use of mouse phenotype data to drive translational research into human disease has expanded rapidly with the improvements in sequencing technology. MGI has implemented many improvements in allele and phenotype data annotation, search, and display to facilitate access to these data through multiple avenues. For example, the description of alleles has been modified to include more detailed categories of allele attributes. This allows improved discrimination between mutation types. Further, connections have been created between mutations involving multiple genes and each of the genes overlapping the mutation. This allows users to readily find all mutations affecting a gene and see all genes affected by a mutation. In a similar manner, the genes expressed by transgenic or knock-in alleles are now connected to these alleles. The advanced search forms and public reports have been updated to take advantage of these improvements. These search forms and reports are used by an expanding number of researchers to identify novel human disease genes and mouse models of human disease.
Collapse
|
17
|
The mouse gene expression database: New features and how to use them effectively. Genesis 2015; 53:510-22. [PMID: 26045019 DOI: 10.1002/dvg.22864] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 05/29/2015] [Accepted: 06/01/2015] [Indexed: 12/15/2022]
Abstract
The Gene Expression Database (GXD) is an extensive and freely available community resource of mouse developmental expression data. GXD curates and integrates expression data from the literature, via electronic data submissions, and by collaborations with large-scale projects. As an integral component of the Mouse Genome Informatics Resource, GXD combines expression data with genetic, functional, phenotypic, and disease-related data, and provides tools for the research community to search for and analyze expression data in this larger context. Recent enhancements include: an interactive browser to navigate the mouse developmental anatomy and find expression data for specific anatomical structures; the capability to search for expression data of genes located in specific genomic regions, supporting the identification of disease candidate genes; a summary displaying all the expression images that meet specified search criteria; interactive matrix views that provide overviews of spatio-temporal expression patterns (Tissue × Stage Matrix) and enable the comparison of expression patterns between genes (Tissue × Gene Matrix); data zoom and filter utilities to iteratively refine summary displays and data sets; and gene-based links to expression data from other model organisms, such as chicken, Xenopus, and zebrafish, fostering comparative expression analysis for species that are highly relevant for developmental research.
Collapse
|
18
|
Abstract
The Gene Expression Database (GXD) is an extensive, easily searchable, and freely available database of mouse gene expression information (www.informatics.jax.org/expression.shtml). GXD was developed to foster progress toward understanding the molecular basis of human development and disease. GXD contains information about when and where genes are expressed in different tissues in the mouse, especially during the embryonic period. GXD collects different types of expression data from wild-type and mutant mice, including RNA in situ hybridization, immunohistochemistry, RT-PCR, and northern and western blot results. The GXD curators read the scientific literature and enter the expression data from those papers into the database. GXD also acquires expression data directly from researchers, including groups doing large-scale expression studies. GXD currently contains nearly 1.5 million expression results for over 13,900 genes. In addition, it has over 265,000 images of expression data, allowing users to retrieve the primary data and interpret it themselves. By being an integral part of the larger Mouse Genome Informatics (MGI) resource, GXD’s expression data are combined with other genetic, functional, phenotypic, and disease-oriented data. This allows GXD to provide tools for researchers to evaluate expression data in the larger context, search by a wide variety of biologically and biomedically relevant parameters, and discover new data connections to help in the design of new experiments. Thus, GXD can provide researchers with critical insights into the functions of genes and the molecular mechanisms of development, differentiation, and disease.
Collapse
|
19
|
Expanding the mammalian phenotype ontology to support automated exchange of high throughput mouse phenotyping data generated by large-scale mouse knockout screens. J Biomed Semantics 2015; 6:11. [PMID: 25825651 PMCID: PMC4378007 DOI: 10.1186/s13326-015-0009-1] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 03/03/2015] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND A vast array of data is about to emerge from the large scale high-throughput mouse knockout phenotyping projects worldwide. It is critical that this information is captured in a standardized manner, made accessible, and is fully integrated with other phenotype data sets for comprehensive querying and analysis across all phenotype data types. The volume of data generated by the high-throughput phenotyping screens is expected to grow exponentially, thus, automated methods and standards to exchange phenotype data are required. RESULTS The IMPC (International Mouse Phenotyping Consortium) is using the Mammalian Phenotype (MP) ontology in the automated annotation of phenodeviant data from high throughput phenotyping screens. 287 new term additions with additional hierarchy revisions were made in multiple branches of the MP ontology to accurately describe the results generated by these high throughput screens. CONCLUSIONS Because these large scale phenotyping data sets will be reported using the MP as the common data standard for annotation and data exchange, automated importation of these data to MGI (Mouse Genome Informatics) and other resources is possible without curatorial effort. Maximum biomedical value of these mutant mice will come from integrating primary high-throughput phenotyping data with secondary, comprehensive phenotypic analyses combined with published phenotype details on these and related mutants at MGI and other resources.
Collapse
|
20
|
Global genetic analysis in mice unveils central role for cilia in congenital heart disease. Nature 2015; 521:520-4. [PMID: 25807483 DOI: 10.1038/nature14269] [Citation(s) in RCA: 297] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 01/26/2015] [Indexed: 01/20/2023]
Abstract
Congenital heart disease (CHD) is the most prevalent birth defect, affecting nearly 1% of live births; the incidence of CHD is up to tenfold higher in human fetuses. A genetic contribution is strongly suggested by the association of CHD with chromosome abnormalities and high recurrence risk. Here we report findings from a recessive forward genetic screen in fetal mice, showing that cilia and cilia-transduced cell signalling have important roles in the pathogenesis of CHD. The cilium is an evolutionarily conserved organelle projecting from the cell surface with essential roles in diverse cellular processes. Using echocardiography, we ultrasound scanned 87,355 chemically mutagenized C57BL/6J fetal mice and recovered 218 CHD mouse models. Whole-exome sequencing identified 91 recessive CHD mutations in 61 genes. This included 34 cilia-related genes, 16 genes involved in cilia-transduced cell signalling, and 10 genes regulating vesicular trafficking, a pathway important for ciliogenesis and cell signalling. Surprisingly, many CHD genes encoded interacting proteins, suggesting that an interactome protein network may provide a larger genomic context for CHD pathogenesis. These findings provide novel insights into the potential Mendelian genetic contribution to CHD in the fetal population, a segment of the human population not well studied. We note that the pathways identified show overlap with CHD candidate genes recovered in CHD patients, suggesting that they may have relevance to the more complex genetics of CHD overall. These CHD mouse models and >8,000 incidental mutations have been sperm archived, creating a rich public resource for human disease modelling.
Collapse
|
21
|
Abstract A05: The Mouse Tumor Biology (MTB) Database: An electronic tool for identifying and evaluating mouse and PDX models of human cancer. Mol Cancer Res 2014. [DOI: 10.1158/1557-3125.modorg-a05] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The increasing number and diversity of available mouse models of human cancer and their growing importance in scientific research have resulted in an enormous increase in the amount and types of data generated from these models. These models represent powerful tools for studying biological and genetic mechanisms of cancer and for translation into potential clinical therapeutics. However, the amount of available data makes it challenging to identify and evaluate specific models and data important for an individual laboratory's research. Placing these data in their proper genetic context is crucial to understanding the biochemical and molecular mechanisms of initiation, progression, and metastasis of different cancers. In addition, the ability of the immunodeficient NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (NSG) mouse strain to host patient derived xenograft (PDX) models only increases the number of available models and the data produced from them.
The Mouse Tumor Biology (MTB) Database provides access to data from mouse and PDX models of human tumors and the tools to analyze these data, facilitating the discovery and evaluation of novel mouse and PDX models of human cancers (http://tumor.informatics.jax.org). MTB includes data on endogenously arising tumors (both spontaneous and induced) in genetically defined mice (inbred, hybrid, mutant, and genetically engineered mice) and information from PDX models of human tumors and provides freely available web access to these data. MTB integrates data from peer-reviewed literature, laboratories studying mouse models of human cancer, production mouse colonies at The Jackson Laboratory (JAX), colonies of aging mice from the Jackson Aging Center, and PDX data from the Jackson Laboratory Patient-derived xenograft resource. MTB also incorporates data from PathBase, and mouse gene expression data sets from NCBI's Gene Expression Omnibus (GEO) and the Array Express Database. Data include tumor classification, incidence and latency, tumor associated quantitative trait loci (QTL), pathology reports, images and genetic changes in tumors (somatic) and background strain (germline). Data type specific query forms (tumor, genetic etc.) allow detailed searches. MTB also can be searched using human gene symbols for orthologous mouse genes and associated data. Pathology images are submitted by the scientific community, from primary literature (with publisher permission), and from JAX colonies. MTB also includes immunohistochemistry data on over 500 antibodies with accompanying images of positive control samples and links to the respective vendors. MTB encourages direct submission of mouse tumor data and images from the cancer research community and has developed a web-based system to facilitate submission of data. Standard nomenclature, controlled vocabularies and literature citations facilitate data integration and robust searches. MTB is integrated with the Mouse Genome Informatics resource (MGI, http://www.informatics.jax.org) and provides links to other related online resources such as the Mouse Phenome Database (MPD), the Biology of the Mammary Gland Web Site, and the NCI Mouse Repository. MTB is supported by NCI grant CA089713.
Citation Format: Dale A. Begley, Debra M. Krupke, Steven B. Neuhauser, Joel E. Richardson, John P. Sundberg, Carol J. Bult, Janan T. Eppig. The Mouse Tumor Biology (MTB) Database: An electronic tool for identifying and evaluating mouse and PDX models of human cancer. [abstract]. In: Proceedings of the AACR Special Conference: The Translational Impact of Model Organisms in Cancer; Nov 5-8, 2013; San Diego, CA. Philadelphia (PA): AACR; Mol Cancer Res 2014;12(11 Suppl):Abstract nr A05.
Collapse
|
22
|
The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease. Nucleic Acids Res 2014; 43:D726-36. [PMID: 25348401 PMCID: PMC4384027 DOI: 10.1093/nar/gku967] [Citation(s) in RCA: 293] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The Mouse Genome Database (MGD, http://www.informatics.jax.org) serves the international biomedical research community as the central resource for integrated genomic, genetic and biological data on the laboratory mouse. To facilitate use of mouse as a model in translational studies, MGD maintains a core of high-quality curated data and integrates experimentally and computationally generated data sets. MGD maintains a unified catalog of genes and genome features, including functional RNAs, QTL and phenotypic loci. MGD curates and provides functional and phenotype annotations for mouse genes using the Gene Ontology and Mammalian Phenotype Ontology. MGD integrates phenotype data and associates mouse genotypes to human diseases, providing critical mouse–human relationships and access to repositories holding mouse models. MGD is the authoritative source of nomenclature for genes, genome features, alleles and strains following guidelines of the International Committee on Standardized Genetic Nomenclature for Mice. A new addition to MGD, the Human–Mouse: Disease Connection, allows users to explore gene–phenotype–disease relationships between human and mouse. MGD has also updated search paradigms for phenotypic allele attributes, incorporated incidental mutation data, added a module for display and exploration of genes and microRNA interactions and adopted the JBrowse genome browser. MGD resources are freely available to the scientific community.
Collapse
|
23
|
Abstract
The Mouse Tumor Biology (MTB; http://tumor.informatics.jax.org) database is a unique online compendium of mouse models for human cancer. MTB provides online access to expertly curated information on diverse mouse models for human cancer and interfaces for searching and visualizing data associated with these models. The information in MTB is designed to facilitate the selection of strains for cancer research and is a platform for mining data on tumor development and patterns of metastases. MTB curators acquire data through manual curation of peer-reviewed scientific literature and from direct submissions by researchers. Data in MTB are also obtained from other bioinformatics resources including PathBase, the Gene Expression Omnibus and ArrayExpress. Recent enhancements to MTB improve the association between mouse models and human genes commonly mutated in a variety of cancers as identified in large-scale cancer genomics studies, provide new interfaces for exploring regions of the mouse genome associated with cancer phenotypes and incorporate data and information related to Patient-Derived Xenograft models of human cancers.
Collapse
|
24
|
Identifying mouse models for skin cancer using the Mouse Tumor Biology Database. Exp Dermatol 2014; 23:761-3. [PMID: 25040013 PMCID: PMC4183210 DOI: 10.1111/exd.12512] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/16/2014] [Indexed: 11/29/2022]
Abstract
In recent years, the scientific community has generated an ever-increasing amount of data from a growing number of animal models of human cancers. Much of these data come from genetically engineered mouse models. Identifying appropriate models for skin cancer and related relevant genetic data sets from an expanding pool of widely disseminated data can be a daunting task. The Mouse Tumor Biology Database (MTB) provides an electronic archive, search and analysis system that can be used to identify dermatological mouse models of cancer, retrieve model-specific data and analyse these data. In this report, we detail MTB's contents and capabilities, together with instructions on how to use MTB to search for skin-related tumor models and associated data.
Collapse
|
25
|
|
26
|
The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse. Nucleic Acids Res 2013; 42:D810-7. [PMID: 24285300 PMCID: PMC3964950 DOI: 10.1093/nar/gkt1225] [Citation(s) in RCA: 176] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The Mouse Genome Database (MGD) (http://www.informatics.jax.org) is the community model organism database resource for the laboratory mouse, a premier animal model for the study of genetic and genomic systems relevant to human biology and disease. MGD maintains a comprehensive catalog of genes, functional RNAs and other genome features as well as heritable phenotypes and quantitative trait loci. The genome feature catalog is generated by the integration of computational and manual genome annotations generated by NCBI, Ensembl and Vega/HAVANA. MGD curates and maintains the comprehensive listing of functional annotations for mouse genes using the Gene Ontology, and MGD curates and integrates comprehensive phenotype annotations including associations of mouse models with human diseases. Recent improvements include integration of the latest mouse genome build (GRCm38), improved access to comparative and functional annotations for mouse genes with expanded representation of comparative vertebrate genomes and new loads of phenotype data from high-throughput phenotyping projects. All MGD resources are freely available to the research community.
Collapse
|
27
|
The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res 2013; 42:D966-74. [PMID: 24217912 PMCID: PMC3965098 DOI: 10.1093/nar/gkt1026] [Citation(s) in RCA: 514] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online.
Collapse
|
28
|
Abstract
The Gene Expression Database (GXD; http://www.informatics.jax.org/expression.shtml) is an extensive and well-curated community resource of mouse developmental expression information. GXD collects different types of expression data from studies of wild-type and mutant mice, covering all developmental stages and including data from RNA in situ hybridization, immunohistochemistry, RT-PCR, northern blot and western blot experiments. The data are acquired from the scientific literature and from researchers, including groups doing large-scale expression studies. Integration with the other data in Mouse Genome Informatics (MGI) and interconnections with other databases places GXD's gene expression information in the larger biological and biomedical context. Since the last report, the utility of GXD has been greatly enhanced by the addition of new data and by the implementation of more powerful and versatile search and display features. Web interface enhancements include the capability to search for expression data for genes associated with specific phenotypes and/or human diseases; new, more interactive data summaries; easy downloading of data; direct searches of expression images via associated metadata; and new displays that combine image data and their associated annotations. At present, GXD includes >1.4 million expression results and 250,000 images that are accessible to our search tools.
Collapse
|
29
|
Abstract C18: Interpreting the cancer genome using the Mouse Tumor Biology (MTB) Database. Cancer Res 2013. [DOI: 10.1158/1538-7445.fbcr13-c18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
For over a century cancer biologists have utilized the laboratory mouse in the study of cancer biology. Many mouse models recapitulate the genetic abnormalities observed in human cancers. The Mouse Tumor Biology Database (MTB; http://tumor.informatics.jax.org) provides multiple tools to explore and interpret the cancer genome. These tools include the ability to search MTB using mouse or human gene symbols, a mouse cancer quantitative trait loci (QTL) viewer, links to gene expression data for a variety of mouse models of human cancer, and access to public genomic data from the emerging Jackson Laboratory (JAX; http://www.jax.org) Patient-Derived Xenograft (PDX) program.
MTB was established in 1997 and provides freely available access to data on spontaneous and induced tumors from genetically defined mice (inbred, hybrid, mutant, and genetically engineered strains). These data include standardized tumor names and classifications, pathology reports and images, mouse genes and QTL associated with cancers, genomic and cytogenetic changes occurring in the tumor, strain names, tumor frequency and latency, and literature citations. Although the primary source for data represented in MTB is peer-reviewed scientific literature an increasing amount of data are derived from large systematic programs (e.g. JAX Aging Resource, PDX Program) and laboratory or consortia contributions. MTB includes annotated histopathology images and cytogenetic assay images for mouse tumors where these data are available. MTB encourages direct submission of mouse tumor data and images from the cancer research community and provides investigators with a web-accessible tool for image submission and annotation.
MTB is supported by NCI grant CA089713.
Citation Format: Debra M. Krupke, Dale A. Begley, Steven B. Neuhauser, Joel E. Richardson, John P. Sundberg, Carol J. Bult, Janan T. Eppig. Interpreting the cancer genome using the Mouse Tumor Biology (MTB) Database. [abstract]. In: Proceedings of the Third AACR International Conference on Frontiers in Basic Cancer Research; Sep 18-22, 2013; National Harbor, MD. Philadelphia (PA): AACR; Cancer Res 2013;73(19 Suppl):Abstract nr C18.
Collapse
|
30
|
The Vertebrate Trait Ontology: a controlled vocabulary for the annotation of trait data across species. J Biomed Semantics 2013; 4:13. [PMID: 23937709 PMCID: PMC3851175 DOI: 10.1186/2041-1480-4-13] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 07/05/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of ontologies to standardize biological data and facilitate comparisons among datasets has steadily grown as the complexity and amount of available data have increased. Despite the numerous ontologies available, one area currently lacking a robust ontology is the description of vertebrate traits. A trait is defined as any measurable or observable characteristic pertaining to an organism or any of its substructures. While there are several ontologies to describe entities and processes in phenotypes, diseases, and clinical measurements, one has not been developed for vertebrate traits; the Vertebrate Trait Ontology (VT) was created to fill this void. DESCRIPTION Significant inconsistencies in trait nomenclature exist in the literature, and additional difficulties arise when trait data are compared across species. The VT is a unified trait vocabulary created to aid in the transfer of data within and between species and to facilitate investigation of the genetic basis of traits. Trait information provides a valuable link between the measurements that are used to assess the trait, the phenotypes related to the traits, and the diseases associated with one or more phenotypes. Because multiple clinical and morphological measurements are often used to assess a single trait, and a single measurement can be used to assess multiple physiological processes, providing investigators with standardized annotations for trait data will allow them to investigate connections among these data types. CONCLUSIONS The annotation of genomic data with ontology terms provides unique opportunities for data mining and analysis. Links between data in disparate databases can be identified and explored, a strategy that is particularly useful for cross-species comparisons or in situations involving inconsistent terminology. The VT provides a common basis for the description of traits in multiple vertebrate species. It is being used in the Rat Genome Database and Animal QTL Database for annotation of QTL data for rat, cattle, chicken, swine, sheep, and rainbow trout, and in the Mouse Phenome Database to annotate strain characterization data. In these databases, data are also cross-referenced to applicable terms from other ontologies, providing additional avenues for data mining and analysis. The ontology is available at http://bioportal.bioontology.org/ontologies/50138.
Collapse
|
31
|
Abstract
The laboratory mouse is the premier animal model for studying human biology because all life stages can be accessed experimentally, a completely sequenced reference genome is publicly available and there exists a myriad of genomic tools for comparative and experimental research. In the current era of genome scale, data-driven biomedical research, the integration of genetic, genomic and biological data are essential for realizing the full potential of the mouse as an experimental model. The Mouse Genome Database (MGD; http://www.informatics.jax.org), the community model organism database for the laboratory mouse, is designed to facilitate the use of the laboratory mouse as a model system for understanding human biology and disease. To achieve this goal, MGD integrates genetic and genomic data related to the functional and phenotypic characterization of mouse genes and alleles and serves as a comprehensive catalog for mouse models of human disease. Recent enhancements to MGD include the addition of human ortholog details to mouse Gene Detail pages, the inclusion of microRNA knockouts to MGD’s catalog of alleles and phenotypes, the addition of video clips to phenotype images, providing access to genotype and phenotype data associated with quantitative trait loci (QTL) and improvements to the layout and display of Gene Ontology annotations.
Collapse
|
32
|
Erratum to: Beyond knockouts: cre resources for conditional mutagenesis. Mamm Genome 2012. [DOI: 10.1007/s00335-012-9434-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
33
|
The mammalian gene function resource: the International Knockout Mouse Consortium. Mamm Genome 2012; 23:580-6. [PMID: 22968824 PMCID: PMC3463800 DOI: 10.1007/s00335-012-9422-2] [Citation(s) in RCA: 234] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2012] [Accepted: 07/20/2012] [Indexed: 11/16/2022]
Abstract
In 2007, the International Knockout Mouse Consortium (IKMC) made the ambitious promise to generate mutations in virtually every protein-coding gene of the mouse genome in a concerted worldwide action. Now, 5 years later, the IKMC members have developed high-throughput gene trapping and, in particular, gene-targeting pipelines and generated more than 17,400 mutant murine embryonic stem (ES) cell clones and more than 1,700 mutant mouse strains, most of them conditional. A common IKMC web portal (www.knockoutmouse.org) has been established, allowing easy access to this unparalleled biological resource. The IKMC materials considerably enhance functional gene annotation of the mammalian genome and will have a major impact on future biomedical research.
Collapse
|
34
|
The Mammalian Phenotype Ontology as a unifying standard for experimental and high-throughput phenotyping data. Mamm Genome 2012; 23:653-68. [PMID: 22961259 PMCID: PMC3463787 DOI: 10.1007/s00335-012-9421-3] [Citation(s) in RCA: 126] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Accepted: 07/24/2012] [Indexed: 01/16/2023]
Abstract
The Mammalian Phenotype Ontology (MP) is a structured vocabulary for describing mammalian phenotypes and serves as a critical tool for efficient annotation and comprehensive retrieval of phenotype data. Importantly, the ontology contains broad and specific terms, facilitating annotation of data from initial observations or screens and detailed data from subsequent experimental research. Using the ontology structure, data are retrieved inclusively, i.e., data annotated to chosen terms and to terms subordinate in the hierarchy. Thus, searching for "abnormal craniofacial morphology" also returns annotations to "megacephaly" and "microcephaly," more specific terms in the hierarchy path. The development and refinement of the MP is ongoing, with new terms and modifications to its organization undergoing continuous assessment as users and expert reviewers propose expansions and revisions. A wealth of phenotype data on mouse mutations and variants annotated to the MP already exists in the Mouse Genome Informatics database. These data, along with data curated to the MP by many mouse mutagenesis programs and mouse repositories, provide a platform for comparative analyses and correlative discoveries. The MP provides a standard underpinning to mouse phenotype descriptions for existing and future experimental and large-scale phenotyping projects. In this review we describe the MP as it presently exists, its application to phenotype annotations, the relationship of the MP to other ontologies, and the integration of the MP within large-scale phenotyping projects. Finally we discuss future application of the MP in providing standard descriptors of the phenotype pipeline test results from the International Mouse Phenotype Consortium projects.
Collapse
|
35
|
Disease model curation improvements at Mouse Genome Informatics. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bar063. [PMID: 22434831 PMCID: PMC3308153 DOI: 10.1093/database/bar063] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Optimal curation of human diseases requires an ontology or structured vocabulary that contains terms familiar to end users, is robust enough to support multiple levels of annotation granularity, is limited to disease terms and is stable enough to avoid extensive reannotation following updates. At Mouse Genome Informatics (MGI), we currently use disease terms from Online Mendelian Inheritance in Man (OMIM) to curate mouse models of human disease. While OMIM provides highly detailed disease records that are familiar to many in the medical community, it lacks structure to support multilevel annotation. To improve disease annotation at MGI, we evaluated the merged Medical Subject Headings (MeSH) and OMIM disease vocabulary created by the Comparative Toxicogenomics Database (CTD) project. Overlaying MeSH onto OMIM provides hierarchical access to broad disease terms, a feature missing from the OMIM. We created an extended version of the vocabulary to meet the genetic disease-specific curation needs at MGI. Here we describe our evaluation of the CTD application, the extensions made by MGI and discuss the strengths and weaknesses of this approach. Database URL:http://www.informatics.jax.org/
Collapse
|
36
|
The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse. Nucleic Acids Res 2011; 40:D881-6. [PMID: 22075990 PMCID: PMC3245042 DOI: 10.1093/nar/gkr974] [Citation(s) in RCA: 216] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The Mouse Genome Database (MGD, http://www.informatics.jax.org) is the international community resource for integrated genetic, genomic and biological data about the laboratory mouse. Data in MGD are obtained through loads from major data providers and experimental consortia, electronic submissions from laboratories and from the biomedical literature. MGD maintains a comprehensive, unified, non-redundant catalog of mouse genome features generated by distilling gene predictions from NCBI, Ensembl and VEGA. MGD serves as the authoritative source for the nomenclature of mouse genes, mutations, alleles and strains. MGD is the primary source for evidence-supported functional annotations for mouse genes and gene products using the Gene Ontology (GO). MGD provides full annotation of phenotypes and human disease associations for mouse models (genotypes) using terms from the Mammalian Phenotype Ontology and disease names from the Online Mendelian Inheritance in Man (OMIM) resource. MGD is freely accessible online through our website, where users can browse and search interactively, access data in bulk using Batch Query or BioMart, download data files or use our web services Application Programming Interface (API). Improvements to MGD include expanded genome feature classifications, inclusion of new mutant allele sets and phenotype associations and extensions of GO to include new relationships and a new stream of annotations via phylogenetic-based approaches.
Collapse
|
37
|
Establishing a Cre Tool Resource for Conditional Mutagenesis, www.creportal.org. Biol Reprod 2011. [DOI: 10.1093/biolreprod/85.s1.26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
38
|
The Mouse Tumor Biology Database (MTB): a central electronic resource for locating and integrating mouse tumor pathology data. Vet Pathol 2011; 49:218-23. [PMID: 21282667 DOI: 10.1177/0300985810395726] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The Mouse Tumor Biology Database (MTB) is designed to provide an electronic data storage, search, and analysis system for information on mouse models of human cancer. The MTB includes data on tumor frequency and latency, strain, germ line, and somatic genetics, pathologic notations, and photomicrographs. The MTB collects data from the primary literature, other public databases, and direct submissions from the scientific community. The MTB is a community resource that provides integrated access to mouse tumor data from different scientific research areas and facilitates integration of molecular, genetic, and pathologic data. Current status of MTB, search capabilities, data types, and future enhancements are described in this article.
Collapse
|
39
|
Towards BioDBcore: a community-defined information specification for biological databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:baq027. [PMID: 21205783 PMCID: PMC3017395 DOI: 10.1093/database/baq027] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources; and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
Collapse
|
40
|
Mouse mutants and phenotypes: accessing information for the study of mammalian gene function. Methods 2010; 53:405-10. [PMID: 21185380 DOI: 10.1016/j.ymeth.2010.12.024] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2010] [Revised: 12/01/2010] [Accepted: 12/17/2010] [Indexed: 02/02/2023] Open
Abstract
Recent advances in high-throughput gene targeting and conditional mutagenesis are creating new and powerful resources to study the in vivo function of mammalian genes using the mouse as an experimental model. Mutant ES cells and mice are being generated at a rapid rate to study the molecular and phenotypic consequences of genetic mutations, and to correlate these study results with human disease conditions. Likewise, classical genetics approaches to identify mutations in the mouse genome that cause specific phenotypes have become more effective. Here, we describe methods to quickly obtain information on what mutant ES cells and mice are available, including recombinase driver lines for the generation of conditional mutants. Further, we describe means to access genetic and phenotypic data that identify mouse models for specific human diseases.
Collapse
|
41
|
Abstract
The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
Collapse
|
42
|
Abstract
The Gene Expression Database (GXD) is a community resource of mouse developmental expression information. GXD integrates different types of expression data at the transcript and protein level and captures expression information from many different mouse strains and mutants. GXD places these data in the larger biological context through integration with other Mouse Genome Informatics (MGI) resources and interconnections with many other databases. Web-based query forms support simple or complex searches that take advantage of all these integrated data. The data in GXD are obtained from the literature, from individual laboratories, and from large-scale data providers. All data are annotated and reviewed by GXD curators. Since the last report, the GXD data content has increased significantly, the interface and data displays have been improved, new querying capabilities were implemented, and links to other expression resources were added. GXD is available through the MGI web site (www.informatics.jax.org), or directly at www.informatics.jax.org/expression.shtml.
Collapse
|
43
|
The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res 2010; 39:D842-8. [PMID: 21051359 PMCID: PMC3013640 DOI: 10.1093/nar/gkq1008] [Citation(s) in RCA: 197] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The Mouse Genome Database (MGD) is the community model organism database for the laboratory mouse and the authoritative source for phenotype and functional annotations of mouse genes. MGD includes a complete catalog of mouse genes and genome features with integrated access to genetic, genomic and phenotypic information, all serving to further the use of the mouse as a model system for studying human biology and disease. MGD is a major component of the Mouse Genome Informatics (MGI, http://www.informatics.jax.org/) resource. MGD contains standardized descriptions of mouse phenotypes, associations between mouse models and human genetic diseases, extensive integration of DNA and protein sequence data, normalized representation of genome and genome variant information. Data are obtained and integrated via manual curation of the biomedical literature, direct contributions from individual investigators and downloads from major informatics resource centers. MGD collaborates with the bioinformatics community on the development and use of biomedical ontologies such as the Gene Ontology (GO) and the Mammalian Phenotype (MP) Ontology. Major improvements to the Mouse Genome Database include comprehensive update of genetic maps, implementation of new classification terms for genome features, development of a recombinase (cre) portal and inclusion of all alleles generated by the International Knockout Mouse Consortium (IKMC).
Collapse
|
44
|
A New Tool for Reproductive Biology: Exploring Recombinases for Conditional Mutagenesis. Biol Reprod 2010. [DOI: 10.1093/biolreprod/83.s1.376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
45
|
The IKMC web portal: a central point of entry to data and resources from the International Knockout Mouse Consortium. Nucleic Acids Res 2010; 39:D849-55. [PMID: 20929875 PMCID: PMC3013768 DOI: 10.1093/nar/gkq879] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The International Knockout Mouse Consortium (IKMC) aims to mutate all protein-coding genes in the mouse using a combination of gene targeting and gene trapping in mouse embryonic stem (ES) cells and to make the generated resources readily available to the research community. The IKMC database and web portal (www.knockoutmouse.org) serves as the central public web site for IKMC data and facilitates the coordination and prioritization of work within the consortium. Researchers can access up-to-date information on IKMC knockout vectors, ES cells and mice for specific genes, and follow links to the respective repositories from which corresponding IKMC products can be ordered. Researchers can also use the web site to nominate genes for targeting, or to indicate that targeting of a gene should receive high priority. The IKMC database provides data to, and features extensive interconnections with, other community databases.
Collapse
|
46
|
The mammalian phenotype ontology: enabling robust annotation and comparative analysis. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2009; 1:390-399. [PMID: 20052305 PMCID: PMC2801442 DOI: 10.1002/wsbm.44] [Citation(s) in RCA: 217] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The mouse has long been an important model for the study of human genetic disease. Through the application of genetic engineering and mutagenesis techniques, the number of unique mutant mouse models and the amount of phenotypic data describing them are growing exponentially. Describing phenotypes of mutant mice in a computationally useful manner that will facilitate data mining is a major challenge for bioinformatics. Here we describe a tool, the Mammalian Phenotype Ontology (MP), for classifying and organizing phenotypic information related to the mouse and other mammalian species. The MP Ontology has been applied to mouse phenotype descriptions in the Mouse Genome Informatics Database (MGI, http://www.informatics.jax.org/), the Rat Genome Database (RGD, http://rgd.mcw.edu), the Online Mendelian Inheritance in Animals (OMIA, http://omia.angis.org.au/) and elsewhere. Use of this ontology allows comparisons of data from diverse sources, can facilitate comparisons across mammalian species, assists in identifying appropriate experimental disease models, and aids in the discovery of candidate disease genes and molecular signaling pathways.
Collapse
|
47
|
Abstract
The Mouse Genome Database (MGD) is a major component of the Mouse Genome Informatics (MGI, http://www.informatics.jax.org/) database resource and serves as the primary community model organism database for the laboratory mouse. MGD is the authoritative source for mouse gene, allele and strain nomenclature and for phenotype and functional annotations of mouse genes. MGD contains comprehensive data and information related to mouse genes and their functions, standardized descriptions of mouse phenotypes, extensive integration of DNA and protein sequence data, normalized representation of genome and genome variant information including comparative data on mammalian genes. Data for MGD are obtained from diverse sources including manual curation of the biomedical literature and direct contributions from individual investigator's laboratories and major informatics resource centers, such as Ensembl, UniProt and NCBI. MGD collaborates with the bioinformatics community on the development and use of biomedical ontologies such as the Gene Ontology and the Mammalian Phenotype Ontology. Recent improvements in MGD described here includes integration of mouse gene trap allele and sequence data, integration of gene targeting information from the International Knockout Mouse Consortium, deployment of an MGI Biomart, and enhancements to our batch query capability for customized data access and retrieval.
Collapse
|
48
|
One Source for Comprehensive Data for Mouse Biology: The Mouse Genome Informatics (MGI) Resource. Biol Reprod 2009. [DOI: 10.1093/biolreprod/81.s1.220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
49
|
Abstract
The Mouse Genome Database (MGD, http://www.informatics.jax.org/), integrates genetic, genomic and phenotypic information about the laboratory mouse, a primary animal model for studying human biology and disease. Information in MGD is obtained from diverse sources, including the scientific literature and external databases, such as EntrezGene, UniProt and GenBank. In addition to its extensive collection of phenotypic allele information for mouse genes that is curated from the published biomedical literature and researcher submission, MGI includes a comprehensive representation of mouse genes including sequence, functional (GO) and comparative information. MGD provides a data mining platform that enables the development of translational research hypotheses based on comparative genotype, phenotype and functional analyses. MGI can be accessed by a variety of methods including web-based search forms, a genome sequence browser and downloadable database reports. Programmatic access is available using web services. Recent improvements in MGD described here include the unified mouse gene catalog for NCBI Build 37 of the reference genome assembly, and improved representation of mouse mutants and phenotypes.
Collapse
|
50
|
Abstract
The laboratory mouse has long been an important tool in the study of the biology and genetics of human cancer. With the advent of genetic engineering techniques, DNA microarray analyses, tissue arrays and other large-scale, high-throughput data generating methods, the amount of data available for mouse models of cancer is growing exponentially. Tools to integrate, locate and visualize these data are crucial to aid researchers in their investigations. The Mouse Tumor Biology database (http://tumor.informatics.jax.org) seeks to address that need.
Collapse
|