1
|
Belizário JE. The humankind genome: from genetic diversity to the origin of human diseases. Genome 2014; 56:705-16. [PMID: 24433206 DOI: 10.1139/gen-2013-0125] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Genome-wide association studies have failed to establish common variant risk for the majority of common human diseases. The underlying reasons for this failure are explained by recent studies of resequencing and comparison of over 1200 human genomes and 10 000 exomes, together with the delineation of DNA methylation patterns (epigenome) and full characterization of coding and noncoding RNAs (transcriptome) being transcribed. These studies have provided the most comprehensive catalogues of functional elements and genetic variants that are now available for global integrative analysis and experimental validation in prospective cohort studies. With these datasets, researchers will have unparalleled opportunities for the alignment, mining, and testing of hypotheses for the roles of specific genetic variants, including copy number variations, single nucleotide polymorphisms, and indels as the cause of specific phenotypes and diseases. Through the use of next-generation sequencing technologies for genotyping and standardized ontological annotation to systematically analyze the effects of genomic variation on humans and model organism phenotypes, we will be able to find candidate genes and new clues for disease's etiology and treatment. This article describes essential concepts in genetics and genomic technologies as well as the emerging computational framework to comprehensively search websites and platforms available for the analysis and interpretation of genomic data.
Collapse
Affiliation(s)
- Jose E Belizário
- Departamento de Farmacologia, Instituto de Ciências Biomédicas da Universidade de São Paulo, Avenida Lineu Prestes, 1524 CEP 05508-900, São Paulo, SP, Brazil
| |
Collapse
|
2
|
Hancock JM. Commentary on Shimoyama et al. (2012): three ontologies to define phenotype measurement data. Front Genet 2014; 5:93. [PMID: 24795755 PMCID: PMC4006037 DOI: 10.3389/fgene.2014.00093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 04/03/2014] [Indexed: 01/17/2023] Open
Affiliation(s)
- John M Hancock
- Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| |
Collapse
|
3
|
Hancock JM. Editorial: biological ontologies and semantic biology. Front Genet 2014; 5:18. [PMID: 24550936 PMCID: PMC3912459 DOI: 10.3389/fgene.2014.00018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 01/21/2014] [Indexed: 01/22/2023] Open
Affiliation(s)
- John M Hancock
- Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| |
Collapse
|
4
|
Köhler S, Doelken SC, Ruef BJ, Bauer S, Washington N, Westerfield M, Gkoutos G, Schofield P, Smedley D, Lewis SE, Robinson PN, Mungall CJ. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Res 2013; 2:30. [PMID: 24358873 DOI: 10.12688/f1000research.2-30.v1] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/22/2013] [Indexed: 12/30/2022] Open
Abstract
Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species. We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases. This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.
Collapse
Affiliation(s)
- Sebastian Köhler
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany ; Berlin-Brandenberg Center for Regenerative Therapies (BCRT), Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany
| | - Sandra C Doelken
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany
| | - Barbara J Ruef
- ZFIN, Institute of Neuroscience, University of Oregon, Eugene OR, 97403-5291, USA
| | - Sebastian Bauer
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany
| | | | - Monte Westerfield
- ZFIN, Institute of Neuroscience, University of Oregon, Eugene OR, 97403-5291, USA
| | - George Gkoutos
- Department of Computer Science, University of Aberystwyth, Aberystwyth, SY23 2AX, UK
| | - Paul Schofield
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Damian Smedley
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK
| | - Suzanna E Lewis
- Lawrence Berkeley National Laboratory, Berkeley CA, 94720, USA
| | - Peter N Robinson
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany ; Berlin-Brandenberg Center for Regenerative Therapies (BCRT), Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany ; Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany
| | | |
Collapse
|
5
|
Köhler S, Doelken SC, Ruef BJ, Bauer S, Washington N, Westerfield M, Gkoutos G, Schofield P, Smedley D, Lewis SE, Robinson PN, Mungall CJ. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Res 2013; 2:30. [PMID: 24358873 PMCID: PMC3799545 DOI: 10.12688/f1000research.2-30.v2] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/20/2014] [Indexed: 12/11/2022] Open
Abstract
Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species. We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases. This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from
http://purl.obolibrary.org/obo/hp/uberpheno/.
Collapse
Affiliation(s)
- Sebastian Köhler
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany ; Berlin-Brandenberg Center for Regenerative Therapies (BCRT), Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany
| | - Sandra C Doelken
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany
| | - Barbara J Ruef
- ZFIN, Institute of Neuroscience, University of Oregon, Eugene OR, 97403-5291, USA
| | - Sebastian Bauer
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany
| | | | - Monte Westerfield
- ZFIN, Institute of Neuroscience, University of Oregon, Eugene OR, 97403-5291, USA
| | - George Gkoutos
- Department of Computer Science, University of Aberystwyth, Aberystwyth, SY23 2AX, UK
| | - Paul Schofield
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Damian Smedley
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK
| | - Suzanna E Lewis
- Lawrence Berkeley National Laboratory, Berkeley CA, 94720, USA
| | - Peter N Robinson
- Institute for Medical and Human Genetics, Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany ; Berlin-Brandenberg Center for Regenerative Therapies (BCRT), Charité-Universitatsmedizin Berlin, Berlin, 13353, Germany ; Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany
| | | |
Collapse
|
6
|
Beck T, Free RC, Thorisson GA, Brookes AJ. Semantically enabling a genome-wide association study database. J Biomed Semantics 2012; 3:9. [PMID: 23244533 PMCID: PMC3579732 DOI: 10.1186/2041-1480-3-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Accepted: 08/22/2012] [Indexed: 01/03/2023] Open
Abstract
Background The amount of data generated from genome-wide association studies (GWAS) has grown rapidly, but considerations for GWAS phenotype data reuse and interchange have not kept pace. This impacts on the work of GWAS Central – a free and open access resource for the advanced querying and comparison of summary-level genetic association data. The benefits of employing ontologies for standardising and structuring data are widely accepted. The complex spectrum of observed human phenotypes (and traits), and the requirement for cross-species phenotype comparisons, calls for reflection on the most appropriate solution for the organisation of human phenotype data. The Semantic Web provides standards for the possibility of further integration of GWAS data and the ability to contribute to the web of Linked Data. Results A pragmatic consideration when applying phenotype ontologies to GWAS data is the ability to retrieve all data, at the most granular level possible, from querying a single ontology graph. We found the Medical Subject Headings (MeSH) terminology suitable for describing all traits (diseases and medical signs and symptoms) at various levels of granularity and the Human Phenotype Ontology (HPO) most suitable for describing phenotypic abnormalities (medical signs and symptoms) at the most granular level. Diseases within MeSH are mapped to HPO to infer the phenotypic abnormalities associated with diseases. Building on the rich semantic phenotype annotation layer, we are able to make cross-species phenotype comparisons and publish a core subset of GWAS data as RDF nanopublications. Conclusions We present a methodology for applying phenotype annotations to a comprehensive genome-wide association dataset and for ensuring compatibility with the Semantic Web. The annotations are used to assist with cross-species genotype and phenotype comparisons. However, further processing and deconstructions of terms may be required to facilitate automatic phenotype comparisons. The provision of GWAS nanopublications enables a new dimension for exploring GWAS data, by way of intrinsic links to related data resources within the Linked Data web. The value of such annotation and integration will grow as more biomedical resources adopt the standards of the Semantic Web.
Collapse
Affiliation(s)
- Tim Beck
- Department of Genetics, University of Leicester, University Road, Leicester, UK.
| | | | | | | |
Collapse
|
7
|
Gkoutos GV, Hoehndorf R. Ontology-based cross-species integration and analysis of Saccharomyces cerevisiae phenotypes. J Biomed Semantics 2012; 3 Suppl 2:S6. [PMID: 23046642 PMCID: PMC3448529 DOI: 10.1186/2041-1480-3-s2-s6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Ontologies are widely used in the biomedical community for annotation and integration of databases. Formal definitions can relate classes from different ontologies and thereby integrate data across different levels of granularity, domains and species. We have applied this methodology to the Ascomycete Phenotype Ontology (APO), enabling the reuse of various orthogonal ontologies and we have converted the phenotype associated data found in the SGD following our proposed patterns. We have integrated the resulting data in the cross-species phenotype network PhenomeNET, and we make both the cross-species integration of yeast phenotypes and a similarity-based comparison of yeast phenotypes across species available in the PhenomeBrowser. Furthermore, we utilize our definitions and the yeast phenotype annotations to suggest novel functional annotations of gene products in yeast.
Collapse
Affiliation(s)
- Georgios V Gkoutos
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, Cambridge CB2 3EH, UK.
| | | |
Collapse
|
8
|
Vik JO, Gjuvsland AB, Li L, Tøndel K, Niederer S, Smith NP, Hunter PJ, Omholt SW. Genotype-Phenotype Map Characteristics of an In silico Heart Cell. Front Physiol 2011; 2:106. [PMID: 22232604 PMCID: PMC3246639 DOI: 10.3389/fphys.2011.00106] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Accepted: 12/05/2011] [Indexed: 11/22/2022] Open
Abstract
Understanding the causal chain from genotypic to phenotypic variation is a tremendous challenge with huge implications for personalized medicine. Here we argue that linking computational physiology to genetic concepts, methodology, and data provides a new framework for this endeavor. We exemplify this causally cohesive genotype–phenotype (cGP) modeling approach using a detailed mathematical model of a heart cell. In silico genetic variation is mapped to parametric variation, which propagates through the physiological model to generate multivariate phenotypes for the action potential and calcium transient under regular pacing, and ion currents under voltage clamping. The resulting genotype-to-phenotype map is characterized using standard quantitative genetic methods and novel applications of high-dimensional data analysis. These analyses reveal many well-known genetic phenomena like intralocus dominance, interlocus epistasis, and varying degrees of phenotypic correlation. In particular, we observe penetrance features such as the masking/release of genetic variation, so that without any change in the regulatory anatomy of the model, traits may appear monogenic, oligogenic, or polygenic depending on which genotypic variation is actually present in the data. The results suggest that a cGP modeling approach may pave the way for a computational physiological genomics capable of generating biological insight about the genotype–phenotype relation in ways that statistical-genetic approaches cannot.
Collapse
Affiliation(s)
- Jon Olav Vik
- Department of Mathematical Sciences and Technology, Centre for Integrative Genetics, Norwegian University of Life Sciences Ås, Norway
| | | | | | | | | | | | | | | |
Collapse
|
9
|
Schulz MH, Köhler S, Bauer S, Robinson PN. Exact score distribution computation for ontological similarity searches. BMC Bioinformatics 2011; 12:441. [PMID: 22078312 PMCID: PMC3240574 DOI: 10.1186/1471-2105-12-441] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Accepted: 11/12/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact P-value of a given score. RESULTS In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a P-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact P-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling. CONCLUSIONS The new algorithm enables for the first time exact P-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/
Collapse
Affiliation(s)
- Marcel H Schulz
- Max Planck Institute for Molecular Genetics, Ihnestr. 73, 14195 Berlin, Germany
- Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213 Pennsylvania, USA
| | - Sebastian Köhler
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
- Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | - Sebastian Bauer
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | - Peter N Robinson
- Max Planck Institute for Molecular Genetics, Ihnestr. 73, 14195 Berlin, Germany
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
- Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| |
Collapse
|
10
|
Köhler S, Bauer S, Mungall CJ, Carletti G, Smith CL, Schofield P, Gkoutos GV, Robinson PN. Improving ontologies by automatic reasoning and evaluation of logical definitions. BMC Bioinformatics 2011; 12:418. [PMID: 22032770 PMCID: PMC3224779 DOI: 10.1186/1471-2105-12-418] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 10/27/2011] [Indexed: 01/16/2023] Open
Abstract
Background Ontologies are widely used to represent knowledge in biomedicine. Systematic approaches for detecting errors and disagreements are needed for large ontologies with hundreds or thousands of terms and semantic relationships. A recent approach of defining terms using logical definitions is now increasingly being adopted as a method for quality control as well as for facilitating interoperability and data integration. Results We show how automated reasoning over logical definitions of ontology terms can be used to improve ontology structure. We provide the Java software package GULO (Getting an Understanding of LOgical definitions), which allows fast and easy evaluation for any kind of logically decomposed ontology by generating a composite OWL ontology from appropriate subsets of the referenced ontologies and comparing the inferred relationships with the relationships asserted in the target ontology. As a case study we show how to use GULO to evaluate the logical definitions that have been developed for the Mammalian Phenotype Ontology (MPO). Conclusions Logical definitions of terms from biomedical ontologies represent an important resource for error and disagreement detection. GULO gives ontology curators a fast and simple tool for validation of their work.
Collapse
Affiliation(s)
- Sebastian Köhler
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Espinosa O, Hancock JM. A gene-phenotype network for the laboratory mouse and its implications for systematic phenotyping. PLoS One 2011; 6:e19693. [PMID: 21625554 PMCID: PMC3098258 DOI: 10.1371/journal.pone.0019693] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2011] [Accepted: 04/11/2011] [Indexed: 01/22/2023] Open
Abstract
The laboratory mouse is the pre-eminent model organism for the dissection of human disease pathways. With the advent of a comprehensive panel of gene knockouts, projects to characterise the phenotypes of all knockout lines are being initiated. The range of genotype-phenotype associations can be represented using the Mammalian Phenotype ontology. Using publicly available data annotated with this ontology we have constructed gene and phenotype networks representing these associations. These networks show a scale-free, hierarchical and modular character and community structure. They also exhibit enrichment for gene coexpression, protein-protein interactions and Gene Ontology annotation similarity. Close association between gene communities and some high-level ontology terms suggests that systematic phenotyping can provide a direct insight into underlying pathways. However some phenotypes are distributed more diffusely across gene networks, likely reflecting the pleiotropic roles of many genes. Phenotype communities show a many-to-many relationship to human disease communities, but stronger overlap at more granular levels of description. This may suggest that systematic phenotyping projects should aim for high granularity annotations to maximise their relevance to human disease.
Collapse
Affiliation(s)
- Octavio Espinosa
- Bioinformatics Group, MRC Mammalian Genetics Unit, Harwell, Oxfordshire, United Kingdom
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - John M. Hancock
- Bioinformatics Group, MRC Mammalian Genetics Unit, Harwell, Oxfordshire, United Kingdom
| |
Collapse
|
12
|
Masuya H, Makita Y, Kobayashi N, Nishikata K, Yoshida Y, Mochizuki Y, Doi K, Takatsuki T, Waki K, Tanaka N, Ishii M, Matsushima A, Takahashi S, Hijikata A, Kozaki K, Furuichi T, Kawaji H, Wakana S, Nakamura Y, Yoshiki A, Murata T, Fukami-Kobayashi K, Mohan S, Ohara O, Hayashizaki Y, Mizoguchi R, Obata Y, Toyoda T. The RIKEN integrated database of mammals. Nucleic Acids Res 2010; 39:D861-70. [PMID: 21076152 PMCID: PMC3013680 DOI: 10.1093/nar/gkq1078] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The RIKEN integrated database of mammals (http://scinets.org/db/mammal) is the official undertaking to integrate its mammalian databases produced from multiple large-scale programs that have been promoted by the institute. The database integrates not only RIKEN's original databases, such as FANTOM, the ENU mutagenesis program, the RIKEN Cerebellar Development Transcriptome Database and the Bioresource Database, but also imported data from public databases, such as Ensembl, MGI and biomedical ontologies. Our integrated database has been implemented on the infrastructure of publication medium for databases, termed SciNetS/SciNeS, or the Scientists' Networking System, where the data and metadata are structured as a semantic web and are downloadable in various standardized formats. The top-level ontology-based implementation of mammal-related data directly integrates the representative knowledge and individual data records in existing databases to ensure advanced cross-database searches and reduced unevenness of the data management operations. Through the development of this database, we propose a novel methodology for the development of standardized comprehensive management of heterogeneous data sets in multiple databases to improve the sustainability, accessibility, utility and publicity of the data of biomedical information.
Collapse
|
13
|
Schofield PN, Gruenberger M, Sundberg JP. Pathbase and the MPATH ontology. Community resources for mouse histopathology. Vet Pathol 2010; 47:1016-20. [PMID: 20587689 PMCID: PMC3038412 DOI: 10.1177/0300985810374845] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Pathbase, the database of mouse histopathology images, was developed as a resource to provide free access to representative images of lesions in background and mutant strains of laboratory mice. When utilized with diagnostic workups or phenotyping of mutant mice, it can provide a "virtual second opinion" for those working without access to groups of experienced pathologists. This is a community resource, and it facilitates the sharing of expertise and data among members of the pathology community worldwide. MPATH-the mouse pathology ontology-was developed alongside Pathbase for the annotation of images and now represents an important resource for the coding of diagnoses, permitting sophisticated data retrieval and computational analysis of mouse phenotypes. In this article, the structure and use of MPATH is discussed, along with current and future challenges for the coding of mutant mouse phenotypes.
Collapse
Affiliation(s)
- P N Schofield
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | | |
Collapse
|
14
|
Abstract
A standardized, controlled vocabulary allows phenotypic information to be described in an unambiguous fashion in medical publications and databases. The Human Phenotype Ontology (HPO) is being developed in an effort to provide such a vocabulary. The use of an ontology to capture phenotypic information allows the use of computational algorithms that exploit semantic similarity between related phenotypic abnormalities to define phenotypic similarity metrics, which can be used to perform database searches for clinical diagnostics or as a basis for incorporating the human phenome into large-scale computational analysis of gene expression patterns and other cellular phenomena associated with human disease. The HPO is freely available at http://www.human-phenotype-ontology.org.
Collapse
Affiliation(s)
- P N Robinson
- Institute for Medical Genetics, Augustenburger Platz 1, 13353 Berlin, Germany.
| | | |
Collapse
|
15
|
Bodenreider O, Burgun A. A framework for comparing phenotype annotations of orthologous genes. Stud Health Technol Inform 2010; 160:1309-13. [PMID: 20841896 PMCID: PMC4300101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
OBJECTIVES Animal models are a key resource for the investigation of human diseases. In contrast to functional annotation, phenotype annotation is less standard, and comparing phenotypes across species remains challenging. The objective of this paper is to propose a framework for comparing phenotype annotations of orthologous genes based on the Medical Subject Headings (MeSH) indexing of biomedical articles in which these genes are discussed. METHODS 17,769 pairs of orthologous genes (mouse and human) are downloaded from the Mouse Genome Informatics (MGI) system and linked to biomedical articles through Entrez Gene. MeSH index terms corresponding to diseases are extracted from Medline. RESULTS 11,111 pairs of genes exhibited at least one phenotype annotation for each gene in the pair. Among these, 81% have at least one phenotype annotation in common, 80% have at least one annotation specific to the human gene and 84% have at least one annotation specific to the mouse gene. Four disease categories represent 54% of all phenotype annotations. CONCLUSIONS This framework supports the curation of phenotype annotation and the generation of research hypotheses based on comparative studies.
Collapse
Affiliation(s)
| | - Anita Burgun
- Division INSERM U936, School of Medicine, University of Rennes 1, IFR 140, Rennes, France
| |
Collapse
|
16
|
Morgan H, Beck T, Blake A, Gates H, Adams N, Debouzy G, Leblanc S, Lengger C, Maier H, Melvin D, Meziane H, Richardson D, Wells S, White J, Wood J, de Angelis MH, Brown SDM, Hancock JM, Mallon AM. EuroPhenome: a repository for high-throughput mouse phenotyping data. Nucleic Acids Res 2009; 38:D577-85. [PMID: 19933761 PMCID: PMC2808931 DOI: 10.1093/nar/gkp1007] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The broad aim of biomedical science in the postgenomic era is to link genomic and phenotype information to allow deeper understanding of the processes leading from genomic changes to altered phenotype and disease. The EuroPhenome project (http://www.EuroPhenome.org) is a comprehensive resource for raw and annotated high-throughput phenotyping data arising from projects such as EUMODIC. EUMODIC is gathering data from the EMPReSSslim pipeline (http://www.empress.har.mrc.ac.uk/) which is performed on inbred mouse strains and knock-out lines arising from the EUCOMM project. The EuroPhenome interface allows the user to access the data via the phenotype or genotype. It also allows the user to access the data in a variety of ways, including graphical display, statistical analysis and access to the raw data via web services. The raw phenotyping data captured in EuroPhenome is annotated by an annotation pipeline which automatically identifies statistically different mutants from the appropriate baseline and assigns ontology terms for that specific test. Mutant phenotypes can be quickly identified using two EuroPhenome tools: PhenoMap, a graphical representation of statistically relevant phenotypes, and mining for a mutant using ontology terms. To assist with data definition and cross-database comparisons, phenotype data is annotated using combinations of terms from biological ontologies.
Collapse
Affiliation(s)
- Hugh Morgan
- MRC Harwell, Mammalian Genetics Unit, MRC Harwell, Mary Lyon Centre, Harwell Science and Innovation Campus, Oxfordshire OX11 0RD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|