1
|
Porto DS, Dahdul WM, Lapp H, Balhoff JP, Vision TJ, Mabee PM, Uyeda J. Assessing Bayesian Phylogenetic Information Content of Morphological Data Using Knowledge from Anatomy Ontologies. Syst Biol 2022; 71:1290-1306. [PMID: 35285502 PMCID: PMC9558846 DOI: 10.1093/sysbio/syac022] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 02/09/2022] [Accepted: 03/05/2022] [Indexed: 11/18/2022] Open
Abstract
Morphology remains a primary source of phylogenetic information for many groups of organisms, and the only one for most fossil taxa. Organismal anatomy is not a collection of randomly assembled and independent “parts”, but instead a set of dependent and hierarchically nested entities resulting from ontogeny and phylogeny. How do we make sense of these dependent and at times redundant characters? One promising approach is using ontologies—structured controlled vocabularies that summarize knowledge about different properties of anatomical entities, including developmental and structural dependencies. Here, we assess whether evolutionary patterns can explain the proximity of ontology-annotated characters within an ontology. To do so, we measure phylogenetic information across characters and evaluate if it matches the hierarchical structure given by ontological knowledge—in much the same way as across-species diversity structure is given by phylogeny. We implement an approach to evaluate the Bayesian phylogenetic information (BPI) content and phylogenetic dissonance among ontology-annotated anatomical data subsets. We applied this to data sets representing two disparate animal groups: bees (Hexapoda: Hymenoptera: Apoidea, 209 chars) and characiform fishes (Actinopterygii: Ostariophysi: Characiformes, 463 chars). For bees, we find that BPI is not substantially explained by anatomy since dissonance is often high among morphologically related anatomical entities. For fishes, we find substantial information for two clusters of anatomical entities instantiating concepts from the jaws and branchial arch bones, but among-subset information decreases and dissonance increases substantially moving to higher-level subsets in the ontology. We further applied our approach to address particular evolutionary hypotheses with an example of morphological evolution in miniature fishes. While we show that phylogenetic information does match ontology structure for some anatomical entities, additional relationships and processes, such as convergence, likely play a substantial role in explaining BPI and dissonance, and merit future investigation. Our work demonstrates how complex morphological data sets can be interrogated with ontologies by allowing one to access how information is spread hierarchically across anatomical concepts, how congruent this information is, and what sorts of processes may play a role in explaining it: phylogeny, development, or convergence. [Apidae; Bayesian phylogenetic information; Ostariophysi; Phenoscape; phylogenetic dissonance; semantic similarity.]
Collapse
Affiliation(s)
- Diego S Porto
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061, USA
| | - Wasila M Dahdul
- UCI Libraries,University of California, Irvine, Irvine, CA 92623, USA
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - Hilmar Lapp
- Center for Genomic and Computational Biology, Duke University, 101 Science Drive, Durham, NC 27708, USA
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, 100 Europa Drive, Suite 540, Chapel Hill, NC 27517, USA
| | - Todd J Vision
- Department of Biology and School of Information and Library Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Paula M Mabee
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
- Battelle, National Ecological Observatory Network, Boulder, CO 80301, USA
| | - Josef Uyeda
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061, USA
| |
Collapse
|
2
|
Sokoloff DD, Remizowa MV. The use of plant ontologies in comparative and evolutionary studies should be flexible. AMERICAN JOURNAL OF BOTANY 2021; 108:909-911. [PMID: 34157126 DOI: 10.1002/ajb2.1692] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 03/10/2021] [Indexed: 06/13/2023]
Affiliation(s)
- Dmitry D Sokoloff
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, 119234, Russia
| | - Margarita V Remizowa
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, 119234, Russia
| |
Collapse
|
3
|
Mabee PM, Balhoff JP, Dahdul WM, Lapp H, Mungall CJ, Vision TJ. A Logical Model of Homology for Comparative Biology. Syst Biol 2020; 69:345-362. [PMID: 31596473 PMCID: PMC7672696 DOI: 10.1093/sysbio/syz067] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 09/20/2019] [Accepted: 09/26/2019] [Indexed: 01/09/2023] Open
Abstract
There is a growing body of research on the evolution of anatomy in a wide variety of organisms. Discoveries in this field could be greatly accelerated by computational methods and resources that enable these findings to be compared across different studies and different organisms and linked with the genes responsible for anatomical modifications. Homology is a key concept in comparative anatomy; two important types are historical homology (the similarity of organisms due to common ancestry) and serial homology (the similarity of repeated structures within an organism). We explored how to most effectively represent historical and serial homology across anatomical structures to facilitate computational reasoning. We assembled a collection of homology assertions from the literature with a set of taxon phenotypes for the skeletal elements of vertebrate fins and limbs from the Phenoscape Knowledgebase. Using seven competency questions, we evaluated the reasoning ramifications of two logical models: the Reciprocal Existential Axioms (REA) homology model and the Ancestral Value Axioms (AVA) homology model. The AVA model returned all user-expected results in addition to the search term and any of its subclasses. The AVA model also returns any superclass of the query term in which a homology relationship has been asserted. The REA model returned the user-expected results for five out of seven queries. We identify some challenges of implementing complete homology queries due to limitations of OWL reasoning. This work lays the foundation for homology reasoning to be incorporated into other ontology-based tools, such as those that enable synthetic supermatrix construction and candidate gene discovery. [Homology; ontology; anatomy; morphology; evolution; knowledgebase; phenoscape.].
Collapse
Affiliation(s)
- Paula M Mabee
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, 100 Europa Drive, Suite 540, Chapel Hill, NC 27517, USA
| | - Wasila M Dahdul
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - Hilmar Lapp
- Center for Genomic and Computational Biology, Duke University, 101 Science Drive, Durham, NC 27708, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Todd J Vision
- Department of Biology and School of Information and Library Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3280, USA
| |
Collapse
|
4
|
Eliason CM, Edwards SV, Clarke JA. phenotools: An
r
package for visualizing and analysing phenomic datasets. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Chad M. Eliason
- Department of Geological Sciences University of Texas Austin Austin Texas
- Grainger Bioinformatics Center Field Museum of Natural History Chicago Illinois
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology Harvard University Cambridge Massachusetts
| | - Julia A. Clarke
- Department of Geological Sciences University of Texas Austin Austin Texas
| |
Collapse
|
5
|
Woodward KJ, Stampalia J, Vanyai H, Rijhumal H, Potts K, Taylor F, Peverall J, Grumball T, Sivamoorthy S, Alinejad-Rokny H, Wray J, Whitehouse A, Nagarajan L, Scurlock J, Afchani S, Edwards M, Murch A, Beilby J, Baynam G, Kiraly-Borri C, McKenzie F, Heng JIT. Atypical nested 22q11.2 duplications between LCR22B and LCR22D are associated with neurodevelopmental phenotypes including autism spectrum disorder with incomplete penetrance. Mol Genet Genomic Med 2019; 7:e00507. [PMID: 30614210 PMCID: PMC6393688 DOI: 10.1002/mgg3.507] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 09/18/2018] [Accepted: 10/10/2018] [Indexed: 12/04/2022] Open
Abstract
Background Chromosome 22q11.2 is susceptible to genomic rearrangements and the most frequently reported involve deletions and duplications between low copy repeats LCR22A to LCR22D. Atypical nested deletions and duplications are rarer and can provide a valuable opportunity to investigate the dosage effects of a smaller subset of genes within the 22q11.2 genomic disorder region. Methods We describe thirteen individuals from six families, each with atypical nested duplications within the central 22q11.2 region between LCR22B and LCR22D. We then compared the molecular and clinical data for patients from this study and the few reported atypical duplication cases, to the cases with larger typical duplications between LCR22A and LCR22D. Further, we analyzed genes with the nested region to identify candidates highly enriched in human brain tissues. Results We observed that atypical nested duplications are heterogeneous in size, often familial, and associated with incomplete penetrance and highly variable clinical expressivity. We found that the nested atypical duplications are a possible risk factor for neurodevelopmental phenotypes, particularly for autism spectrum disorder (ASD), speech and language delay, and behavioral abnormalities. In addition, we analyzed genes within the nested region between LCR22B and LCR22D to identify nine genes (ZNF74, KLHL22, MED15, PI4KA, SERPIND1, CRKL, AIFM3, SLC7A4, and BCRP2) with enriched expression in the nervous system, each with unique spatiotemporal patterns in fetal and adult brain tissues. Interestingly, PI4KA is prominently expressed in the brain, and this gene is included either partially or completely in all of our subjects. Conclusion Our findings confirm variable expressivity and incomplete penetrance for atypical nested 22q11.2 duplications and identify genes such as PI4KA to be directly relevant to brain development and disorder. We conclude that further work is needed to elucidate the basis of variable neurodevelopmental phenotypes and to exclude the presence of a second disorder. Our findings contribute to the genotype–phenotype data for atypical nested 22q11.2 duplications, with implications for genetic counseling.
Collapse
Affiliation(s)
- Karen J Woodward
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia.,School of Biomedical Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Julie Stampalia
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Hannah Vanyai
- The Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, Western Australia, Australia.,Centre for Medical Research, University of Western Australia, Nedlands, Western Australia, Australia
| | - Hashika Rijhumal
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Kim Potts
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Fiona Taylor
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Joanne Peverall
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Tanya Grumball
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Soruba Sivamoorthy
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia
| | - Hamid Alinejad-Rokny
- The Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, Western Australia, Australia.,Centre for Medical Research, University of Western Australia, Nedlands, Western Australia, Australia
| | - John Wray
- Telethon Kids Institute, University of Western Australia, Perth, Western Australia, Australia
| | - Andrew Whitehouse
- Telethon Kids Institute, University of Western Australia, Perth, Western Australia, Australia
| | - Lakshmi Nagarajan
- Children's Neuroscience Service, Princess Margaret Hospital, Subiaco, Western Australia, Australia.,School of Paediatrics and Child Health, University of Western Australia, Perth, Western Australia, Australia
| | | | - Sabine Afchani
- State Child Development Centre, West Perth, Western Australia, Australia.,Lockridge Child Development Centre, Lockridge, Western Australia, Australia
| | - Matthew Edwards
- School of Medicine, Western Sydney University, Penrith South DC, New South Wales, Australia
| | - Ashleigh Murch
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia.,School of Biomedical Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - John Beilby
- Diagnostic Genomics, PathWest Laboratory Medicine, Perth, Western Australia, Australia.,School of Biomedical Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Gareth Baynam
- Genetic Services of Western Australia, Perth, Western Australia, Australia.,Department of Health, Office of Population Health Genomics, Public Health and Clinical Services Division, Perth, Western Australia, Australia.,Institute for Immunology and Infectious Diseases, Murdoch University, Perth, Western Australia, Australia.,Western Australian Register of Developmental Anomalies, Perth, Western Australia, Australia.,Spatial Sciences, Science and Engineering, Curtin University, Perth, Western Australia, Australia.,Telethon Kids Institute, University of Western Australia, Perth, Western Australia, Australia.,School of Paediatrics and Child Health, University of Western Australia, Perth, Western Australia, Australia
| | - Cathy Kiraly-Borri
- Genetic Services of Western Australia, Perth, Western Australia, Australia.,Children's Neuroscience Service, Princess Margaret Hospital, Subiaco, Western Australia, Australia
| | - Fiona McKenzie
- Genetic Services of Western Australia, Perth, Western Australia, Australia.,School of Paediatrics and Child Health, University of Western Australia, Perth, Western Australia, Australia
| | - Julian I T Heng
- Curtin Health Innovation Research Institute and Sarich Neuroscience Institute, Curtin University, Crawley, Western Australia, Australia.,The Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, Western Australia, Australia.,Centre for Medical Research, University of Western Australia, Nedlands, Western Australia, Australia
| |
Collapse
|
6
|
Ali NM, Khan HA, Then AYH, Ving Ching C, Gaur M, Dhillon SK. Fish Ontology framework for taxonomy-based fish recognition. PeerJ 2017; 5:e3811. [PMID: 28929028 PMCID: PMC5602685 DOI: 10.7717/peerj.3811] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 08/25/2017] [Indexed: 11/20/2022] Open
Abstract
Life science ontologies play an important role in Semantic Web. Given the diversity in fish species and the associated wealth of information, it is imperative to develop an ontology capable of linking and integrating this information in an automated fashion. As such, we introduce the Fish Ontology (FO), an automated classification architecture of existing fish taxa which provides taxonomic information on unknown fish based on metadata restrictions. It is designed to support knowledge discovery, provide semantic annotation of fish and fisheries resources, data integration, and information retrieval. Automated classification for unknown specimens is a unique feature that currently does not appear to exist in other known ontologies. Examples of automated classification for major groups of fish are demonstrated, showing the inferred information by introducing several restrictions at the species or specimen level. The current version of FO has 1,830 classes, includes widely used fisheries terminology, and models major aspects of fish taxonomy, grouping, and character. With more than 30,000 known fish species globally, the FO will be an indispensable tool for fish scientists and other interested users.
Collapse
Affiliation(s)
- Najib M. Ali
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Haris A. Khan
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Amy Y-Hui Then
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Chong Ving Ching
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Manas Gaur
- Wright State University, Kno.e.sis Center, Dayton, OH, United States of America
| | - Sarinder Kaur Dhillon
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
7
|
|
8
|
Abstract
The overarching goal of the Gene Ontology (GO) Consortium is to provide researchers in biology and biomedicine with all current functional information concerning genes and the cellular context under which these occur. When the GO was started in the 1990s surprisingly little attention had been given to how functional information about genes was to be uniformly captured, structured in a computable form, and made accessible to biologists. Because knowledge of gene, protein, ncRNA, and molecular complex roles is continuously accumulating and changing, the GO needed to be a dynamic resource, accurately tracking ongoing research results over time. Here I describe the progress that has been made over the years towards this goal, and the work that still remains to be done, to make of the Gene Ontology (GO) Consortium realize its goal of offering the most comprehensive and up-to-date resource for information on gene function.
Collapse
Affiliation(s)
- Suzanna E Lewis
- Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA.
| |
Collapse
|
9
|
Diehl AD, Meehan TF, Bradford YM, Brush MH, Dahdul WM, Dougall DS, He Y, Osumi-Sutherland D, Ruttenberg A, Sarntivijai S, Van Slyke CE, Vasilevsky NA, Haendel MA, Blake JA, Mungall CJ. The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability. J Biomed Semantics 2016; 7:44. [PMID: 27377652 PMCID: PMC4932724 DOI: 10.1186/s13326-016-0088-7] [Citation(s) in RCA: 145] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Accepted: 06/23/2016] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies. CONSTRUCTION AND CONTENT Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologies in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning. UTILITY AND DISCUSSION The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs. CONCLUSIONS The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.
Collapse
Affiliation(s)
- Alexander D. Diehl
- />Department of Neurology, University at Buffalo School of Medicine and Biomedical Sciences, Buffalo, NY 14203 USA
| | - Terrence F. Meehan
- />European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD UK
| | - Yvonne M. Bradford
- />ZFIN, the Zebrafish Model Organism Database, 5291 University of Oregon, Eugene, OR 97403 USA
| | - Matthew H. Brush
- />Ontology Development Group, Library, Oregon Health and Science University, Portland, Oregon 97239 USA
| | - Wasila M. Dahdul
- />Department of Biology, University of South Dakota, Vermillion, SD 57069 USA
- />National Evolutionary Synthesis Center, Durham, NC 27705 USA
| | - David S. Dougall
- />Southwestern Medical Center, University of Texas, Dallas, TX 75235 USA
| | - Yongqun He
- />Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109 USA
| | - David Osumi-Sutherland
- />European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD UK
| | - Alan Ruttenberg
- />Oral Diagnostics Sciences, University at Buffalo School of Dental Medicine, Buffalo, NY 14210 USA
| | - Sirarat Sarntivijai
- />European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD UK
| | - Ceri E. Van Slyke
- />ZFIN, the Zebrafish Model Organism Database, 5291 University of Oregon, Eugene, OR 97403 USA
| | - Nicole A. Vasilevsky
- />Ontology Development Group, Library, Oregon Health and Science University, Portland, Oregon 97239 USA
| | - Melissa A. Haendel
- />Ontology Development Group, Library, Oregon Health and Science University, Portland, Oregon 97239 USA
| | | | | |
Collapse
|
10
|
Dececchi TA, Mabee PM, Blackburn DC. Data Sources for Trait Databases: Comparing the Phenomic Content of Monographs and Evolutionary Matrices. PLoS One 2016; 11:e0155680. [PMID: 27191170 PMCID: PMC4871461 DOI: 10.1371/journal.pone.0155680] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Accepted: 05/03/2016] [Indexed: 01/17/2023] Open
Abstract
Databases of organismal traits that aggregate information from one or multiple sources can be leveraged for large-scale analyses in biology. Yet the differences among these data streams and how well they capture trait diversity have never been explored. We present the first analysis of the differences between phenotypes captured in free text of descriptive publications ('monographs') and those used in phylogenetic analyses ('matrices'). We focus our analysis on osteological phenotypes of the limbs of four extinct vertebrate taxa critical to our understanding of the fin-to-limb transition. We find that there is low overlap between the anatomical entities used in these two sources of phenotype data, indicating that phenotypes represented in matrices are not simply a subset of those found in monographic descriptions. Perhaps as expected, compared to characters found in matrices, phenotypes in monographs tend to emphasize descriptive and positional morphology, be somewhat more complex, and relate to fewer additional taxa. While based on a small set of focal taxa, these qualitative and quantitative data suggest that either source of phenotypes alone will result in incomplete knowledge of variation for a given taxon. As a broader community develops to use and expand databases characterizing organismal trait diversity, it is important to recognize the limitations of the data sources and develop strategies to more fully characterize variation both within species and across the tree of life.
Collapse
Affiliation(s)
- T. Alex Dececchi
- Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America
| | - Paula M. Mabee
- Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America
| | - David C. Blackburn
- Florida Museum of Natural History, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|
11
|
Cheung KH, Keerthikumar S, Roncaglia P, Subramanian SL, Roth ME, Samuel M, Anand S, Gangoda L, Gould S, Alexander R, Galas D, Gerstein MB, Hill AF, Kitchen RR, Lötvall J, Patel T, Procaccini DC, Quesenberry P, Rozowsky J, Raffai RL, Shypitsyna A, Su AI, Théry C, Vickers K, Wauben MHM, Mathivanan S, Milosavljevic A, Laurent LC. Extending gene ontology in the context of extracellular RNA and vesicle communication. J Biomed Semantics 2016; 7:19. [PMID: 27076901 PMCID: PMC4830068 DOI: 10.1186/s13326-016-0061-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 04/04/2016] [Indexed: 12/31/2022] Open
Abstract
Background To address the lack of standard terminology to describe extracellular RNA (exRNA) data/metadata, we have launched an inter-community effort to extend the Gene Ontology (GO) with subcellular structure concepts relevant to the exRNA domain. By extending GO in this manner, the exRNA data/metadata will be more easily annotated and queried because it will be based on a shared set of terms and relationships relevant to extracellular research. Methods By following a consensus-building process, we have worked with several academic societies/consortia, including ERCC, ISEV, and ASEMV, to identify and approve a set of exRNA and extracellular vesicle-related terms and relationships that have been incorporated into GO. In addition, we have initiated an ongoing process of extractions of gene product annotations associated with these terms from Vesiclepedia and ExoCarta, conversion of the extracted annotations to Gene Association File (GAF) format for batch submission to GO, and curation of the submitted annotations by the GO Consortium. As a use case, we have incorporated some of the GO terms into annotations of samples from the exRNA Atlas and implemented a faceted search interface based on such annotations. Results We have added 7 new terms and modified 9 existing terms (along with their synonyms and relationships) to GO. Additionally, 18,695 unique coding gene products (mRNAs and proteins) and 963 unique non-coding gene products (ncRNAs) which are associated with the terms: “extracellular vesicle”, “extracellular exosome”, “apoptotic body”, and “microvesicle” were extracted from ExoCarta and Vesiclepedia. These annotations are currently being processed for submission to GO. Conclusions As an inter-community effort, we have made a substantial update to GO in the exRNA context. We have also demonstrated the utility of some of the new GO terms for sample annotation and metadata search.
Collapse
Affiliation(s)
- Kei-Hoi Cheung
- Department of Emergency Medicine, Yale Center for Medical Informatics, Yale University School of Medicine, New Haven, CT USA ; VA Connecticut Healthcare System, West Haven, CT USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Shivakumar Keerthikumar
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086 Australia ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Paola Roncaglia
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Gene Ontology Consortium (GOC), ᅟ, ᅟ
| | - Sai Lakshmi Subramanian
- Bioinformatics Research Laboratory, Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Matthew E Roth
- Bioinformatics Research Laboratory, Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Monisha Samuel
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086 Australia
| | - Sushma Anand
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086 Australia
| | - Lahiru Gangoda
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086 Australia
| | - Stephen Gould
- Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ ; American Society for Exosomes and Microvesicles (ASEMV), ᅟ, ᅟ
| | - Roger Alexander
- Pacific Northwest Diabetes Research Institute, Seattle, WA USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - David Galas
- Pacific Northwest Diabetes Research Institute, Seattle, WA USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Mark B Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT USA ; Department of Computer Science, Yale University, New Haven, CT USA ; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Andrew F Hill
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086 Australia ; International Society for Extracellular Vesicles (ISEV), ᅟ, ᅟ
| | - Robert R Kitchen
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Jan Lötvall
- University of Gothenburg, Gothenburg, Sweden ; International Society for Extracellular Vesicles (ISEV), ᅟ, ᅟ
| | - Tushar Patel
- Mayo Clinic, Jacksonville, FL USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Dena C Procaccini
- Division of Neuroscience and Behavior, National Institute on Drug Abuse (NIDA), Rockville, MD USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Peter Quesenberry
- University Medicine Comprehensive Cancer Center, Providence, RI USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ ; International Society for Extracellular Vesicles (ISEV), ᅟ, ᅟ
| | - Joel Rozowsky
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT USA ; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Robert L Raffai
- Department of Surgery, University of California San Francisco and VA Medical Center, San Francisco, CA USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Aleksandra Shypitsyna
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Gene Ontology Consortium (GOC), ᅟ, ᅟ
| | - Andrew I Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Clotilde Théry
- Institut Curie, PSL Research University, INSERM U932, Paris, France ; International Society for Extracellular Vesicles (ISEV), ᅟ, ᅟ
| | - Kasey Vickers
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Marca H M Wauben
- Department of Biochemistry & Cell Biology, Utrecht University, Utrecht, Netherlands ; International Society for Extracellular Vesicles (ISEV), ᅟ, ᅟ
| | - Suresh Mathivanan
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086 Australia ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ ; International Society for Extracellular Vesicles (ISEV), ᅟ, ᅟ
| | - Aleksandar Milosavljevic
- Bioinformatics Research Laboratory, Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| | - Louise C Laurent
- Department of Reproductive Medicine, University of California, San Diego, La Jolla, CA USA ; Extracellular RNA Communication Consortium (ERCC), ᅟ, ᅟ
| |
Collapse
|
12
|
Druzinsky RE, Balhoff JP, Crompton AW, Done J, German RZ, Haendel MA, Herrel A, Herring SW, Lapp H, Mabee PM, Muller HM, Mungall CJ, Sternberg PW, Van Auken K, Vinyard CJ, Williams SH, Wall CE. Muscle Logic: New Knowledge Resource for Anatomy Enables Comprehensive Searches of the Literature on the Feeding Muscles of Mammals. PLoS One 2016; 11:e0149102. [PMID: 26870952 PMCID: PMC4752357 DOI: 10.1371/journal.pone.0149102] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 01/27/2016] [Indexed: 01/27/2023] Open
Abstract
Background In recent years large bibliographic databases have made much of the published literature of biology available for searches. However, the capabilities of the search engines integrated into these databases for text-based bibliographic searches are limited. To enable searches that deliver the results expected by comparative anatomists, an underlying logical structure known as an ontology is required. Development and Testing of the Ontology Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repository for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). We compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar. Results and Significance Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities.
Collapse
Affiliation(s)
- Robert E. Druzinsky
- Department of Oral Biology, University of Illinois at Chicago, Chicago, Illinois, United States of America
- * E-mail:
| | - James P. Balhoff
- RTI International, Research Triangle Park, North Carolina, United States of America
| | - Alfred W. Crompton
- Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - James Done
- Division of Biology and Biological Engineering, M/C 156–29, California Institute of Technology, Pasadena, California, United States of America
| | - Rebecca Z. German
- Department of Anatomy and Neurobiology, Northeast Ohio Medical University, Rootstown, Ohio, United States of America
| | - Melissa A. Haendel
- Oregon Health and Science University, Portland, Oregon, United States of America
| | - Anthony Herrel
- Département d’Ecologie et de Gestion de la Biodiversité, Museum National d’Histoire Naturelle, Paris, France
| | - Susan W. Herring
- University of Washington, Department of Orthodontics, Seattle, Washington, United States of America
| | - Hilmar Lapp
- National Evolutionary Synthesis Center, Durham, North Carolina, United States of America
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
| | - Paula M. Mabee
- Department of Biology, University of South Dakota, Vermillion, South Dakota, United States of America
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering, M/C 156–29, California Institute of Technology, Pasadena, California, United States of America
| | - Christopher J. Mungall
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Paul W. Sternberg
- Division of Biology and Biological Engineering, M/C 156–29, California Institute of Technology, Pasadena, California, United States of America
- Howard Hughes Medical Institute, M/C 156–29, California Institute of Technology, Pasadena, California, United States of America
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering, M/C 156–29, California Institute of Technology, Pasadena, California, United States of America
| | - Christopher J. Vinyard
- Department of Anatomy and Neurobiology, Northeast Ohio Medical University, Rootstown, Ohio, United States of America
| | - Susan H. Williams
- Department of Biomedical Sciences, Ohio University Heritage College of Osteopathic Medicine, Athens, Ohio, United States of America
| | - Christine E. Wall
- Department of Evolutionary Anthropology, Duke University, Durham, North Carolina, United States of America
| |
Collapse
|
13
|
Thessen AE, Bunker DE, Buttigieg PL, Cooper LD, Dahdul WM, Domisch S, Franz NM, Jaiswal P, Lawrence-Dill CJ, Midford PE, Mungall CJ, Ramírez MJ, Specht CD, Vogt L, Vos RA, Walls RL, White JW, Zhang G, Deans AR, Huala E, Lewis SE, Mabee PM. Emerging semantics to link phenotype and environment. PeerJ 2015; 3:e1470. [PMID: 26713234 PMCID: PMC4690371 DOI: 10.7717/peerj.1470] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 11/12/2015] [Indexed: 11/20/2022] Open
Abstract
Understanding the interplay between environmental conditions and phenotypes is a fundamental goal of biology. Unfortunately, data that include observations on phenotype and environment are highly heterogeneous and thus difficult to find and integrate. One approach that is likely to improve the status quo involves the use of ontologies to standardize and link data about phenotypes and environments. Specifying and linking data through ontologies will allow researchers to increase the scope and flexibility of large-scale analyses aided by modern computing methods. Investments in this area would advance diverse fields such as ecology, phylogenetics, and conservation biology. While several biological ontologies are well-developed, using them to link phenotypes and environments is rare because of gaps in ontological coverage and limits to interoperability among ontologies and disciplines. In this manuscript, we present (1) use cases from diverse disciplines to illustrate questions that could be answered more efficiently using a robust linkage between phenotypes and environments, (2) two proof-of-concept analyses that show the value of linking phenotypes to environments in fishes and amphibians, and (3) two proposed example data models for linking phenotypes and environments using the extensible observation ontology (OBOE) and the Biological Collections Ontology (BCO); these provide a starting point for the development of a data model linking phenotypes and environments.
Collapse
Affiliation(s)
- Anne E. Thessen
- Ronin Institute for Independent Scholarship, Monclair, NJ, United States
- The Data Detektiv, Waltham, MA, United States
| | - Daniel E. Bunker
- Department of Biological Sciences, New Jersey Institute of Technology, Newark, NJ, United States
| | - Pier Luigi Buttigieg
- HGF-MPG Group for Deep Sea Ecology and Technology, Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar-und Meeresforschung, Bremerhaven, Germany
| | - Laurel D. Cooper
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Wasila M. Dahdul
- Department of Biology, University of South Dakota, Vermillion, SD, United States
| | - Sami Domisch
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, United States
| | - Nico M. Franz
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Carolyn J. Lawrence-Dill
- Departments of Genetics, Development and Cell Biology and Agronomy, Iowa State University, Ames, IA, United States
| | | | | | - Martín J. Ramírez
- Division of Arachnology, Museo Argentino de Ciencias Naturales–CONICET, Buenos Aires, Argentina
| | - Chelsea D. Specht
- Departments of Plant and Microbial Biology & Integrative Biology, University of California, Berkeley, CA, United States
| | - Lars Vogt
- Institut für Evolutionsbiologie und Ökologie, Universität Bonn, Bonn, Germany
| | | | - Ramona L. Walls
- iPlant Collaborative, University of Arizona, Tucson, AZ, United States
| | - Jeffrey W. White
- US Arid Land Agricultural Research Center, United States Department of Agriculture—ARS, Maricopa, AZ, United States
| | - Guanyang Zhang
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Andrew R. Deans
- Department of Entomology, Pennsylvania State University, University Park, PA, United States
| | - Eva Huala
- Phoenix Bioinformatics, Redwood City, CA, United States
| | - Suzanna E. Lewis
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States
| | - Paula M. Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, United States
| |
Collapse
|
14
|
Hilton EJ, Schnell NK, Konstantinidis P. When Tradition Meets Technology: Systematic Morphology of Fishes in the Early 21stCentury. COPEIA 2015. [DOI: 10.1643/ci-14-178] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
15
|
Dececchi TA, Balhoff JP, Lapp H, Mabee PM. Toward Synthesizing Our Knowledge of Morphology: Using Ontologies and Machine Reasoning to Extract Presence/Absence Evolutionary Phenotypes across Studies. Syst Biol 2015; 64:936-52. [PMID: 26018570 PMCID: PMC4604830 DOI: 10.1093/sysbio/syv031] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 05/20/2015] [Indexed: 02/02/2023] Open
Abstract
The reality of larger and larger molecular databases and the need to integrate data scalably have presented a major challenge for the use of phenotypic data. Morphology is currently primarily described in discrete publications, entrenched in noncomputer readable text, and requires enormous investments of time and resources to integrate across large numbers of taxa and studies. Here we present a new methodology, using ontology-based reasoning systems working with the Phenoscape Knowledgebase (KB; kb.phenoscape.org), to automatically integrate large amounts of evolutionary character state descriptions into a synthetic character matrix of neomorphic (presence/absence) data. Using the KB, which includes more than 55 studies of sarcopterygian taxa, we generated a synthetic supermatrix of 639 variable characters scored for 1051 taxa, resulting in over 145,000 populated cells. Of these characters, over 76% were made variable through the addition of inferred presence/absence states derived by machine reasoning over the formal semantics of the source ontologies. Inferred data reduced the missing data in the variable character-subset from 98.5% to 78.2%. Machine reasoning also enables the isolation of conflicts in the data, that is, cells where both presence and absence are indicated; reports regarding conflicting data provenance can be generated automatically. Further, reasoning enables quantification and new visualizations of the data, here for example, allowing identification of character space that has been undersampled across the fin-to-limb transition. The approach and methods demonstrated here to compute synthetic presence/absence supermatrices are applicable to any taxonomic and phenotypic slice across the tree of life, providing the data are semantically annotated. Because such data can also be linked to model organism genetics through computational scoring of phenotypic similarity, they open a rich set of future research questions into phenotype-to-genome relationships.
Collapse
Affiliation(s)
| | - James P Balhoff
- National Evolutionary Synthesis Center, Durham, NC 27705, USA; University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hilmar Lapp
- National Evolutionary Synthesis Center, Durham, NC 27705, USA; Center for Genomics and Computational Biology, Duke University, Durham, NC 27708, USA
| | - Paula M Mabee
- Department of Biology, University of South Dakota, Vermillion, SD 57069, USA;
| |
Collapse
|
16
|
Enault S, Muñoz DN, Silva WTAF, Borday-Birraux V, Bonade M, Oulion S, Ventéo S, Marcellini S, Debiais-Thibaud M. Molecular footprinting of skeletal tissues in the catshark Scyliorhinus canicula and the clawed frog Xenopus tropicalis identifies conserved and derived features of vertebrate calcification. Front Genet 2015; 6:283. [PMID: 26442101 PMCID: PMC4584932 DOI: 10.3389/fgene.2015.00283] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 08/24/2015] [Indexed: 12/22/2022] Open
Abstract
Understanding the evolutionary emergence and subsequent diversification of the vertebrate skeleton requires a comprehensive view of the diverse skeletal cell types found in distinct developmental contexts, tissues, and species. To date, our knowledge of the molecular nature of the shark calcified extracellular matrix, and its relationships with osteichthyan skeletal tissues, remain scarce. Here, based on specific combinations of expression patterns of the Col1a1, Col1a2, and Col2a1 fibrillar collagen genes, we compare the molecular footprint of endoskeletal elements from the chondrichthyan Scyliorhinus canicula and the tetrapod Xenopus tropicalis. We find that, depending on the anatomical location, Scyliorhinus skeletal calcification is associated to cell types expressing different subsets of fibrillar collagen genes, such as high levels of Col1a1 and Col1a2 in the neural arches, high levels of Col2a1 in the tesserae, or associated to a drastic Col2a1 downregulation in the centrum. We detect low Col2a1 levels in Xenopus osteoblasts, thereby revealing that the osteoblastic expression of this gene was significantly reduced in the tetrapod lineage. Finally, we uncover a striking parallel, from a molecular and histological perspective, between the vertebral cartilage calcification of both species and discuss the evolutionary origin of endochondral ossification.
Collapse
Affiliation(s)
- Sébastien Enault
- Institut des Sciences de l'Evolution de Montpellier, UMR5554, Université Montpellier, Centre National de la Recherche Scientifique, IRD, EPHE Montpellier, France
| | - David N Muñoz
- Laboratory of Development and Evolution, Department of Cell Biology, Faculty of Biological Sciences, Universidad de Concepción Concepción, Chile
| | - Willian T A F Silva
- Institut des Sciences de l'Evolution de Montpellier, UMR5554, Université Montpellier, Centre National de la Recherche Scientifique, IRD, EPHE Montpellier, France
| | - Véronique Borday-Birraux
- Laboratoire EGCE UMR Centre National de la Recherche Scientifique 9191, IRD247, Université Paris Sud Gif-sur-Yvette, France ; Université Paris Diderot, Sorbonne Paris Cité Paris, France
| | - Morgane Bonade
- Laboratoire EGCE UMR Centre National de la Recherche Scientifique 9191, IRD247, Université Paris Sud Gif-sur-Yvette, France
| | - Silvan Oulion
- Institut des Sciences de l'Evolution de Montpellier, UMR5554, Université Montpellier, Centre National de la Recherche Scientifique, IRD, EPHE Montpellier, France
| | - Stéphanie Ventéo
- Institute for Neurosciences of Montpellier, Institut National de la Santé et de la Recherche Médicale U1051 Montpellier, France
| | - Sylvain Marcellini
- Laboratory of Development and Evolution, Department of Cell Biology, Faculty of Biological Sciences, Universidad de Concepción Concepción, Chile
| | - Mélanie Debiais-Thibaud
- Institut des Sciences de l'Evolution de Montpellier, UMR5554, Université Montpellier, Centre National de la Recherche Scientifique, IRD, EPHE Montpellier, France
| |
Collapse
|
17
|
Manda P, Balhoff JP, Lapp H, Mabee P, Vision TJ. Using the phenoscape knowledgebase to relate genetic perturbations to phenotypic evolution. Genesis 2015. [PMID: 26220875 DOI: 10.1002/dvg.22878] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The abundance of phenotypic diversity among species can enrich our knowledge of development and genetics beyond the limits of variation that can be observed in model organisms. The Phenoscape Knowledgebase (KB) is designed to enable exploration and discovery of phenotypic variation among species. Because phenotypes in the KB are annotated using standard ontologies, evolutionary phenotypes can be compared with phenotypes from genetic perturbations in model organisms. To illustrate the power of this approach, we review the use of the KB to find taxa showing evolutionary variation similar to that of a query gene. Matches are made between the full set of phenotypes described for a gene and an evolutionary profile, the latter of which is defined as the set of phenotypes that are variable among the daughters of any node on the taxonomic tree. Phenoscape's semantic similarity interface allows the user to assess the statistical significance of each match and flags matches that may only result from differences in annotation coverage between genetic and evolutionary studies. Tools such as this will help meet the challenge of relating the growing volume of genetic knowledge in model organisms to the diversity of phenotypes in nature. The Phenoscape KB is available at http://kb.phenoscape.org.
Collapse
Affiliation(s)
- Prashanti Manda
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina.,US National Evolutionary Synthesis Center, Durham, North Carolina
| | - James P Balhoff
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina.,US National Evolutionary Synthesis Center, Durham, North Carolina
| | - Hilmar Lapp
- US National Evolutionary Synthesis Center, Durham, North Carolina.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina
| | - Paula Mabee
- Department of Biology, University of South Dakota, Vermillion, South Dakota
| | - Todd J Vision
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina.,US National Evolutionary Synthesis Center, Durham, North Carolina
| |
Collapse
|
18
|
Brunt LH, Norton JL, Bright JA, Rayfield EJ, Hammond CL. Finite element modelling predicts changes in joint shape and cell behaviour due to loss of muscle strain in jaw development. J Biomech 2015; 48:3112-22. [PMID: 26253758 PMCID: PMC4601018 DOI: 10.1016/j.jbiomech.2015.07.017] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Revised: 07/15/2015] [Accepted: 07/18/2015] [Indexed: 11/30/2022]
Abstract
Abnormal joint morphogenesis is linked to clinical conditions such as Developmental Dysplasia of the Hip (DDH) and to osteoarthritis (OA). Muscle activity is known to be important during the developmental process of joint morphogenesis. However, less is known about how this mechanical stimulus affects the behaviour of joint cells to generate altered morphology. Using zebrafish, in which we can image all joint musculoskeletal tissues at high resolution, we show that removal of muscle activity through anaesthetisation or genetic manipulation causes a change to the shape of the joint between the Meckel's cartilage and Palatoquadrate (the jaw joint), such that the joint develops asymmetrically leading to an overlap of the cartilage elements on the medial side which inhibits normal joint function. We identify the time during which muscle activity is critical to produce a normal joint. Using Finite Element Analysis (FEA), to model the strains exerted by muscle on the skeletal elements, we identify that minimum principal strains are located at the medial region of the joint and interzone during mouth opening. Then, by studying the cells immediately proximal to the joint, we demonstrate that biomechanical strain regulates cell orientation within the developing joint, such that when muscle-induced strain is removed, cells on the medial side of the joint notably change their orientation. Together, these data show that biomechanical forces are required to establish symmetry in the joint during development.
Collapse
Affiliation(s)
- Lucy H Brunt
- Schools of Physiology and Pharmacology and of Biochemistry, University of Bristol, BS8 1TD Bristol, United Kingdom
| | - Joanna L Norton
- Schools of Physiology and Pharmacology and of Biochemistry, University of Bristol, BS8 1TD Bristol, United Kingdom
| | - Jen A Bright
- School of Earth Sciences, University of Bristol, BS8 1RJ Bristol, United Kingdom
| | - Emily J Rayfield
- School of Earth Sciences, University of Bristol, BS8 1RJ Bristol, United Kingdom
| | - Chrissy L Hammond
- Schools of Physiology and Pharmacology and of Biochemistry, University of Bristol, BS8 1TD Bristol, United Kingdom.
| |
Collapse
|
19
|
Dahdul W, Dececchi TA, Ibrahim N, Lapp H, Mabee P. Moving the mountain: analysis of the effort required to transform comparative anatomy into computable anatomy. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav040. [PMID: 25972520 PMCID: PMC4429748 DOI: 10.1093/database/bav040] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2015] [Accepted: 04/05/2015] [Indexed: 11/28/2022]
Abstract
The diverse phenotypes of living organisms have been described for centuries, and though they may be digitized, they are not readily available in a computable form. Using over 100 morphological studies, the Phenoscape project has demonstrated that by annotating characters with community ontology terms, links between novel species anatomy and the genes that may underlie them can be made. But given the enormity of the legacy literature, how can this largely unexploited wealth of descriptive data be rendered amenable to large-scale computation? To identify the bottlenecks, we quantified the time involved in the major aspects of phenotype curation as we annotated characters from the vertebrate phylogenetic systematics literature. This involves attaching fully computable logical expressions consisting of ontology terms to the descriptions in character-by-taxon matrices. The workflow consists of: (i) data preparation, (ii) phenotype annotation, (iii) ontology development and (iv) curation team discussions and software development feedback. Our results showed that the completion of this work required two person-years by a team of two post-docs, a lead data curator, and students. Manual data preparation required close to 13% of the effort. This part in particular could be reduced substantially with better community data practices, such as depositing fully populated matrices in public repositories. Phenotype annotation required ∼40% of the effort. We are working to make this more efficient with Natural Language Processing tools. Ontology development (40%), however, remains a highly manual task requiring domain (anatomical) expertise and use of specialized software. The large overhead required for data preparation and ontology development contributed to a low annotation rate of approximately two characters per hour, compared with 14 characters per hour when activity was restricted to character annotation. Unlocking the potential of the vast stores of morphological descriptions requires better tools for efficiently processing natural language, and better community practices towards a born-digital morphology. Database URL:http://kb.phenoscape.org
Collapse
Affiliation(s)
- Wasila Dahdul
- Department of Biology, University of South Dakota, Vermillion, SD, USA, Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL, USA and National Evolutionary Synthesis Center, Durham, NC, USA
| | - T Alexander Dececchi
- Department of Biology, University of South Dakota, Vermillion, SD, USA, Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL, USA and National Evolutionary Synthesis Center, Durham, NC, USA
| | - Nizar Ibrahim
- Department of Biology, University of South Dakota, Vermillion, SD, USA, Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL, USA and National Evolutionary Synthesis Center, Durham, NC, USA
| | - Hilmar Lapp
- Department of Biology, University of South Dakota, Vermillion, SD, USA, Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL, USA and National Evolutionary Synthesis Center, Durham, NC, USA
| | - Paula Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, USA, Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL, USA and National Evolutionary Synthesis Center, Durham, NC, USA
| |
Collapse
|
20
|
Describing the breakbone fever: IDODEN, an ontology for dengue fever. PLoS Negl Trop Dis 2015; 9:e0003479. [PMID: 25646954 PMCID: PMC4315569 DOI: 10.1371/journal.pntd.0003479] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Accepted: 12/15/2014] [Indexed: 01/25/2023] Open
Abstract
Background Ontologies represent powerful tools in information technology because they enhance interoperability and facilitate, among other things, the construction of optimized search engines. To address the need to expand the toolbox available for the control and prevention of vector-borne diseases we embarked on the construction of specific ontologies. We present here IDODEN, an ontology that describes dengue fever, one of the globally most important diseases that are transmitted by mosquitoes. Methodology/Principal Findings We constructed IDODEN using open source software, and modeled it on IDOMAL, the malaria ontology developed previously. IDODEN covers all aspects of dengue fever, such as disease biology, epidemiology and clinical features. Moreover, it covers all facets of dengue entomology. IDODEN, which is freely available, can now be used for the annotation of dengue-related data and, in addition to its use for modeling, it can be utilized for the construction of other dedicated IT tools such as decision support systems. Conclusions/Significance The availability of the dengue ontology will enable databases hosting dengue-associated data and decision-support systems for that disease to perform most efficiently and to link their own data to those stored in other independent repositories, in an architecture- and software-independent manner. The need for the construction of a dengue ontology arose through the fact that the incidence of dengue fever is on the rise across the world; the number of cases may be three to four times higher than the 100 million estimated by the WHO and a vaccine is still not available in spite of the significant efforts undertaken. Thus, control of dengue fever still relies mostly on controlling its mosquito vectors. Large amounts of entomological, epidemiological and clinical data are generated; these need to be efficiently organized in order to further our comprehension of the disease and its control. IDODEN aims to cover the different aspects and intricacies of dengue fever and syndromes caused by dengue virus(es). It contains more than 5000 terms describing epidemiological data, vaccine development, clinical features, the disease course, and more. We show here that it can be a helpful tool for researchers and that, in addition to allowing sophisticated search strategies, it is also useful for tasks such as modeling.
Collapse
|
21
|
Dahdul WM, Cui H, Mabee PM, Mungall CJ, Osumi-Sutherland D, Walls RL, Haendel MA. Nose to tail, roots to shoots: spatial descriptors for phenotypic diversity in the Biological Spatial Ontology. J Biomed Semantics 2014; 5:34. [PMID: 25140222 PMCID: PMC4137724 DOI: 10.1186/2041-1480-5-34] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 06/16/2014] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Spatial terminology is used in anatomy to indicate precise, relative positions of structures in an organism. While these terms are often standardized within specific fields of biology, they can differ dramatically across taxa. Such differences in usage can impair our ability to unambiguously refer to anatomical position when comparing anatomy or phenotypes across species. We developed the Biological Spatial Ontology (BSPO) to standardize the description of spatial and topological relationships across taxa to enable the discovery of comparable phenotypes. RESULTS BSPO currently contains 146 classes and 58 relations representing anatomical axes, gradients, regions, planes, sides, and surfaces. These concepts can be used at multiple biological scales and in a diversity of taxa, including plants, animals and fungi. The BSPO is used to provide a source of anatomical location descriptors for logically defining anatomical entity classes in anatomy ontologies. Spatial reasoning is further enhanced in anatomy ontologies by integrating spatial relations such as dorsal_to into class descriptions (e.g., 'dorsolateral placode' dorsal_to some 'epibranchial placode'). CONCLUSIONS The BSPO is currently used by projects that require standardized anatomical descriptors for phenotype annotation and ontology integration across a diversity of taxa. Anatomical location classes are also useful for describing phenotypic differences, such as morphological variation in position of structures resulting from evolution within and across species.
Collapse
Affiliation(s)
- Wasila M Dahdul
- Department of Biology, University of South Dakota, Vermillion, SD, USA
- National Evolutionary Synthesis Center, Durham, NC, USA
| | - Hong Cui
- School of Information Resource and Library Science, University of Arizona, Tucson, AZ, USA
| | - Paula M Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, USA
| | | | | | - Ramona L Walls
- The iPlant Collaborative, Bio5 Institute, University of Arizona, Tucson, AZ, USA
| | - Melissa A Haendel
- Library and Department of Medical Informatics & Epidemiology, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
22
|
Smith CM, Finger JH, Kadin JA, Richardson JE, Ringwald M. The gene expression database for mouse development (GXD): putting developmental expression information at your fingertips. Dev Dyn 2014; 243:1176-86. [PMID: 24958384 DOI: 10.1002/dvdy.24155] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Revised: 05/16/2014] [Accepted: 06/17/2014] [Indexed: 12/15/2022] Open
Abstract
Because molecular mechanisms of development are extraordinarily complex, the understanding of these processes requires the integration of pertinent research data. Using the Gene Expression Database for Mouse Development (GXD) as an example, we illustrate the progress made toward this goal, and discuss relevant issues that apply to developmental databases and developmental research in general. Since its first release in 1998, GXD has served the scientific community by integrating multiple types of expression data from publications and electronic submissions and by making these data freely and widely available. Focusing on endogenous gene expression in wild-type and mutant mice and covering data from RNA in situ hybridization, in situ reporter (knock-in), immunohistochemistry, reverse transcriptase-polymerase chain reaction, Northern blot, and Western blot experiments, the database has grown tremendously over the years in terms of data content and search utilities. Currently, GXD includes over 1.4 million annotated expression results and over 260,000 images. All these data and images are readily accessible to many types of database searches. Here we describe the data and search tools of GXD; explain how to use the database most effectively; discuss how we acquire, curate, and integrate developmental expression information; and describe how the research community can help in this process.
Collapse
|
23
|
Haendel MA, Balhoff JP, Bastian FB, Blackburn DC, Blake JA, Bradford Y, Comte A, Dahdul WM, Dececchi TA, Druzinsky RE, Hayamizu TF, Ibrahim N, Lewis SE, Mabee PM, Niknejad A, Robinson-Rechavi M, Sereno PC, Mungall CJ. Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon. J Biomed Semantics 2014; 5:21. [PMID: 25009735 PMCID: PMC4089931 DOI: 10.1186/2041-1480-5-21] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Accepted: 03/25/2014] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Elucidating disease and developmental dysfunction requires understanding variation in phenotype. Single-species model organism anatomy ontologies (ssAOs) have been established to represent this variation. Multi-species anatomy ontologies (msAOs; vertebrate skeletal, vertebrate homologous, teleost, amphibian AOs) have been developed to represent 'natural' phenotypic variation across species. Our aim has been to integrate ssAOs and msAOs for various purposes, including establishing links between phenotypic variation and candidate genes. RESULTS Previously, msAOs contained a mixture of unique and overlapping content. This hampered integration and coordination due to the need to maintain cross-references or inter-ontology equivalence axioms to the ssAOs, or to perform large-scale obsolescence and modular import. Here we present the unification of anatomy ontologies into Uberon, a single ontology resource that enables interoperability among disparate data and research groups. As a consequence, independent development of TAO, VSAO, AAO, and vHOG has been discontinued. CONCLUSIONS The newly broadened Uberon ontology is a unified cross-taxon resource for metazoans (animals) that has been substantially expanded to include a broad diversity of vertebrate anatomical structures, permitting reasoning across anatomical variation in extinct and extant taxa. Uberon is a core resource that supports single- and cross-species queries for candidate genes using annotations for phenotypes from the systematics, biodiversity, medical, and model organism communities, while also providing entities for logical definitions in the Cell and Gene Ontologies. THE ONTOLOGY RELEASE FILES ASSOCIATED WITH THE ONTOLOGY MERGE DESCRIBED IN THIS MANUSCRIPT ARE AVAILABLE AT: http://purl.obolibrary.org/obo/uberon/releases/2013-02-21/ CURRENT ONTOLOGY RELEASE FILES ARE AVAILABLE ALWAYS AVAILABLE AT: http://purl.obolibrary.org/obo/uberon/releases/
Collapse
Affiliation(s)
- Melissa A Haendel
- Department of Medical Informatics & Epidemiology, Oregon Health & Science University, Portland, OR, USA
| | - James P Balhoff
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599-3280, USA ; National Evolutionary Synthesis Center, Durham, NC, USA
| | - Frederic B Bastian
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland ; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - David C Blackburn
- Department of Vertebrate Zoology and Anthropology, California Academy of Sciences, San Francisco, CA 94118, USA
| | | | - Yvonne Bradford
- The Zebrafish Model Organism Database, University of Oregon, Eugene, OR 97403, USA
| | - Aurelie Comte
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland ; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Wasila M Dahdul
- National Evolutionary Synthesis Center, Durham, NC, USA ; Department of Biology, University of South Dakota, Vermillion, SD 57069, USA
| | - Thomas A Dececchi
- Department of Biology, University of South Dakota, Vermillion, SD 57069, USA
| | - Robert E Druzinsky
- Department of Oral Biology, University of Illinois-Chicago, Chicago, IL 60612, USA
| | | | - Nizar Ibrahim
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL 60637, USA
| | - Suzanna E Lewis
- Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
| | - Paula M Mabee
- Department of Biology, University of South Dakota, Vermillion, SD 57069, USA
| | - Anne Niknejad
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland ; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland ; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Paul C Sereno
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL 60637, USA
| | | |
Collapse
|
24
|
Ramírez MJ, Michalik P. Calculating structural complexity in phylogenies using ancestral ontologies. Cladistics 2014; 30:635-649. [DOI: 10.1111/cla.12075] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/20/2014] [Indexed: 01/29/2023] Open
Affiliation(s)
- Martín J. Ramírez
- Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET; Av. Angel Gallardo 470 C1405DJR Buenos Aires Argentina
| | - Peter Michalik
- Zoologisches Institut und Museum; Ernst-Moritz-Arndt-Universität; J.-S.-Bach-Str. 11/12 D-17489 Greifswald Germany
| |
Collapse
|
25
|
Van Slyke CE, Bradford YM, Westerfield M, Haendel MA. The zebrafish anatomy and stage ontologies: representing the anatomy and development of Danio rerio. J Biomed Semantics 2014; 5:12. [PMID: 24568621 PMCID: PMC3944782 DOI: 10.1186/2041-1480-5-12] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Accepted: 02/07/2014] [Indexed: 01/07/2023] Open
Abstract
Background The Zebrafish Anatomy Ontology (ZFA) is an OBO Foundry ontology that is used in conjunction with the Zebrafish Stage Ontology (ZFS) to describe the gross and cellular anatomy and development of the zebrafish, Danio rerio, from single cell zygote to adult. The zebrafish model organism database (ZFIN) uses the ZFA and ZFS to annotate phenotype and gene expression data from the primary literature and from contributed data sets. Results The ZFA models anatomy and development with a subclass hierarchy, a partonomy, and a developmental hierarchy and with relationships to the ZFS that define the stages during which each anatomical entity exists. The ZFA and ZFS are developed utilizing OBO Foundry principles to ensure orthogonality, accessibility, and interoperability. The ZFA has 2860 classes representing a diversity of anatomical structures from different anatomical systems and from different stages of development. Conclusions The ZFA describes zebrafish anatomy and development semantically for the purposes of annotating gene expression and anatomical phenotypes. The ontology and the data have been used by other resources to perform cross-species queries of gene expression and phenotype data, providing insights into genetic relationships, morphological evolution, and models of human disease.
Collapse
|
26
|
Morris RA, Dou L, Hanken J, Kelly M, Lowery DB, Ludäscher B, Macklin JA, Morris PJ. Semantic annotation of mutable data. PLoS One 2013; 8:e76093. [PMID: 24223697 PMCID: PMC3817185 DOI: 10.1371/journal.pone.0076093] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Accepted: 08/22/2013] [Indexed: 11/18/2022] Open
Abstract
Electronic annotation of scientific data is very similar to annotation of documents. Both types of annotation amplify the original object, add related knowledge to it, and dispute or support assertions in it. In each case, annotation is a framework for discourse about the original object, and, in each case, an annotation needs to clearly identify its scope and its own terminology. However, electronic annotation of data differs from annotation of documents: the content of the annotations, including expectations and supporting evidence, is more often shared among members of networks. Any consequent actions taken by the holders of the annotated data could be shared as well. But even those current annotation systems that admit data as their subject often make it difficult or impossible to annotate at fine-enough granularity to use the results in this way for data quality control. We address these kinds of issues by offering simple extensions to an existing annotation ontology and describe how the results support an interest-based distribution of annotations. We are using the result to design and deploy a platform that supports annotation services overlaid on networks of distributed data, with particular application to data quality control. Our initial instance supports a set of natural science collection metadata services. An important application is the support for data quality control and provision of missing data. A previous proof of concept demonstrated such use based on data annotations modeled with XML-Schema.
Collapse
Affiliation(s)
- Robert A. Morris
- Harvard University Herbaria, Cambridge, Massachusetts, United States of America
- Computer Science Department, University of Massachusetts, Boston, Massachusetts, United States of America
- * E-mail:
| | - Lei Dou
- UC Davis Genome Center, University of California, Davis, California, United States of America
| | - James Hanken
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Maureen Kelly
- Harvard University Herbaria, Cambridge, Massachusetts, United States of America
| | - David B. Lowery
- Harvard University Herbaria, Cambridge, Massachusetts, United States of America
- Computer Science Department, University of Massachusetts, Boston, Massachusetts, United States of America
| | - Bertram Ludäscher
- UC Davis Genome Center, University of California, Davis, California, United States of America
| | | | - Paul J. Morris
- Harvard University Herbaria, Cambridge, Massachusetts, United States of America
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
27
|
Segerdell E, Ponferrada VG, James-Zorn C, Burns KA, Fortriede JD, Dahdul WM, Vize PD, Zorn AM. Enhanced XAO: the ontology of Xenopus anatomy and development underpins more accurate annotation of gene expression and queries on Xenbase. J Biomed Semantics 2013; 4:31. [PMID: 24139024 PMCID: PMC3816597 DOI: 10.1186/2041-1480-4-31] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2013] [Accepted: 10/11/2013] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND The African clawed frogs Xenopus laevis and Xenopus tropicalis are prominent animal model organisms. Xenopus research contributes to the understanding of genetic, developmental and molecular mechanisms underlying human disease. The Xenopus Anatomy Ontology (XAO) reflects the anatomy and embryological development of Xenopus. The XAO provides consistent terminology that can be applied to anatomical feature descriptions along with a set of relationships that indicate how each anatomical entity is related to others in the embryo, tadpole, or adult frog. The XAO is integral to the functionality of Xenbase (http://www.xenbase.org), the Xenopus model organism database. RESULTS We significantly expanded the XAO in the last five years by adding 612 anatomical terms, 2934 relationships between them, 640 synonyms, and 547 ontology cross-references. Each term now has a definition, so database users and curators can be certain they are selecting the correct term when specifying an anatomical entity. With developmental timing information now asserted for every anatomical term, the ontology provides internal checks that ensure high-quality gene expression and phenotype data annotation. The XAO, now with 1313 defined anatomical and developmental stage terms, has been integrated with Xenbase expression and anatomy term searches and it enables links between various data types including images, clones, and publications. Improvements to the XAO structure and anatomical definitions have also enhanced cross-references to anatomy ontologies of other model organisms and humans, providing a bridge between Xenopus data and other vertebrates. The ontology is free and open to all users. CONCLUSIONS The expanded and improved XAO allows enhanced capture of Xenopus research data and aids mechanisms for performing complex retrieval and analysis of gene expression, phenotypes, and antibodies through text-matching and manual curation. Its comprehensive references to ontologies across taxa help integrate these data for human disease modeling.
Collapse
Affiliation(s)
- Erik Segerdell
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Virgilio G Ponferrada
- Division of Developmental Biology, Cincinnati Children’s Research Foundation, Cincinnati, OH, USA
| | - Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children’s Research Foundation, Cincinnati, OH, USA
| | - Kevin A Burns
- Division of Developmental Biology, Cincinnati Children’s Research Foundation, Cincinnati, OH, USA
| | - Joshua D Fortriede
- Division of Developmental Biology, Cincinnati Children’s Research Foundation, Cincinnati, OH, USA
| | - Wasila M Dahdul
- Department of Biology, University of South Dakota, Vermillion, SD, USA
- National Evolutionary Synthesis Center, Durham, NC, USA
| | - Peter D Vize
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Aaron M Zorn
- Division of Developmental Biology, Cincinnati Children’s Research Foundation, Cincinnati, OH, USA
| |
Collapse
|
28
|
Druzinsky R, Mungall C, Haendel M, Lapp H, Mabee P. What is an anatomy ontology? Anat Rec (Hoboken) 2013; 296:1797-9. [DOI: 10.1002/ar.22805] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2013] [Revised: 06/18/2013] [Accepted: 08/05/2013] [Indexed: 11/08/2022]
Affiliation(s)
- Robert Druzinsky
- Department of Oral Biology; College of Dentistry, University of Illinois; Chicago
| | - Christopher Mungall
- Department of Genome Dynamics; Lawrence Berkeley Laboratory; Berkeley California
| | - Melissa Haendel
- Department of Medical Informatics and Epidemiology; Oregon Health & Science University; Portland Oregon
| | - Hilmar Lapp
- National Evolutionary Synthesis Center (NESCent); Durham North Carolina
| | - Paula Mabee
- Department of Biology; University of South Dakota; Vermillion South Dakota
| |
Collapse
|
29
|
Brinkley JF, Borromeo C, Clarkson M, Cox TC, Cunningham MJ, Detwiler LT, Heike CL, Hochheiser H, Mejino JLV, Travillian RS, Shapiro LG. The ontology of craniofacial development and malformation for translational craniofacial research. AMERICAN JOURNAL OF MEDICAL GENETICS PART C-SEMINARS IN MEDICAL GENETICS 2013; 163C:232-45. [PMID: 24124010 DOI: 10.1002/ajmg.c.31377] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
We introduce the Ontology of Craniofacial Development and Malformation (OCDM) as a mechanism for representing knowledge about craniofacial development and malformation, and for using that knowledge to facilitate integrating craniofacial data obtained via multiple techniques from multiple labs and at multiple levels of granularity. The OCDM is a project of the NIDCR-sponsored FaceBase Consortium, whose goal is to promote and enable research into the genetic and epigenetic causes of specific craniofacial abnormalities through the provision of publicly accessible, integrated craniofacial data. However, the OCDM should be usable for integrating any web-accessible craniofacial data, not just those data available through FaceBase. The OCDM is based on the Foundational Model of Anatomy (FMA), our comprehensive ontology of canonical human adult anatomy, and includes modules to represent adult and developmental craniofacial anatomy in both human and mouse, mappings between homologous structures in human and mouse, and associated malformations. We describe these modules, as well as prototype uses of the OCDM for integrating craniofacial data. By using the terms from the OCDM to annotate data, and by combining queries over the ontology with those over annotated data, it becomes possible to create "intelligent" queries that can, for example, find gene expression data obtained from mouse structures that are precursors to homologous human structures involved in malformations such as cleft lip. We suggest that the OCDM can be useful not only for integrating craniofacial data, but also for expressing new knowledge gained from analyzing the integrated data.
Collapse
|