1
|
Cardoso SD, Pruski C, Da Silveira M. Supporting biomedical ontology evolution by identifying outdated concepts and the required type of change. J Biomed Inform 2018; 87:1-11. [DOI: 10.1016/j.jbi.2018.08.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 08/16/2018] [Accepted: 08/28/2018] [Indexed: 11/16/2022]
|
2
|
Medina-Moreira J, Lagos-Ortiz K, Luna-Aveiga H, Apolinario-Arzube O, Salas-Zárate MDP, Valencia-García R. Knowledge Acquisition Through Ontologies from Medical Natural Language Texts. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH 2017. [DOI: 10.4018/jitr.2017100104] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Ontologies are used to represent knowledge and they have become very important in the Semantic Web era. Ontologies evolve continuously during their life cycle to adapt to new requirements and needs, especially in the biomedical field, where the number of ontologies and their complexity have increased during the last years. On the other hand, a vast amount of clinical knowledge resides in natural language texts. For these reasons, building and maintaining biomedical ontologies from natural language texts is a relevant and challenging issue. In order to provide a general solution and to minimize the experts' participation during the ontology enriching process, a methodology for extracting terms and relations from natural language texts is proposed in this work. This framework is based on linguistic and statistical methods and semantic role labeling technologies, having been validated in the domain of diabetes, where they have obtained encouraging results with an F-measure of 82.1% and 79.9% for concepts and relations, respectively.
Collapse
|
3
|
Groß A, Pruski C, Rahm E. Evolution of biomedical ontologies and mappings: Overview of recent approaches. Comput Struct Biotechnol J 2016; 14:333-40. [PMID: 27642503 PMCID: PMC5018063 DOI: 10.1016/j.csbj.2016.08.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Revised: 08/19/2016] [Accepted: 08/23/2016] [Indexed: 11/16/2022] Open
Abstract
Biomedical ontologies are heavily used to annotate data, and different ontologies are often interlinked by ontology mappings. These ontology-based mappings and annotations are used in many applications and analysis tasks. Since biomedical ontologies are continuously updated dependent artifacts can become outdated and need to undergo evolution as well. Hence there is a need for largely automated approaches to keep ontology-based mappings up-to-date in the presence of evolving ontologies. In this article, we survey current approaches and novel directions in the context of ontology and mapping evolution. We will discuss requirements for mapping adaptation and provide a comprehensive overview on existing approaches. We will further identify open challenges and outline ideas for future developments.
Collapse
Affiliation(s)
- Anika Groß
- Institute of Computer Science, Universität Leipzig, P.O. Box 100920, 04009 Leipzig, Germany
| | - Cédric Pruski
- Luxembourg Institute of Science and Technology, 5 Avenue des Hauts-Fourneaux, L-4362 Esch-sur-Alzette, Luxembourg
| | - Erhard Rahm
- Institute of Computer Science, Universität Leipzig, P.O. Box 100920, 04009 Leipzig, Germany
| |
Collapse
|
4
|
Da Silveira M, Dos Reis JC, Pruski C. Management of Dynamic Biomedical Terminologies: Current Status and Future Challenges. Yearb Med Inform 2015; 10:125-33. [PMID: 26293859 DOI: 10.15265/iy-2015-002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
OBJECTIVES Controlled terminologies and their dependent artefacts provide a consensual understanding of a domain while reducing ambiguities and enabling reasoning. However, the evolution of a domain's knowledge directly impacts these terminologies and generates inconsistencies in the underlying biomedical information systems. In this article, we review existing work addressing the dynamic aspect of terminologies as well as their effects on mappings and semantic annotations. METHODS We investigate approaches related to the identification, characterization and propagation of changes in terminologies, mappings and semantic annotations including techniques to update their content. RESULTS AND CONCLUSION Based on the explored issues and existing methods, we outline open research challenges requiring investigation in the near future.
Collapse
Affiliation(s)
- M Da Silveira
- Dr. Marcos Da Silveira, Luxembourg Institute of Science and Technology (LIST), 5, avenue des Hauts-Fourneaux, 4362 Esch/Alzette, Luxembourg, E-mail:
| | | | | |
Collapse
|
5
|
Analysis and Prediction of User Editing Patterns in Ontology Development Projects. JOURNAL ON DATA SEMANTICS 2015; 4:117-132. [PMID: 26052350 DOI: 10.1007/s13740-014-0047-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The development of real-world ontologies is a complex undertaking, commonly involving a group of domain experts with different expertise that work together in a collaborative setting. These ontologies are usually large scale and have complex structures. To assist in the authoring process, ontology tools are key at making the editing process as streamlined as possible. Being able to predict confidently what the users are likely to do next as they edit an ontology will enable us to focus and structure the user interface accordingly and to facilitate more efficient interaction and information discovery. In this paper, we use data mining, specifically the association rule mining, to investigate whether we are able to predict the next editing operation that a user will make based on the change history. We simulated and evaluated continuous prediction across time using sliding window model. We used the association rule mining to generate patterns from the ontology change logs in the training window and tested these patterns on logs in the adjacent testing window. We also evaluated the impact of different training and testing window sizes on the prediction accuracies. At last, we evaluated our prediction accuracies across different user groups and different ontologies. Our results indicate that we can indeed predict the next editing operation a user is likely to make. We will use the discovered editing patterns to develop a recommendation module for our editing tools, and to design user interface components that better fit with the user editing behaviors.
Collapse
|
6
|
Christen V, Hartung M, Groß A. Region Evolution eXplorer - A tool for discovering evolution trends in ontology regions. J Biomed Semantics 2015; 6:26. [PMID: 26034559 PMCID: PMC4450457 DOI: 10.1186/s13326-015-0020-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 04/17/2015] [Indexed: 11/30/2022] Open
Abstract
Background A large number of life science ontologies has been developed to support different application scenarios such as gene annotation or functional analysis. The continuous accumulation of new insights and knowledge affects specific portions in ontologies and thus leads to their adaptation. Therefore, it is valuable to study which ontology parts have been extensively modified or remained unchanged. Users can monitor the evolution of an ontology to improve its further development or apply the knowledge in their applications. Results Here we present REX (Region Evolution eXplorer) a web-based system for exploring the evolution of ontology parts (regions). REX provides an analysis platform for currently about 1,000 versions of 16 well-known life science ontologies. Interactive workflows allow an explorative analysis of changing ontology regions and can be used to study evolution trends for long-term periods. Conclusion REX is a web application providing an interactive and user-friendly interface to identify (un)stable regions in large life science ontologies. It is available at http://www.izbi.de/rex. Electronic supplementary material The online version of this article (doi:10.1186/s13326-015-0020-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Victor Christen
- Department of Computer Science, Universität Leipzig, Augustusplatz 10, Leipzig, Germany
| | - Michael Hartung
- Department of Computer Science, Universität Leipzig, Augustusplatz 10, Leipzig, Germany ; Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstr. 16 - 18, Leipzig, Germany
| | - Anika Groß
- Department of Computer Science, Universität Leipzig, Augustusplatz 10, Leipzig, Germany ; Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstr. 16 - 18, Leipzig, Germany
| |
Collapse
|
7
|
Tao S, Cui L, Zhu W, Sun M, Bodenreider O, Zhang GQ. Mining Relation Reversals in the Evolution of SNOMED CT Using MapReduce. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2015; 2015:46-50. [PMID: 26306232 PMCID: PMC4525241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Relation reversals in ontological systems refer to such patterns as a path from concept A to concept B in one version becoming a path with the position of A and B switched in another version. We present a scalable approach, using cloud computing, to systematically extract all hierarchical relation reversals among 8 SNOMED CT versions from 2009 to 2014. Taking advantage of our MapReduce algorithms for computing transitive closure and large-scale set operations, 48 reversals were found through 28 pairwise comparison of the 8 versions in 18 minutes using a 30-node local cloud, to completely cover all possible scenarios. Except for one, all such reversals occurred in three sub-hierarchies: Body Structure, Clinical Finding, and Procedure. Two (2) reversal pairs involved an uncoupling of the pair before the is-a coupling is reversed. Twelve (12) reversal pairs involved paths of length-two, and none (0) involved paths beyond length-two. Such reversals not only represent areas of potential need for additional modeling work, but also are important for identifying and handling cycles for comparative visualization of ontological evolution.
Collapse
Affiliation(s)
- Shiqiang Tao
- Department of EECS, Case Western Reserve University, Cleveland, OH, USA,Division of Medical Informatics, Case Western Reserve University, Cleveland, OH, USA
| | - Licong Cui
- Department of EECS, Case Western Reserve University, Cleveland, OH, USA
| | - Wei Zhu
- Department of EECS, Case Western Reserve University, Cleveland, OH, USA
| | - Mengmeng Sun
- Department of EECS, Case Western Reserve University, Cleveland, OH, USA
| | | | - Guo-Qiang Zhang
- Department of EECS, Case Western Reserve University, Cleveland, OH, USA,Division of Medical Informatics, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
8
|
Measuring the evolution of ontology complexity: the gene ontology case study. PLoS One 2013; 8:e75993. [PMID: 24146805 PMCID: PMC3795689 DOI: 10.1371/journal.pone.0075993] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 08/20/2013] [Indexed: 01/09/2023] Open
Abstract
Ontologies support automatic sharing, combination and analysis of life sciences data. They undergo regular curation and enrichment. We studied the impact of an ontology evolution on its structural complexity. As a case study we used the sixty monthly releases between January 2008 and December 2012 of the Gene Ontology and its three independent branches, i.e. biological processes (BP), cellular components (CC) and molecular functions (MF). For each case, we measured complexity by computing metrics related to the size, the nodes connectivity and the hierarchical structure. The number of classes and relations increased monotonously for each branch, with different growth rates. BP and CC had similar connectivity, superior to that of MF. Connectivity increased monotonously for BP, decreased for CC and remained stable for MF, with a marked increase for the three branches in November and December 2012. Hierarchy-related measures showed that CC and MF had similar proportions of leaves, average depths and average heights. BP had a lower proportion of leaves, and a higher average depth and average height. For BP and MF, the late 2012 increase of connectivity resulted in an increase of the average depth and average height and a decrease of the proportion of leaves, indicating that a major enrichment effort of the intermediate-level hierarchy occurred. The variation of the number of classes and relations in an ontology does not provide enough information about the evolution of its complexity. However, connectivity and hierarchy-related metrics revealed different patterns of values as well as of evolution for the three branches of the Gene Ontology. CC was similar to BP in terms of connectivity, and similar to MF in terms of hierarchy. Overall, BP complexity increased, CC was refined with the addition of leaves providing a finer level of annotations but decreasing slightly its complexity, and MF complexity remained stable.
Collapse
|
9
|
Pesquita C, Couto FM. Predicting the extension of biomedical ontologies. PLoS Comput Biol 2012; 8:e1002630. [PMID: 23028267 PMCID: PMC3441454 DOI: 10.1371/journal.pcbi.1002630] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 06/11/2012] [Indexed: 11/25/2022] Open
Abstract
Developing and extending a biomedical ontology is a very demanding task that can never be considered complete given our ever-evolving understanding of the life sciences. Extension in particular can benefit from the automation of some of its steps, thus releasing experts to focus on harder tasks. Here we present a strategy to support the automation of change capturing within ontology extension where the need for new concepts or relations is identified. Our strategy is based on predicting areas of an ontology that will undergo extension in a future version by applying supervised learning over features of previous ontology versions. We used the Gene Ontology as our test bed and obtained encouraging results with average f-measure reaching 0.79 for a subset of biological process terms. Our strategy was also able to outperform state of the art change capturing methods. In addition we have identified several issues concerning prediction of ontology evolution, and have delineated a general framework for ontology extension prediction. Our strategy can be applied to any biomedical ontology with versioning, to help focus either manual or semi-automated extension methods on areas of the ontology that need extension. Biomedical knowledge is complex and in constant evolution and growth, making it difficult for researchers to keep up with novel discoveries. Ontologies have become essential to help with this issue since they provide a standardized format to describe knowledge that facilitates its storing, sharing and computational analysis. However, the effort to keep a biomedical ontology up-to-date is a demanding and costly task involving several experts. Much of this effort is dedicated to the addition of new elements to extend the ontology to cover new areas of knowledge. We have developed an automated methodology to identify areas of the ontology that need extension based on past versions of the ontology as well as external data such as references in scientific literature and ontology usage. This can be a valuable help to semi-automated ontology extension systems, since they can focus on the subdomains of the identified ontology areas thus reducing the amount of information to process, which in turn releases ontology developers to focus on more complex ontology evolution tasks. By contributing to a faster rate of ontology evolution, we hope to positively impact ontology-based applications such as natural language processing, computer reasoning, information integration or semantic querying of heterogenous data.
Collapse
Affiliation(s)
- Catia Pesquita
- Faculty of Sciences, University of Lisboa, Lisboa, Portugal.
| | | |
Collapse
|
10
|
COnto-Diff: generation of complex evolution mappings for life science ontologies. J Biomed Inform 2012; 46:15-32. [PMID: 22580476 DOI: 10.1016/j.jbi.2012.04.009] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2011] [Revised: 04/05/2012] [Accepted: 04/07/2012] [Indexed: 11/20/2022]
Abstract
Life science ontologies evolve frequently to meet new requirements or to better reflect the current domain knowledge. The development and adaptation of large and complex ontologies is typically performed collaboratively by several curators. To effectively manage the evolution of ontologies it is essential to identify the difference (Diff) between ontology versions. Such a Diff supports the synchronization of changes in collaborative curation, the adaptation of dependent data such as annotations, and ontology version management. We propose a novel approach COnto-Diff to determine an expressive and invertible diff evolution mapping between given versions of an ontology. Our approach first matches the ontology versions and determines an initial evolution mapping consisting of basic change operations (insert/update/delete). To semantically enrich the evolution mapping we adopt a rule-based approach to transform the basic change operations into a smaller set of more complex change operations, such as merge, split, or changes of entire subgraphs. The proposed algorithm is customizable in different ways to meet the requirements of diverse ontologies and application scenarios. We evaluate the proposed approach for large life science ontologies including the Gene Ontology and the NCI Thesaurus and compare it with PromptDiff. We further show how the Diff results can be used for version management and annotation migration in collaborative curation.
Collapse
|
11
|
Hartung M, Gross A, Rahm E. CODEX: exploration of semantic changes between ontology versions. Bioinformatics 2012; 28:895-6. [DOI: 10.1093/bioinformatics/bts029] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
12
|
Kirsten T, Gross A, Hartung M, Rahm E. GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J Biomed Semantics 2011; 2:6. [PMID: 21914205 PMCID: PMC3198872 DOI: 10.1186/2041-1480-2-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Accepted: 09/13/2011] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Ontologies are increasingly used to structure and semantically describe entities of domains, such as genes and proteins in life sciences. Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data. RESULTS We present GOMMA, a generic infrastructure for managing and analyzing life science ontologies and their evolution. GOMMA utilizes a generic repository to uniformly and efficiently manage ontology versions and different kinds of mappings. Furthermore, it provides components for ontology matching, and determining evolutionary ontology changes. These components are used by analysis tools, such as the Ontology Evolution Explorer (OnEX) and the detection of unstable ontology regions. We introduce the component-based infrastructure and show analysis results for selected components and life science applications. GOMMA is available at http://dbs.uni-leipzig.de/GOMMA. CONCLUSIONS GOMMA provides a comprehensive and scalable infrastructure to manage large life science ontologies and analyze their evolution. Key functions include a generic storage of ontology versions and mappings, support for ontology matching and determining ontology changes. The supported features for analyzing ontology changes are helpful to assess their impact on ontology-dependent applications such as for term enrichment. GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.
Collapse
Affiliation(s)
- Toralf Kirsten
- Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
| | | | | | | |
Collapse
|
13
|
Hartung M, Gross A, Kirsten T, Rahm E. Discovering Evolving Regions in Life Science Ontologies. LECTURE NOTES IN COMPUTER SCIENCE 2010. [DOI: 10.1007/978-3-642-15120-0_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|