1
|
Vos RA, Katayama T, Mishima H, Kawano S, Kawashima S, Kim JD, Moriya Y, Tokimatsu T, Yamaguchi A, Yamamoto Y, Wu H, Amstutz P, Antezana E, Aoki NP, Arakawa K, Bolleman JT, Bolton E, Bonnal RJP, Bono H, Burger K, Chiba H, Cohen KB, Deutsch EW, Fernández-Breis JT, Fu G, Fujisawa T, Fukushima A, García A, Goto N, Groza T, Hercus C, Hoehndorf R, Itaya K, Juty N, Kawashima T, Kim JH, Kinjo AR, Kotera M, Kozaki K, Kumagai S, Kushida T, Lütteke T, Matsubara M, Miyamoto J, Mohsen A, Mori H, Naito Y, Nakazato T, Nguyen-Xuan J, Nishida K, Nishida N, Nishide H, Ogishima S, Ohta T, Okuda S, Paten B, Perret JL, Prathipati P, Prins P, Queralt-Rosinach N, Shinmachi D, Suzuki S, Tabata T, Takatsuki T, Taylor K, Thompson M, Uchiyama I, Vieira B, Wei CH, Wilkinson M, Yamada I, Yamanaka R, Yoshitake K, Yoshizawa AC, Dumontier M, Kosaki K, Takagi T. BioHackathon 2015: Semantics of data for life sciences and reproducible research. F1000Res 2020; 9:136. [PMID: 32308977 PMCID: PMC7141167 DOI: 10.12688/f1000research.18236.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/05/2020] [Indexed: 01/08/2023] Open
Abstract
We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.
Collapse
Affiliation(s)
- Rutger A. Vos
- Institute of Biology Leiden, Leiden University, Leiden, The Netherlands
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | | | - Hiroyuki Mishima
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Shin Kawano
- Database Center for Life Science, Tokyo, Japan
| | | | | | - Yuki Moriya
- Database Center for Life Science, Tokyo, Japan
| | | | | | | | - Hongyan Wu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | | | - Erick Antezana
- Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway
| | - Nobuyuki P. Aoki
- Faculty of Science and Engineering, SOKA University, Tokyo, Japan
| | - Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Tokyo, Japan
| | - Jerven T. Bolleman
- SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Lausanne, Switzerland
| | - Evan Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Raoul J. P. Bonnal
- Istituto Nazionale Genetica Molecolare, Romeo ed Enrica Invernizzi, Milan, Italy
| | | | - Kees Burger
- Dutch Techcentre for Life Sciences, Utrecht, The Netherlands
| | - Hirokazu Chiba
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Kevin B. Cohen
- Computational Bioscience Program, University of Colorado School of Medicine, Denver, USA
- Université Paris-Saclay, LIMSI, CNRS, Paris, France
| | | | | | - Gang Fu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | | | | | | | - Naohisa Goto
- Research Institute for Microbial Diseases, Osaka University, Osaka, Japan
| | - Tudor Groza
- St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Darlinghurst, Australia
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, Australia
| | - Colin Hercus
- Novocraft Technologies Sdn. Bhd., Selangor, Malaysia
| | - Robert Hoehndorf
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Kotone Itaya
- Institute for Advanced Biosciences, Keio University, Tokyo, Japan
| | - Nick Juty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | | | - Jee-Hyub Kim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Akira R. Kinjo
- Institute for Protein Research, Osaka University, Osaka, Japan
| | - Masaaki Kotera
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Kouji Kozaki
- The Institute of Scientific and Industrial Research, Osaka University, Osaka, Japan
| | | | - Tatsuya Kushida
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
| | - Thomas Lütteke
- Institute of Veterinary Physiology and Biochemistry, Justus-Liebig University Giessen, Giessen, Germany
- Gesellschaft für innovative Personalwirtschaftssysteme mbH (GIP GmbH), Offenbach, Germany
| | | | | | - Attayeb Mohsen
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Hiroshi Mori
- Center for Information Biology, National Institute of Genetics, Mishima, Japan
| | - Yuki Naito
- Database Center for Life Science, Tokyo, Japan
| | | | | | | | - Naoki Nishida
- Department of Systems Science, Osaka University, Osaka, Japan
| | - Hiroyo Nishide
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Soichi Ogishima
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
| | - Tazro Ohta
- Database Center for Life Science, Tokyo, Japan
| | - Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, USA
| | | | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, Osaka, Japan
| | - Pjotr Prins
- University Medical Center Utrecht, Utrecht, The Netherlands
- University of Tennessee Health Science Center, Memphis, USA
| | - Núria Queralt-Rosinach
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Shinya Suzuki
- School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Tsuyosi Tabata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | | | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Mark Thompson
- Leiden University Medical Center, Leiden, The Netherlands
| | - Ikuo Uchiyama
- National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Japan
| | - Bruno Vieira
- WurmLab, School of Biological & Chemical Sciences, Queen Mary University of London, London, UK
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, USA
| | - Mark Wilkinson
- Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid, Madrid, Spain
| | | | | | - Kazutoshi Yoshitake
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| | | | - Michel Dumontier
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands
| | - Kenjiro Kosaki
- Center for Medical Genetics, Keio University School of Medicine, Tokyo, Japan
| | - Toshihisa Takagi
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
2
|
Alobaidi M, Malik KM, Hussain M. Automated ontology generation framework powered by linked biomedical ontologies for disease-drug domain. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 165:117-128. [PMID: 30337066 DOI: 10.1016/j.cmpb.2018.08.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Revised: 07/31/2018] [Accepted: 08/14/2018] [Indexed: 06/08/2023]
Abstract
OBJECTIVE AND BACKGROUND The exponential growth of the unstructured data available in biomedical literature, and Electronic Health Record (EHR), requires powerful novel technologies and architectures to unlock the information hidden in the unstructured data. The success of smart healthcare applications such as clinical decision support systems, disease diagnosis systems, and healthcare management systems depends on knowledge that is understandable by machines to interpret and infer new knowledge from it. In this regard, ontological data models are expected to play a vital role to organize, integrate, and make informative inferences with the knowledge implicit in that unstructured data and represent the resultant knowledge in a form that machines can understand. However, constructing such models is challenging because they demand intensive labor, domain experts, and ontology engineers. Such requirements impose a limit on the scale or scope of ontological data models. We present a framework that will allow mitigating the time-intensity to build ontologies and achieve machine interoperability. METHODS Empowered by linked biomedical ontologies, our proposed novel Automated Ontology Generation Framework consists of five major modules: a) Text Processing using compute on demand approach. b) Medical Semantic Annotation using N-Gram, ontology linking and classification algorithms, c) Relation Extraction using graph method and Syntactic Patterns, d), Semantic Enrichment using RDF mining, e) Domain Inference Engine to build the formal ontology. RESULTS Quantitative evaluations show 84.78% recall, 53.35% precision, and 67.70% F-measure in terms of disease-drug concepts identification; 85.51% recall, 69.61% precision, and F-measure 76.74% with respect to taxonomic relation extraction; and 77.20% recall, 40.10% precision, and F-measure 52.78% with respect to biomedical non-taxonomic relation extraction. CONCLUSION We present an automated ontology generation framework that is empowered by Linked Biomedical Ontologies. This framework integrates various natural language processing, semantic enrichment, syntactic pattern, and graph algorithm based techniques. Moreover, it shows that using Linked Biomedical Ontologies enables a promising solution to the problem of automating the process of disease-drug ontology generation.
Collapse
Affiliation(s)
- Mazen Alobaidi
- Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA
| | - Khalid Mahmood Malik
- Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA.
| | - Maqbool Hussain
- Department of Software, College of Electronics and Information Engineering, Sejong University, Seoul, South Korea
| |
Collapse
|
3
|
Jacobson M, Sedeño-Cortés AE, Pavlidis P. Monitoring changes in the Gene Ontology and their impact on genomic data analysis. Gigascience 2018; 7:5069393. [PMID: 30107399 PMCID: PMC6113503 DOI: 10.1093/gigascience/giy103] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 07/27/2018] [Accepted: 08/06/2018] [Indexed: 01/01/2023] Open
Abstract
Background The Gene Ontology (GO) is one of the most widely used resources in molecular and cellular biology, largely through the use of "enrichment analysis." To facilitate informed use of GO, we present GOtrack (https://gotrack.msl.ubc.ca), which provides access to historical records and trends in the GO and GO annotations. Findings GOtrack gives users access to gene- and term-level information on annotations for nine model organisms as well as an interactive tool that measures the stability of enrichment results over time for user-provided "hit lists" of genes. To document the effects of GO evolution on enrichment, we analyzed more than 2,500 published hit lists of human genes (most older than 9 years ); 53% of hit lists were considered to yield significantly stable enrichment results. Conclusions Because stability is far from assured for any individual hit list, GOtrack can lead to more informed and cautious application of GO to genomics research.
Collapse
Affiliation(s)
- Matthew Jacobson
- Michael Smith Laboratories, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
- Department of Psychiatry, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| | - Adriana Estela Sedeño-Cortés
- Graduate Program in Bioinformatics, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| | - Paul Pavlidis
- Michael Smith Laboratories, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
- Department of Psychiatry, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| |
Collapse
|
4
|
Tomczak A, Mortensen JM, Winnenburg R, Liu C, Alessi DT, Swamy V, Vallania F, Lofgren S, Haynes W, Shah NH, Musen MA, Khatri P. Interpretation of biological experiments changes with evolution of the Gene Ontology and its annotations. Sci Rep 2018; 8:5115. [PMID: 29572502 PMCID: PMC5865181 DOI: 10.1038/s41598-018-23395-2] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 03/12/2018] [Indexed: 12/12/2022] Open
Abstract
Gene Ontology (GO) enrichment analysis is ubiquitously used for interpreting high throughput molecular data and generating hypotheses about underlying biological phenomena of experiments. However, the two building blocks of this analysis — the ontology and the annotations — evolve rapidly. We used gene signatures derived from 104 disease analyses to systematically evaluate how enrichment analysis results were affected by evolution of the GO over a decade. We found low consistency between enrichment analyses results obtained with early and more recent GO versions. Furthermore, there continues to be a strong annotation bias in the GO annotations where 58% of the annotations are for 16% of the human genes. Our analysis suggests that GO evolution may have affected the interpretation and possibly reproducibility of experiments over time. Hence, researchers must exercise caution when interpreting GO enrichment analyses and should reexamine previous analyses with the most recent GO version.
Collapse
Affiliation(s)
- Aurelie Tomczak
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA.,Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Jonathan M Mortensen
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Rainer Winnenburg
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Charles Liu
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Dominique T Alessi
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Varsha Swamy
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Francesco Vallania
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Shane Lofgren
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Winston Haynes
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Mark A Musen
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Purvesh Khatri
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA. .,Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
5
|
Yu G, Lu C, Wang J. NoGOA: predicting noisy GO annotations using evidences and sparse representation. BMC Bioinformatics 2017; 18:350. [PMID: 28732468 PMCID: PMC5521088 DOI: 10.1186/s12859-017-1764-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 07/14/2017] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. RESULTS We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. CONCLUSIONS The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .
Collapse
Affiliation(s)
- Guoxian Yu
- College of Computer and Information Sciences, Southwest University, Chongqing, China.
| | - Chang Lu
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Jun Wang
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| |
Collapse
|
6
|
Abstract
The Gene Ontology (GO) is a formidable resource, but there are several considerations about it that are essential to understand the data and interpret it correctly. The GO is sufficiently simple that it can be used without deep understanding of its structure or how it is developed, which is both a strength and a weakness. In this chapter, we discuss some common misinterpretations of the ontology and the annotations. A better understanding of the pitfalls and the biases in the GO should help users make the most of this very rich resource. We also review some of the misconceptions and misleading assumptions commonly made about GO, including the effect of data incompleteness, the importance of annotation qualifiers, and the transitivity or lack thereof associated with different ontology relations. We also discuss several biases that can confound aggregate analyses such as gene enrichment analyses. For each of these pitfalls and biases, we suggest remedies and best practices.
Collapse
Affiliation(s)
- Pascale Gaudet
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel-Servet, 1211, Geneva 4, Switzerland. .,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, 1211, Geneva, Switzerland.
| | - Christophe Dessimoz
- Department of Genetics, Evolution & Environment, University College London, Gower St, London, WC1E 6BT, UK.,Swiss Institute of Bioinformatics, Biophore Building, 1015, Lausanne, Switzerland.,Department of Ecology and Evolution, University of Lausanne, Street Biophore, 1015, Lausanne, Switzerland.,Center of Integrative Genomics, University of Lausanne, Biophore, 1015, Lausanne, Switzerland.,Department of Computer Science, University College London, Gower St, WC1E 6BT, London, UK
| |
Collapse
|
7
|
Gordon CL, Weng C. Combining expert knowledge and knowledge automatically acquired from electronic data sources for continued ontology evaluation and improvement. J Biomed Inform 2015. [PMID: 26212414 DOI: 10.1016/j.jbi.2015.07.014] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
INTRODUCTION A common bottleneck during ontology evaluation is knowledge acquisition from domain experts for gold standard creation. This paper contributes a novel semi-automated method for evaluating the concept coverage and accuracy of biomedical ontologies by complementing expert knowledge with knowledge automatically extracted from clinical practice guidelines and electronic health records, which minimizes reliance on expensive domain expertise for gold standards generation. METHODS We developed a bacterial clinical infectious diseases ontology (BCIDO) to assist clinical infectious disease treatment decision support. Using a semi-automated method we integrated diverse knowledge sources, including publically available infectious disease guidelines from international repositories, electronic health records, and expert-generated infectious disease case scenarios, to generate a compendium of infectious disease knowledge and use it to evaluate the accuracy and coverage of BCIDO. RESULTS BCIDO has three classes (i.e., infectious disease, antibiotic, bacteria) containing 593 distinct concepts and 2345 distinct concept relationships. Our semi-automated method generated an ID knowledge compendium consisting of 637 concepts and 1554 concept relationships. Overall, BCIDO covered 79% (504/637) of the concepts and 89% (1378/1554) of the concept relationships in the ID compendium. BCIDO coverage of ID compendium concepts was 92% (121/131) for antibiotic, 80% (205/257) for infectious disease, and 72% (178/249) for bacteria. The low coverage of bacterial concepts in BCIDO was due to a difference in concept granularity between BCIDO and infectious disease guidelines. Guidelines and expert generated scenarios were the richest source of ID concepts and relationships while patient records provided relatively fewer concepts and relationships. CONCLUSIONS Our semi-automated method was cost-effective for generating a useful knowledge compendium with minimal reliance on domain experts. This method can be useful for continued development and evaluation of biomedical ontologies for better accuracy and coverage.
Collapse
Affiliation(s)
- Claire L Gordon
- Department of Medicine, Columbia University Medical Center, 630 West 168th Street, New York, USA; Department of Biomedical Informatics, Columbia University Medical Center, 622 West 168th Street, New York, NY 10032, USA; Department of Medicine, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Medical Center, 622 West 168th Street, New York, NY 10032, USA.
| |
Collapse
|
8
|
Sedeño-Cortés AE, Pavlidis P. Pitfalls in the application of gene-set analysis to genetics studies. Trends Genet 2015; 30:513-4. [PMID: 25459301 DOI: 10.1016/j.tig.2014.10.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 10/01/2014] [Indexed: 11/25/2022]
Abstract
Gene-set analysis (GSA) (‘enrichment’) is a popular approach for the interpretation of genome-wide association studies (GWASs). GSA is most commonly applied to the analysis of transcriptomes, but from the outset it has been considered useful for any study that provides rankings or ‘hit lists’ of genes. The recent review by Mooney et al. [1] is a valuable resource for geneticists wishing to apply GSA to the output of GWASs. Here we describe some additional points of practical importance if the methods are to be applied and interpreted soundly.
Collapse
|
9
|
Fan WW, Chen B, Selvaraj G, Wu FX. Discovering biological patterns from short time-series gene expression profiles with integrating PPI data. Neurocomputing 2014; 145:3-13. [DOI: 10.1016/j.neucom.2014.02.068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
10
|
Hastings J, Brass A, Caine C, Jay C, Stevens R. Evaluating the Emotion Ontology through use in the self-reporting of emotional responses at an academic conference. J Biomed Semantics 2014; 5:38. [PMID: 25937879 PMCID: PMC4417517 DOI: 10.1186/2041-1480-5-38] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 08/14/2014] [Indexed: 01/21/2023] Open
Abstract
Background We evaluate the application of the Emotion Ontology (EM) to the task of self-reporting of emotional experience in the context of audience response to academic presentations at the International Conference on Biomedical Ontology (ICBO). Ontology evaluation is regarded as a difficult task. Types of ontology evaluation range from gauging adherence to some philosophical principles, following some engineering method, to assessing fitness for purpose. The Emotion Ontology (EM) represents emotions and all related affective phenomena, and should enable self-reporting or articulation of emotional states and responses; how do we know if this is the case? Here we use the EM ‘in the wild’ in order to evaluate the EM’s ability to capture people’s self-reported emotional responses to a situation through use of the vocabulary provided by the EM. Results To achieve this evaluation we developed a tool, EmOntoTag, in which audience members were able to capture their self-reported emotional responses to scientific presentations using the vocabulary offered by the EM. We furthermore asked participants using the tool to rate the appropriateness of an EM vocabulary term for capturing their self-assessed emotional response. Participants were also able to suggest improvements to the EM using a free-text feedback facility. Here, we present the data captured and analyse the EM’s fitness for purpose in reporting emotional responses to conference talks. Conclusions Based on our analysis of this data set, our primary finding is that the audience are able to articulate their emotional response to a talk via the EM, and reporting via the EM ontology is able to draw distinctions between the audience’s response to a speaker and between the speakers (or talks) themselves. Thus we can conclude that the vocabulary provided at the leaves of the EM are fit for purpose in this setting. We additionally obtained interesting observations from the experiment as a whole, such as that the majority of emotions captured had positive valence, and the free-form feedback supplied new terms for the EM. Availability EmOntoTag can be seen at http://www.bioontology.ch/emontotag; source code can be downloaded from http://emotion-ontology.googlecode.com/svn/trunk/apps/emontotag/and the ontology is available at http://purl.obolibrary.org/obo/MFOEM.owl. Electronic supplementary material The online version of this article (doi:10.1186/2041-1480-5-38) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Janna Hastings
- Cheminformatics and Metabolism, EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD UK ; Swiss Center for Affective Sciences, University of Geneva, Geneva, Switzerland
| | - Andy Brass
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL UK ; Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PL UK
| | - Colin Caine
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL UK
| | - Caroline Jay
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL UK
| | - Robert Stevens
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL UK
| |
Collapse
|
11
|
Soldatova LN, Sansone SA, Dumontier M, Shah NH. Selected papers from the 15th Annual Bio-Ontologies Special Interest Group Meeting. J Biomed Semantics 2013; 4 Suppl 1:I1. [PMID: 23735191 PMCID: PMC3633002 DOI: 10.1186/2041-1480-4-s1-i1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Over the 15 years, the Bio-Ontologies SIG at ISMB has provided a forum for discussion of the latest and most innovative research in the bio-ontologies development, its applications to biomedicine and more generally the organisation, presentation and dissemination of knowledge in biomedicine and the life sciences. The seven papers and the commentary selected for this supplement span a wide range of topics including: web-based querying over multiple ontologies, integration of data, annotating patent records, NCBO Web services, ontology developments for probabilistic reasoning and for physiological processes, and analysis of the progress of annotation and structural GO changes.
Collapse
|