1
|
Getting More Out of Clinical Documentation: Can Clinical Dashboards Yield Clinically Useful Information? ADMINISTRATION AND POLICY IN MENTAL HEALTH AND MENTAL HEALTH SERVICES RESEARCH 2024; 51:268-285. [PMID: 38261119 DOI: 10.1007/s10488-023-01329-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/29/2023] [Indexed: 01/24/2024]
Abstract
This study investigated coded data retrieved from clinical dashboards, which are decision-support tools that include a graphical display of clinical progress and clinical activities. Data were extracted from clinical dashboards representing 256 youth (M age = 11.9) from 128 practitioners who were trained in the Managing and Adapting Practice (MAP) system (Chorpita & Daleiden in BF Chorpita EL Daleiden 2014 Structuring the collaboration of science and service in pursuit of a shared vision. 43(2):323 338. 2014, Chorpita & Daleiden in BF Chorpita EL Daleiden 2018 Coordinated strategic action: Aspiring to wisdom in mental health service systems. 25(4):e12264. 2018) in 55 agencies across 5 regional mental health systems. Practitioners labeled up to 35 fields (i.e., descriptions of clinical activities), with the options of drawing from a controlled vocabulary or writing in a client-specific activity. Practitioners then noted when certain activities occurred during the episode of care. Fields from the extracted data were coded and reliability was assessed for Field Type, Practice Element Type, Target Area, and Audience (e.g., Caregiver Psychoeducation: Anxiety would be coded as Field Type = Practice Element; Practice Element Type = Psychoeducation; Target Area = Anxiety; Audience = Caregiver). Coders demonstrated moderate to almost perfect interrater reliability. On average, practitioners recorded two activities per session, and clients had 10 unique activities across all their sessions. Results from multilevel models showed that clinical activity characteristics and sessions accounted for the most variance in the occurrence, recurrence, and co-occurrence of clinical activities, with relatively less variance accounted for by practitioners, clients, and regional systems. Findings are consistent with patterns of practice reported in other studies and suggest that clinical dashboards may be a useful source of clinical information. More generally, the use of a controlled vocabulary for clinical activities appears to increase the retrievability and actionability of healthcare information and thus sets the stage for advancing the utility of clinical documentation.
Collapse
|
2
|
Development of a 3-Step theory of suicide ontology to facilitate 3ST factor extraction from clinical progress notes. J Biomed Inform 2024; 150:104582. [PMID: 38160758 DOI: 10.1016/j.jbi.2023.104582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/21/2023] [Accepted: 12/22/2023] [Indexed: 01/03/2024]
Abstract
OBJECTIVE Suicide risk prediction algorithms at the Veterans Health Administration (VHA) do not include predictors based on the 3-Step Theory of suicide (3ST), which builds on hopelessness, psychological pain, connectedness, and capacity for suicide. These four factors are not available from structured fields in VHA electronic health records, but they are found in unstructured clinical text. An ontology and controlled vocabulary that maps psychosocial and behavioral terms to these factors does not exist. The objectives of this study were 1) to develop an ontology with a controlled vocabulary of terms that map onto classes that represent the 3ST factors as identified within electronic clinical progress notes, and 2) to determine the accuracy of automated extractions based on terms in the controlled vocabulary. METHODS A team of four annotators did linguistic annotation of 30,000 clinical progress notes from 231 Veterans in VHA electronic health records who attempted suicide or who died by suicide for terms relating to the 3ST factors. Annotation involved manually assigning a label to words or phrases that indicated presence or absence of the factor (polarity). These words and phrases were entered into a controlled vocabulary that was then used by our computational system to tag 14 million clinical progress notes from Veterans who attempted or died by suicide after 2013. Tagged text was extracted and machine-labelled for presence or absence of the 3ST factors. Accuracy of these machine-labels was determined for 1000 randomly selected extractions for each factor against a ground truth created by our annotators. RESULTS Linguistic annotation identified 8486 terms that related to 33 subclasses across the four factors and polarities. Precision of machine-labeled extractions ranged from 0.73 to 1.00 for most factor-polarity combinations, whereas recall was somewhat lower 0.65-0.91. CONCLUSION The ontology that was developed consists of classes that represent each of the four 3ST factors, subclasses, relationships, and terms that map onto those classes which are stored in a controlled vocabulary (https://bioportal.bioontology.org/ontologies/THREE-ST). The use case that we present shows how scores based on clinical notes tagged for terms in the controlled vocabulary capture meaningful change in the 3ST factors during weeks preceding a suicidal event.
Collapse
|
3
|
Machine translation of standardised medical terminology using natural language processing: A scoping review. N Biotechnol 2023; 77:120-129. [PMID: 37652265 DOI: 10.1016/j.nbt.2023.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 08/01/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023]
Abstract
Standardised medical terminologies are used to ensure accurate and consistent communication of information and to facilitate data exchange. Currently, many terminologies are only available in English, which hinders international research and automated processing of medical data. Natural language processing (NLP) and Machine Translation (MT) methods can be used to automatically translate these terms. This scoping review examines the research on automated translation of standardised medical terminology. A search was performed in PubMed and Web of Science and results were screened for eligibility by title and abstract as well as full text screening. In addition to bibliographic data, the following data items were considered: 'terminology considered', 'terms considered', 'source language', 'target language', 'translation type', 'NLP technique', 'NLP system', 'machine translation system', 'data source' and 'translation quality'. The results showed that the most frequently translated terminology is SNOMED CT (39.1%), followed by MeSH (13%), ICD (13%) and UMLS (8.7%). The most common source language is English (55.9%), and the most common target language is German (41.2%). Translation methods are often based on Statistical Machine Translation (SMT) (41.7%) and, more recently, Neural Machine Translation (NMT) (30.6%), but can also be combined with various MT methods. Commercial translators such as Google Translate (36.4%) and automatic validation methods such as BLEU (22.2%) are frequently used tools for translation and subsequent validation.
Collapse
|
4
|
Use of the clinical care classification in South Korean nursing practice: Challenges and opportunities. Int J Med Inform 2023; 170:104968. [PMID: 36603388 DOI: 10.1016/j.ijmedinf.2022.104968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/29/2022] [Accepted: 12/13/2022] [Indexed: 12/29/2022]
Abstract
BACKGROUND AND OBJECTIVES A government-driven standardization of nursing terminology including the Clinical Care Classification (CCC) was endorsed in South Korea in 2015, but the number of hospitals who have adopted this standard terminology remains unknown. This study aimed to determine the CCC awareness, adoption, and utilization statuses and its association with patient experience in South Korea. DESIGN, SETTING, AND PARTICIPANTS A nationwide telephone survey was conducted from January 13 to February 12, 2022 among 217 tertiary and secondary hospitals participating in the health information exchange network. The survey questionnaire included 22 items in 3 categories: current status of electronic nursing records, awareness and adoption of standard terminology, and open-ended questions regarding standard usage and dissemination. General characteristics and experience scores of the patients of the surveyed hospitals were collected from the publicly available data sources. Data analysis was performed using descriptive statistics, t-test, and generalized linear regression. MAIN OUTCOMES AND MEASURES The rates of awareness and adoption in hospitals to the nursing terminology standard of the CCC were calculated, and the current status of electronic nursing records used in practice was examined. The relationships between CCC awareness and the characteristics of hospitals in their patient experiences of health services were also identified. RESULTS The survey response rate was 24.9 % (54/217). Two out of three hospitals (68.5 %) were aware of the CCC. These hospitals had 800 beds or more, and higher scores for patient experience. CCC awareness was significantly related to increases in the overall scores for patient experiences (t = 2.70, p =.0103), but no significance with sub-score for nursing service (t = 1.23, p =.1594). CONCLUSIONS With a high adoption rate of electronic medical record systems, two-third hospitals acknowledged their CCC awareness, but were still lagged in adoption and usage of it in practice with operational challenges. The CCC awareness has potential relationships with positive patient experience.
Collapse
|
5
|
Patient safety classification, taxonomy and ontology systems: A systematic review on development and evaluation methodologies. J Biomed Inform 2022; 133:104150. [PMID: 35878822 DOI: 10.1016/j.jbi.2022.104150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 06/11/2022] [Accepted: 07/19/2022] [Indexed: 11/24/2022]
Abstract
INTRODUCTION Patient safety classifications/ontologies enable patient safety information systems to receive and analyze patient safety data to improve patient safety. Patient safety classifications/ontologies have been developed and evaluated using a variety of methods. The purpose of this review was to discuss and analyze the methodologies for developing and evaluating patient safety classifications/ontologies. METHODS Studies that developed or evaluated patient safety classifications, terminologies, taxonomies, or ontologies were searched through Google Scholar, Google search engines, National Center for Biomedical Ontology (NCBO) BioPortal, Open Biological and Biomedical Ontology (OBO) Foundry and World Health Organization (WHO) websites and Scopus, Web of Science, PubMed, and Science Direct. We updated our search on 30 February 2021 and included all studies published until the end of 2020. Studies that developed or evaluated classifications only for patient safety and provided information on how they were developed or evaluated were included. Systems with covered patient safety terms (such as ICD-10) but are not specifically developed for patient safety were excluded. The quality and the risk of bias of studies were not assessed because all methodologies and criteria were intended to be covered. In addition, we analyzed the data through descriptive narrative synthesis and compared and classified the development and evaluation methods and evaluation criteria according to available development and evaluation approaches for biomedical ontologies. RESULTS We identified 84 articles that met all of the inclusion criteria, resulting in 70 classifications/ontologies, nine of which were for the general medical domain. The most papers were published in 2010 and 2011, with 8 and 7 papers, respectively. The United States (50) and Australia (23) have the most studies. The most commonly used methods for developing classifications/ontologies included the use of existing systems (for expanding or mapping) (44) and qualitative analysis of event reports (39). The most common evaluation methods were coding or classifying some safety report samples (25), quantitative analysis of incidents based on the developed classification (24), and consensus among physicians (16). The most commonly applied evaluation criteria were reliability (27), content and face validity (9), comprehensiveness (6), usability (5), linguistic clarity (5), and impact (4), respectively. CONCLUSIONS Because of the weaknesses and strengths of the development/evaluation methods, it is advised that more than one method for development or evaluation, as well as evaluation criteria, should be used. To organize the processes of developing classification/ontologies, well-established approaches such as Methontology are recommended. The most prevalent evaluation methods applied in this domain are well fitted to the biomedical ontology evaluation methods, but it is also advised to apply some evaluation approaches such as logic, rules, and Natural language processing (NLP) based in combination with other evaluation approaches. This research can assist domain researchers in developing or evaluating domain ontologies using more complete methodologies. There is also a lack of reporting consistency in the literature and same methods or criteria were reported with different terminologies.
Collapse
|
6
|
Development and Implementation of a Standard Format for Clinical Laboratory Test Results. Am J Clin Pathol 2022; 158:409-415. [PMID: 35713605 DOI: 10.1093/ajcp/aqac067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 05/04/2022] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVES Surprisingly, laboratory results, the principal output of clinical laboratories, are not standardized. Thus, laboratories frequently report results with identical meaning in different formats. For example, laboratories report a positive pregnancy test as "+," "P," or "Positive." To assess the feasibility of a widespread implementation of a result standard, we (1) developed a standard result format for common laboratory tests and (2) implemented a feedback system for clinical laboratories to view their unstandardized results. METHODS In the largest integrated health care system in America, 130 facilities had the opportunity to collaboratively develop the standard. For 15 weeks, clinical laboratories received a weekly report of their unstandardized results. At the study's conclusion, laboratories were compared with themselves and their peers by metrics that reflected their unstandardized results. RESULTS We rereviewed 156 million test results and observed a 51% decline in the rate of unstandardized results. The number of facilities with fewer than 23 unstandardized results per 100,000 (Six Sigma σ > 5) increased by 58% (52 to 82 facilities; β = 1.79; P < .001). CONCLUSIONS This study demonstrated significant improvement in the standardization of clinical laboratory results in a relatively short time. The laboratory community should create and promulgate a standardized result format.
Collapse
|
7
|
The ECOTOXicology Knowledgebase: A Curated Database of Ecologically Relevant Toxicity Tests to Support Environmental Research and Risk Assessment. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2022; 41:1520-1539. [PMID: 35262228 PMCID: PMC9408435 DOI: 10.1002/etc.5324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/25/2021] [Accepted: 02/28/2022] [Indexed: 05/19/2023]
Abstract
The need for assembled existing and new toxicity data has accelerated as the amount of chemicals introduced into commerce continues to grow and regulatory mandates require safety assessments for a greater number of chemicals. To address this evolving need, the ECOTOXicology Knowledgebase (ECOTOX) was developed starting in the 1980s and is currently the world's largest compilation of curated ecotoxicity data, providing support for assessments of chemical safety and ecological research through systematic and transparent literature review procedures. The recently released version of ECOTOX (Ver 5, www.epa.gov/ecotox) provides single-chemical ecotoxicity data for over 12,000 chemicals and ecological species with over one million test results from over 50,000 references. Presented is an overview of ECOTOX, detailing the literature review and data curation processes within the context of current systematic review practices and discussing how recent updates improve the accessibility and reusability of data to support the assessment, management, and research of environmental chemicals. Relevant and acceptable toxicity results are identified from studies in the scientific literature, with pertinent methodological details and results extracted following well-established controlled vocabularies and newly extracted toxicity data added quarterly to the public website. Release of ECOTOX, Ver 5, included an entirely redesigned user interface with enhanced data queries and retrieval options, visualizations to aid in data exploration, customizable outputs for export and use in external applications, and interoperability with chemical and toxicity databases and tools. This is a reliable source of curated ecological toxicity data for chemical assessments and research and continues to evolve with accessible and transparent state-of-the-art practices in literature data curation and increased interoperability to other relevant resources. Environ Toxicol Chem 2022;41:1520-1539. © 2022 SETAC. This article has been contributed to by US Government employees and their work is in the public domain in the USA.
Collapse
|
8
|
Trends in… Controlled Vocabulary and Health Equity. Med Ref Serv Q 2022; 41:185-201. [PMID: 35511428 DOI: 10.1080/02763869.2022.2060638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Medical librarians collaborate with physicians and other healthcare professionals to improve the quality and accessibility of medical information, which includes assembling the best evidence to advance health equality through teaching and research. This column brings together brief cases highlighting the experiences and perspectives of medical librarians, educators, and healthcare professionals using their organizational, pedagogical, and information-analysis skills to advance health equality indexing.
Collapse
|
9
|
Abstract
Molecular interaction databases aim to systematically capture and organize the experimental interaction information described in the scientific literature. These data can then be used to perform network analysis, to assign putative roles to uncharacterized proteins and to investigate their involvement in cellular pathways.This chapter gives a brief overview of publicly available molecular interaction databases and focuses on the members of the IMEx Consortium, on their curation policies and standard data formats. All of the goals achieved by IMEx databases over the last 15 years, the data types provided and the many different ways in which such data can be utilized by the research community, are described in detail. The IMEx databases curate molecular interaction data to the highest caliber, following a detailed curation model and supplying rich metadata by employing common curation rules and harmonized standards. The IMEx Consortium provides comprehensively annotated molecular interaction data integrated into a single, non-redundant, open access dataset.
Collapse
|
10
|
Biomedical Ontologies to Guide AI Development in Radiology. J Digit Imaging 2021; 34:1331-1341. [PMID: 34724143 PMCID: PMC8669056 DOI: 10.1007/s10278-021-00527-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 04/27/2021] [Accepted: 10/13/2021] [Indexed: 10/25/2022] Open
Abstract
The advent of deep learning has engendered renewed and rapidly growing interest in artificial intelligence (AI) in radiology to analyze images, manipulate textual reports, and plan interventions. Applications of deep learning and other AI approaches must be guided by sound medical knowledge to assure that they are developed successfully and that they address important problems in biomedical research or patient care. To date, AI has been applied to a limited number of real-world radiology applications. As AI systems become more pervasive and are applied more broadly, they will benefit from medical knowledge on a larger scale, such as that available through computer-based approaches. A key approach to represent computer-based knowledge in a particular domain is an ontology. As defined in informatics, an ontology defines a domain's terms through their relationships with other terms in the ontology. Those relationships, then, define the terms' semantics, or "meaning." Biomedical ontologies commonly define the relationships between terms and more general terms, and can express causal, part-whole, and anatomic relationships. Ontologies express knowledge in a form that is both human-readable and machine-computable. Some ontologies, such as RSNA's RadLex radiology lexicon, have been applied to applications in clinical practice and research, and may be familiar to many radiologists. This article describes how ontologies can support research and guide emerging applications of AI in radiology, including natural language processing, image-based machine learning, radiomics, and planning.
Collapse
|
11
|
Map-Assisted Generation of Procedure and Intervention Encoding (Magpie): An Innovative Approach for ICD-10-PCS Coding. Stud Health Technol Inform 2019; 264:428-432. [PMID: 31437959 DOI: 10.3233/shti190257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
ICD-10-PCS coding is challenging because of the large number of codes, non-intuitive terms and paucity of the ICD-10-PCS index. We previously repurposed the richer ICD-9-CM procedure index for ICD-10-PCS coding. We have developed the MAGPIE tool based on the repurposed ICD-9-CM index with other lexical and mapping resources. MAGPIE helps the user to identify SNOMED CT and ICD-10-PCS codes for medical procedures. MAGPIE uses three innovative search approaches: cascading search (SNOMED CT to ICD-9-CM to ICD-10-PCS), hybrid lexical and map-assisted matching, and semantic filtering of ICD-10-PCS codes. Our evaluation showed that MAGPIE found the correct SNOMED CT code and ICD-10-PCS table in 70% and 85% of cases respectively, without any user intervention. MAGPIE is available online from the NLM website: magpie.nlm.nih.gov.
Collapse
|
12
|
Building and Querying RDF/OWL Database of Semantically Annotated Nuclear Medicine Images. J Digit Imaging 2018; 30:4-10. [PMID: 27785632 DOI: 10.1007/s10278-016-9916-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
As the use of positron emission tomography-computed tomography (PET-CT) has increased rapidly, there is a need to retrieve relevant medical images that can assist image interpretation. However, the images themselves lack the explicit information needed for query. We constructed a semantically structured database of nuclear medicine images using the Annotation and Image Markup (AIM) format and evaluated the ability the AIM annotations to improve image search. We created AIM annotation templates specific to the nuclear medicine domain and used them to annotate 100 nuclear medicine PET-CT studies in AIM format using controlled vocabulary. We evaluated image retrieval from 20 specific clinical queries. As the gold standard, two nuclear medicine physicians manually retrieved the relevant images from the image database using free text search of radiology reports for the same queries. We compared query results with the manually retrieved results obtained by the physicians. The query performance indicated a 98 % recall for simple queries and a 89 % recall for complex queries. In total, the queries provided 95 % (75 of 79 images) recall, 100 % precision, and an F1 score of 0.97 for the 20 clinical queries. Three of the four images missed by the queries required reasoning for successful retrieval. Nuclear medicine images augmented using semantic annotations in AIM enabled high recall and precision for simple queries, helping physicians to retrieve the relevant images. Further study using a larger data set and the implementation of an inference engine may improve query results for more complex queries.
Collapse
|
13
|
Representing and organizing information to describe the lived experience of health from a personal factors perspective in the light of the International Classification of Functioning, Disability and Health (ICF): a discussion paper. Disabil Rehabil 2018; 41:1727-1738. [PMID: 29509044 DOI: 10.1080/09638288.2018.1445302] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
PURPOSE To discuss the representation and organization of information describing persons' lived experience of health from a personal factors perspective in the light of the International Classification of Functioning, Disability and Health, using spinal cord injury as a case in point for disability. METHODS The scientific literature was reviewed, discussion rounds conducted, and qualitative secondary analyses of data carried out using an iterative inductive-deductive approach. RESULTS Conceptual considerations are explicated that distinguish the personal factors perspective from other components of the International Classification of Functioning, Disability and Health. A representation structure is developed that organizes health-related concepts describing the internal context of functioning. Concepts are organized as individual facts, subjective experiences, and recurrent patterns of experience and behavior specifying 7 areas and 211 concept groups. CONCLUSIONS The article calls for further scientific debate on the perspective of personal factors in the light of the International Classification of Functioning, Disability and Health. A structure that organizes concepts in relation to a personal factors perspective can enhance the comprehensiveness, transparency and standardization of health information, and contribute to the empowerment of persons with disabilities. Implications for rehabilitation The present study collected data from scientific literature reviews, discussion rounds and qualitative secondary analyses in order to develop a representation and organization of information describing persons' lived experience of health from a personal factors perspective in the light of the International Classification of Functioning, Disability and Health. The following representation structure for health-related information from a personal factors perspective was developed: (i) Individuals facts (i.e., socio-demographical factors, position in the immediate social and physical context, personal history and biography), (ii) subjective experience (i.e., feelings, thoughts and beliefs, motives), and (iii) recurrent patterns of experience (i.e., feelings, thoughts and beliefs) and behavior. With this study, we aim to stimulate further scientific discussion about the personal factors component in the International Classification of Functioning, Disability and Health, including its application and subsequent validation for potential implementation into clinical practice.
Collapse
|
14
|
Harmonising phenomics information for a better interoperability in the rare disease field. Eur J Med Genet 2018; 61:706-714. [PMID: 29425702 DOI: 10.1016/j.ejmg.2018.01.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 11/30/2017] [Accepted: 01/27/2018] [Indexed: 01/30/2023]
Abstract
HIPBI-RD (Harmonising phenomics information for a better interoperability in the rare disease field) is a three-year project which started in 2016 funded via the E-Rare 3 ERA-NET program. This project builds on three resources largely adopted by the rare disease (RD) community: Orphanet, its ontology ORDO (the Orphanet Rare Disease Ontology), HPO (the Human Phenotype Ontology) as well as PhenoTips software for the capture and sharing of structured phenotypic data for RD patients. Our project is further supported by resources developed by the European Bioinformatics Institute and the Garvan Institute. HIPBI-RD aims to provide the community with an integrated, RD-specific bioinformatics ecosystem that will harmonise the way phenomics information is stored in databases and patient files worldwide, and thereby contribute to interoperability. This ecosystem will consist of a suite of tools and ontologies, optimized to work together, and made available through commonly used software repositories. The project workplan follows three main objectives: The HIPBI-RD ecosystem will contribute to the interpretation of variants identified through exome and full genome sequencing by harmonising the way phenotypic information is collected, thus improving diagnostics and delineation of RD. The ultimate goal of HIPBI-RD is to provide a resource that will contribute to bridging genome-scale biology and a disease-centered view on human pathobiology. Achievements in Year 1.
Collapse
|
15
|
Abstract
Molecular interaction databases collect, organize, and enable the analysis of the increasing amounts of molecular interaction data being produced and published as we move towards a more complete understanding of the interactomes of key model organisms. The organization of these data in a structured format supports analyses such as the modeling of pairwise relationships between interactors into interaction networks and is a powerful tool for understanding the complex molecular machinery of the cell. This chapter gives an overview of the principal molecular interaction databases, in particular the IMEx databases, and their curation policies, use of standardized data formats and quality control rules. Special attention is given to the MIntAct project, in which IntAct and MINT joined forces to create a single resource to improve curation and software development efforts. This is exemplified as a model for the future of molecular interaction data collation and dissemination.
Collapse
|
16
|
Querying EHRs with a Semantic and Entity-Oriented Query Language. Stud Health Technol Inform 2017; 235:121-125. [PMID: 28423767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
While the digitization of medical documents has greatly expanded during the past decade, health information retrieval has become a great challenge to address many issues in medical research. Information retrieval in electronic health records (EHR) should also reduce the difficult tasks of manual information retrieval from records in paper format or computer. The aim of this article was to present the features of a semantic search engine implemented in EHRs. A flexible, scalable and entity-oriented query language tool is proposed. The program is designed to retrieve and visualize data which can support any Conceptual Data Model. The search engine deals with structured and unstructured data, for a sole patient from a caregiver perspective, and for a number of patients (e.g. epidemiology). Several types of queries on a test database containing 2,000 anonymized patients EHRs (i.e. approximately 200,000 records) were tested. These queries were able to accurately treat symbolic, textual, numerical and chronological data.
Collapse
|
17
|
Analysis of multi-dimensional contemporaneous EHR data to refine delirium assessments. Comput Biol Med 2016; 75:267-74. [PMID: 27340924 DOI: 10.1016/j.compbiomed.2016.06.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 06/10/2016] [Accepted: 06/13/2016] [Indexed: 12/16/2022]
Abstract
Delirium is a potentially lethal condition of altered mental status, attention, and level of consciousness with an acute onset and fluctuating course. Its causes are multi-factorial, and its pathophysiology is not well understood; therefore clinical focus has been on prevention strategies and early detection. One patient evaluation technique in routine use is the Confusion Assessment Method (CAM): a relatively simple test resulting in 'positive', 'negative' or 'unable-to-assess' (UTA) ratings. Hartford Hospital nursing staff use the CAM regularly on all non-critical care units, and a high frequency of UTA was observed after reviewing several years of records. In addition, patients with UTA ratings displayed poor outcomes such as in-hospital mortality, longer lengths of stay, and discharge to acute and long term care facilities. We sought to better understand the use of UTA, especially outside of critical care environments, in order to improve delirium detection throughout the hospital. An unsupervised clustering approach was used with additional, concurrent assessment data available in the EHR to categorize patient visits with UTA CAMs. The results yielded insights into the most common situations in which the UTA rating was used (e.g. impaired verbal communication, dementia), suggesting potentially inappropriate ratings that could be refined with further evaluation and remedied with updated clinical training. Analysis of the patient clusters also suggested that unrecognized delirium may contribute to the poor outcomes associated with the use of UTA. This method of using temporally related high dimensional EHR data to illuminate a dynamic medical condition could have wider applicability.
Collapse
|
18
|
A software tool for the input and management of phenotypic data using personal digital assistants and other mobile devices. PLANT METHODS 2015; 11:25. [PMID: 25866550 PMCID: PMC4393613 DOI: 10.1186/s13007-015-0069-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 03/19/2015] [Indexed: 05/29/2023]
Abstract
BACKGROUND Plant breeding and genetics demand fast, exact and reproducible phenotyping. Efficient statistical evaluation of phenotyping data requires standardised data storage ensuring long-term data availability while maintaining intellectual property rights. This is state of the art at phenomics centres, which, however, are unavailable for most scientists. For them we developed a simple and cost-efficient system, the Phenotyper, which employs mobile devices or personal digital assistants (PDA) for on-site data entry and open-source software for data management. RESULTS A graphical user interface (GUI) on a PDA replaces paper-based form sheet and data entry on a desktop. The user can define his phenotyping schemes in a web tool without in-depth knowledge of the system and thus adjust it more easily to new research aspects than in a classical laboratory information management system (LIMS). In the Phenotyper, schemes are built from controlled vocabulary gained from published ontologies. Vocabulary and schemes are stored in a database that also manages the user access. From the web page, schemes are downloaded as extended markup language (XML) files for the transfer to the PDA and the exchange between users. On the PDA, the GUI displays the schemes and stores data in comma separated value format and XML format. After manual quality control, data are uploaded via a web page to an independently hosted results database, in which data are stored in an entity-attribute-value structure to provide maximum flexibility. Datasets are linked to the original and curated data files stored on a file server. The ownership stamp, project affiliation and date stamp of a dataset are used to regulate data access, which is restricted to data belonging to the user or to his projects and data, for which the embargo period has ended. By export of standardised ASCII reports to long-term data storage facility, long-term accessibility allows searching, citing and use of raw data beyond the lifetime of the database. The Phenotyper is available to the scientific community for use and further development. CONCLUSIONS The Phenotyper provides a well-structured, but flexible data acquisition and management structure for mobile on-site measurements for efficient evaluation and shared use of data.
Collapse
|
19
|
Unsupervised mining of frequent tags for clinical eligibility text indexing. J Biomed Inform 2013; 46:1145-51. [PMID: 24036004 DOI: 10.1016/j.jbi.2013.08.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Revised: 08/24/2013] [Accepted: 08/29/2013] [Indexed: 10/26/2022]
Abstract
Clinical text, such as clinical trial eligibility criteria, is largely underused in state-of-the-art medical search engines due to difficulties of accurate parsing. This paper proposes a novel methodology to derive a semantic index for clinical eligibility documents based on a controlled vocabulary of frequent tags, which are automatically mined from the text. We applied this method to eligibility criteria on ClinicalTrials.gov and report that frequent tags (1) define an effective and efficient index of clinical trials and (2) are unlikely to grow radically when the repository increases. We proposed to apply the semantic index to filter clinical trial search results and we concluded that frequent tags reduce the result space more efficiently than an uncontrolled set of UMLS concepts. Overall, unsupervised mining of frequent tags from clinical text leads to an effective semantic index for the clinical eligibility documents and promotes their computational reuse.
Collapse
|
20
|
Analysis of eligibility criteria representation in industry-standard clinical trial protocols. J Biomed Inform 2013; 46:805-13. [PMID: 23770150 DOI: 10.1016/j.jbi.2013.06.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Revised: 05/30/2013] [Accepted: 06/03/2013] [Indexed: 10/26/2022]
Abstract
Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from ClinicalTrials.gov (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this study we first compare the representation characteristics and textual complexity of a set of Pfizer's internal full-text protocols to their corresponding entries in CT. Next, we identify clusters of similar criteria sentences from both full-text and CT protocols and outline methods for standardized representation of eligibility criteria. We also study the distribution of eligibility criteria in full-text and CT protocols with respect to pre-defined semantic classes used for eligibility criteria classification. We find that in comparison to full-text protocols, CT protocols are not only more condensed but also convey less information. We also find no correlation between the variations in word-counts of the ClinicalTrials.gov and full-text protocols. While we identify 65 and 103 clusters of inclusion and exclusion criteria from full text protocols, our methods found only 36 and 63 corresponding clusters from CT protocols. For both the full-text and CT protocols we are able to identify 'templates' for standardized representations with full-text standardization being more challenging of the two. In our exploration of the semantic class distributions we find that the majority of the inclusion criteria from both full-text and CT protocols belong to the semantic class "Diagnostic and Lab Results" while "Disease, Sign or Symptom" forms the majority for exclusion criteria. Overall, we show that developing a template set of eligibility criteria for clinical trials, specifically in their full-text form, is feasible and could lead to more efficient clinical trial protocol design.
Collapse
|
21
|
Implementation of a platform dedicated to the biomedical analysis terminologies management. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2011; 2011:1418-1427. [PMID: 22195205 PMCID: PMC3243140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
BACKGROUND AND OBJECTIVES Assistance Publique - Hôpitaux de Paris (AP-HP) is implementing a new laboratory management system (LMS) common to the 12 hospital groups. First step to this process was to acquire a biological analysis dictionary. This dictionary is interfaced with the international nomenclature LOINC, and has been developed in collaboration with experts from all biological disciplines. In this paper we describe in three steps (modeling, data migration and integration/verification) the implementation of a platform for publishing and maintaining the AP-HP laboratory data dictionary (AnaBio). MATERIAL AND METHODS Due to data complexity and volume, setting up a platform dedicated to the terminology management was a key requirement. This is an enhancement tackling identified weaknesses of previous spreadsheet tool. Our core model allows interoperability regarding data exchange standards and dictionary evolution. RESULTS We completed our goals within one year. In addition, structuring data representation has lead to a significant data quality improvement (impacting more than 10% of data). The platform is active in the 21 hospitals of the institution spread into 165 laboratories.
Collapse
|