Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

18
(from Reference Citation Analysis)

Article PDFs (9)

Cited by > 0 (14)

Searched Name

Knowledge bases

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Newton AJH, Chartash D, Kleinstein SH, McDougal RA. A pipeline for the retrieval and extraction of domain-specific information with application to COVID-19 immune signatures. BMC Bioinformatics 2023;24:292. [PMID: 37474900 PMCID: PMC10357743 DOI: 10.1186/s12859-023-05397-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 06/23/2023] [Indexed: 07/22/2023] Open

Abstract

BACKGROUND

The accelerating pace of biomedical publication has made it impractical to manually, systematically identify papers containing specific information and extract this information. This is especially challenging when the information itself resides beyond titles or abstracts. For emerging science, with a limited set of known papers of interest and an incomplete information model, this is of pressing concern. A timely example in retrospect is the identification of immune signatures (coherent sets of biomarkers) driving differential SARS-CoV-2 infection outcomes.

IMPLEMENTATION

We built a classifier to identify papers containing domain-specific information from the document embeddings of the title and abstract. To train this classifier with limited data, we developed an iterative process leveraging pre-trained SPECTER document embeddings, SVM classifiers and web-enabled expert review to iteratively augment the training set. This training set was then used to create a classifier to identify papers containing domain-specific information. Finally, information was extracted from these papers through a semi-automated system that directly solicited the paper authors to respond via a web-based form.

RESULTS

We demonstrate a classifier that retrieves papers with human COVID-19 immune signatures with a positive predictive value of 86%. The type of immune signature (e.g., gene expression vs. other types of profiling) was also identified with a positive predictive value of 74%. Semi-automated queries to the corresponding authors of these publications requesting signature information achieved a 31% response rate.

CONCLUSIONS

Our results demonstrate the efficacy of using a SVM classifier with document embeddings of the title and abstract, to retrieve papers with domain-specific information, even when that information is rarely present in the abstract. Targeted author engagement based on classifier predictions offers a promising pathway to build a semi-structured representation of such information. Through this approach, partially automated literature mining can help rapidly create semi-structured knowledge repositories for automatic analysis of emerging health threats.

Collapse

Lai TM, Zhai C, Ji H. KEBLM: Knowledge-Enhanced Biomedical Language Models. J Biomed Inform 2023;143:104392. [PMID: 37211194 DOI: 10.1016/j.jbi.2023.104392] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 04/17/2023] [Accepted: 05/12/2023] [Indexed: 05/23/2023]

Abstract

Pretrained language models (PLMs) have demonstrated strong performance on many natural language processing (NLP) tasks. Despite their great success, these PLMs are typically pretrained only on unstructured free texts without leveraging existing structured knowledge bases that are readily available for many domains, especially scientific domains. As a result, these PLMs may not achieve satisfactory performance on knowledge-intensive tasks such as biomedical NLP. Comprehending a complex biomedical document without domain-specific knowledge is challenging, even for humans. Inspired by this observation, we propose a general framework for incorporating various types of domain knowledge from multiple sources into biomedical PLMs. We encode domain knowledge using lightweight adapter modules, bottleneck feed-forward networks that are inserted into different locations of a backbone PLM. For each knowledge source of interest, we pretrain an adapter module to capture the knowledge in a self-supervised way. We design a wide range of self-supervised objectives to accommodate diverse types of knowledge, ranging from entity relations to description sentences. Once a set of pretrained adapters is available, we employ fusion layers to combine the knowledge encoded within these adapters for downstream tasks. Each fusion layer is a parameterized mixer of the available trained adapters that can identify and activate the most useful adapters for a given input. Our method diverges from prior work by including a knowledge consolidation phase, during which we teach the fusion layers to effectively combine knowledge from both the original PLM and newly-acquired external knowledge using a large collection of unannotated texts. After the consolidation phase, the complete knowledge-enhanced model can be fine-tuned for any downstream task of interest to achieve optimal performance. Extensive experiments on many biomedical NLP datasets show that our proposed framework consistently improves the performance of the underlying PLMs on various downstream tasks such as natural language inference, question answering, and entity linking. These results demonstrate the benefits of using multiple sources of external knowledge to enhance PLMs and the effectiveness of the framework for incorporating knowledge into PLMs. While primarily focused on the biomedical domain in this work, our framework is highly adaptable and can be easily applied to other domains, such as the bioenergy sector.

Collapse

Pereira A, Almeida JR, Lopes RP, Oliveira JL. Querying semantic catalogues of biomedical databases. J Biomed Inform 2023;137:104272. [PMID: 36563828 DOI: 10.1016/j.jbi.2022.104272] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 11/03/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]

Abstract

BACKGROUND

Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.

METHODS

We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.

RESULTS

Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical ontologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https://bioinformatics-ua.github.io/BioKBQA/.

CONCLUSION

We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.

Collapse

Denton N, Mulberg AE, Molloy M, Charleston S, Fajgenbaum DC, Marsh ED, Howard P. Sharing is caring: a call for a new era of rare disease research and development. Orphanet J Rare Dis 2022;17:389. [PMID: 36303170 PMCID: PMC9612604 DOI: 10.1186/s13023-022-02529-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 08/05/2022] [Accepted: 10/02/2022] [Indexed: 01/25/2023] Open

Abstract

Scientific advances in the understanding of the genetics and mechanisms of many rare diseases with previously unknown etiologies are inspiring optimism in the patient, clinical, and research communities and there is hope that disease-specific treatments are on the way. However, the rare disease community has reached a critical point in which its increasingly fragmented structure and operating models are threatening its ability to harness the full potential of advancing genomic and computational technologies. Changes are therefore needed to overcome these issues plaguing many rare diseases while also supporting economically viable therapy development. In "Data silos are undermining drug development and failing rare disease patients (Orphanet Journal of Rare Disease, Apr 2021)," we outlined many of the broad issues underpinning the increasingly fragmented and siloed nature of the rare disease space, as well as how the issues encountered by this community are representative of biomedical research more generally. Here, we propose several initiatives for key stakeholders - including regulators, private and public foundations, and research institutions - to reorient the rare disease ecosystem and its incentives in a way that we believe would cultivate and accelerate innovation. Specifically, we propose supporting non-proprietary patient registries, greater data standardization, global regulatory harmonization, and new business models that encourage data sharing and research collaboration as the default mode. Leadership needs to be integrated across sectors to drive meaningful change between patients, industry, sponsors, and academic medical centers. To transform the research and development landscape and unlock its vast healthcare, economic, and scientific potential for rare disease patients, a new model is ultimately the goal for all.

Collapse

Tute E, Ganapathy N, Wulff A. A data driven learning approach for the assessment of data quality. BMC Med Inform Decis Mak 2021;21:302. [PMID: 34724930 PMCID: PMC8561935 DOI: 10.1186/s12911-021-01656-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 10/14/2021] [Indexed: 11/16/2022] Open

Abstract

Background

Data quality assessment is important but complex and task dependent. Identifying suitable measurement methods and reference ranges for assessing their results is challenging. Manually inspecting the measurement results and current data driven approaches for learning which results indicate data quality issues have considerable limitations, e.g. to identify task dependent thresholds for measurement results that indicate data quality issues.

Objectives

To explore the applicability and potential benefits of a data driven approach to learn task dependent knowledge about suitable measurement methods and assessment of their results. Such knowledge could be useful for others to determine whether a local data stock is suitable for a given task.

Methods

We started by creating artificial data with previously defined data quality issues and applied a set of generic measurement methods on this data (e.g. a method to count the number of values in a certain variable or the mean value of the values). We trained decision trees on exported measurement methods’ results and corresponding outcome data (data that indicated the data’s suitability for a use case). For evaluation, we derived rules for potential measurement methods and reference values from the decision trees and compared these regarding their coverage of the true data quality issues artificially created in the dataset. Three researchers independently derived these rules. One with knowledge about present data quality issues and two without.

Results

Our self-trained decision trees were able to indicate rules for 12 of 19 previously defined data quality issues. Learned knowledge about measurement methods and their assessment was complementary to manual interpretation of measurement methods’ results.

Conclusions

Our data driven approach derives sensible knowledge for task dependent data quality assessment and complements other current approaches. Based on labeled measurement methods’ results as training data, our approach successfully suggested applicable rules for checking data quality characteristics that determine whether a dataset is suitable for a given task.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12911-021-01656-x.

Collapse

Steiner B, Saalfeld B, Elgert L, Haux R, Wolf KH. OnTARi: an ontology for factors influencing therapy adherence to rehabilitation. BMC Med Inform Decis Mak 2021;21:153. [PMID: 33975585 PMCID: PMC8111729 DOI: 10.1186/s12911-021-01512-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/28/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Adherence and motivation are key factors for successful treatment of patients with chronic diseases, especially in long-term care processes like rehabilitation. However, only a few patients achieve good treatment adherence. The causes are manifold. Adherence-influencing factors vary depending on indications, therapies, and individuals. Positive and negative effects are rarely confirmed or even contradictory. An ontology seems to be convenient to represent existing knowledge in this domain and to make it available for information retrieval.

METHODS

First, a manual data extraction of current knowledge in the domain of treatment adherence in rehabilitation was conducted. Data was retrieved from various sources, including basic literature, scientific publications, and health behavior models. Second, all adherence and motivation factors identified were formalized according to the ontology development methodology METHONTOLOGY. This comprises the specification, conceptualization, formalization, and implementation of the ontology "Ontology for factors influencing therapy adherence to rehabilitation" (OnTARi) in Protégé. A taxonomy-oriented evaluation was conducted by two domain experts.

RESULTS

OnTARi includes 281 classes implemented in ontology web language, ten object properties, 22 data properties, 1440 logical axioms, 244 individuals, and 1023 annotations. Six higher-level classes are differentiated: (1) Adherence, (2) AdherenceFactors, (3) AdherenceFactorCategory, (4) Rehabilitation, (5) RehabilitationForm, and (6) RehabilitationType. By means of the class AdherenceFactors 227 adherence factors, thereof 49 hard factors, are represented. Each factor involves a proper description, synonyms, possibly existing acronyms, and a German translation. OnTARi illustrates links between adherence factors through 160 influences-relations. Description logic queries implemented in Protégé allow multiple targeted requests, e.g., for the extraction of adherence factors in a specific rehabilitation area.

CONCLUSIONS

With OnTARi, a generic reference model was built to represent potential adherence and motivation factors and their interrelations in rehabilitation of patients with chronic diseases. In terms of information retrieval, this formalization can serve as a basis for implementation and adaptation of conventional rehabilitative measures, taking into account (patient-specific) adherence factors. OnTARi also enables the development of medical assistance systems to increase motivation and adherence in rehabilitation processes.

Collapse

Denton N, Molloy M, Charleston S, Lipset C, Hirsch J, Mulberg AE, Howard P, Marsh ED. Data silos are undermining drug development and failing rare disease patients. Orphanet J Rare Dis 2021;16:161. [PMID: 33827602 PMCID: PMC8025897 DOI: 10.1186/s13023-021-01806-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 03/30/2021] [Indexed: 11/10/2022] Open

Abstract

Data silos are proliferating while research and development activity explode following genetic and immunological advances for many clinically described disorders with previously unknown etiologies. The latter event has inspired optimism in the patient, clinical, and research communities that disease-specific treatments are on the way. However, we fear the tendency of various stakeholders to balkanize databases in proprietary formats, driven by current economic and academic incentives, will inevitably fragment the expanding knowledge base and undermine current and future research efforts to develop much-needed treatments. The proliferation of proprietary databases, compounded by a paucity of meaningful outcome measures and/or good natural history data, slows our ability to generate scalable solutions to benefit chronically underserved patient populations in ways that would translate to more common diseases. The current research and development landscape sets too many projects up for unnecessary failure, particularly in the rare disease sphere, and does a grave disservice to highly vulnerable patients. This system also encourages the collection of redundant data in uncoordinated parallel studies and registries to ultimately delay or deny potential treatments for ostensibly tractable diseases; it also promotes the waste of precious time, energy, and resources. Groups at the National Institutes of Health and Food and Drug Administration have started programs to address these issues. However, we and many others feel there should be significantly more discussion of how to coordinate and scale registry efforts. Such discourse aims to reduce needless complexity and duplication of efforts, as well as promote a pre-competitive knowledge ecosystem for rare disease drug development that cultivates and accelerates innovation.

Collapse

Denton N, Molloy M, Charleston S, Lipset C, Hirsch J, Mulberg AE, Howard P, Marsh ED. Data silos are undermining drug development and failing rare disease patients. Orphanet J Rare Dis 2021. [PMID: 33827602 DOI: 10.1186/s13023-021-01806-4)] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023] Open

Abstract

Collapse

Tute E, Scheffner I, Marschollek M. A method for interoperable knowledge-based data quality assessment. BMC Med Inform Decis Mak 2021;21:93. [PMID: 33750371 PMCID: PMC7942002 DOI: 10.1186/s12911-021-01458-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 02/26/2021] [Indexed: 11/10/2022] Open

Gerdesköld C, Toth-Pal E, Wårdh I, Nilsson GH, Nager A. Use of online knowledge base in primary health care and correlation to health care quality: an observational study. BMC Med Inform Decis Mak 2020;20:294. [PMID: 33198720 PMCID: PMC7670813 DOI: 10.1186/s12911-020-01313-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Accepted: 10/30/2020] [Indexed: 11/10/2022] Open

Nydal R, Bennett G, Kuiper M, Lægreid A. Silencing trust: confidence and familiarity in re-engineering knowledge infrastructures. Med Health Care Philos 2020;23:471-484. [PMID: 32468194 PMCID: PMC7426298 DOI: 10.1007/s11019-020-09957-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Silsand L, Severinsen GH, Pedersen R, Ellingsen G. Preconditions for Enabling Advanced Patient-Centered Decision Support on a National Knowledge Information Infrastructure. Stud Health Technol Inform 2019;264:1773-1774. [PMID: 31438337 DOI: 10.3233/shti190641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Zhu R, Han S, Su Y, Zhang C, Yu Q, Duan Z. The application of big data and the development of nursing science: A discussion paper. Int J Nurs Sci 2019;6:229-34. [PMID: 31406897 DOI: 10.1016/j.ijnss.2019.03.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/20/2019] [Accepted: 03/04/2019] [Indexed: 11/23/2022] Open

Lenert MC, Walsh CG, Miller RA. Discovering hidden knowledge through auditing clinical diagnostic knowledge bases. J Biomed Inform 2018;84:75-81. [PMID: 29940263 DOI: 10.1016/j.jbi.2018.06.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 06/19/2018] [Accepted: 06/21/2018] [Indexed: 11/21/2022]

Maiella S, Olry A, Hanauer M, Lanneau V, Lourghi H, Donadille B, Rodwell C, Köhler S, Seelow D, Jupp S, Parkinson H, Groza T, Brudno M, Robinson PN, Rath A. Harmonising phenomics information for a better interoperability in the rare disease field. Eur J Med Genet 2018;61:706-714. [PMID: 29425702 DOI: 10.1016/j.ejmg.2018.01.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 11/30/2017] [Accepted: 01/27/2018] [Indexed: 01/30/2023]

Banos O, Bilal Amin M, Ali Khan W, Afzal M, Hussain M, Kang BH, Lee S. The Mining Minds digital health and wellness framework. Biomed Eng Online 2016;15 Suppl 1:76. [PMID: 27454608 PMCID: PMC4959395 DOI: 10.1186/s12938-016-0179-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Abstract

Background

The provision of health and wellness care is undergoing an enormous transformation. A key element of this revolution consists in prioritizing prevention and proactivity based on the analysis of people’s conducts and the empowerment of individuals in their self-management. Digital technologies are unquestionably destined to be the main engine of this change, with an increasing number of domain-specific applications and devices commercialized every year; however, there is an apparent lack of frameworks capable of orchestrating and intelligently leveraging, all the data, information and knowledge generated through these systems.

Methods

This work presents Mining Minds, a novel framework that builds on the core ideas of the digital health and wellness paradigms to enable the provision of personalized support. Mining Minds embraces some of the most prominent digital technologies, ranging from Big Data and Cloud Computing to Wearables and Internet of Things, as well as modern concepts and methods, such as context-awareness, knowledge bases or analytics, to holistically and continuously investigate on people’s lifestyles and provide a variety of smart coaching and support services.

Results

This paper comprehensively describes the efficient and rational combination and interoperation of these technologies and methods through Mining Minds, while meeting the essential requirements posed by a framework for personalized health and wellness support. Moreover, this work presents a realization of the key architectural components of Mining Minds, as well as various exemplary user applications and expert tools to illustrate some of the potential services supported by the proposed framework.

Conclusions

Mining Minds constitutes an innovative holistic means to inspect human behavior and provide personalized health and wellness support. The principles behind this framework uncover new research ideas and may serve as a reference for similar initiatives.

Collapse

Llano MT, Colton S, Hepworth R, Gow J. Automated Fictional Ideation via Knowledge Base Manipulation. Cognit Comput 2016;8:153-174. [PMID: 27110296 PMCID: PMC4826667 DOI: 10.1007/s12559-015-9366-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2014] [Accepted: 11/07/2015] [Indexed: 11/24/2022]

McCoy AB, Wright A, Rogith D, Fathiamini S, Ottenbacher AJ, Sittig DF. Development of a clinician reputation metric to identify appropriate problem-medication pairs in a crowdsourced knowledge base. J Biomed Inform 2013;48:66-72. [PMID: 24321170 DOI: 10.1016/j.jbi.2013.11.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Revised: 10/23/2013] [Accepted: 11/29/2013] [Indexed: 02/08/2023]

Abstract

BACKGROUND

Correlation of data within electronic health records is necessary for implementation of various clinical decision support functions, including patient summarization. A key type of correlation is linking medications to clinical problems; while some databases of problem-medication links are available, they are not robust and depend on problems and medications being encoded in particular terminologies. Crowdsourcing represents one approach to generating robust knowledge bases across a variety of terminologies, but more sophisticated approaches are necessary to improve accuracy and reduce manual data review requirements.

OBJECTIVE

We sought to develop and evaluate a clinician reputation metric to facilitate the identification of appropriate problem-medication pairs through crowdsourcing without requiring extensive manual review.

APPROACH

We retrieved medications from our clinical data warehouse that had been prescribed and manually linked to one or more problems by clinicians during e-prescribing between June 1, 2010 and May 31, 2011. We identified measures likely to be associated with the percentage of accurate problem-medication links made by clinicians. Using logistic regression, we created a metric for identifying clinicians who had made greater than or equal to 95% appropriate links. We evaluated the accuracy of the approach by comparing links made by those physicians identified as having appropriate links to a previously manually validated subset of problem-medication pairs.

RESULTS

Of 867 clinicians who asserted a total of 237,748 problem-medication links during the study period, 125 had a reputation metric that predicted the percentage of appropriate links greater than or equal to 95%. These clinicians asserted a total of 2464 linked problem-medication pairs (983 distinct pairs). Compared to a previously validated set of problem-medication pairs, the reputation metric achieved a specificity of 99.5% and marginally improved the sensitivity of previously described knowledge bases.

CONCLUSION

A reputation metric may be a valuable measure for identifying high quality clinician-entered, crowdsourced data.

Collapse