Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pacaci A, Gonul S, Sinaci AA, Yuksel M, Laleci Erturkmen GB. A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies. Front Pharmacol 2018;9:435. [PMID: 29760661 PMCID: PMC5937227 DOI: 10.3389/fphar.2018.00435] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 04/12/2018] [Indexed: 11/13/2022] Open

For:	Pacaci A, Gonul S, Sinaci AA, Yuksel M, Laleci Erturkmen GB. A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies. Front Pharmacol 2018;9:435. [PMID: 29760661 PMCID: PMC5937227 DOI: 10.3389/fphar.2018.00435] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 04/12/2018] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Berman L, Ostchega Y, Giannini J, Anandan LP, Clark E, Spotnitz M, Sulieman L, Volynski M, Ramirez A. Application of a Data Quality Framework to Ductal Carcinoma In Situ Using Electronic Health Record Data From the All of Us Research Program. JCO Clin Cancer Inform 2024;8:e2400052. [PMID: 39178364 DOI: 10.1200/cci.24.00052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/27/2024] [Accepted: 07/17/2024] [Indexed: 08/25/2024] Open

Abstract

PURPOSE

The specific aims of this paper are to (1) develop and operationalize an electronic health record (EHR) data quality framework, (2) apply the dimensions of the framework to the phenotype and treatment pathways of ductal carcinoma in situ (DCIS) using All of Us Research Program data, and (3) propose and apply a checklist to evaluate the application of the framework.

METHODS

We developed a framework of five data quality dimensions (DQD; completeness, concordance, conformance, plausibility, and temporality). Participants signed a consent and Health Insurance Portability and Accountability Act authorization to share EHR data and responded to demographic questions in the Basics questionnaire. We evaluated the internal characteristics of the data and compared data with external benchmarks with descriptive and inferential statistics. We developed a DQD checklist to evaluate concept selection, internal verification, and external validity for each DQD. The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) concept ID codes for DCIS were used to select a cohort of 2,209 females 18 years and older.

RESULTS

Using the proposed DQD checklist criteria, (1) concepts were selected and internally verified for conformance; (2) concepts were selected and internally verified for completeness; (3) concepts were selected, internally verified, and externally validated for concordance; (4) concepts were selected, internally verified, and externally validated for plausibility; and (5) concepts were selected, internally verified, and externally validated for temporality.

CONCLUSION

This assessment and evaluation provided insights into data quality for the DCIS phenotype using EHR data from the All of Us Research Program. The review demonstrates that salient clinical measures can be selected, applied, and operationalized within a conceptual framework and evaluated for fitness for use by applying a proposed checklist.

Collapse

Peng Y, Bathelt F, Gebler R, Gött R, Heidenreich A, Henke E, Kadioglu D, Lorenz S, Vengadeswaran A, Sedlmayr M. Use of Metadata-Driven Approaches for Data Harmonization in the Medical Domain: Scoping Review. JMIR Med Inform 2024;12:e52967. [PMID: 38354027 PMCID: PMC10902772 DOI: 10.2196/52967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/01/2023] [Accepted: 12/03/2023] [Indexed: 03/02/2024] Open

Abstract

BACKGROUND

Multisite clinical studies are increasingly using real-world data to gain real-world evidence. However, due to the heterogeneity of source data, it is difficult to analyze such data in a unified way across clinics. Therefore, the implementation of Extract-Transform-Load (ETL) or Extract-Load-Transform (ELT) processes for harmonizing local health data is necessary, in order to guarantee the data quality for research. However, the development of such processes is time-consuming and unsustainable. A promising way to ease this is the generalization of ETL/ELT processes.

OBJECTIVE

In this work, we investigate existing possibilities for the development of generic ETL/ELT processes. Particularly, we focus on approaches with low development complexity by using descriptive metadata and structural metadata.

METHODS

We conducted a literature review following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We used 4 publication databases (ie, PubMed, IEEE Explore, Web of Science, and Biomed Center) to search for relevant publications from 2012 to 2022. The PRISMA flow was then visualized using an R-based tool (Evidence Synthesis Hackathon). All relevant contents of the publications were extracted into a spreadsheet for further analysis and visualization.

RESULTS

Regarding the PRISMA guidelines, we included 33 publications in this literature review. All included publications were categorized into 7 different focus groups (ie, medicine, data warehouse, big data, industry, geoinformatics, archaeology, and military). Based on the extracted data, ontology-based and rule-based approaches were the 2 most used approaches in different thematic categories. Different approaches and tools were chosen to achieve different purposes within the use cases.

CONCLUSIONS

Our literature review shows that using metadata-driven (MDD) approaches to develop an ETL/ELT process can serve different purposes in different thematic categories. The results show that it is promising to implement an ETL/ELT process by applying MDD approach to automate the data transformation from Fast Healthcare Interoperability Resources to Observational Medical Outcomes Partnership Common Data Model. However, the determining of an appropriate MDD approach and tool to implement such an ETL/ELT process remains a challenge. This is due to the lack of comprehensive insight into the characterizations of the MDD approaches presented in this study. Therefore, our next step is to evaluate the MDD approaches presented in this study and to determine the most appropriate MDD approaches and the way to integrate them into the ETL/ELT process. This could verify the ability of using MDD approaches to generalize the ETL process for harmonizing medical data.

Collapse

Zhang S, Benis N, Cornet R. Automated approach for quality assessment of RDF resources. BMC Med Inform Decis Mak 2023;23:90. [PMID: 37165363 PMCID: PMC10170671 DOI: 10.1186/s12911-023-02182-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 04/20/2023] [Indexed: 05/12/2023] Open

Abstract

INTRODUCTION

The Semantic Web community provides a common Resource Description Framework (RDF) that allows representation of resources such that they can be linked. To maximize the potential of linked data - machine-actionable interlinked resources on the Web - a certain level of quality of RDF resources should be established, particularly in the biomedical domain in which concepts are complex and high-quality biomedical ontologies are in high demand. However, it is unclear which quality metrics for RDF resources exist that can be automated, which is required given the multitude of RDF resources. Therefore, we aim to determine these metrics and demonstrate an automated approach to assess such metrics of RDF resources.

METHODS

An initial set of metrics are identified through literature, standards, and existing tooling. Of these, metrics are selected that fulfil these criteria: (1) objective; (2) automatable; and (3) foundational. Selected metrics are represented in RDF and semantically aligned to existing standards. These metrics are then implemented in an open-source tool. To demonstrate the tool, eight commonly used RDF resources were assessed, including data models in the healthcare domain (HL7 RIM, HL7 FHIR, CDISC CDASH), ontologies (DCT, SIO, FOAF, ORDO), and a metadata profile (GRDDL).

RESULTS

Six objective metrics are identified in 3 categories: Resolvability (1), Parsability (1), and Consistency (4), and represented in RDF. The tool demonstrates that these metrics can be automated, and application in the healthcare domain shows non-resolvable URIs (ranging from 0.3% to 97%) among all eight resources and undefined URIs in HL7 RIM, and FHIR. In the tested resources no errors were found for parsability and the other three consistency metrics for correct usage of classes and properties.

CONCLUSION

We extracted six objective and automatable metrics from literature, as the foundational quality requirements of RDF resources to maximize the potential of linked data. Automated tooling to assess resources has shown to be effective to identify quality issues that must be avoided. This approach can be expanded to incorporate more automatable metrics so as to reflect additional quality dimensions with the assessment tool implementing more metrics.

Collapse

Touré V, Krauss P, Gnodtke K, Buchhorn J, Unni D, Horki P, Raisaro JL, Kalt K, Teixeira D, Crameri K, Österle S. FAIRification of health-related data using semantic web technologies in the Swiss Personalized Health Network. Sci Data 2023;10:127. [PMID: 36899064 PMCID: PMC10006404 DOI: 10.1038/s41597-023-02028-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 02/17/2023] [Indexed: 03/12/2023] Open

Frid S, Pastor Duran X, Bracons Cucó G, Pedrera-Jiménez M, Serrano-Balazote P, Muñoz Carrero A, Lozano-Rubí R. An Ontology-Based Approach for Consolidating Patient Data Standardized With European Norm/International Organization for Standardization 13606 (EN/ISO 13606) Into Joint Observational Medical Outcomes Partnership (OMOP) Repositories: Description of a Methodology. JMIR Med Inform 2023;11:e44547. [PMID: 36884279 PMCID: PMC10034609 DOI: 10.2196/44547] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/28/2022] [Accepted: 01/05/2023] [Indexed: 01/06/2023] Open

Abstract

BACKGROUND

To discover new knowledge from data, they must be correct and in a consistent format. OntoCR, a clinical repository developed at Hospital Clínic de Barcelona, uses ontologies to represent clinical knowledge and map locally defined variables to health information standards and common data models.

OBJECTIVE

The aim of the study is to design and implement a scalable methodology based on the dual-model paradigm and the use of ontologies to consolidate clinical data from different organizations in a standardized repository for research purposes without loss of meaning.

METHODS

First, the relevant clinical variables are defined, and the corresponding European Norm/International Organization for Standardization (EN/ISO) 13606 archetypes are created. Data sources are then identified, and an extract, transform, and load process is carried out. Once the final data set is obtained, the data are transformed to create EN/ISO 13606-normalized electronic health record (EHR) extracts. Afterward, ontologies that represent archetyped concepts and map them to EN/ISO 13606 and Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) standards are created and uploaded to OntoCR. Data stored in the extracts are inserted into its corresponding place in the ontology, thus obtaining instantiated patient data in the ontology-based repository. Finally, data can be extracted via SPARQL queries as OMOP CDM-compliant tables.

RESULTS

Using this methodology, EN/ISO 13606-standardized archetypes that allow for the reuse of clinical information were created, and the knowledge representation of our clinical repository by modeling and mapping ontologies was extended. Furthermore, EN/ISO 13606-compliant EHR extracts of patients (6803), episodes (13,938), diagnosis (190,878), administered medication (222,225), cumulative drug dose (222,225), prescribed medication (351,247), movements between units (47,817), clinical observations (6,736,745), laboratory observations (3,392,873), limitation of life-sustaining treatment (1,298), and procedures (19,861) were created. Since the creation of the application that inserts data from extracts into the ontologies is not yet finished, the queries were tested and the methodology was validated by importing data from a random subset of patients into the ontologies using a locally developed Protégé plugin ("OntoLoad"). In total, 10 OMOP CDM-compliant tables ("Condition_occurrence," 864 records; "Death," 110; "Device_exposure," 56; "Drug_exposure," 5609; "Measurement," 2091; "Observation," 195; "Observation_period," 897; "Person," 922; "Visit_detail," 772; and "Visit_occurrence," 971) were successfully created and populated.

CONCLUSIONS

This study proposes a methodology for standardizing clinical data, thus allowing its reuse without any changes in the meaning of the modeled concepts. Although this paper focuses on health research, our methodology suggests that the data be initially standardized per EN/ISO 13606 to obtain EHR extracts with a high level of granularity that can be used for any purpose. Ontologies constitute a valuable approach for knowledge representation and standardization of health information in a standard-agnostic manner. With the proposed methodology, institutions can go from local raw data to standardized, semantically interoperable EN/ISO 13606 and OMOP repositories.

Collapse

Khnaisser C, Lavoie L, Fraikin B, Barton A, Dussault S, Burgun A, Ethier JF. Using an Ontology to Derive a Sharable and Interoperable Relational Data Model for Heterogeneous Healthcare Data and Various Applications. Methods Inf Med 2022;61:e73-e88. [PMID: 35709746 PMCID: PMC9788910 DOI: 10.1055/a-1877-9498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Abstract

BACKGROUND

A large volume of heavily fragmented data is generated daily in different healthcare contexts and is stored using various structures with different semantics. This fragmentation and heterogeneity make secondary use of data a challenge. Data integration approaches that derive a common data model from sources or requirements have some advantages. However, these approaches are often built for a specific application where the research questions are known. Thus, the semantic and structural reconciliation is often not reusable nor reproducible. A recent integration approach using knowledge models has been developed with ontologies that provide a strong semantic foundation. Nonetheless, deriving a data model that captures the richness of the ontology to store data with their full semantic remains a challenging task.

OBJECTIVES

This article addresses the following question: How to design a sharable and interoperable data model for storing heterogeneous healthcare data and their semantic to support various applications?

METHOD

This article describes a method using an ontological knowledge model to automatically generate a data model for a domain of interest. The model can then be implemented in a relational database which efficiently enables the collection, storage, and retrieval of data while keeping semantic ontological annotations so that the same data can be extracted for various applications for further processing.

RESULTS

This article (1) presents a comparison of existing methods for generating a relational data model from an ontology using 23 criteria, (2) describes standard conversion rules, and (3) presents O n t o R e l a , a prototype developed to demonstrate the conversion rules.

CONCLUSION

This work is a first step toward automating and refining the generation of sharable and interoperable relational data models using ontologies with a freely available tool. The remaining challenges to cover all the ontology richness in the relational model are pointed out.

Collapse

Pedrera-Jiménez M, García-Barrio N, Rubio-Mayo P, Tato-Gómez A, Cruz-Bermúdez JL, Bernal-Sobrino JL, Muñoz-Carrero A, Serrano-Balazote P. TransformEHRs: a flexible methodology for building transparent ETL processes for EHR reuse. Methods Inf Med 2022;61:e89-e102. [PMID: 36220109 PMCID: PMC9788916 DOI: 10.1055/s-0042-1757763] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Abstract

BACKGROUND

During the COVID-19 pandemic, several methodologies were designed for obtaining electronic health record (EHR)-derived datasets for research. These processes are often based on black boxes, on which clinical researchers are unaware of how the data were recorded, extracted, and transformed. In order to solve this, it is essential that extract, transform, and load (ETL) processes are based on transparent, homogeneous, and formal methodologies, making them understandable, reproducible, and auditable.

OBJECTIVES

This study aims to design and implement a methodology, according with FAIR Principles, for building ETL processes (focused on data extraction, selection, and transformation) for EHR reuse in a transparent and flexible manner, applicable to any clinical condition and health care organization.

METHODS

The proposed methodology comprises four stages: (1) analysis of secondary use models and identification of data operations, based on internationally used clinical repositories, case report forms, and aggregated datasets; (2) modeling and formalization of data operations, through the paradigm of the Detailed Clinical Models; (3) agnostic development of data operations, selecting SQL and R as programming languages; and (4) automation of the ETL instantiation, building a formal configuration file with XML.

RESULTS

First, four international projects were analyzed to identify 17 operations, necessary to obtain datasets according to the specifications of these projects from the EHR. With this, each of the data operations was formalized, using the ISO 13606 reference model, specifying the valid data types as arguments, inputs and outputs, and their cardinality. Then, an agnostic catalog of data was developed through data-oriented programming languages previously selected. Finally, an automated ETL instantiation process was built from an ETL configuration file formally defined.

CONCLUSIONS

This study has provided a transparent and flexible solution to the difficulty of making the processes for obtaining EHR-derived data for secondary use understandable, auditable, and reproducible. Moreover, the abstraction carried out in this study means that any previous EHR reuse methodology can incorporate these results into them.

Collapse

Shang Y, Tian Y, Zhou M, Zhou T, Lyu K, Wang Z, Xin R, Liang T, Zhu S, Li J. EHR-Oriented Knowledge Graph System: Toward Efficient Utilization of Non-Used Information Buried in Routine Clinical Practice. IEEE J Biomed Health Inform 2021;25:2463-2475. [PMID: 34057901 DOI: 10.1109/jbhi.2021.3085003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Abstract

Non-used clinical information has negative implications on healthcare quality. Clinicians pay priority attention to clinical information relevant to their specialties during routine clinical practices but may be insensitive or less concerned about information showing disease risks beyond their specialties, resulting in delayed and missed diagnoses or improper management. In this study, we introduced an electronic health record (EHR)-oriented knowledge graph system to efficiently utilize non-used information buried in EHRs. EHR data were transformed into a semantic patient-centralized information model under the ontology structure of a knowledge graph. The knowledge graph then creates an EHR data trajectory and performs reasoning through semantic rules to identify important clinical findings within EHR data. A graphical reasoning pathway illustrates the reasoning footage and explains the clinical significance for clinicians to better understand the neglected information. An application study was performed to evaluate unconsidered chronic kidney disease (CKD) reminding for non-nephrology clinicians to identify important neglected information. The study covered 71,679 patients in non-nephrology departments. The system identified 2,774 patients meeting CKD diagnosis criteria and 10,377 patients requiring high attention. A follow-up study of 5,439 patients showed that 82.1% of patients who met the diagnosis criteria and 61.4% of patients requiring high attention were confirmed to be CKD positive during follow-up research. The application demonstrated that the proposed approach is feasible and effective in clinical information utilization. Additionally, it's valuable as an explainable artificial intelligence to provide interpretable recommendations for specialist physicians to understand the importance of non-used data and make comprehensive decisions.

Collapse

Gaudet-Blavignac C, Raisaro JL, Touré V, Österle S, Crameri K, Lovis C. A National, Semantic-Driven, Three-Pillar Strategy to Enable Health Data Secondary Usage Interoperability for Research Within the Swiss Personalized Health Network: Methodological Study. JMIR Med Inform 2021;9:e27591. [PMID: 34185008 PMCID: PMC8277320 DOI: 10.2196/27591] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 04/27/2021] [Accepted: 05/19/2021] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

Interoperability is a well-known challenge in medical informatics. Current trends in interoperability have moved from a data model technocentric approach to sustainable semantics, formal descriptive languages, and processes. Despite many initiatives and investments for decades, the interoperability challenge remains crucial. The need for data sharing for most purposes ranging from patient care to secondary uses, such as public health, research, and quality assessment, faces unmet problems.

OBJECTIVE

This work was performed in the context of a large Swiss Federal initiative aiming at building a national infrastructure for reusing consented data acquired in the health care and research system to enable research in the field of personalized medicine in Switzerland. The initiative is the Swiss Personalized Health Network (SPHN). This initiative is providing funding to foster use and exchange of health-related data for research. As part of the initiative, a national strategy to enable a semantically interoperable clinical data landscape was developed and implemented.

METHODS

A deep analysis of various approaches to address interoperability was performed at the start, including large frameworks in health care, such as Health Level Seven (HL7) and Integrating Healthcare Enterprise (IHE), and in several domains, such as regulatory agencies (eg, Clinical Data Interchange Standards Consortium [CDISC]) and research communities (eg, Observational Medical Outcome Partnership [OMOP]), to identify bottlenecks and assess sustainability. Based on this research, a strategy composed of three pillars was designed. It has strong multidimensional semantics, descriptive formal language for exchanges, and as many data models as needed to comply with the needs of various communities.

RESULTS

This strategy has been implemented stepwise in Switzerland since the middle of 2019 and has been adopted by all university hospitals and high research organizations. The initiative is coordinated by a central organization, the SPHN Data Coordination Center of the SIB Swiss Institute of Bioinformatics. The semantics is mapped by domain experts on various existing standards, such as Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), Logical Observation Identifiers Names and Codes (LOINC), and International Classification of Diseases (ICD). The resource description framework (RDF) is used for storing and transporting data, and to integrate information from different sources and standards. Data transformers based on SPARQL query language are implemented to convert RDF representations to the numerous data models required by the research community or bridge with other systems, such as electronic case report forms.

CONCLUSIONS

The SPHN strategy successfully implemented existing standards in a pragmatic and applicable way. It did not try to build any new standards but used existing ones in a nondogmatic way. It has now been funded for another 4 years, bringing the Swiss landscape into a new dimension to support research in the field of personalized medicine and large interoperable clinical data.

Collapse

Hammad R, Barhoush M, Abed-alguni BH. A Semantic-Based Approach for Managing Healthcare Big Data: A Survey. JOURNAL OF HEALTHCARE ENGINEERING 2020;2020:8865808. [PMID: 33489061 PMCID: PMC7787845 DOI: 10.1155/2020/8865808] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 11/02/2020] [Accepted: 11/09/2020] [Indexed: 12/20/2022]