Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Dugas M, Neuhaus P, Meidt A, Doods J, Storck M, Bruland P, Varghese J. Portal of medical data models: information infrastructure for medical research and healthcare. Database (Oxford) 2016;2016:bav121. [PMID: 26868052 PMCID: PMC4750548 DOI: 10.1093/database/bav121] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 12/01/2015] [Indexed: 11/14/2022]

For:	Dugas M, Neuhaus P, Meidt A, Doods J, Storck M, Bruland P, Varghese J. Portal of medical data models: information infrastructure for medical research and healthcare. Database (Oxford) 2016;2016:bav121. [PMID: 26868052 PMCID: PMC4750548 DOI: 10.1093/database/bav121] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 12/01/2015] [Indexed: 11/14/2022]

Number

Cited by Other Article(s)

Riepenhausen S, Blumenstock M, Niklas C, Hegselmann S, Neuhaus P, Meidt A, Püttmann C, Storck M, Ganzinger M, Varghese J, Dugas M. Europe's Largest Research Infrastructure for Curated Medical Data Models with Semantic Annotations. Methods Inf Med 2024. [PMID: 38740374 DOI: 10.1055/s-0044-1786839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]

Abrán H, Kovács K, Horvát Z, Erőss E, Hollins Martin CJ, Martin CR. Translation and validation of the Hungarian version of the Birth Satisfaction Scale-Revised (BSS-R). Midwifery 2024;132:103983. [PMID: 38581970 DOI: 10.1016/j.midw.2024.103983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 02/27/2024] [Accepted: 03/25/2024] [Indexed: 04/08/2024]

Oehm JB, Riepenhausen SL, Storck M, Dugas M, Pryss R, Varghese J. Integration of Patient-Reported Outcome Data Collected Via Web Applications and Mobile Apps Into a Nation-Wide COVID-19 Research Platform Using Fast Healthcare Interoperability Resources: Development Study. J Med Internet Res 2024;26:e47846. [PMID: 38411999 PMCID: PMC10933715 DOI: 10.2196/47846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 07/30/2023] [Accepted: 12/12/2023] [Indexed: 02/28/2024] Open

Abstract

BACKGROUND

The Network University Medicine projects are an important part of the German COVID-19 research infrastructure. They comprise 2 subprojects: COVID-19 Data Exchange (CODEX) and Coordination on Mobile Pandemic Apps Best Practice and Solution Sharing (COMPASS). CODEX provides a centralized and secure data storage platform for research data, whereas in COMPASS, expert panels were gathered to develop a reference app framework for capturing patient-reported outcomes (PROs) that can be used by any researcher.

OBJECTIVE

Our study aims to integrate the data collected with the COMPASS reference app framework into the central CODEX platform, so that they can be used by secondary researchers. Although both projects used the Fast Healthcare Interoperability Resources (FHIR) standard, it was not used in a way that data could be shared directly. Given the short time frame and the parallel developments within the CODEX platform, a pragmatic and robust solution for an interface component was required.

METHODS

We have developed a means to facilitate and promote the use of the German Corona Consensus (GECCO) data set, a core data set for COVID-19 research in Germany. In this way, we ensured semantic interoperability for the app-collected PRO data with the COMPASS app. We also developed an interface component to sustain syntactic interoperability.

RESULTS

The use of different FHIR types by the COMPASS reference app framework (the general-purpose FHIR Questionnaire) and the CODEX platform (eg, Patient, Condition, and Observation) was found to be the most significant obstacle. Therefore, we developed an interface component that realigns the Questionnaire items with the corresponding items in the GECCO data set and provides the correct resources for the CODEX platform. We extended the existing COMPASS questionnaire editor with an import function for GECCO items, which also tags them for the interface component. This ensures syntactic interoperability and eases the reuse of the GECCO data set for researchers.

CONCLUSIONS

This paper shows how PRO data, which are collected across various studies conducted by different researchers, can be captured in a research-compatible way. This means that the data can be shared with a central research infrastructure and be reused by other researchers to gain more insights about COVID-19 and its sequelae.

Collapse

Reinikainen J, Palosaari T, Canosa-Valls AJ, Schmidt CO, Wissa R, Chadalavada S, Codó L, Gelpí JL, Joseph B, van der Lugt A, Pacella E, Petersen SE, Pujadas ER, Szabo L, Zeller T, Niiranen T, Lekadir K, Kuulasmaa K. Cohort Profile: The Cardiovascular Research Data Catalogue. Int J Epidemiol 2024;53:dyad175. [PMID: 38142238 DOI: 10.1093/ije/dyad175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 12/11/2023] [Indexed: 12/25/2023] Open

Affiliation(s)

Jaakko Reinikainen Population Health Unit, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
Tarja Palosaari Population Health Unit, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
Alejandro J Canosa-Valls Barcelona Supercomputing Center (BSC), Barcelona, Spain
Carsten O Schmidt Functional Division Quality in the Health Sciences (QIHS), Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
Rita Wissa Maelstrom Research, Research Institute of the McGill University Health Centre, Montreal, Canada
Sucharitha Chadalavada William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, Charterhouse Square, London, UK Barts Heart Centre, St Bartholomew's Hospital, Barts Health NHS Trust, London, UK
Laia Codó Barcelona Supercomputing Center (BSC), Barcelona, Spain
Josep Lluís Gelpí Barcelona Supercomputing Center (BSC), Barcelona, Spain Department of Biochemistry and Biomedicine, University of Barcelona, Barcelona, Spain
Bijoy Joseph Data and Analytics Unit, Department of Knowledge Brokers, Finnish Institute for Health and Welfare, Helsinki, Finland
Aad van der Lugt Department of Radiology and Nuclear Medicine, Erasmus University Medical Center Rotterdam, The Netherlands
Elsa Pacella Scientific Affairs, Research Department, European Society of Cardiology, France
Steffen E Petersen William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, Charterhouse Square, London, UK Barts Heart Centre, St Bartholomew's Hospital, Barts Health NHS Trust, London, UK Health Data Research UK, London, UK
Esmeralda Ruiz Pujadas Department of Mathematics and Computer Science, Artificial Intelligence in Medicine Lab (BCN-AIM), Barcelona, Spain
Liliana Szabo William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, Charterhouse Square, London, UK Barts Heart Centre, St Bartholomew's Hospital, Barts Health NHS Trust, London, UK Semmelweis University, Heart and Vascular Centre, Budapest, Hungary
Tanja Zeller University Center of Cardiovascular Science, University Heart and Vascular Center, Hamburg, Germany Department of Cardiology, University Heart and Vascular Center, Hamburg, Germany German Center of Cardiovascular Research, Partner site Hamburg/Lübeck/Kiel, Hamburg, Germany
Teemu Niiranen Population Health Unit, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland Department of Internal Medicine, University of Turku and Turku University Hospital, Turku, Finland
Karim Lekadir Department of Mathematics and Computer Science, Artificial Intelligence in Medicine Lab (BCN-AIM), Barcelona, Spain
Kari Kuulasmaa Population Health Unit, Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland

Collapse

Dugas M, Blumenstock M, Dittrich T, Eisenmann U, Feder SC, Fritz-Kebede F, Kessler LJ, Klass M, Knaup P, Lehmann CU, Merzweiler A, Niklas C, Pausch TM, Zental N, Ganzinger M. Next-generation study databases require FAIR, EHR-integrated, and scalable Electronic Data Capture for medical documentation and decision support. NPJ Digit Med 2024;7:10. [PMID: 38216645 PMCID: PMC10786912 DOI: 10.1038/s41746-023-00994-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 12/11/2023] [Indexed: 01/14/2024] Open

Affiliation(s)

Martin Dugas Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Max Blumenstock Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Tobias Dittrich Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany Department of Hematology, Oncology and Rheumatology, Heidelberg University Hospital, Heidelberg, Germany
Urs Eisenmann Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Stephan Christoph Feder Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany Department of General Internal Medicine and Psychosomatics, Heidelberg University Hospital, Heidelberg, Germany
Fleur Fritz-Kebede Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Lucy J Kessler Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany Department of Ophthalmology, Heidelberg University Hospital, Heidelberg, Germany
Maximilian Klass Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Petra Knaup Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Christoph U Lehmann Clinical Informatics Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
Angela Merzweiler Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Christian Niklas Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
Thomas M Pausch Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany Department of General, Visceral, and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
Nelly Zental Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany Department of Anesthesiology, Heidelberg University Hospital, Heidelberg, Germany
Matthias Ganzinger Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany.

Collapse

Pallier K, Prot O, Naldi S, Silva F, Denis T, Giry O, Leobon S, Deluche E, Tubiana-Mathieu N. Patient Identification and Tumor Identification Management: Quality Program in a Cancer Multicentric Clinical Data Warehouse. Cancer Inform 2023;22:11769351231172609. [PMID: 37223319 PMCID: PMC10201142 DOI: 10.1177/11769351231172609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 04/12/2023] [Indexed: 05/25/2023] Open

Cremonesi F, Planat V, Kalokyri V, Kondylakis H, Sanavia T, Miguel Mateos Resinas V, Singh B, Uribe S. The need for multimodal health data modeling: a practical approach for a federated-learning healthcare platform. J Biomed Inform 2023;141:104338. [PMID: 37023843 DOI: 10.1016/j.jbi.2023.104338] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 03/06/2023] [Accepted: 03/11/2023] [Indexed: 04/08/2023]

Abstract

Federated learning initiatives in healthcare are being developed to collaboratively train predictive models without the need to centralize sensitive personal data. GenoMed4All is one such project, with the goal of connecting European clinical and -omics data repositories on rare diseases through a federated learning platform. Currently, the consortium faces the challenge of a lack of well-established international datasets and interoperability standards for federated learning applications on rare diseases. This paper presents our practical approach to select and implement a Common Data Model (CDM) suitable for the federated training of predictive models applied to the medical domain, during the initial design phase of our federated learning platform. We describe our selection process, composed of identifying the consortium's needs, reviewing our functional and technical architecture specifications, and extracting a list of business requirements. We review the state of the art and evaluate three widely-used approaches (FHIR, OMOP and Phenopackets) based on a checklist of requirements and specifications. We discuss the pros and cons of each approach considering the use cases specific to our consortium as well as the generic issues of implementing a European federated learning healthcare platform. A list of lessons learned from the experience in our consortium is discussed, from the importance of establishing the proper communication channels for all stakeholders to technical aspects related to -omics data. For federated learning projects focused on secondary use of health data for predictive modeling, encompassing multiple data modalities, a phase of data model convergence is sorely needed to gather different data representations developed in the context of medical research, interoperability of clinical care software, imaging, and -omics analysis into a coherent, unified data model. Our work identifies this need and presents our experience and a list of actionable lessons learned for future work in this direction.

Collapse

Federated electronic data capture (fEDC): Architecture and prototype. J Biomed Inform 2023;138:104280. [PMID: 36623781 DOI: 10.1016/j.jbi.2023.104280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 12/23/2022] [Accepted: 01/03/2023] [Indexed: 01/09/2023]

Abstract

In clinical research as well as patient care, structured documentation of findings is an important task. In many cases, this is achieved by means of electronic case report forms (eCRF) using corresponding information technology systems. To avoid double data entry, eCRF systems can be integrated with electronic health records (EHR). However, when researchers from different institutions collaborate in collecting data, they often use a single joint eCRF system on the Internet. In this case, integration with EHR systems is not possible in most cases due to information security and data protection restrictions. To overcome this shortcoming, we propose a novel architecture for a federated electronic data capture system (fEDC). Four key requirements were identified for fEDC: Definitions of forms have to be available in a reliable and controlled fashion, integration with electronic health record systems must be possible, patient data should be under full local control until they are explicitly transferred for joint analysis, and the system must support data sharing principles accepted by the scientific community for both data model and data captured. With our approach, sites participating in a joint study can run their own instance of an fEDC system that complies with local standards (such as being behind a network firewall) while also being able to benefit from using identical form definitions by sharing metadata in the Operational Data Model (ODM) format published by the Clinical Data Interchange Standards Consortium (CDISC) throughout the collaboration. The fEDC architecture was validated with a working open-source prototype at five German university hospitals. The fEDC architecture provides a novel approach with the potential to significantly improve collaborative data capture: Efforts for data entry are reduced and at the same time, data quality is increased since barriers for integrating with local electronic health record systems are lowered. Further, metadata are shared and patient privacy is ensured at a high level.

Collapse

Bernasconi A, Guizzardi G, Pastor O, Storey VC. Semantic interoperability: ontological unpacking of a viral conceptual model. BMC Bioinformatics 2022;23:491. [PMID: 36396980 PMCID: PMC9672571 DOI: 10.1186/s12859-022-05022-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 10/29/2022] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Genomics and virology are unquestionably important, but complex, domains being investigated by a large number of scientists. The need to facilitate and support work within these domains requires sharing of databases, although it is often difficult to do so because of the different ways in which data is represented across the databases. To foster semantic interoperability, models are needed that provide a deep understanding and interpretation of the concepts in a domain, so that the data can be consistently interpreted among researchers.

RESULTS

In this research, we propose the use of conceptual models to support semantic interoperability among databases and assess their ontological clarity to support their effective use. This modeling effort is illustrated by its application to the Viral Conceptual Model (VCM) that captures and represents the sequencing of viruses, inspired by the need to understand the genomic aspects of the virus responsible for COVID-19. For achieving semantic clarity on the VCM, we leverage the "ontological unpacking" method, a process of ontological analysis that reveals the ontological foundation of the information that is represented in a conceptual model. This is accomplished by applying the stereotypes of the OntoUML ontology-driven conceptual modeling language.As a result, we propose a new OntoVCM, an ontologically grounded model, based on the initial VCM, but with guaranteed interoperability among the data sources that employ it.

CONCLUSIONS

We propose and illustrate how the unpacking of the Viral Conceptual Model resolves several issues related to semantic interoperability, the importance of which is recognized by the "I" in FAIR principles. The research addresses conceptual uncertainty within the domain of SARS-CoV-2 data and knowledge.The method employed provides the basis for further analyses of complex models currently used in life science applications, but lacking ontological grounding, subsequently hindering the interoperability needed for scientists to progress their research.

Collapse

Rafee A, Riepenhausen S, Neuhaus P, Meidt A, Dugas M, Varghese J. ELaPro, a LOINC-mapped core dataset for top laboratory procedures of eligibility screening for clinical trials. BMC Med Res Methodol 2022;22:141. [PMID: 35568796 PMCID: PMC9107639 DOI: 10.1186/s12874-022-01611-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 04/20/2022] [Indexed: 12/21/2022] Open

Abstract

Background

Screening for eligible patients continues to pose a great challenge for many clinical trials. This has led to a rapidly growing interest in standardizing computable representations of eligibility criteria (EC) in order to develop tools that leverage data from electronic health record (EHR) systems. Although laboratory procedures (LP) represent a common entity of EC that is readily available and retrievable from EHR systems, there is a lack of interoperable data models for this entity of EC. A public, specialized data model that utilizes international, widely-adopted terminology for LP, e.g. Logical Observation Identifiers Names and Codes (LOINC®), is much needed to support automated screening tools.

Objective

The aim of this study is to establish a core dataset for LP most frequently requested to recruit patients for clinical trials using LOINC terminology. Employing such a core dataset could enhance the interface between study feasibility platforms and EHR systems and significantly improve automatic patient recruitment.

Methods

We used a semi-automated approach to analyze 10,516 screening forms from the Medical Data Models (MDM) portal’s data repository that are pre-annotated with Unified Medical Language System (UMLS). An automated semantic analysis based on concept frequency is followed by an extensive manual expert review performed by physicians to analyze complex recruitment-relevant concepts not amenable to automatic approach.

Results

Based on analysis of 138,225 EC from 10,516 screening forms, 55 laboratory procedures represented 77.87% of all UMLS laboratory concept occurrences identified in the selected EC forms. We identified 26,413 unique UMLS concepts from 118 UMLS semantic types and covered the vast majority of Medical Subject Headings (MeSH) disease domains.

Conclusions

Only a small set of common LP covers the majority of laboratory concepts in screening EC forms which supports the feasibility of establishing a focused core dataset for LP. We present ELaPro, a novel, LOINC-mapped, core dataset for the most frequent 55 LP requested in screening for clinical trials. ELaPro is available in multiple machine-readable data formats like CSV, ODM and HL7 FHIR. The extensive manual curation of this large number of free-text EC as well as the combining of UMLS and LOINC terminologies distinguishes this specialized dataset from previous relevant datasets in the literature.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-022-01611-y.

Collapse

Lamer A, Fruchart M, Paris N, Popoff B, Payen A, Balcaen T, Gacquer W, Bouzille G, Cuggia M, Doutreligne M, Chazard E. Enhancing Data Reuse: Standardized Description of the Feature Extraction Process to Transform Raw Data into Meaningful Information (Preprint). JMIR Med Inform 2022;10:e38936. [DOI: 10.2196/38936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 07/19/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open

Vaidyam A, Halamka J, Torous J. Enabling Research and Clinical Use of Patient-Generated Health Data (the mindLAMP Platform): Digital Phenotyping Study. JMIR Mhealth Uhealth 2022;10:e30557. [PMID: 34994710 PMCID: PMC8783287 DOI: 10.2196/30557] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 08/18/2021] [Accepted: 11/11/2021] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

There is a growing need for the integration of patient-generated health data (PGHD) into research and clinical care to enable personalized, preventive, and interactive care, but technical and organizational challenges, such as the lack of standards and easy-to-use tools, preclude the effective use of PGHD generated from consumer devices, such as smartphones and wearables.

OBJECTIVE

This study outlines how we used mobile apps and semantic web standards such as HTTP 2.0, Representational State Transfer, JSON (JavaScript Object Notation), JSON Schema, Transport Layer Security (version 1.3), Advanced Encryption Standard-256, OpenAPI, HTML5, and Vega, in conjunction with patient and provider feedback to completely update a previous version of mindLAMP.

METHODS

The Learn, Assess, Manage, and Prevent (LAMP) platform addresses the abovementioned challenges in enhancing clinical insight by supporting research, data analysis, and implementation efforts around PGHD as an open-source solution with freely accessible and shared code.

RESULTS

With a simplified programming interface and novel data representation that captures additional metadata, the LAMP platform enables interoperability with existing Fast Healthcare Interoperability Resources-based health care systems as well as consumer wearables and services such as Apple HealthKit and Google Fit. The companion Cortex data analysis and machine learning toolkit offer robust support for artificial intelligence, behavioral feature extraction, interactive visualizations, and high-performance data processing through parallelization and vectorization techniques.

CONCLUSIONS

The LAMP platform incorporates feedback from patients and clinicians alongside a standards-based approach to address these needs and functions across a wide range of use cases through its customizable and flexible components. These range from simple survey-based research to international consortiums capturing multimodal data to simple delivery of mindfulness exercises through personalized, just-in-time adaptive interventions.

Collapse

Stöhr MR, Günther A, Majeed RW. The Collaborative Metadata Repository (CoMetaR) Web App: Quantitative and Qualitative Usability Evaluation. JMIR Med Inform 2021;9:e30308. [PMID: 34847059 PMCID: PMC8669586 DOI: 10.2196/30308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 08/13/2021] [Accepted: 10/11/2021] [Indexed: 11/29/2022] Open

Abstract

Background

In the field of medicine and medical informatics, the importance of comprehensive metadata has long been recognized, and the composition of metadata has become its own field of profession and research. To ensure sustainable and meaningful metadata are maintained, standards and guidelines such as the FAIR (Findability, Accessibility, Interoperability, Reusability) principles have been published. The compilation and maintenance of metadata is performed by field experts supported by metadata management apps. The usability of these apps, for example, in terms of ease of use, efficiency, and error tolerance, crucially determines their benefit to those interested in the data.

Objective

This study aims to provide a metadata management app with high usability that assists scientists in compiling and using rich metadata. We aim to evaluate our recently developed interactive web app for our collaborative metadata repository (CoMetaR). This study reflects how real users perceive the app by assessing usability scores and explicit usability issues.

Methods

We evaluated the CoMetaR web app by measuring the usability of 3 modules: core module, provenance module, and data integration module. We defined 10 tasks in which users must acquire information specific to their user role. The participants were asked to complete the tasks in a live web meeting. We used the System Usability Scale questionnaire to measure the usability of the app. For qualitative analysis, we applied a modified think aloud method with the following thematic analysis and categorization into the ISO 9241-110 usability categories.

Results

A total of 12 individuals participated in the study. We found that over 97% (85/88) of all the tasks were completed successfully. We measured usability scores of 81, 81, and 72 for the 3 evaluated modules. The qualitative analysis resulted in 24 issues with the app.

Conclusions

A usability score of 81 implies very good usability for the 2 modules, whereas a usability score of 72 still indicates acceptable usability for the third module. We identified 24 issues that serve as starting points for further development. Our method proved to be effective and efficient in terms of effort and outcome. It can be adapted to evaluate apps within the medical informatics field and potentially beyond.

Collapse

Vesteghem C, Brøndum RF, Sønderkær M, Sommer M, Schmitz A, Bødker JS, Dybkær K, El-Galaly TC, Bøgsted M. Implementing the FAIR Data Principles in precision oncology: review of supporting initiatives. Brief Bioinform 2021;21:936-945. [PMID: 31263868 PMCID: PMC7299292 DOI: 10.1093/bib/bbz044] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 03/13/2019] [Accepted: 03/21/2019] [Indexed: 12/26/2022] Open

Hegselmann S, Storck M, Gessner S, Neuhaus P, Varghese J, Bruland P, Meidt A, Mertens C, Riepenhausen S, Baier S, Stöcker B, Henke J, Schmidt CO, Dugas M. Pragmatic MDR: a metadata repository with bottom-up standardization of medical metadata through reuse. BMC Med Inform Decis Mak 2021;21:160. [PMID: 34001121 PMCID: PMC8130274 DOI: 10.1186/s12911-021-01524-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 05/09/2021] [Indexed: 11/27/2022] Open

Abstract

Background

The variety of medical documentation often leads to incompatible data elements that impede data integration between institutions. A common approach to standardize and distribute metadata definitions are ISO/IEC 11179 norm-compliant metadata repositories with top-down standardization. To the best of our knowledge, however, it is not yet common practice to reuse the content of publicly accessible metadata repositories for creation of case report forms or routine documentation. We suggest an alternative concept called pragmatic metadata repository, which enables a community-driven bottom-up approach for agreeing on data collection models. A pragmatic metadata repository collects real-world documentation and considers frequent metadata definitions as high quality with potential for reuse.

Methods

We implemented a pragmatic metadata repository proof of concept application and filled it with medical forms from the Portal of Medical Data Models. We applied this prototype in two use cases to demonstrate its capabilities for reusing metadata: first, integration into a study editor for the suggestion of data elements and, second, metadata synchronization between two institutions. Moreover, we evaluated the emergence of bottom-up standards in the prototype and two medical data managers assessed their quality for 24 medical concepts.

Results

The resulting prototype contained 466,569 unique metadata definitions. Integration into the study editor led to a reuse of 1836 items and item groups. During the metadata synchronization, semantic codes of 4608 data elements were transferred. Our evaluation revealed that for less complex medical concepts weak bottom-up standards could be established. However, more diverse disease-related concepts showed no convergence of data elements due to an enormous heterogeneity of metadata. The survey showed fair agreement (K_alpha = 0.50, 95% CI 0.43–0.56) for good item quality of bottom-up standards.

Conclusions

We demonstrated the feasibility of the pragmatic metadata repository concept for medical documentation. Applications of the prototype in two use cases suggest that it facilitates the reuse of data elements. Our evaluation showed that bottom-up standardization based on a large collection of real-world metadata can yield useful results. The proposed concept shall not replace existing top-down approaches, rather it complements them by showing what is commonly used in the community to guide other researchers.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12911-021-01524-8.

Collapse

Sass J, Bartschke A, Lehne M, Essenwanger A, Rinaldi E, Rudolph S, Heitmann KU, Vehreschild JJ, von Kalle C, Thun S. The German Corona Consensus Dataset (GECCO): a standardized dataset for COVID-19 research in university medicine and beyond. BMC Med Inform Decis Mak 2020;20:341. [PMID: 33349259 PMCID: PMC7751265 DOI: 10.1186/s12911-020-01374-w] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 12/16/2020] [Indexed: 11/10/2022] Open

de Ridder S, Beliën JAM. The iCRF Generator: Generating interoperable electronic case report forms using online codebooks. F1000Res 2020;9:81. [PMID: 32566137 PMCID: PMC7291075 DOI: 10.12688/f1000research.21576.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/16/2020] [Indexed: 11/23/2022] Open

von Martial S, Brix TJ, Klotz L, Neuhaus P, Berger K, Warnke C, Meuth SG, Wiendl H, Dugas M. EMR-integrated minimal core dataset for routine health care and multiple research settings: A case study for neuroinflammatory demyelinating diseases. PLoS One 2019;14:e0223886. [PMID: 31613917 PMCID: PMC6793844 DOI: 10.1371/journal.pone.0223886] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 10/01/2019] [Indexed: 11/18/2022] Open

Abstract

Although routine health care and clinical trials usually require the documentation of similar information, data collection is performed independently from each other, resulting in redundant documentation efforts. Standardizing routine documentation can enable secondary use for medical research. Neuroinflammatory demyelinating diseases (NIDs) represent a heterogeneous group of diseases requiring further research to improve patient management. The aim of this work is to develop, implement and evaluate a minimal core dataset in routine health care with a focus on secondary use as case study for NIDs. Therefore, a draft minimal core dataset for NIDs was created by analyzing routine, clinical trial, registry, biobank documentation and existing data standards for NIDs. Data elements (DEs) were converted into the standard format Operational Data Model, semantically annotated and analyzed via frequency analysis. The analysis produced 1958 DEs based on 864 distinct medical concepts. After review and finalization by an interdisciplinary team of neurologists, epidemiologists and medical computer scientists, the minimal core dataset (NID CDEs) consists of 46 common DEs capturing disease-specific information for reuse in the discharge letter and other research settings. It covers the areas of diagnosis, laboratory results, disease progress, expanded disability status scale, therapy and magnetic resonance imaging findings. NID CDEs was implemented in two German university hospitals and a usability study in clinical routine was conducted (participants n = 16) showing a good usability (Mean SUS = 75). From May 2017 to February 2018, 755 patients were documented with the NID CDEs, which indicates the feasibility of developing a minimal core dataset for structured documentation based on previously used documentation standards and integrating the dataset into clinical routine. By sharing, translating and reusing the minimal dataset, a transnational harmonized documentation of patients with NIDs might be realized, supporting interoperability in medical research.

Collapse

Kentgen M, Varghese J, Samol A, Waltenberger J, Dugas M. Common Data Elements for Acute Coronary Syndrome: Analysis Based on the Unified Medical Language System. JMIR Med Inform 2019;7:e14107. [PMID: 31444871 PMCID: PMC6729118 DOI: 10.2196/14107] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 06/21/2019] [Accepted: 07/04/2019] [Indexed: 01/29/2023] Open

Holz C, Kessler T, Dugas M, Varghese J. Core Data Elements in Acute Myeloid Leukemia: A Unified Medical Language System-Based Semantic Analysis and Experts' Review. JMIR Med Inform 2019;7:e13554. [PMID: 31407666 PMCID: PMC6709897 DOI: 10.2196/13554] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/08/2019] [Accepted: 05/31/2019] [Indexed: 01/27/2023] Open

Abstract

Background

For cancer domains such as acute myeloid leukemia (AML), a large set of data elements is obtained from different institutions with heterogeneous data definitions within one patient course. The lack of clinical data harmonization impedes cross-institutional electronic data exchange and future meta-analyses.

Objective

This study aimed to identify and harmonize a semantic core of common data elements (CDEs) in clinical routine and research documentation, based on a systematic metadata analysis of existing documentation models.

Methods

Lists of relevant data items were collected and reviewed by hematologists from two university hospitals regarding routine documentation and several case report forms of clinical trials for AML. In addition, existing registries and international recommendations were included. Data items were coded to medical concepts via the Unified Medical Language System (UMLS) by a physician and reviewed by another physician. On the basis of the coded concepts, the data sources were analyzed for concept overlaps and identification of most frequent concepts. The most frequent concepts were then implemented as data elements in the standardized format of the Operational Data Model by the Clinical Data Interchange Standards Consortium.

Results

A total of 3265 medical concepts were identified, of which 1414 were unique. Among the 1414 unique medical concepts, the 50 most frequent ones cover 26.98% of all concept occurrences within the collected AML documentation. The top 100 concepts represent 39.48% of all concepts’ occurrences. Implementation of CDEs is available on a European research infrastructure and can be downloaded in different formats for reuse in different electronic data capture systems.

Conclusions

Information management is a complex process for research-intense disease entities as AML that is associated with a large set of lab-based diagnostics and different treatment options. Our systematic UMLS-based analysis revealed the existence of a core data set and an exemplary reusable implementation for harmonized data capture is available on an established metadata repository.

Collapse

Varghese J, Niewöhner S, Soto-Rey I, Schipmann-Miletić S, Warneke N, Warnecke T, Dugas M. A Smart Device System to Identify New Phenotypical Characteristics in Movement Disorders. Front Neurol 2019;10:48. [PMID: 30761078 PMCID: PMC6363699 DOI: 10.3389/fneur.2019.00048] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 01/14/2019] [Indexed: 01/30/2023] Open

Varghese J, Sandmann S, Dugas M. Web-Based Information Infrastructure Increases the Interrater Reliability of Medical Coders: Quasi-Experimental Study. J Med Internet Res 2018;20:e274. [PMID: 30322834 PMCID: PMC6231825 DOI: 10.2196/jmir.9644] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 05/03/2018] [Accepted: 06/28/2018] [Indexed: 01/05/2023] Open

Abstract

Background

Medical coding is essential for standardized communication and integration of clinical data. The Unified Medical Language System by the National Library of Medicine is the largest clinical terminology system for medical coders and Natural Language Processing tools. However, the abundance of ambiguous codes leads to low rates of uniform coding among different coders.

Objective

The objective of our study was to measure uniform coding among different medical experts in terms of interrater reliability and analyze the effect on interrater reliability using an expert- and Web-based code suggestion system.

Methods

We conducted a quasi-experimental study in which 6 medical experts coded 602 medical items from structured quality assurance forms or free-text eligibility criteria of 20 different clinical trials. The medical item content was selected on the basis of mortality-leading diseases according to World Health Organization data. The intervention comprised using a semiautomatic code suggestion tool that is linked to a European information infrastructure providing a large medical text corpus of >300,000 medical form items with expert-assigned semantic codes. Krippendorff alpha (K_alpha) with bootstrap analysis was used for the interrater reliability analysis, and coding times were measured before and after the intervention.

Results

The intervention improved interrater reliability in structured quality assurance form items (from K_alpha=0.50, 95% CI 0.43-0.57 to K_alpha=0.62 95% CI 0.55-0.69) and free-text eligibility criteria (from K_alpha=0.19, 95% CI 0.14-0.24 to K_alpha=0.43, 95% CI 0.37-0.50) while preserving or slightly reducing the mean coding time per item for all 6 coders. Regardless of the intervention, precoordination and structured items were associated with significantly high interrater reliability, but the proportion of items that were precoordinated significantly increased after intervention (eligibility criteria: OR 4.92, 95% CI 2.78-8.72; quality assurance: OR 1.96, 95% CI 1.19-3.25).

Conclusions

The Web-based code suggestion mechanism improved interrater reliability toward moderate or even substantial intercoder agreement. Precoordination and the use of structured versus free-text data elements are key drivers of higher interrater reliability.

Collapse

Varghese J, Fujarski M, Hegselmann S, Neuhaus P, Dugas M. CDEGenerator: an online platform to learn from existing data models to build model registries. Clin Epidemiol 2018;10:961-970. [PMID: 30127646 PMCID: PMC6089100 DOI: 10.2147/clep.s170075] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

Tapuria A, Bruland P, Delaney B, Kalra D, Curcin V. Comparison and transformation between CDISC ODM and EN13606 EHR standards in connecting EHR data with clinical trial research data. Digit Health 2018;4:2055207618777676. [PMID: 29942639 PMCID: PMC6016569 DOI: 10.1177/2055207618777676] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 04/13/2018] [Indexed: 01/01/2023] Open

Abstract

Objectives

Integrating Electronic Health Record (EHR) systems into the field of clinical trials still contains several challenges and obstacles. Heterogeneous standards and specifications are used to represent healthcare and clinical trial information. Therefore, this work investigates the mapping and data interoperability between healthcare and research standards: EN13606 used for the EHRs and the Clinical Data Interchange Standards Consortium Operational Data Model (CDISC ODM) used for clinical research.

Methods

Based on the specifications of CDISC ODM 1.3.2 and EN13606, a mapping between the structure and components of both standards has been performed. Archetype Definition Language (ADL) forms built with the EN13606 editor were transformed to ODM XML and reviewed. As a proof of concept, clinical sample data has been transformed into ODM and imported into an electronic data capture system. Reverse transformation from ODM to ADL has also been performed and finally reviewed concerning map-ability.

Results

The mapping between EN13606 and CDISC ODM shows the similarities and differences between the components and overall record structure of the two standards. An EN13606 archetype corresponds with a group of items within CDISC ODM. Transformations of element names, descriptions, different languages, datatypes, cardinality, optionality, units, value range and terminology codes are possible from EN13606 to CDISC ODM and vice versa.

Conclusion

It is feasible to map data elements between EN13606 and CDISC ODM and transformation of forms between ADL and ODM XML format is possible with only minor limitations. EN13606 can accommodate clinical information in a more structured manner with more constraints, whereas CDISC ODM is more suitable and specific for clinical trials and studies. It is feasible to transform EHR data in the EN13606 form to ODM to transfer it into research database. The attempt to use EN13606 to build a study protocol (that was already built with CDISC ODM) also suggests the possibility of using EN13606 standard in place of CDISC ODM if needed to avoid transformations.

Collapse

Varghese J, Kleine M, Gessner SI, Sandmann S, Dugas M. Effects of computerized decision support system implementations on patient outcomes in inpatient care: a systematic review. J Am Med Inform Assoc 2018;25:593-602. [PMID: 29036406 PMCID: PMC7646949 DOI: 10.1093/jamia/ocx100] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Revised: 08/10/2017] [Accepted: 08/22/2017] [Indexed: 02/07/2023] Open

Abstract

Objectives

To systematically classify the clinical impact of computerized clinical decision support systems (CDSSs) in inpatient care.

Materials and Methods

Medline, Cochrane Trials, and Cochrane Reviews were searched for CDSS studies that assessed patient outcomes in inpatient settings. For each study, 2 physicians independently mapped patient outcome effects to a predefined medical effect score to assess the clinical impact of reported outcome effects. Disagreements were measured by using weighted kappa and solved by consensus. An example set of promising disease entities was generated based on medical effect scores and risk of bias assessment. To summarize technical characteristics of the systems, reported input variables and algorithm types were extracted as well.

Results

Seventy studies were included. Five (7%) reported reduced mortality, 16 (23%) reduced life-threatening events, and 28 (40%) reduced non-life-threatening events, 20 (29%) had no significant impact on patient outcomes, and 1 showed a negative effect (weighted κ: 0.72, P < .001). Six of 24 disease entity settings showed high effect scores with medium or low risk of bias: blood glucose management, blood transfusion management, physiologic deterioration prevention, pressure ulcer prevention, acute kidney injury prevention, and venous thromboembolism prophylaxis. Most of the implemented algorithms (72%) were rule-based. Reported input variables are shared as standardized models on a metadata repository.

Discussion and Conclusion

Most of the included CDSS studies were associated with positive patient outcomes effects but with substantial differences regarding the clinical impact. A subset of 6 disease entities could be filtered in which CDSS should be given special consideration at sites where computer-assisted decision-making is deemed to be underutilized. Registration number on PROSPERO: CRD42016049946.

Collapse

Read K, LaPolla FWZ. A new hat for librarians: providing REDCap support to establish the library as a central data hub. J Med Libr Assoc 2018;106:120-126. [PMID: 29339942 PMCID: PMC5764577 DOI: 10.5195/jmla.2018.327] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 02/01/2017] [Indexed: 11/24/2022] Open

Johnson SB. Clinical Research Informatics: Supporting the Research Study Lifecycle. Yearb Med Inform 2017;26:193-200. [PMID: 29063565 PMCID: PMC6239240 DOI: 10.15265/iy-2017-022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Indexed: 12/27/2022] Open

Abstract

Objectives: The primary goal of this review is to summarize significant developments in the field of Clinical Research Informatics (CRI) over the years 2015-2016. The secondary goal is to contribute to a deeper understanding of CRI as a field, through the development of a strategy for searching and classifying CRI publications. Methods: A search strategy was developed to query the PubMed database, using medical subject headings to both select and exclude articles, and filtering publications by date and other characteristics. A manual review classified publications using stages in the "research study lifecycle", with key stages that include study definition, participant enrollment, data management, data analysis, and results dissemination. Results: The search strategy generated 510 publications. The manual classification identified 125 publications as relevant to CRI, which were classified into seven different stages of the research lifecycle, and one additional class that pertained to multiple stages, referring to general infrastructure or standards. Important cross-cutting themes included new applications of electronic media (Internet, social media, mobile devices), standardization of data and procedures, and increased automation through the use of data mining and big data methods. Conclusions: The review revealed increased interest and support for CRI in large-scale projects across institutions, regionally, nationally, and internationally. A search strategy based on medical subject headings can find many relevant papers, but a large number of non-relevant papers need to be detected using text words which pertain to closely related fields such as computational statistics and clinical informatics. The research lifecycle was useful as a classification scheme by highlighting the relevance to the users of clinical research informatics solutions.

Collapse

Kellar E, Bornstein SM, Caban A, Célingant C, Crouthamel M, Johnson C, McIntire PA, Milstead KR, Patterson JK, Wilson B. Optimizing the Use of Electronic Data Sources in Clinical Trials: The Landscape, Part 1. Ther Innov Regul Sci 2016;50:682-696. [PMID: 30231749 DOI: 10.1177/2168479016670689] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Storck M, Krumm R, Dugas M. ODMSummary: A Tool for Automatic Structured Comparison of Multiple Medical Forms Based on Semantic Annotation with the Unified Medical Language System. PLoS One 2016;11:e0164569. [PMID: 27736972 PMCID: PMC5063379 DOI: 10.1371/journal.pone.0164569] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 09/27/2016] [Indexed: 12/01/2022] Open

Dugas M. Sharing clinical trial data. Lancet 2016;387:2287. [PMID: 27302260 DOI: 10.1016/s0140-6736(16)30683-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository. BMC Med Res Methodol 2016;16:65. [PMID: 27245222 PMCID: PMC4888420 DOI: 10.1186/s12874-016-0164-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 05/14/2016] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The volume and complexity of patient data - especially in personalised medicine - is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. In oncology hundreds of mutations can potentially be detected for each patient by genomic profiling. Therefore data integration from multiple sources constitutes a key challenge for medical research and healthcare.

METHODS

Semantic annotation of data elements can facilitate to identify matching data elements in different sources and thereby supports data integration. Millions of different annotations are required due to the semantic richness of patient data. These annotations should be uniform, i.e., two matching data elements shall contain the same annotations. However, large terminologies like SNOMED CT or UMLS don't provide uniform coding. It is proposed to develop semantic annotations of medical data elements based on a large-scale public metadata repository. To achieve uniform codes, semantic annotations shall be re-used if a matching data element is available in the metadata repository.

RESULTS

A web-based tool called ODMedit ( https://odmeditor.uni-muenster.de/ ) was developed to create data models with uniform semantic annotations. It contains ~800,000 terms with semantic annotations which were derived from ~5,800 models from the portal of medical data models (MDM). The tool was successfully applied to manually annotate 22 forms with 292 data items from CDISC and to update 1,495 data models of the MDM portal.

CONCLUSION

Uniform manual semantic annotation of data models is feasible in principle, but requires a large-scale collaborative effort due to the semantic richness of patient data. A web-based tool for these annotations is available, which is linked to a public metadata repository.

Collapse