1
|
Chen JS, Copado IA, Vallejos C, Kalaw FGP, Soe P, Cai CX, Toy BC, Borkar D, Sun CQ, Shantha JG, Baxter SL. Variations in Electronic Health Record-Based Definitions of Diabetic Retinopathy Cohorts: A Literature Review and Quantitative Analysis. Ophthalmology Science 2024; 4:100468. [PMID: 38560278 PMCID: PMC10973665 DOI: 10.1016/j.xops.2024.100468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 01/04/2024] [Accepted: 01/11/2024] [Indexed: 04/04/2024]
Abstract
Purpose Use of the electronic health record (EHR) has motivated the need for data standardization. A gap in knowledge exists regarding variations in existing terminologies for defining diabetic retinopathy (DR) cohorts. This study aimed to review the literature and analyze variations regarding codified definitions of DR. Design Literature review and quantitative analysis. Subjects Published manuscripts. Methods Four graders reviewed PubMed and Google Scholar for peer-reviewed studies. Studies were included if they used codified definitions of DR (e.g., billing codes). Data elements such as author names, publication year, purpose, data set type, and DR definitions were manually extracted. Each study was reviewed by ≥ 2 authors to validate inclusion eligibility. Quantitative analyses of the codified definitions were then performed to characterize the variation between DR cohort definitions. Main Outcome Measures Number of studies included and numeric counts of billing codes used to define codified cohorts. Results In total, 43 studies met the inclusion criteria. Half of the included studies used datasets based on structured EHR data (i.e., data registries, institutional EHR review), and half used claims data. All but 1 of the studies used billing codes such as the International Classification of Diseases 9th or 10th edition (ICD-9 or ICD-10), either alone or in addition to another terminology for defining disease. Of the 27 included studies that used ICD-9 and the 20 studies that used ICD-10 codes, the most common codes used pertained to the full spectrum of DR severity. Diabetic retinopathy complications (e.g., vitreous hemorrhage) were also used to define some DR cohorts. Conclusions Substantial variations exist among codified definitions for DR cohorts within retrospective studies. Variable definitions may limit generalizability and reproducibility of retrospective studies. More work is needed to standardize disease cohorts. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Jimmy S Chen
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Ivan A Copado
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Cecilia Vallejos
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Fritz Gerald P Kalaw
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Priyanka Soe
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| | - Cindy X Cai
- Wilmer Eye Institute, Johns Hopkins School of Medicine, Baltimore, Maryland
| | - Brian C Toy
- Department of Ophthalmology, Roski Eye Institute, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Durga Borkar
- Department of Ophthalmology, Duke Eye Center, Duke University, Durham, North Carolina
| | - Catherine Q Sun
- F.I. Proctor Foundation, University of California San Francisco, San Francisco, California
- Department of Ophthalmology, University of California San Francisco, San Francisco, California
| | - Jessica G Shantha
- F.I. Proctor Foundation, University of California San Francisco, San Francisco, California
- Department of Ophthalmology, University of California San Francisco, San Francisco, California
| | - Sally L Baxter
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
| |
Collapse
|
2
|
Blasini R, Buchowicz KM, Schneider H, Samans B, Sohrabi K. Implementation of inclusion and exclusion criteria in clinical studies in OHDSI ATLAS software. Sci Rep 2023; 13:22457. [PMID: 38105303 PMCID: PMC10725886 DOI: 10.1038/s41598-023-49560-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 12/09/2023] [Indexed: 12/19/2023] Open
Abstract
Clinical trials are essential parts of a medical study process, but studies are often cancelled due to a lack of participants. Clinical Trial Recruitment Support Systems are systems that help to increase the number of participants by seeking more suitable subjects. The software ATLAS (developed by Observational Health Data Sciences and Informatics) can support the launch of a clinical trial by building cohorts of patients who fulfill certain criteria. The correct use of medical classification systems aiming at clearly defined inclusion and exclusion criteria in the studies is an important pillar of this software. The aim of this investigation was to determine whether ATLAS can be used in a Clinical Trial Recruitment Support System to portray the eligibility criteria of clinical studies. Our analysis considered the number of criteria feasible for integration with ATLAS and identified its strengths and weaknesses. Additionally, we investigated whether nonrepresentable criteria were associated with the utilized terminology systems. We analyzed ATLAS using 223 objective eligibility criteria from 30 randomly selected trials conducted in the last 10 years. In the next step, we selected appropriate ICD, OPS, LOINC, or ATC codes to feed the software. We classified each criterion and study based on its implementation capability in the software, ensuring a clear and logical progression of information. Based on our observations, 51% of the analyzed inclusion criteria were fully implemented in ATLAS. Within our selected example set, 10% of the studies were classified as fully portrayable, and 73% were portrayed to some extent. Additionally, we conducted an evaluation of the software regarding its technical limitations and interaction with medical classification systems. To improve and expand the scope of criteria within a cohort definition in a practical setting, it is recommended to work closely with personnel involved in the study to define the criteria precisely and to carefully select terminology systems. The chosen criteria should be combined according to the specific setting. Additional work is needed to specify the significance and amount of the extracted criteria.
Collapse
Affiliation(s)
- Romina Blasini
- Institute of Medical Informatics, Justus Liebig University, Giessen, Germany.
| | - Kornelia Marta Buchowicz
- Institute of Medical Informatics, Justus Liebig University, Giessen, Germany
- Faculty of Health Sciences, University of Applied Sciences, Giessen, Germany
| | - Henning Schneider
- Institute of Medical Informatics, Justus Liebig University, Giessen, Germany
- Faculty of Health Sciences, University of Applied Sciences, Giessen, Germany
| | - Birgit Samans
- Faculty of Health Sciences, University of Applied Sciences, Giessen, Germany
| | - Keywan Sohrabi
- Institute of Medical Informatics, Justus Liebig University, Giessen, Germany
- Faculty of Health Sciences, University of Applied Sciences, Giessen, Germany
| |
Collapse
|
3
|
Henke E, Zoch M, Kallfelz M, Ruhnke T, Leutner LA, Spoden M, Günster C, Sedlmayr M, Bathelt F. Assessing the Use of German Claims Data Vocabularies for Research in the Observational Medical Outcomes Partnership Common Data Model: Development and Evaluation Study. JMIR Med Inform 2023; 11:e47959. [PMID: 37942786 PMCID: PMC10653283 DOI: 10.2196/47959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 09/07/2023] [Accepted: 09/09/2023] [Indexed: 11/10/2023] Open
Abstract
Background National classifications and terminologies already routinely used for documentation within patient care settings enable the unambiguous representation of clinical information. However, the diversity of different vocabularies across health care institutions and countries is a barrier to achieving semantic interoperability and exchanging data across sites. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) enables the standardization of structure and medical terminology. It allows the mapping of national vocabularies into so-called standard concepts, representing normative expressions for international analyses and research. Within our project "Hybrid Quality Indicators Using Machine Learning Methods" (Hybrid-QI), we aim to harmonize source codes used in German claims data vocabularies that are currently unavailable in the OMOP CDM. Objective This study aims to increase the coverage of German vocabularies in the OMOP CDM. We aim to completely transform the source codes used in German claims data into the OMOP CDM without data loss and make German claims data usable for OMOP CDM-based research. Methods To prepare the missing German vocabularies for the OMOP CDM, we defined a vocabulary preparation approach consisting of the identification of all codes of the corresponding vocabularies, their assembly into machine-readable tables, and the translation of German designations into English. Furthermore, we used 2 proposed approaches for OMOP-compliant vocabulary preparation: the mapping to standard concepts using the Observational Health Data Sciences and Informatics (OHDSI) tool Usagi and the preparation of new 2-billion concepts (ie, concept_id >2 billion). Finally, we evaluated the prepared vocabularies regarding completeness and correctness using synthetic German claims data and calculated the coverage of German claims data vocabularies in the OMOP CDM. Results Our vocabulary preparation approach was able to map 3 missing German vocabularies to standard concepts and prepare 8 vocabularies as new 2-billion concepts. The completeness evaluation showed that the prepared vocabularies cover 44.3% (3288/7417) of the source codes contained in German claims data. The correctness evaluation revealed that the specified validity periods in the OMOP CDM are compliant for the majority (705,531/706,032, 99.9%) of source codes and associated dates in German claims data. The calculation of the vocabulary coverage showed a noticeable decrease of missing vocabularies from 55% (11/20) to 10% (2/20) due to our preparation approach. Conclusions By preparing 10 vocabularies, we showed that our approach is applicable to any type of vocabulary used in a source data set. The prepared vocabularies are currently limited to German vocabularies, which can only be used in national OMOP CDM research projects, because the mapping of new 2-billion concepts to standard concepts is missing. To participate in international OHDSI network studies with German claims data, future work is required to map the prepared 2-billion concepts to standard concepts.
Collapse
Affiliation(s)
- Elisa Henke
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Michéle Zoch
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | | | - Thomas Ruhnke
- Wissenschaftliches Institut der AOK (AOK Research Institute), Berlin, Germany
| | - Liz Annika Leutner
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Melissa Spoden
- Wissenschaftliches Institut der AOK (AOK Research Institute), Berlin, Germany
| | - Christian Günster
- Wissenschaftliches Institut der AOK (AOK Research Institute), Berlin, Germany
| | - Martin Sedlmayr
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | | |
Collapse
|
4
|
Zass L, Johnston K, Benkahla A, Chaouch M, Kumuthini J, Radouani F, Mwita LA, Alsayed N, Allie T, Sathan D, Masamu U, Seuneu Tchamga MS, Tamuhla T, Samtal C, Nembaware V, Gill Z, Ahmed S, Hamdi Y, Fadlelmola F, Tiffin N, Mulder N. Developing Clinical Phenotype Data Collection Standards for Research in Africa. Glob Health Epidemiol Genom 2023; 2023:6693323. [PMID: 37766808 PMCID: PMC10522421 DOI: 10.1155/2023/6693323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 06/30/2023] [Accepted: 07/21/2023] [Indexed: 09/29/2023] Open
Abstract
Modern biomedical research is characterised by its high-throughput and interdisciplinary nature. Multiproject and consortium-based collaborations requiring meaningful analysis of multiple heterogeneous phenotypic datasets have become the norm; however, such analysis remains a challenge in many regions across the world. An increasing number of data harmonisation efforts are being undertaken by multistudy collaborations through either prospective standardised phenotype data collection or retrospective phenotype harmonisation. In this regard, the Phenotype Harmonisation Working Group (PHWG) of the Human Heredity and Health in Africa (H3Africa) consortium aimed to facilitate phenotype standardisation by both promoting the use of existing data collection standards (hosted by PhenX), adapting existing data collection standards for appropriate use in low- and middle-income regions such as Africa, and developing novel data collection standards where relevant gaps were identified. Ultimately, the PHWG produced 11 data collection kits, consisting of 82 protocols, 38 of which were existing protocols, 17 were adapted, and 27 were novel protocols. The data collection kits will facilitate phenotype standardisation and harmonisation not only in Africa but also across the larger research community. In addition, the PHWG aims to feed back adapted and novel protocols to existing reference platforms such as PhenX.
Collapse
Affiliation(s)
- Lyndon Zass
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
| | - Katherine Johnston
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
| | - Alia Benkahla
- Laboratory of BioInformatics, BioMathematics and BioStatistics LR16IPT09, Institut Pasteur de Tunis, Tunis, Tunisia
| | - Melek Chaouch
- Laboratory of BioInformatics, BioMathematics and BioStatistics LR16IPT09, Institut Pasteur de Tunis, Tunis, Tunisia
| | - Judit Kumuthini
- South African National Bioinformatics Institute (SANBI), Life Sciences Building, University of Western Cape, Bellville, Cape Town, South Africa
| | - Fouzia Radouani
- Chlamydiae & Mycoplasmas Laboratory Research Department, Institut Pasteur du Maroc, 20360 Casablanca, Morocco
| | - Liberata Alexander Mwita
- Muhimbili Sickle Cell Program, Department of Hematology and Blood Transfusion, Muhimbili University of Health and Allied Sciences, Dar-es-Salaam, Tanzania
| | - Nihad Alsayed
- Kush Centre for Genomics & Biomedical Informatics, Biotechnology Perspectives Organization, Khartoum 11111, Sudan
| | - Taryn Allie
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
| | - Dassen Sathan
- Software Information Systems Department, FOICDT, University of Mauritius, Reduit, Mauritius
| | - Upendo Masamu
- Muhimbili Sickle Cell Program, Department of Hematology and Blood Transfusion, Muhimbili University of Health and Allied Sciences, Dar-es-Salaam, Tanzania
| | | | - Tsaone Tamuhla
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
| | - Chaimae Samtal
- Laboratory of Biotechnology, Environment, Agri-Food and Health, Faculty of Sciences Dhar El Mahraz-Sidi Mohammed Ben Abdellah University, Fez 30000, Morocco
| | - Victoria Nembaware
- Division of Human Genetics, Department of Pathology, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Zoe Gill
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
- Department of Molecular Biology, Johannes Gutenberg University, Mainz, Germany
| | - Samah Ahmed
- Kush Centre for Genomics & Biomedical Informatics, Biotechnology Perspectives Organization, Khartoum 11111, Sudan
| | - Yosr Hamdi
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
- Laboratory of Human and Experimental Pathology, Institut Pasteur de Tunis, Tunis, Tunisia
| | - Faisal Fadlelmola
- Kush Centre for Genomics & Biomedical Informatics, Biotechnology Perspectives Organization, Khartoum 11111, Sudan
| | - Nicki Tiffin
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
- South African National Bioinformatics Institute (SANBI), Life Sciences Building, University of Western Cape, Bellville, Cape Town, South Africa
- Wellcome Centre for Infectious Disease Research in Africa, Institute of Infectious Diseases and Molecular Medicine, Faculty of Cape Town, University of Cape Town, Cape Town, South Africa
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Cape Town, South Africa
- Wellcome Centre for Infectious Disease Research in Africa, Institute of Infectious Diseases and Molecular Medicine, Faculty of Cape Town, University of Cape Town, Cape Town, South Africa
| |
Collapse
|
5
|
Wolfien M, Ahmadi N, Fitzer K, Grummt S, Heine KL, Jung IC, Krefting D, Kühn A, Peng Y, Reinecke I, Scheel J, Schmidt T, Schmücker P, Schüttler C, Waltemath D, Zoch M, Sedlmayr M. Ten Topics to Get Started in Medical Informatics Research. J Med Internet Res 2023; 25:e45948. [PMID: 37486754 PMCID: PMC10407648 DOI: 10.2196/45948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/29/2023] [Accepted: 04/11/2023] [Indexed: 07/25/2023] Open
Abstract
The vast and heterogeneous data being constantly generated in clinics can provide great wealth for patients and research alike. The quickly evolving field of medical informatics research has contributed numerous concepts, algorithms, and standards to facilitate this development. However, these difficult relationships, complex terminologies, and multiple implementations can present obstacles for people who want to get active in the field. With a particular focus on medical informatics research conducted in Germany, we present in our Viewpoint a set of 10 important topics to improve the overall interdisciplinary communication between different stakeholders (eg, physicians, computational experts, experimentalists, students, patient representatives). This may lower the barriers to entry and offer a starting point for collaborations at different levels. The suggested topics are briefly introduced, then general best practice guidance is given, and further resources for in-depth reading or hands-on tutorials are recommended. In addition, the topics are set to cover current aspects and open research gaps of the medical informatics domain, including data regulations and concepts; data harmonization and processing; and data evaluation, visualization, and dissemination. In addition, we give an example on how these topics can be integrated in a medical informatics curriculum for higher education. By recognizing these topics, readers will be able to (1) set clinical and research data into the context of medical informatics, understanding what is possible to achieve with data or how data should be handled in terms of data privacy and storage; (2) distinguish current interoperability standards and obtain first insights into the processes leading to effective data transfer and analysis; and (3) value the use of newly developed technical approaches to utilize the full potential of clinical data.
Collapse
Affiliation(s)
- Markus Wolfien
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence, Dresden, Germany
| | - Najia Ahmadi
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Kai Fitzer
- Core Unit Data Integration Center, University Medicine Greifswald, Greifswald, Germany
| | - Sophia Grummt
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Kilian-Ludwig Heine
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Ian-C Jung
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Dagmar Krefting
- Department of Medical Informatics, University Medical Center, Goettingen, Germany
| | - Andreas Kühn
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Yuan Peng
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Ines Reinecke
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Julia Scheel
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
| | - Tobias Schmidt
- Institute for Medical Informatics, University of Applied Sciences Mannheim, Mannheim, Germany
| | - Paul Schmücker
- Institute for Medical Informatics, University of Applied Sciences Mannheim, Mannheim, Germany
| | - Christina Schüttler
- Central Biobank Erlangen, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Dagmar Waltemath
- Core Unit Data Integration Center, University Medicine Greifswald, Greifswald, Germany
- Department of Medical Informatics, University Medicine Greifswald, Greifswald, Germany
| | - Michele Zoch
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Martin Sedlmayr
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence, Dresden, Germany
| |
Collapse
|
6
|
Lovis C, Siebel J, Fuhrmann S, Fischer A, Sedlmayr M, Weidner J, Bathelt F. Assessment and Improvement of Drug Data Structuredness From Electronic Health Records: Algorithm Development and Validation. JMIR Med Inform 2023; 11:e40312. [PMID: 36696159 PMCID: PMC9909518 DOI: 10.2196/40312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Digitization offers a multitude of opportunities to gain insights into current diagnostics and therapies from retrospective data. In this context, real-world data and their accessibility are of increasing importance to support unbiased and reliable research on big data. However, routinely collected data are not readily usable for research owing to the unstructured nature of health care systems and a lack of interoperability between these systems. This challenge is evident in drug data. OBJECTIVE This study aimed to present an approach that identifies and increases the structuredness of drug data while ensuring standardization according to Anatomical Therapeutic Chemical (ATC) classification. METHODS Our approach was based on available drug prescriptions and a drug catalog and consisted of 4 steps. First, we performed an initial analysis of the structuredness of local drug data to define a point of comparison for the effectiveness of the overall approach. Second, we applied 3 algorithms to unstructured data that translated text into ATC codes based on string comparisons in terms of ingredients and product names and performed similarity comparisons based on Levenshtein distance. Third, we validated the results of the 3 algorithms with expert knowledge based on the 1000 most frequently used prescription texts. Fourth, we performed a final validation to determine the increased degree of structuredness. RESULTS Initially, 47.73% (n=843,980) of 1,768,153 drug prescriptions were classified as structured. With the application of the 3 algorithms, we were able to increase the degree of structuredness to 85.18% (n=1,506,059) based on the 1000 most frequent medication prescriptions. In this regard, the combination of algorithms 1, 2, and 3 resulted in a correctness level of 100% (with 57,264 ATC codes identified), algorithms 1 and 3 resulted in 99.6% (with 152,404 codes identified), and algorithms 1 and 2 resulted in 95.9% (with 39,472 codes identified). CONCLUSIONS As shown in the first analysis steps of our approach, the availability of a product catalog to select during the documentation process is not sufficient to generate structured data. Our 4-step approach reduces the problems and reliably increases the structuredness automatically. Similarity matching shows promising results, particularly for entries with no connection to a product catalog. However, further enhancement of the correctness of such a similarity matching algorithm needs to be investigated in future work.
Collapse
Affiliation(s)
| | - Joscha Siebel
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Saskia Fuhrmann
- Center for Evidence-Based Healthcare, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany.,Hospital Pharmacy, University Hospital Carl Gustav Carus, Dresden, Germany
| | - Andreas Fischer
- Hospital Pharmacy, University Hospital Carl Gustav Carus, Dresden, Germany
| | - Martin Sedlmayr
- Center for Evidence-Based Healthcare, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Jens Weidner
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Franziska Bathelt
- Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
7
|
Park K, Cho M, Song M, Yoo S, Baek H, Kim S, Kim K. Exploring the potential of OMOP common data model for process mining in healthcare. PLoS One 2023; 18:e0279641. [PMID: 36595527 PMCID: PMC9810199 DOI: 10.1371/journal.pone.0279641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 12/09/2022] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND AND OBJECTIVE Recently, Electronic Health Records (EHR) are increasingly being converted to Common Data Models (CDMs), a database schema designed to provide standardized vocabularies to facilitate collaborative observational research. To date, however, rare attempts exist to leverage CDM data for healthcare process mining, a technique to derive process-related knowledge (e.g., process model) from event logs. This paper presents a method to extract, construct, and analyze event logs from the Observational Medical Outcomes Partnership (OMOP) CDM for process mining and demonstrates CDM-based healthcare process mining with several real-life study cases while answering frequently posed questions in process mining, in the CDM environment. METHODS We propose a method to extract, construct, and analyze event logs from the OMOP CDM for process types including inpatient, outpatient, emergency room processes, and patient journey. Using the proposed method, we extract the retrospective data of several surgical procedure cases (i.e., Total Laparoscopic Hysterectomy (TLH), Total Hip Replacement (THR), Coronary Bypass (CB), Transcatheter Aortic Valve Implantation (TAVI), Pancreaticoduodenectomy (PD)) from the CDM of a Korean tertiary hospital. Patient data are extracted for each of the operations and analyzed using several process mining techniques. RESULTS Using process mining, the clinical pathways, outpatient process models, emergency room process models, and patient journeys are demonstrated using the extracted logs. The result shows CDM's usability as a novel and valuable data source for healthcare process analysis, yet with a few considerations. We found that CDM should be complemented by different internal and external data sources to address the administrative and operational aspects of healthcare processes, particularly for outpatient and ER process analyses. CONCLUSION To the best of our knowledge, we are the first to exploit CDM for healthcare process mining. Specifically, we provide a step-by-step guidance by demonstrating process analysis from locating relevant CDM tables to visualizing results using process mining tools. The proposed method can be widely applicable across different institutions. This work can contribute to bringing a process mining perspective to the existing CDM users in the changing Hospital Information Systems (HIS) environment and also to facilitating CDM-based studies in the process mining research community.
Collapse
Affiliation(s)
- Kangah Park
- Department of Industrial and Management Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
| | - Minsu Cho
- School of Information Convergence, Kwangwoon University, Seoul, South Korea
| | - Minseok Song
- Department of Industrial and Management Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
- * E-mail: (MS); (SY)
| | - Sooyoung Yoo
- Healthcare ICT Research Center, Office of eHealth Research and Businesses, Seoul National University Bundang Hospital, Seongnam, South Korea
- * E-mail: (MS); (SY)
| | - Hyunyoung Baek
- Healthcare ICT Research Center, Office of eHealth Research and Businesses, Seoul National University Bundang Hospital, Seongnam, South Korea
| | - Seok Kim
- Healthcare ICT Research Center, Office of eHealth Research and Businesses, Seoul National University Bundang Hospital, Seongnam, South Korea
| | - Kidong Kim
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, South Korea
| |
Collapse
|
8
|
Mavragani A, Lai J, Jin F, Liao X, Zhu H, Yao C. Clinical Source Data Production and Quality Control in Real-world Studies: Proposal for Development of the eSource Record System. JMIR Res Protoc 2022; 11:e42754. [PMID: 36563036 PMCID: PMC9823571 DOI: 10.2196/42754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 11/18/2022] [Accepted: 11/21/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND An eSource generally includes the direct capture, collection, and storage of electronic data to simplify clinical research. It can improve data quality and patient safety and reduce clinical trial costs. There has been some eSource-related research progress in relatively large projects. However, most of these studies focused on technical explorations to improve interoperability among systems to reuse retrospective data for research. Few studies have explored source data collection and quality control during prospective data collection from a methodological perspective. OBJECTIVE This study aimed to design a clinical source data collection method that is suitable for real-world studies and meets the data quality standards for clinical research and to improve efficiency when writing electronic medical records (EMRs). METHODS On the basis of our group's previous research experience, TransCelerate BioPharm Inc eSource logical architecture, and relevant regulations and guidelines, we designed a source data collection method and invited relevant stakeholders to optimize it. On the basis of this method, we proposed the eSource record (ESR) system as a solution and invited experts with different roles in the contract research organization company to discuss and design a flowchart for data connection between the ESR and electronic data capture (EDC). RESULTS The ESR method included 5 steps: research project preparation, initial survey collection, in-hospital medical record writing, out-of-hospital follow-up, and electronic case report form (eCRF) traceability. The data connection between the ESR and EDC covered the clinical research process from creating the eCRF to collecting data for the analysis. The intelligent data acquisition function of the ESR will automatically complete the empty eCRF to create an eCRF with values. When the clinical research associate and data manager conduct data verification, they can query the certified copy database through interface traceability and send data queries. The data queries are transmitted to the ESR through the EDC interface. The EDC and EMR systems interoperate through the ESR. The EMR and EDC systems transmit data to the ESR system through the data standards of the Health Level Seven Clinical Document Architecture and the Clinical Data Interchange Standards Consortium operational data model, respectively. When the implemented data standards for a given system are not consistent, the ESR will approach the problem by first automating mappings between standards and then handling extensions or corrections to a given data format through human evaluation. CONCLUSIONS The source data collection method proposed in this study will help to realize eSource's new strategy. The ESR solution is standardized and sustainable. It aims to ensure that research data meet the attributable, legible, contemporaneous, original, accurate, complete, consistent, enduring, and available standards for clinical research data quality and to provide a new model for prospective data collection in real-world studies.
Collapse
Affiliation(s)
| | - Junkai Lai
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Feifei Jin
- Trauma Medicine Center, Peking University People's Hospital, Beijing, China.,Key Laboratory of Trauma treatment and Neural Regeneration, Peking University, Ministry of Education, Beijing, China.,National Center for Trauma Medicine of China, Beijing, China
| | - Xiwen Liao
- Peking University Clinical Research Institute, Peking University First Hospital, Beijing, China
| | - Huan Zhu
- Hangzhou LionMed Medical Information Technology Co, Ltd, Hangzhou, China
| | - Chen Yao
- Peking University Clinical Research Institute, Peking University First Hospital, Beijing, China.,Hainan Institute of Real World Data, Qionghai, China
| |
Collapse
|
9
|
Moberg R, Moyer EJ, Olson D, Rosenthal E, Foreman B. Harmonization of Physiological Data in Neurocritical Care: Challenges and a Path Forward. Neurocrit Care 2022. [PMID: 35641807 DOI: 10.1007/s12028-022-01524-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 04/20/2022] [Indexed: 10/18/2022]
Abstract
Continuous multimodal monitoring in neurocritical care provides valuable insights into the dynamics of the injured brain. Unfortunately, the "readiness" of this data for robust artificial intelligence (AI) and machine learning (ML) applications is low and presents a significant barrier for advancement. Harmonization standards and tools to implement those standards are key to overcoming existing barriers. Consensus in our professional community is essential for success.
Collapse
|
10
|
Bathelt F, Reinecke I, Peng Y, Henke E, Weidner J, Bartos M, Gött R, Waltemath D, Engelmann K, Schwarz PE, Sedlmayr M. Opportunities of Digital Infrastructures for Disease Management-Exemplified on COVID-19-Related Change in Diagnosis Counts for Diabetes-Related Eye Diseases. Nutrients 2022; 14:2016. [PMID: 35631157 DOI: 10.3390/nu14102016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 05/05/2022] [Accepted: 05/08/2022] [Indexed: 01/20/2023] Open
Abstract
Background: Retrospective research on real-world data provides the ability to gain evidence on specific topics especially when running across different sites in research networks. Those research networks have become increasingly relevant in recent years; not least due to the special situation caused by the COVID-19 pandemic. An important requirement for those networks is the data harmonization by ensuring the semantic interoperability. Aims: In this paper we demonstrate (1) how to facilitate digital infrastructures to run a retrospective study in a research network spread across university and non-university hospital sites; and (2) to answer a medical question on COVID-19 related change in diagnostic counts for diabetes-related eye diseases. Materials and methods: The study is retrospective and non-interventional and runs on medical case data documented in routine care at the participating sites. The technical infrastructure consists of the OMOP CDM and other OHDSI tools that is provided in a transferable format. An ETL process to transfer and harmonize the data to the OMOP CDM has been utilized. Cohort definitions for each year in observation have been created centrally and applied locally against medical case data of all participating sites and analyzed with descriptive statistics. Results: The analyses showed an expectable drop of the total number of diagnoses and the diagnoses for diabetes in general; whereas the number of diagnoses for diabetes-related eye diseases surprisingly decreased stronger compared to non-eye diseases. Differences in relative changes of diagnoses counts between sites show an urgent need to process multi-centric studies rather than single-site studies to reduce bias in the data. Conclusions: This study has demonstrated the ability to utilize an existing portable and standardized infrastructure and ETL process from a university hospital setting and transfer it to non-university sites. From a medical perspective further activity is needed to evaluate data quality of the utilized real-world data documented in routine care and to investigate its eligibility of this data for research.
Collapse
|