Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Klann JG, Phillips LC, Herrick C, Joss MAH, Wagholikar KB, Murphy SN. Web services for data warehouses: OMOP and PCORnet on i2b2. J Am Med Inform Assoc 2019;25:1331-1338. [PMID: 30085008 PMCID: PMC6188504 DOI: 10.1093/jamia/ocy093] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2018] [Accepted: 06/28/2018] [Indexed: 01/09/2023] Open

For:	Klann JG, Phillips LC, Herrick C, Joss MAH, Wagholikar KB, Murphy SN. Web services for data warehouses: OMOP and PCORnet on i2b2. J Am Med Inform Assoc 2019;25:1331-1338. [PMID: 30085008 PMCID: PMC6188504 DOI: 10.1093/jamia/ocy093] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2018] [Accepted: 06/28/2018] [Indexed: 01/09/2023] Open

Number

Cited by Other Article(s)

Cohen AM, Kaner J, Miller R, Kopesky JW, Hersh W. Automatically pre-screening patients for the rare disease aromatic l-amino acid decarboxylase deficiency using knowledge engineering, natural language processing, and machine learning on a large EHR population. J Am Med Inform Assoc 2024;31:692-704. [PMID: 38134953 PMCID: PMC10873832 DOI: 10.1093/jamia/ocad244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/28/2023] [Accepted: 12/01/2023] [Indexed: 12/24/2023] Open

Rajendran S, Pan W, Sabuncu MR, Chen Y, Zhou J, Wang F. Learning across diverse biomedical data modalities and cohorts: Challenges and opportunities for innovation. PATTERNS (NEW YORK, N.Y.) 2024;5:100913. [PMID: 38370129 PMCID: PMC10873158 DOI: 10.1016/j.patter.2023.100913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]

Klann JG, Henderson DW, Morris M, Estiri H, Weber GM, Visweswaran S, Murphy SN. A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation. J Am Med Inform Assoc 2023;30:1985-1994. [PMID: 37632234 PMCID: PMC10654861 DOI: 10.1093/jamia/ocad166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 07/25/2023] [Accepted: 08/08/2023] [Indexed: 08/27/2023] Open

Scheible R, Thomczyk F, Blum M, Rautenberg M, Prunotto A, Yazijy S, Boeker M. Integrating row level security in i2b2: segregation of medical records into data marts without data replication and synchronization. JAMIA Open 2023;6:ooad068. [PMID: 37583654 PMCID: PMC10425194 DOI: 10.1093/jamiaopen/ooad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 07/28/2023] [Accepted: 08/03/2023] [Indexed: 08/17/2023] Open

Sinaci AA, Gencturk M, Teoman HA, Laleci Erturkmen GB, Alvarez-Romero C, Martinez-Garcia A, Poblador-Plou B, Carmona-Pírez J, Löbe M, Parra-Calderon CL. A Data Transformation Methodology to Create Findable, Accessible, Interoperable, and Reusable Health Data: Software Design, Development, and Evaluation Study. J Med Internet Res 2023;25:e42822. [PMID: 36884270 PMCID: PMC10034606 DOI: 10.2196/42822] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 01/04/2023] [Accepted: 01/31/2023] [Indexed: 03/09/2023] Open

Abstract

BACKGROUND

Sharing health data is challenging because of several technical, ethical, and regulatory issues. The Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles have been conceptualized to enable data interoperability. Many studies provide implementation guidelines, assessment metrics, and software to achieve FAIR-compliant data, especially for health data sets. Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) is a health data content modeling and exchange standard.

OBJECTIVE

Our goal was to devise a new methodology to extract, transform, and load existing health data sets into HL7 FHIR repositories in line with FAIR principles, develop a Data Curation Tool to implement the methodology, and evaluate it on health data sets from 2 different but complementary institutions. We aimed to increase the level of compliance with FAIR principles of existing health data sets through standardization and facilitate health data sharing by eliminating the associated technical barriers.

METHODS

Our approach automatically processes the capabilities of a given FHIR end point and directs the user while configuring mappings according to the rules enforced by FHIR profile definitions. Code system mappings can be configured for terminology translations through automatic use of FHIR resources. The validity of the created FHIR resources can be automatically checked, and the software does not allow invalid resources to be persisted. At each stage of our data transformation methodology, we used particular FHIR-based techniques so that the resulting data set could be evaluated as FAIR. We performed a data-centric evaluation of our methodology on health data sets from 2 different institutions.

RESULTS

Through an intuitive graphical user interface, users are prompted to configure the mappings into FHIR resource types with respect to the restrictions of selected profiles. Once the mappings are developed, our approach can syntactically and semantically transform existing health data sets into HL7 FHIR without loss of data utility according to our privacy-concerned criteria. In addition to the mapped resource types, behind the scenes, we create additional FHIR resources to satisfy several FAIR criteria. According to the data maturity indicators and evaluation methods of the FAIR Data Maturity Model, we achieved the maximum level (level 5) for being Findable, Accessible, and Interoperable and level 3 for being Reusable.

CONCLUSIONS

We developed and extensively evaluated our data transformation approach to unlock the value of existing health data residing in disparate data silos to make them available for sharing according to the FAIR principles. We showed that our method can successfully transform existing health data sets into HL7 FHIR without loss of data utility, and the result is FAIR in terms of the FAIR Data Maturity Model. We support institutional migration to HL7 FHIR, which not only leads to FAIR data sharing but also eases the integration with different research networks.

Collapse

Synthetic data in health care: A narrative review. PLOS DIGITAL HEALTH 2023;2:e0000082. [PMID: 36812604 PMCID: PMC9931305 DOI: 10.1371/journal.pdig.0000082] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 12/06/2022] [Indexed: 01/09/2023]

Wagholikar KB, Ainsworth L, Zelle D, Chaney K, Mendis M, Klann J, Blood AJ, Miller A, Chulyadyo R, Oates M, Gordon WJ, Aronson SJ, Scirica BM, Murphy SN. I2b2-etl: Python application for importing electronic health data into the informatics for integrating biology and the bedside platform. Bioinformatics 2022;38:4833-4836. [PMID: 36053173 PMCID: PMC9563689 DOI: 10.1093/bioinformatics/btac595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 07/15/2022] [Accepted: 08/31/2022] [Indexed: 11/14/2022] Open

Lenert LA, Zhu V, Jennings L, McCauley JL, Obeid JS, Ward R, Hassanpour S, Marsch LA, Hogarth M, Shipman P, Harris DR, Talbert JC. Enhancing research data infrastructure to address the opioid epidemic: the Opioid Overdose Network (O2-Net). JAMIA Open 2022;5:ooac055. [PMID: 35783072 PMCID: PMC9243402 DOI: 10.1093/jamiaopen/ooac055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 02/11/2022] [Accepted: 06/17/2022] [Indexed: 02/05/2023] Open

Yu Y, Zong N, Wen A, Liu S, Stone DJ, Knaack D, Chamberlain AM, Pfaff E, Gabriel D, Chute CG, Shah N, Jiang G. Developing an ETL tool for converting the PCORnet CDM into the OMOP CDM to facilitate the COVID-19 data integration. J Biomed Inform 2022;127:104002. [PMID: 35077901 PMCID: PMC8791245 DOI: 10.1016/j.jbi.2022.104002] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 01/17/2022] [Accepted: 01/18/2022] [Indexed: 11/01/2022]

Abstract

OBJECTIVE

The large-scale collection of observational data and digital technologies could help curb the COVID-19 pandemic. However, the coexistence of multiple Common Data Models (CDMs) and the lack of data extract, transform, and load (ETL) tool between different CDMs causes potential interoperability issue between different data systems. The objective of this study is to design, develop, and evaluate an ETL tool that transforms the PCORnet CDM format data into the OMOP CDM.

METHODS

We developed an open-source ETL tool to facilitate the data conversion from the PCORnet CDM and the OMOP CDM. The ETL tool was evaluated using a dataset with 1000 patients randomly selected from the PCORnet CDM at Mayo Clinic. Information loss, data mapping accuracy, and gap analysis approaches were conducted to assess the performance of the ETL tool. We designed an experiment to conduct a real-world COVID-19 surveillance task to assess the feasibility of the ETL tool. We also assessed the capacity of the ETL tool for the COVID-19 data surveillance using data collection criteria of the MN EHR Consortium COVID-19 project.

RESULTS

After the ETL process, all the records of 1000 patients from 18 PCORnet CDM tables were successfully transformed into 12 OMOP CDM tables. The information loss for all the concept mapping was less than 0.61%. The string mapping process for the unit concepts lost 2.84% records. Almost all the fields in the manual mapping process achieved 0% information loss, except the specialty concept mapping. Moreover, the mapping accuracy for all the fields were 100%. The COVID-19 surveillance task collected almost the same set of cases (99.3% overlaps) from the original PCORnet CDM and target OMOP CDM separately. Finally, all the data elements for MN EHR Consortium COVID-19 project could be captured from both the PCORnet CDM and the OMOP CDM.

CONCLUSION

We demonstrated that our ETL tool could satisfy the data conversion requirements between the PCORnet CDM and the OMOP CDM. The outcome of the work would facilitate the data retrieval, communication, sharing, and analysis between different institutions for not only COVID-19 related project, but also other real-world evidence-based observational studies.

Collapse

Wagholikar KB, Zelle D, Ainsworth L, Chaney K, Blood AJ, Miller A, Chulyadyo R, Oates M, Gordon WJ, Aronson SJ, Scirica BM, Murphy SN. Use of automatic SQL generation interface to enhance transparency and validity of health-data analysis. INFORMATICS IN MEDICINE UNLOCKED 2022;31. [PMID: 35874460 PMCID: PMC9306316 DOI: 10.1016/j.imu.2022.100996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Bahmani A, Alavi A, Buergel T, Upadhyayula S, Wang Q, Ananthakrishnan SK, Alavi A, Celis D, Gillespie D, Young G, Xing Z, Nguyen MHH, Haque A, Mathur A, Payne J, Mazaheri G, Li JK, Kotipalli P, Liao L, Bhasin R, Cha K, Rolnik B, Celli A, Dagan-Rosenfeld O, Higgs E, Zhou W, Berry CL, Van Winkle KG, Contrepois K, Ray U, Bettinger K, Datta S, Li X, Snyder MP. A scalable, secure, and interoperable platform for deep data-driven health management. Nat Commun 2021;12:5757. [PMID: 34599181 PMCID: PMC8486823 DOI: 10.1038/s41467-021-26040-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 08/23/2021] [Indexed: 11/08/2022] Open

Affiliation(s)

Amir Bahmani Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Arash Alavi Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Thore Buergel Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Sushil Upadhyayula Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Qiwen Wang Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Srinath Krishna Ananthakrishnan Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Amir Alavi Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Diego Celis Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Dan Gillespie Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Gregory Young Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Ziye Xing Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA
Minh Hoang Huynh Nguyen Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA
Audrey Haque Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA
Ankit Mathur Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Josh Payne Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Ghazal Mazaheri Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Jason Kenichi Li Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Pramod Kotipalli Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Lisa Liao Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Rajat Bhasin Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Kexin Cha Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Benjamin Rolnik Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Alessandra Celli Department of Genetics, Stanford University, Stanford, CA, USA
Orit Dagan-Rosenfeld Department of Genetics, Stanford University, Stanford, CA, USA
Emily Higgs Department of Genetics, Stanford University, Stanford, CA, USA
Wenyu Zhou Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA
Camille Lauren Berry Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Katherine Grace Van Winkle Department of Genetics, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Kévin Contrepois Department of Genetics, Stanford University, Stanford, CA, USA
Utsab Ray Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA
Keith Bettinger Department of Genetics, Stanford University, Stanford, CA, USA Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA
Somalee Datta Technology and Digital Solutions, Stanford Medicine, Stanford, CA, USA
Xiao Li Department of Genetics, Stanford University, Stanford, CA, USA. Department of Biochemistry, The Center for RNA Science and Therapeutics, Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH, USA.
Michael P Snyder Department of Genetics, Stanford University, Stanford, CA, USA. Stanford Center for Genomics and Personalized Medicine, Stanford University, Stanford, CA, USA. Stanford Healthcare Innovation Lab, Stanford University, Stanford, CA, USA.

Collapse

Lenert LA, Ilatovskiy AV, Agnew J, Rudisill P, Jacobs J, Weatherston D, Deans KR. Automated production of research data marts from a canonical fast healthcare interoperability resource data repository: applications to COVID-19 research. J Am Med Inform Assoc 2021;28:1605-1611. [PMID: 33993254 PMCID: PMC8243354 DOI: 10.1093/jamia/ocab108] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 05/14/2021] [Indexed: 11/12/2022] Open

Shang Y, Tian Y, Zhou M, Zhou T, Lyu K, Wang Z, Xin R, Liang T, Zhu S, Li J. EHR-Oriented Knowledge Graph System: Toward Efficient Utilization of Non-Used Information Buried in Routine Clinical Practice. IEEE J Biomed Health Inform 2021;25:2463-2475. [PMID: 34057901 DOI: 10.1109/jbhi.2021.3085003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Abstract

Non-used clinical information has negative implications on healthcare quality. Clinicians pay priority attention to clinical information relevant to their specialties during routine clinical practices but may be insensitive or less concerned about information showing disease risks beyond their specialties, resulting in delayed and missed diagnoses or improper management. In this study, we introduced an electronic health record (EHR)-oriented knowledge graph system to efficiently utilize non-used information buried in EHRs. EHR data were transformed into a semantic patient-centralized information model under the ontology structure of a knowledge graph. The knowledge graph then creates an EHR data trajectory and performs reasoning through semantic rules to identify important clinical findings within EHR data. A graphical reasoning pathway illustrates the reasoning footage and explains the clinical significance for clinicians to better understand the neglected information. An application study was performed to evaluate unconsidered chronic kidney disease (CKD) reminding for non-nephrology clinicians to identify important neglected information. The study covered 71,679 patients in non-nephrology departments. The system identified 2,774 patients meeting CKD diagnosis criteria and 10,377 patients requiring high attention. A follow-up study of 5,439 patients showed that 82.1% of patients who met the diagnosis criteria and 61.4% of patients requiring high attention were confirmed to be CKD positive during follow-up research. The application demonstrated that the proposed approach is feasible and effective in clinical information utilization. Additionally, it's valuable as an explainable artificial intelligence to provide interpretable recommendations for specialist physicians to understand the importance of non-used data and make comprehensive decisions.

Collapse

Kang B, Yoon J, Kim HY, Jo SJ, Lee Y, Kam HJ. Deep-learning-based automated terminology mapping in OMOP-CDM. J Am Med Inform Assoc 2021;28:1489-1496. [PMID: 33987667 DOI: 10.1093/jamia/ocab030] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 01/07/2021] [Accepted: 02/05/2021] [Indexed: 11/14/2022] Open

Abstract

OBJECTIVE

Accessing medical data from multiple institutions is difficult owing to the interinstitutional diversity of vocabularies. Standardization schemes, such as the common data model, have been proposed as solutions to this problem, but such schemes require expensive human supervision. This study aims to construct a trainable system that can automate the process of semantic interinstitutional code mapping.

MATERIALS AND METHODS

To automate mapping between source and target codes, we compute the embedding-based semantic similarity between corresponding descriptive sentences. We also implement a systematic approach for preparing training data for similarity computation. Experimental results are compared to traditional word-based mappings.

RESULTS

The proposed model is compared against the state-of-the-art automated matching system, which is called Usagi, of the Observational Medical Outcomes Partnership common data model. By incorporating multiple negative training samples per positive sample, our semantic matching method significantly outperforms Usagi. Its matching accuracy is at least 10% greater than that of Usagi, and this trend is consistent across various top-k measurements.

DISCUSSION

The proposed deep learning-based mapping approach outperforms previous simple word-level matching algorithms because it can account for contextual and semantic information. Additionally, we demonstrate that the manner in which negative training samples are selected significantly affects the overall performance of the system.

CONCLUSION

Incorporating the semantics of code descriptions more significantly increases matching accuracy compared to traditional text co-occurrence-based approaches. The negative training sample collection methodology is also an important component of the proposed trainable system that can be adopted in both present and future related systems.

Collapse

Lenert LA, Ilatovskiy AV, Agnew J, Rudsill P, Jacobs J, Weatherston D, Deans K. Automated Production of Research Data Marts from a Canonical Fast Healthcare Interoperability Resource (FHIR) Data Repository: Applications to COVID-19 Research. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021. [PMID: 33758877 DOI: 10.1101/2021.03.11.21253384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Syed S, Baghal A, Prior F, Zozus M, Al-Shukri S, Syeda HB, Garza M, Begum S, Gates K, Syed M, Sexton KW. Toolkit to Compute Time-Based Elixhauser Comorbidity Indices and Extension to Common Data Models. Healthc Inform Res 2020;26:193-200. [PMID: 32819037 PMCID: PMC7438698 DOI: 10.4258/hir.2020.26.3.193] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 04/17/2020] [Indexed: 01/02/2023] Open

The promise of big data for precision population health management in the US. Public Health 2020;185:110-116. [PMID: 32615477 DOI: 10.1016/j.puhe.2020.04.040] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 02/16/2020] [Accepted: 04/30/2020] [Indexed: 11/23/2022]

Abstract

OBJECTIVES

As we enter the year 2020, health data in the United States (US) is still in the process of being curated into a usable format. With coordinated data systems, it becomes possible to answer, with relative certainty, what preventive and medical interventions work in the real world and for whom they might work.

STUDY DESIGN

This is a non-systematic expert review.

METHODS

A non-systematic expert review was undertaken to identify relevant scientific and gray literature on the current state and the limitations of evaluation of health interventions and the health data infrastructure in the US. This review also included the literature on nations with unified data systems. We coupled this review with non-structured interviews of data scientists to gain insight into the progress in establishing the components necessary to support a unified data system and to facilitate data exchange for evaluations, as well as further guide our review. Our goal was to produce a critical analysis of the existing attempts to standardize and use data collected during patient encounters with physicians for public health purposes.

RESULTS

Data obtained from electronic health records are produced in a way that is challenging to use and difficult to compile across platforms in the US. One response to this problem has been to encourage the exchange and standardization of health record information through Distributed Research Networks and Common Data Models (CDMs). These data can be combined with mobile health, social media, and other sources of data to radically transform what we know about the prevention and management of disease. However, issues with the variety of CDMs and growing sense of distrust of institutions that maintain data continue to impede medical progress.

CONCLUSIONS

We present a framework for data use that will allow public health to answer a swath of unanswered research questions that can improve public health practice.

Collapse

Ci B, Yang DM, Krailo M, Xia C, Yao B, Luo D, Zhou Q, Xiao G, Xu L, Skapek SX, Murray MJ, Amatruda JF, Klosterkemper L, Shaikh F, Faure-Conter C, Fresneau B, Volchenboum SL, Stoneham S, Lopes LF, Nicholson J, Frazier AL, Xie Y. Development of a Data Model and Data Commons for Germ Cell Tumors. JCO Clin Cancer Inform 2020;4:555-566. [PMID: 32568554 PMCID: PMC7328105 DOI: 10.1200/cci.20.00025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/29/2020] [Indexed: 11/20/2022] Open

Affiliation(s)

Bo Ci Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX
Donghan M. Yang Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX
Mark Krailo Keck School of Medicine, University of Southern California, Los Angeles, CA Children’s Oncology Group, Monrovia, CA
Caihong Xia Children’s Oncology Group, Monrovia, CA
Bo Yao Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX
Danni Luo Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX
Qinbo Zhou Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX
Guanghua Xiao Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX
Lin Xu Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX
Stephen X. Skapek Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX
Matthew J. Murray Department of Pathology, University of Cambridge, Cambridge, United Kingdom
James F. Amatruda Keck School of Medicine, University of Southern California, Los Angeles, CA Cancer and Blood Disease Institute, Children’s Hospital Los Angeles, Los Angeles, CA
Lindsay Klosterkemper Dana-Farber/Boston Children’s Blood and Cancer Disorders Center, Boston, MA
Furqan Shaikh Hospital for Sick Children, University of Toronto, Toronto, ON, Canada
Cecile Faure-Conter Institute of Hematology and Pediatric Oncology, Lyon, France
Brice Fresneau Department of Pediatric Oncology, Gustave Roussy, University of Paris-Saclay, Villejuif, France
Samuel L. Volchenboum Center for Research Informatics, Division of Medicine and Biological Sciences, University of Chicago, Chicago, IL
Sara Stoneham Department of Paediatrics, University College London Hospitals, London, United Kingdom
Luiz Fernando Lopes Children’s Cancer Hospital, Barretos Cancer Center, Barretos, Brazil
James Nicholson Department of Paediatric Haematology and Oncology, Cambridge University Hospitals National Health Service Foundation Trust, Cambridge, United Kingdom
A. Lindsay Frazier Dana-Farber/Boston Children’s Blood and Cancer Disorders Center, Boston, MA
Yang Xie Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX

Collapse

Danese MD, Halperin M, Duryea J, Duryea R. The Generalized Data Model for clinical research. BMC Med Inform Decis Mak 2019;19:117. [PMID: 31234921 PMCID: PMC6591926 DOI: 10.1186/s12911-019-0837-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 06/10/2019] [Indexed: 11/23/2022] Open

Abstract

BACKGROUND

Most healthcare data sources store information within their own unique schemas, making reliable and reproducible research challenging. Consequently, researchers have adopted various data models to improve the efficiency of research. Transforming and loading data into these models is a labor-intensive process that can alter the semantics of the original data. Therefore, we created a data model with a hierarchical structure that simplifies the transformation process and minimizes data alteration.

METHODS

There were two design goals in constructing the tables and table relationships for the Generalized Data Model (GDM). The first was to focus on clinical codes in their original vocabularies to retain the original semantic representation of the data. The second was to retain hierarchical information present in the original data while retaining provenance. The model was tested by transforming synthetic Medicare data; Surveillance, Epidemiology, and End Results data linked to Medicare claims; and electronic health records from the Clinical Practice Research Datalink. We also tested a subsequent transformation from the GDM into the Sentinel data model.

RESULTS

The resulting data model contains 19 tables, with the Clinical Codes, Contexts, and Collections tables serving as the core of the model, and containing most of the clinical, provenance, and hierarchical information. In addition, a Mapping table allows users to apply an arbitrarily complex set of relationships among vocabulary elements to facilitate automated analyses.

CONCLUSIONS

The GDM offers researchers a simpler process for transforming data, clear data provenance, and a path for users to transform their data into other data models. The GDM is designed to retain hierarchical relationships among data elements as well as the original semantic representation of the data, ensuring consistency in protocol implementation as part of a complete data pipeline for researchers.

Collapse

Kirkendall ES, Ni Y, Lingren T, Leonard M, Hall ES, Melton K. Data Challenges With Real-Time Safety Event Detection And Clinical Decision Support. J Med Internet Res 2019;21:e13047. [PMID: 31120022 PMCID: PMC6549472 DOI: 10.2196/13047] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 03/04/2019] [Accepted: 04/05/2019] [Indexed: 12/03/2022] Open

Abstract

Background

The continued digitization and maturation of health care information technology has made access to real-time data easier and feasible for more health care organizations. With this increased availability, the promise of using data to algorithmically detect health care–related events in real-time has become more of a reality. However, as more researchers and clinicians utilize real-time data delivery capabilities, it has become apparent that simply gaining access to the data is not a panacea, and some unique data challenges have emerged to the forefront in the process.

Objective

The aim of this viewpoint was to highlight some of the challenges that are germane to real-time processing of health care system–generated data and the accurate interpretation of the results.

Methods

Distinct challenges related to the use and processing of real-time data for safety event detection were compiled and reported by several informatics and clinical experts at a quaternary pediatric academic institution. The challenges were collated from the experiences of the researchers implementing real-time event detection on more than half a dozen distinct projects. The challenges have been presented in a challenge category-specific challenge-example format.

Results

In total, 8 major types of challenge categories were reported, with 13 specific challenges and 9 specific examples detailed to provide a context for the challenges. The examples reported are anchored to a specific project using medication order, medication administration record, and smart infusion pump data to detect discrepancies and errors between the 3 datasets.

Conclusions

The use of real-time data to drive safety event detection and clinical decision support is extremely powerful, but it presents its own set of challenges that include data quality and technical complexity. These challenges must be recognized and accommodated for if the full promise of accurate, real-time safety event clinical decision support is to be realized.

Collapse