1
|
Priou S, Kempf E, Jankovic M, Lamé G. "Goldmine" or "big mess"? An interview study on the challenges of designing, operating, and ensuring the durability of Clinical Data Warehouses in France and Belgium. J Am Med Inform Assoc 2024; 31:2699-2707. [PMID: 39269930 PMCID: PMC11491596 DOI: 10.1093/jamia/ocae244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 07/28/2024] [Accepted: 09/02/2024] [Indexed: 09/15/2024] Open
Abstract
OBJECTIVES Clinical Data Warehouses (CDW) are the designated infrastructures to enable access and analysis of large quantities of electronic health record data. Building and managing such systems implies extensive "data work" and coordination between multiple stakeholders. Our study focuses on the challenges these stakeholders face when designing, operating, and ensuring the durability of CDWs for research. MATERIALS AND METHODS We conducted semistructured interviews with 21 professionals working with CDWs from France and Belgium. All interviews were recorded, transcribed verbatim, and coded inductively. RESULTS Prompted by the AI boom, healthcare institutions launched initiatives to repurpose data they were generating for care without a clear vision of how to generate value. Difficulties in operating CDWs arose quickly, strengthened by the multiplicity and diversity of stakeholders involved and grand discourses on the possibilities of CDWs, disjointed from their actual capabilities. Without proper management of the information flows, stakeholders struggled to build a shared vision. This was evident in our interviewees' contrasting appreciations of what mattered most to ensure data quality. Participants explained they struggled to manage knowledge inside and across institutions, generating knowledge loss, repeated mistakes, and impeding progress locally and nationally. DISCUSSION AND CONCLUSION Management issues strongly affect the deployment and operation of CDWs. This may stem from a simplistic linear vision of how this type of infrastructure operates. CDWs remain promising for research, and their design, implementation, and operation require careful management if they are to be successful. Building on innovation management, complex systems, and organizational learning knowledge will help.
Collapse
Affiliation(s)
- Sonia Priou
- CentraleSupélec, Laboratoire de Génie Industriel, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Emmanuelle Kempf
- Department of Medical Oncology, Université Paris Est Créteil, AP-HP, CHU Henri Mondor and Albert Chenevier, 94000 Créteil, France
- Laboratoire d’Informatique Médicale et d’Ingénierie des Connaissances pour la e-Santé, LIMICS, Sorbonne Université, Inserm, Université Sorbonne Paris Nord, 75006 Paris, France
| | - Marija Jankovic
- CentraleSupélec, Laboratoire de Génie Industriel, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Guillaume Lamé
- CentraleSupélec, Laboratoire de Génie Industriel, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| |
Collapse
|
2
|
Lamer A, Saint-Dizier C, Paris N, Chazard E. Data Lake, Data Warehouse, Datamart, and Feature Store: Their Contributions to the Complete Data Reuse Pipeline. JMIR Med Inform 2024; 12:e54590. [PMID: 39037339 PMCID: PMC11267403 DOI: 10.2196/54590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/11/2024] [Accepted: 04/05/2024] [Indexed: 07/23/2024] Open
Abstract
Unlabelled The growing adoption and use of health information technology has generated a wealth of clinical data in electronic format, offering opportunities for data reuse beyond direct patient care. However, as data are distributed across multiple software, it becomes challenging to cross-reference information between sources due to differences in formats, vocabularies, and technologies and the absence of common identifiers among software. To address these challenges, hospitals have adopted data warehouses to consolidate and standardize these data for research. Additionally, as a complement or alternative, data lakes store both source data and metadata in a detailed and unprocessed format, empowering exploration, manipulation, and adaptation of the data to meet specific analytical needs. Subsequently, datamarts are used to further refine data into usable information tailored to specific research questions. However, for efficient analysis, a feature store is essential to pivot and denormalize the data, simplifying queries. In conclusion, while data warehouses are crucial, data lakes, datamarts, and feature stores play essential and complementary roles in facilitating data reuse for research and analysis in health care.
Collapse
Affiliation(s)
- Antoine Lamer
- Univ. Lille, CHU Lille, ULR 2694-METRICS, Centre d'Etudes et de Recherche en Informatique Médicale, Lille, France
- Fédération régionale de recherche en psychiatrie et santé mentale des Hauts-de-France, Saint-André-lez-Lille, France
- InterHop, Rennes, France
| | - Chloé Saint-Dizier
- Univ. Lille, CHU Lille, ULR 2694-METRICS, Centre d'Etudes et de Recherche en Informatique Médicale, Lille, France
- Fédération régionale de recherche en psychiatrie et santé mentale des Hauts-de-France, Saint-André-lez-Lille, France
| | | | - Emmanuel Chazard
- Univ. Lille, CHU Lille, ULR 2694-METRICS, Centre d'Etudes et de Recherche en Informatique Médicale, Lille, France
| |
Collapse
|
3
|
Karakachoff M, Goronflot T, Coudol S, Toublant D, Bazoge A, Constant Dit Beaufils P, Varey E, Leux C, Mauduit N, Wargny M, Gourraud PA. Implementing a Biomedical Data Warehouse From Blueprint to Bedside in a Regional French University Hospital Setting: Unveiling Processes, Overcoming Challenges, and Extracting Clinical Insight. JMIR Med Inform 2024; 12:e50194. [PMID: 38915177 PMCID: PMC11217163 DOI: 10.2196/50194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 04/08/2024] [Accepted: 04/17/2024] [Indexed: 06/26/2024] Open
Abstract
Background Biomedical data warehouses (BDWs) have become an essential tool to facilitate the reuse of health data for both research and decisional applications. Beyond technical issues, the implementation of BDWs requires strong institutional data governance and operational knowledge of the European and national legal framework for the management of research data access and use. Objective In this paper, we describe the compound process of implementation and the contents of a regional university hospital BDW. Methods We present the actions and challenges regarding organizational changes, technical architecture, and shared governance that took place to develop the Nantes BDW. We describe the process to access clinical contents, give details about patient data protection, and use examples to illustrate merging clinical insights. Unlabelled More than 68 million textual documents and 543 million pieces of coded information concerning approximately 1.5 million patients admitted to CHUN between 2002 and 2022 can be queried and transformed to be made available to investigators. Since its creation in 2018, 269 projects have benefited from the Nantes BDW. Access to data is organized according to data use and regulatory requirements. Conclusions Data use is entirely determined by the scientific question posed. It is the vector of legitimacy of data access for secondary use. Enabling access to a BDW is a game changer for research and all operational situations in need of data. Finally, data governance must prevail over technical issues in institution data strategy vis-à-vis care professionals and patients alike.
Collapse
Affiliation(s)
- Matilde Karakachoff
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Thomas Goronflot
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Sandrine Coudol
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Delphine Toublant
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
- IT Services, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Adrien Bazoge
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
- Unité Mixte de Recherche 6004, Laboratoire des Sciences du Numérique de Nantes, Centre National de Recherche Scientifique, École Centrale Nantes, Nantes Université, Nantes, France
| | - Pacôme Constant Dit Beaufils
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
- l’institut du thorax, Service de neuroradiologie diagnostique et interventionnelle, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Emilie Varey
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
- Direction de la Recherche et de l’Innovation, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Christophe Leux
- Service d'information médicale, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Nicolas Mauduit
- Service d'information médicale, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Matthieu Wargny
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
| | - Pierre-Antoine Gourraud
- Centre d'Investigation Clinique 1413, INSERM, Clinique des données, Pôle Hospitalo-Universitaire 11: Santé Publique, Centre Hospitalier Universitaire Nantes, Nantes Université, Nantes, France
- INSERM Center for Research in Transplantation and Translational Immunology, Nantes Université, Nantes, France
| |
Collapse
|
4
|
Priou S, Lame G, Jankovic M, Kempf E. "In conferences, everyone goes 'health data is the future' ": an interview study on challenges in re-using EHR data for research in Clinical Data Warehouses. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:579-588. [PMID: 38222365 PMCID: PMC10785853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
More and more hospital Clinical Data Warehouses (CDWs) are developed to gain access to EHR data. The rapid growth of investments in CDWs suggest a real potential for innovation in healthcare. However, it is still not confirmed that CDWs will deliver on their promises as researchers working with CDWs face many challenges. To gain a better understanding of these challenges and how to overcome them, we conducted a series of semi-structured interviews with EHR data experts. In this article, we share some initial results from the ongoing interview study. Two main themes emerged from the analysis of the transcripts of the interviews: the importance of infrastructures in terms of data and how it is generated, and the difficulty to make care, clinical research, and data science work together. Finally, based on the experts' experience, several recommendations were identified when using a CDW.
Collapse
Affiliation(s)
- Sonia Priou
- Université Paris-Saclay, CentraleSupélec, Laboratoire Génie Industriel, France
| | - Guillaume Lame
- Université Paris-Saclay, CentraleSupélec, Laboratoire Génie Industriel, France
| | - Marija Jankovic
- Université Paris-Saclay, CentraleSupélec, Laboratoire Génie Industriel, France
| | - Emmanuelle Kempf
- Université Paris Est Créteil, AP-HP, Department of medical oncology, CHU Henri Mondor and Albert Chenevier, Créteil, France
- Sorbonne Université, Inserm, Universit́ Sorbonne Paris Nord, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé, LIMICS, Paris, France
| |
Collapse
|