1
|
Swart E, Gothe H, Ihle P. Bausteine und Strukturen für eine leistungsfähige Real-World-Data-Analyse. PRÄVENTION UND GESUNDHEITSFÖRDERUNG 2022. [DOI: 10.1007/s11553-022-01005-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Zusammenfassung
Hintergrund
Die Real-World-Data (RWD)-Analyse hat in den vergangenen Jahren eine erhebliche Dynamik entwickelt, gefördert durch gestiegene Möglichkeiten der (explorativen) Analyse großer Datenmengen und parallel durch die zunehmende Verfügbarkeit bislang für die Wissenschaft nicht zugänglicher Datenkörper.
Ziel der Arbeit
Der nächste Entwicklungsschritt der RWD-Analyse ist ihre wissenschaftliche Etablierung in Form von Netzwerkbildung und formeller Organisation sowie Generierung spezifischer Methoden. Auf diesem Weg können Erfahrungen aus der Entwicklung der Sekundärdatenanalyse als Teil der RWD-Analyse hilfreich sein. Es wird diskutiert, inwieweit maßgebliche Schritte in diesem Prozess und dabei entstandene wissenschaftliche Produkte als Vorbild für die RWD-Analyse dienen können, exemplarisch dargestellt an den Aktivitäten der Arbeitsgruppe Erhebung und Nutzung von Sekundärdaten (AGENS).
Material und Methoden
Aus maßgeblichen Entwicklungsphasen der AGENS in den vergangenen 25 Jahren werden potenzielle Prozesse und Strukturen für eine Fortentwicklung der RWD-Analyse abgeleitet.
Ergebnisse
Die wesentlichen Charakteristika der aktuellen Arbeit der AGENS und damit auch der Strukturen der Sekundärdatenanalyse sind: a) Netzwerkbildung von Wissenschaftler:innen aus Forschung und Entwicklung sowie Vertreter:innen der Dateneigner:innen, b) Entwicklung spezifischer wissenschaftlicher Standards, c) die wissenschaftliche Sichtbarkeit durch Schwerpunktbeiträge, Journale und Tagungsformate, d) eigenständige Angebote für Aus- und Weiterbildung von Nachwuchswissenschaftler:innen, e) die laufende Verbesserung des Forschungsfelds durch Erschließung neuer Datenkörper.
Schlussfolgerung
Die Entwicklung der Sekundärdatenanalyse als wichtiger Teil der RWD-Analyse liefert Ansatzpunkte für deren Etablierung als eigenständiger Wissenschaftszweig.
Collapse
|
2
|
Kollhorst B, Reinders T, Grill S, Eberle A, Intemann T, Kieschke J, Meyer M, Nennecke A, Rathmann W, Pigeot I. Record linkage of claims and cancer registries data-Evaluation of a deterministic linkage approach based on indirect personal identifiers. Pharmacoepidemiol Drug Saf 2022; 31:1287-1293. [PMID: 36129372 DOI: 10.1002/pds.5545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 07/20/2022] [Accepted: 09/05/2022] [Indexed: 12/15/2022]
Abstract
PURPOSE In Germany, record linkage of claims and cancer registry data is cost- and time-consuming, since up until recently no unique personal identifier was available in both data sources. The aim of this study was to evaluate the feasibility and performance of a deterministic linkage procedure based on indirect personal identifiers included in the data sources. METHODS We identified users of glucose-lowering drugs with residence in four federal states in Northern and Southern Germany (Bavaria, Bremen, Hamburg, Lower Saxony) in the German Pharmacoepidemiological Research Database (GePaRD) and assessed colorectal and thyroid cancer cases. Cancer registries of the federal states selected all colorectal and thyroid cancer cases between 2004 and 2015. A deterministic linkage approach was performed based on indirect personal identifiers such as year of birth, sex, area of residence, type of cancer and an absolute difference between the dates of cancer diagnosis in both data sources of at most 90 days. Results were compared to a probabilistic linkage using "direct" personal identifiers (gold standard). RESULTS The deterministic linkage procedure yielded a sensitivity of 71.8% for colorectal cancer and 66.6% for thyroid cancer. For thyroid cancer, the sensitivity improved when using only inpatient diagnosis to define cancer in GePaRD (71.4%). Specificity was always above 99%. Using the probabilistic linkage to define cancer cases, the risk for colorectal cancer was estimated 10 percentage points lower than when using the deterministic approach. CONCLUSIONS Sensitivity of the deterministic linkage approach appears to be too low to be considered as reasonable alternative to the probabilistic linkage procedure.
Collapse
Affiliation(s)
- Bianca Kollhorst
- Department of Biometry and Data Management, Leibniz-Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany
| | - Tammo Reinders
- Department of Biometry and Data Management, Leibniz-Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany
| | - Susann Grill
- Department of Biometry and Data Management, Leibniz-Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany
| | - Andrea Eberle
- Cancer Registry of Bremen, Leibniz Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany
| | - Timm Intemann
- Department of Biometry and Data Management, Leibniz-Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany
| | | | - Martin Meyer
- Bavarian State Office for Food Safety and Health, Nürnberg, Germany
| | | | - Wolfgang Rathmann
- Institute for Biometrics and Epidemiology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany
| | - Iris Pigeot
- Department of Biometry and Data Management, Leibniz-Institute for Prevention Research and Epidemiology-BIPS, Bremen, Germany.,Faculty of Mathematics and Computer Science, University of Bremen, Bremen, Germany
| |
Collapse
|
3
|
Scholten N, Ihle P, Pfaff H. [Sustainable Infrastructure for Health Services Research: Development of a Regional SHI Routine Database]. DAS GESUNDHEITSWESEN 2020; 83:463-469. [PMID: 33184806 DOI: 10.1055/a-1205-0751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
AIM The scientific use of SHI routine data is increasing, especially in the field of health services research. This also raises new questions with regard to the development of databases, which make it possible to store these data for longitudinal analyses over a longer period of time and by combining data from different SHI companies. On the basis of the experience gained in setting up the CoRe-Net database, we want to show that it is possible to install such a research infrastructure and make it usable in the long term. METHODOLOGY/RESULTS On the basis of the current regulatory framework (e. g. the added specification of § 75 SGB X) and taking into account strict data protection criteria, it is possible to set up a database covering several health insurance funds: In CoRe-Net, a pseudonymisation centre and a trust centre were implemented for this purpose. At the same time, multiple pseudonymisation was carried out using a one-way hash procedure. Data analysis are only possible after approval by the participating health insurance funds and if valid approval has been obtained from relevant ethics committees. CONCLUSION The amendment of § 75 SGB X in 2018 creates a legal framework for the collection and storage of SHI routine data within the framework of a research project for future questions within a defined research area.
Collapse
Affiliation(s)
- Nadine Scholten
- Institut für Medizinsoziologie, Versorgungsforschung und Rehabilitationswissenschaft (IMVR), Universität zu Köln, Köln
| | - Peter Ihle
- PMV forschungsgruppe an der Medizinischen Fakultät und Uniklinik Köln, Universität zu Köln, Köln
| | - Holger Pfaff
- Institut für Medizinsoziologie, Versorgungsforschung und Rehabilitationswissenschaft (IMVR), Universität zu Köln, Köln
| | | | | |
Collapse
|
4
|
March S, Andrich S, Drepper J, Horenkamp-Sonntag D, Icks A, Ihle P, Kieschke J, Kollhorst B, Maier B, Meyer I, Müller G, Ohlmeier C, Peschke D, Richter A, Rosenbusch ML, Scholten N, Schulz M, Stallmann C, Swart E, Wobbe-Ribinski S, Wolter A, Zeidler J, Hoffmann F. Good Practice Data Linkage (GPD): A Translation of the German Version. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17217852. [PMID: 33120886 PMCID: PMC7663300 DOI: 10.3390/ijerph17217852] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/16/2020] [Accepted: 10/22/2020] [Indexed: 12/14/2022]
Abstract
The data linkage of different data sources for research purposes is being increasingly used in recent years. However, generally accepted methodological guidance is missing. The aim of this article is to provide methodological guidelines and recommendations for research projects that have been consented to across different German research societies. Another aim is to endow readers with a checklist for the critical appraisal of research proposals and articles. This Good Practice Data Linkage (GPD) was already published in German in 2019, but the aspects mentioned can easily be transferred to an international context, especially for other European Union (EU) member states. Therefore, it is now also published in English. Since 2016, an expert panel of members of different German scientific societies have worked together and developed seven guidelines with a total of 27 practical recommendations. These recommendations include (1) the research objectives, research questions, data sources, and resources; (2) the data infrastructure and data flow; (3) data protection; (4) ethics; (5) the key variables and linkage methods; (6) data validation/quality assurance; and (7) the long-term use of data for questions still to be determined. The authors provide a rationale for each recommendation. Future revisions will include new developments in science and updates of data privacy regulations.
Collapse
Affiliation(s)
- Stefanie March
- Institute for Social Medicine and Health Systems Research (ISMHSR), Medical Faculty, Otto von Guericke University Magdeburg, 39120 Magdeburg, Germany; (S.M.); (C.S.); (E.S.)
- Department of Social Work, Health and Media, Magdeburg-Stendal University of Applied Sciences, 39114 Magdeburg, Germany
| | - Silke Andrich
- Institute for Health Services Research and Health Economics, Centre for Health and Society, Faculty of Medicine, Heinrich-Heine-University Düsseldorf, 40225 Dusseldorf, Germany; (S.A.); (A.I.)
- Institute for Health Services Research and Health Economics, German Diabetes Center, Leibniz Center for Diabetes Research at the Heinrich-Heine-University Düsseldorf, 40225 Dusseldorf, Germany
| | - Johannes Drepper
- TMF—Technology, Methods, and Infrastructure for Networked Medical Research, 10117 Berlin, Germany;
| | | | - Andrea Icks
- Institute for Health Services Research and Health Economics, Centre for Health and Society, Faculty of Medicine, Heinrich-Heine-University Düsseldorf, 40225 Dusseldorf, Germany; (S.A.); (A.I.)
- Institute for Health Services Research and Health Economics, German Diabetes Center, Leibniz Center for Diabetes Research at the Heinrich-Heine-University Düsseldorf, 40225 Dusseldorf, Germany
| | - Peter Ihle
- PMV Research Group, University of Cologne, 50931 Cologne, Germany; (P.I.); (I.M.)
| | - Joachim Kieschke
- Epidemiological Cancer Registry of Lower Saxony, Register Center, 26121 Oldenburg, Germany;
| | - Bianca Kollhorst
- Leibniz Institute for Prevention Research and Epidemiology—BIPS Department Biometry and Data Management, 28359 Bremen, Germany;
| | - Birga Maier
- Berlin-Brandenburg Myocardial Infarction Registry e. V., 10317 Berlin, Germany;
| | - Ingo Meyer
- PMV Research Group, University of Cologne, 50931 Cologne, Germany; (P.I.); (I.M.)
| | - Gabriele Müller
- Center for Evidence-Based Healthcare (ZEGV), University Hospital and Faculty of Medicine Carl Gustav Carus, Technical University of Dresden, 01307 Dresden, Germany;
| | | | - Dirk Peschke
- Institute for Public Health and Nursing Research (IPP), University of Bremen, 28359 Bremen, Germany;
- Department of Applied Health Sciences, University of Health Bochum, 44801 Bochum, Germany
| | - Adrian Richter
- Institute for Community Medicine, Department SHIP-KEF, Greifswald University Medical Center, 17475 Greifswald, Germany;
| | - Marie-Luise Rosenbusch
- Central Research Institute for Ambulatory Healthcare in Germany (Zi), Department of Data Science and Healthcare Analyses, 10587 Berlin, Germany; (M.-L.R.); (M.S.)
| | - Nadine Scholten
- Institute of Medical Sociology, Health Services Research and Rehabilitation Science (IMVR), Faculty of Human Sciences and Faculty of Medicine, University of Cologne, 50933 Cologne, Germany;
| | - Mandy Schulz
- Central Research Institute for Ambulatory Healthcare in Germany (Zi), Department of Data Science and Healthcare Analyses, 10587 Berlin, Germany; (M.-L.R.); (M.S.)
| | - Christoph Stallmann
- Institute for Social Medicine and Health Systems Research (ISMHSR), Medical Faculty, Otto von Guericke University Magdeburg, 39120 Magdeburg, Germany; (S.M.); (C.S.); (E.S.)
| | - Enno Swart
- Institute for Social Medicine and Health Systems Research (ISMHSR), Medical Faculty, Otto von Guericke University Magdeburg, 39120 Magdeburg, Germany; (S.M.); (C.S.); (E.S.)
| | - Stefanie Wobbe-Ribinski
- DAK Gesundheit, Health Services Research and Innovation, 20097 Hamburg, Germany; (S.W.-R.); (A.W.)
| | - Antke Wolter
- DAK Gesundheit, Health Services Research and Innovation, 20097 Hamburg, Germany; (S.W.-R.); (A.W.)
| | - Jan Zeidler
- Center for Health Economics Research Hanover (CHERH), Leibniz University Hanover, 30159 Hanover, Germany;
| | - Falk Hoffmann
- Faculty of Medicine and Health Sciences, Department of Healthcare Research, Carl von Ossietzky University Oldenburg, 26129 Oldenburg, Germany
- Correspondence:
| |
Collapse
|
5
|
Ahrens W, Greiser KH, Linseisen J, Pischon T, Pigeot I. [The investigation of health outcomes in the German National Cohort: the most relevant endpoints and their assessment]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2020; 63:376-384. [PMID: 32157353 DOI: 10.1007/s00103-020-03111-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The focus of the German National Cohort, the largest population-based cohort study in Germany to date, is the investigation of the most important widespread diseases, such as cardiovascular diseases, diabetes, cancer, neurological and psychiatric disorders, and frequent respiratory and infectious diseases. This cohort will answer questions on the development of these diseases and on the impact of genetic, environmental and lifestyle-related risk factors. Another focus is on the identification of early, subclinical markers of emerging diseases. To answer these questions, a comprehensive assessment of these health outcomes as well as of all potential determinants and precursors is mandatory.This paper describes the various health outcomes that are assessed in the German National Cohort, as well as the examination modules that are applied for deep phenotyping of study participants. Repeated collection of biosamples as well as functional measurements and application of modern imaging techniques at various time points allow for assessing the dynamics of physiological changes related to the individuals' health status. The prognostic value of these changes for disease development will be explored and translated to novel approaches for prevention and personalised medicine. Incident diseases are being assessed through self-reports by study participants and through record linkage with data from health insurances and cancer registries. Additional information about clinical diagnoses is obtained from the treating physicians to ensure the highest possible validity.
Collapse
Affiliation(s)
- Wolfgang Ahrens
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland. .,Fachbereich Mathematik und Informatik, Universität Bremen, Bremen, Deutschland.
| | - Karin H Greiser
- Abteilung Epidemiologie von Krebserkrankungen, Deutsches Krebsforschungszentrum Heidelberg, Heidelberg, Deutschland
| | - Jakob Linseisen
- Lehrstuhl für Epidemiologie am UNIKA-T, Ludwig-Maximilians-Universität München, Augsburg, Deutschland.,Klinische Epidemiologie, Helmholtz Zentrum München, Neuherberg, Deutschland
| | - Tobias Pischon
- Forschergruppe Molekulare Epidemiologie, Max-Delbrück-Centrum für Molekulare Medizin in der Helmholtz-Gemeinschaft (MDC), Berlin, Deutschland
| | - Iris Pigeot
- Leibniz-Institut für Präventionsforschung und Epidemiologie - BIPS, Achterstr. 30, 28359, Bremen, Deutschland.,Fachbereich Mathematik und Informatik, Universität Bremen, Bremen, Deutschland
| |
Collapse
|
6
|
Individual Data Linkage of Survey Data with Claims Data in Germany-An Overview Based on a Cohort Study. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2017; 14:ijerph14121543. [PMID: 29232834 PMCID: PMC5750961 DOI: 10.3390/ijerph14121543] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 12/01/2017] [Accepted: 12/06/2017] [Indexed: 11/16/2022]
Abstract
Research based on health insurance data has a long tradition in Germany. By contrast, data linkage of survey data with such claims data is a relatively new field of research with high potential. Data linkage opens up new opportunities for analyses in the field of health services research and public health. Germany has comprehensive rules and regulations of data protection that have to be followed. Therefore, a written informed consent is needed for individual data linkage. Additionally, the health system is characterized by heterogeneity of health insurance. The lidA-living at work-study is a cohort study on work, age and health, which linked survey data with claims data of a large number of statutory health insurance data. All health insurance funds were contacted, of whom a written consent was given. This paper will give an overview of individual data linkage of survey data with German claims data on the example of the lidA-study results. The challenges and limitations of data linkage will be presented. Despite heterogeneity, such kind of studies is possible with a negligibly small influence of bias. The experience we gain in lidA will be shown and provide important insights for other studies focusing on data linkage.
Collapse
|
7
|
[Linkage of large secondary and registry data sources with data of cohort studies : usage of a dual potential]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2016; 58:822-828. [PMID: 26063523 DOI: 10.1007/s00103-015-2184-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Cohort studies provide the best evidence of all epidemiological observational studies for the identification of causal relationships between risk factors and diseases. However, this design may lead to drawbacks that may affect the validity and reliability of the results. This follows in particular from systematic errors, such as selection bias or recall bias. One possibility to avoid or counteract some of these drawbacks is to link primary data from cohort studies with secondary and register data. The linkage of these data may also be used for mutual validations. Data that were previously linked with primary data within the context of cohort studies in Germany were obtained from statutory health insurances and pensions as well as data from the Federal Employment Agency and cancer registries. All these data have two features in common: First, they all cover detailed information about a large population and over a long period of time. Second, all sources are in principle able to provide data on an individual level such that an individual data linkage, e.g. with primary data, is possible. However, use and linkage of each of these data sources are restricted by several limitations. These have to be accounted for as well as numerous legal restrictions that exist in Germany to especially prevent the misuse of social data.
Collapse
|