1
|
Kerasidou CX, Malone M, Daly A, Tava F. Machine learning models, trusted research environments and UK health data: ensuring a safe and beneficial future for AI development in healthcare. JOURNAL OF MEDICAL ETHICS 2023; 49:838-843. [PMID: 36997310 DOI: 10.1136/jme-2022-108696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 03/11/2023] [Indexed: 06/19/2023]
Abstract
Digitalisation of health and the use of health data in artificial intelligence, and machine learning (ML), including for applications that will then in turn be used in healthcare are major themes permeating current UK and other countries' healthcare systems and policies. Obtaining rich and representative data is key for robust ML development, and UK health data sets are particularly attractive sources for this. However, ensuring that such research and development is in the public interest, produces public benefit and preserves privacy are key challenges. Trusted research environments (TREs) are positioned as a way of balancing the diverging interests in healthcare data research with privacy and public benefit. Using TRE data to train ML models presents various challenges to the balance previously struck between these societal interests, which have hitherto not been discussed in the literature. These challenges include the possibility of personal data being disclosed in ML models, the dynamic nature of ML models and how public benefit may be (re)conceived in this context. For ML research to be facilitated using UK health data, TREs and others involved in the UK health data policy ecosystem need to be aware of these issues and work to address them in order to continue to ensure a 'safe' health and care data environment that truly serves the public.
Collapse
Affiliation(s)
| | - Maeve Malone
- Dundee Law School, School of Humanities Social Sciences and Law, University of Dundee, Dundee, UK
| | - Angela Daly
- Leverhulme Research Centre for Forensic Science, School of Science and Engineering, University of Dundee, Dundee, UK
| | | |
Collapse
|
2
|
Pearce LA, Borschmann R, Young JT, Kinner SA. Advancing cross-sectoral data linkage to understand and address the health impacts of social exclusion: Challenges and potential solutions. Int J Popul Data Sci 2023; 8:2116. [PMID: 37670956 PMCID: PMC10476462 DOI: 10.23889/ijpds.v8i1.2116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023] Open
Abstract
The use of administrative health data for research, monitoring, and quality improvement has proliferated in recent decades, leading to improvements in health across many disease areas and across the life course. However, not all populations are equally visible in administrative health data, and those that are less visible may be excluded from the benefits of associated research. Socially excluded populations - including the homeless, people with substance dependence, people involved in sex work, migrants or asylum seekers, and people with a history of incarceration - are typically characterised by health inequity. Yet people who experience social exclusion are often invisible within routinely collected administrative health data because information on their markers of social exclusion are not routinely recorded by healthcare providers. These circumstances make it difficult to understand the often complex health needs of socially excluded populations, evaluate and improve the quality of health services that they interact with, provide more accessible and appropriate health services, and develop effective and integrated responses to reduce health inequity. In this commentary we discuss how linking data from multiple sectors with administrative health data, often called cross-sectoral data linkage, is a key method for systematically identifying socially excluded populations in administrative health data and addressing other issues related to data quality and representativeness. We discuss how cross-sectoral data linkage can improve the representation of socially excluded populations in research, monitoring, and quality improvement initiatives, which can in turn inform coordinated responses across multiple sectors of service delivery. Finally, we articulate key challenges and potential solutions for advancing the use of cross-sectoral data linkage to improve the health of socially excluded populations, using international examples.
Collapse
Affiliation(s)
- Lindsay A. Pearce
- School of Population Health, Curtin University, Perth, Western Australia, Australia
- Justice Health Group, Centre for Adolescent Health, Murdoch Children’s Research Institute, Melbourne, Victoria, Australia
| | - Rohan Borschmann
- School of Population Health, Curtin University, Perth, Western Australia, Australia
- Justice Health Group, Centre for Adolescent Health, Murdoch Children’s Research Institute, Melbourne, Victoria, Australia
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
- Department of Psychiatry; University of Oxford, Oxford, UK
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Jesse T. Young
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
- School of Population and Global Health, The University of Western Australia, Perth, Western Australia, Australia
- National Drug Research Institute, Curtin University, Perth, Western Australia, Australia
| | - Stuart A. Kinner
- School of Population Health, Curtin University, Perth, Western Australia, Australia
- Justice Health Group, Centre for Adolescent Health, Murdoch Children’s Research Institute, Melbourne, Victoria, Australia
- Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
- Griffith Criminology Institute, Griffith University, Brisbane, Queensland, Australia
| |
Collapse
|
3
|
Kavianpour S, Sutherland J, Mansouri-Benssassi E, Coull N, Jefferson E. Next-Generation Capabilities in Trusted Research Environments: Interview Study. J Med Internet Res 2022; 24:e33720. [PMID: 36125859 PMCID: PMC9533202 DOI: 10.2196/33720] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 03/22/2022] [Accepted: 05/30/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND A Trusted Research Environment (TRE; also known as a Safe Haven) is an environment supported by trained staff and agreed processes (principles and standards), providing access to data for research while protecting patient confidentiality. Accessing sensitive data without compromising the privacy and security of the data is a complex process. OBJECTIVE This paper presents the security measures, administrative procedures, and technical approaches adopted by TREs. METHODS We contacted 73 TRE operators, 22 (30%) of whom, in the United Kingdom and internationally, agreed to be interviewed remotely under a nondisclosure agreement and to complete a questionnaire about their TRE. RESULTS We observed many similar processes and standards that TREs follow to adhere to the Seven Safes principles. The security processes and TRE capabilities for supporting observational studies using classical statistical methods were mature, and the requirements were well understood. However, we identified limitations in the security measures and capabilities of TREs to support "next-generation" requirements such as wide ranges of data types, ability to develop artificial intelligence algorithms and software within the environment, handling of big data, and timely import and export of data. CONCLUSIONS We found a lack of software or other automation tools to support the community and limited knowledge of how to meet the next-generation requirements from the research community. Disclosure control for exporting artificial intelligence algorithms and software was found to be particularly challenging, and there is a clear need for additional controls to support this capability within TREs.
Collapse
Affiliation(s)
- Sanaz Kavianpour
- School of Design and Informatics, Abertay University, Dundee, United Kingdom
| | - James Sutherland
- Health Informatics Centre, University of Dundee, Dundee, United Kingdom
| | | | - Natalie Coull
- School of Design and Informatics, Abertay University, Dundee, United Kingdom
| | - Emily Jefferson
- Health Informatics Centre, University of Dundee, Dundee, United Kingdom
| |
Collapse
|
4
|
Boyer P, Donia J, Whyne C, Burns D, Shaw J. Regulatory regimes and procedural values for health-related motion data in the United States and Canada. HEALTH POLICY AND TECHNOLOGY 2022. [DOI: 10.1016/j.hlpt.2022.100648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
5
|
Wagner SK, Hughes F, Cortina-Borja M, Pontikos N, Struyven R, Liu X, Montgomery H, Alexander DC, Topol E, Petersen SE, Balaskas K, Hindley J, Petzold A, Rahi JS, Denniston AK, Keane PA. AlzEye: longitudinal record-level linkage of ophthalmic imaging and hospital admissions of 353 157 patients in London, UK. BMJ Open 2022; 12:e058552. [PMID: 35296488 PMCID: PMC8928293 DOI: 10.1136/bmjopen-2021-058552] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
PURPOSE Retinal signatures of systemic disease ('oculomics') are increasingly being revealed through a combination of high-resolution ophthalmic imaging and sophisticated modelling strategies. Progress is currently limited not mainly by technical issues, but by the lack of large labelled datasets, a sine qua non for deep learning. Such data are derived from prospective epidemiological studies, in which retinal imaging is typically unimodal, cross-sectional, of modest number and relates to cohorts, which are not enriched with subpopulations of interest, such as those with systemic disease. We thus linked longitudinal multimodal retinal imaging from routinely collected National Health Service (NHS) data with systemic disease data from hospital admissions using a privacy-by-design third-party linkage approach. PARTICIPANTS Between 1 January 2008 and 1 April 2018, 353 157 participants aged 40 years or older, who attended Moorfields Eye Hospital NHS Foundation Trust, a tertiary ophthalmic institution incorporating a principal central site, four district hubs and five satellite clinics in and around London, UK serving a catchment population of approximately six million people. FINDINGS TO DATE Among the 353 157 individuals, 186 651 had a total of 1 337 711 Hospital Episode Statistics admitted patient care episodes. Systemic diagnoses recorded at these episodes include 12 022 patients with myocardial infarction, 11 735 with all-cause stroke and 13 363 with all-cause dementia. A total of 6 261 931 retinal images of seven different modalities and across three manufacturers were acquired from 1 54 830 patients. The majority of retinal images were retinal photographs (n=1 874 175) followed by optical coherence tomography (n=1 567 358). FUTURE PLANS AlzEye combines the world's largest single institution retinal imaging database with nationally collected systemic data to create an exceptional large-scale, enriched cohort that reflects the diversity of the population served. First analyses will address cardiovascular diseases and dementia, with a view to identifying hidden retinal signatures that may lead to earlier detection and risk management of these life-threatening conditions.
Collapse
Affiliation(s)
- Siegfried Karl Wagner
- Institute of Ophthalmology, University College London, London, UK
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Fintan Hughes
- Department of Anaesthesiology, Duke University Hospital, Durham, North Carolina, USA
| | | | - Nikolas Pontikos
- Institute of Ophthalmology, University College London, London, UK
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Robbert Struyven
- Institute of Ophthalmology, University College London, London, UK
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Xiaoxuan Liu
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- Centre for Regulatory Science and Innovation, Birmingham Health Partners, Birmingham, UK
| | - Hugh Montgomery
- Centre for Human Health and Performance, University College London, London, UK
| | - Daniel C Alexander
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
| | - Eric Topol
- Scripps Research Institute, La Jolla, California, USA
| | - Steffen Erhard Petersen
- William Harvey Research Institute, Queen Mary University of London, London, UK
- Barts Heart Centre, Barts Health NHS Trust, London, UK
| | - Konstantinos Balaskas
- Institute of Ophthalmology, University College London, London, UK
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
- Medical Retina Service, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Jack Hindley
- Department of Information Governance, University College London, London, UK
| | - Axel Petzold
- Institute of Ophthalmology, University College London, London, UK
- Institute of Neurology, University College London, London, UK
- Department of Neurophthalmology, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Jugnoo S Rahi
- Institute of Ophthalmology, University College London, London, UK
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
- Great Ormond Street Institute of Child Health, University College London, London, UK
- Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
- Ulverscroft Vision Research Group, University College London, London, UK
| | - Alastair K Denniston
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- Centre for Regulatory Science and Innovation, Birmingham Health Partners, Birmingham, UK
| | - Pearse A Keane
- Institute of Ophthalmology, University College London, London, UK
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
- Medical Retina Service, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| |
Collapse
|
6
|
Gao C, McGilchrist M, Mumtaz S, Hall C, Anderson LA, Zurowski J, Gordon S, Lumsden J, Munro V, Wozniak A, Sibley M, Banks C, Duncan C, Linksted P, Hume A, Stables CL, Mayor C, Caldwell J, Wilde K, Cole C, Jefferson E. A National Network of Safe Havens: Scottish Perspective. J Med Internet Res 2022; 24:e31684. [PMID: 35262495 PMCID: PMC8943560 DOI: 10.2196/31684] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 11/18/2021] [Accepted: 12/03/2021] [Indexed: 01/22/2023] Open
Abstract
For over a decade, Scotland has implemented and operationalized a system of Safe Havens, which provides secure analytics platforms for researchers to access linked, deidentified electronic health records (EHRs) while managing the risk of unauthorized reidentification. In this paper, a perspective is provided on the state-of-the-art Scottish Safe Haven network, including its evolution, to define the key activities required to scale the Scottish Safe Haven network's capability to facilitate research and health care improvement initiatives. A set of processes related to EHR data and their delivery in Scotland have been discussed. An interview with each Safe Haven was conducted to understand their services in detail, as well as their commonalities. The results show how Safe Havens in Scotland have protected privacy while facilitating the reuse of the EHR data. This study provides a common definition of a Safe Haven and promotes a consistent understanding among the Scottish Safe Haven network and the clinical and academic research community. We conclude by identifying areas where efficiencies across the network can be made to meet the needs of population-level studies at scale.
Collapse
Affiliation(s)
- Chuang Gao
- Health Informatics Centre, Ninewells Hospital & Medical School, University of Dundee, Dundee, United Kingdom
| | - Mark McGilchrist
- Health Informatics Centre, Ninewells Hospital & Medical School, University of Dundee, Dundee, United Kingdom
| | - Shahzad Mumtaz
- Health Informatics Centre, Ninewells Hospital & Medical School, University of Dundee, Dundee, United Kingdom
| | - Christopher Hall
- Health Informatics Centre, Ninewells Hospital & Medical School, University of Dundee, Dundee, United Kingdom
| | - Lesley Ann Anderson
- Centre for Health Data Science, University of Aberdeen, Aberdeen, United Kingdom
| | - John Zurowski
- Imaging Centre of Excellence, Queen Elizabeth University Hospital, Glasgow, United Kingdom
| | - Sharon Gordon
- Grampian Data Safe Haven, Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen, United Kingdom
| | - Joanne Lumsden
- Grampian Data Safe Haven, Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen, United Kingdom
| | - Vicky Munro
- Grampian Data Safe Haven, Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen, United Kingdom
| | - Artur Wozniak
- Grampian Data Safe Haven, Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen, United Kingdom
| | - Michael Sibley
- Electronic Data Research and Innovation Service, Public Health Scotland, Edinburgh, United Kingdom
| | - Christopher Banks
- Electronic Data Research and Innovation Service, Public Health Scotland, Edinburgh, United Kingdom
| | - Chris Duncan
- Lothian Research Safe Haven, Department of Public Health and Health Policy National Health Service Lothian, Edinburgh, United Kingdom
| | - Pamela Linksted
- Lothian Research Safe Haven, Department of Public Health and Health Policy National Health Service Lothian, Edinburgh, United Kingdom
| | - Alastair Hume
- EPCC, University of Edinburgh, Edinburgh, United Kingdom
| | - Catherine L Stables
- DataLoch, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Charlie Mayor
- Glasgow Safe Haven, Research and Development division of National Health Service Greater Glasgow and Clyde, Glasgow, United Kingdom
| | - Jacqueline Caldwell
- Electronic Data Research and Innovation Service, Public Health Scotland, Edinburgh, United Kingdom
| | - Katie Wilde
- Grampian Data Safe Haven, Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen, United Kingdom
| | - Christian Cole
- Health Informatics Centre, Ninewells Hospital & Medical School, University of Dundee, Dundee, United Kingdom
| | - Emily Jefferson
- Health Informatics Centre, Ninewells Hospital & Medical School, University of Dundee, Dundee, United Kingdom
| |
Collapse
|
7
|
Avraam D, Jones E, Burton P. A deterministic approach for protecting privacy in sensitive personal data. BMC Med Inform Decis Mak 2022; 22:24. [PMID: 35090447 PMCID: PMC8796499 DOI: 10.1186/s12911-022-01754-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 01/09/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Data privacy is one of the biggest challenges for any organisation which processes personal data, especially in the area of medical research where data include sensitive information about patients and study participants. Sharing of data is therefore problematic, which is at odds with the principle of open data that is so important to the advancement of society and science. Several statistical methods and computational tools have been developed to help data custodians and analysts overcome this challenge. METHODS In this paper, we propose a new deterministic approach for anonymising personal data. The method stratifies the underlying data by the categorical variables and re-distributes the continuous variables through a k nearest neighbours based algorithm. RESULTS We demonstrate the use of the deterministic anonymisation on real data, including data from a sample of Titanic passengers, and data from participants in the 1958 Birth Cohort. CONCLUSIONS The proposed procedure makes data re-identification difficult while minimising the loss of utility (by preserving the spatial properties of the underlying data); the latter means that informative statistical analysis can still be conducted.
Collapse
Affiliation(s)
- Demetris Avraam
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
- Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Elinor Jones
- Department of Statistical Science, University College London, London, UK
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle, UK
| |
Collapse
|
8
|
Daniels H, Jones KH, Heys S, Ford DV. Exploring the Use of Genomic and Routinely Collected Data: Narrative Literature Review and Interview Study. J Med Internet Res 2021; 23:e15739. [PMID: 34559060 PMCID: PMC8501405 DOI: 10.2196/15739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 10/01/2020] [Accepted: 07/15/2021] [Indexed: 11/13/2022] Open
Abstract
Background Advancing the use of genomic data with routinely collected health data holds great promise for health care and research. Increasing the use of these data is a high priority to understand and address the causes of disease. Objective This study aims to provide an outline of the use of genomic data alongside routinely collected data in health research to date. As this field prepares to move forward, it is important to take stock of the current state of play in order to highlight new avenues for development, identify challenges, and ensure that adequate data governance models are in place for safe and socially acceptable progress. Methods We conducted a literature review to draw information from past studies that have used genomic and routinely collected data and conducted interviews with individuals who use these data for health research. We collected data on the following: the rationale of using genomic data in conjunction with routinely collected data, types of genomic and routinely collected data used, data sources, project approvals, governance and access models, and challenges encountered. Results The main purpose of using genomic and routinely collected data was to conduct genome-wide and phenome-wide association studies. Routine data sources included electronic health records, disease and death registries, health insurance systems, and deprivation indices. The types of genomic data included polygenic risk scores, single nucleotide polymorphisms, and measures of genetic activity, and biobanks generally provided these data. Although the literature search showed that biobanks released data to researchers, the case studies revealed a growing tendency for use within a data safe haven. Challenges of working with these data revolved around data collection, data storage, technical, and data privacy issues. Conclusions Using genomic and routinely collected data holds great promise for progressing health research. Several challenges are involved, particularly in terms of privacy. Overcoming these barriers will ensure that the use of these data to progress health research can be exploited to its full potential.
Collapse
Affiliation(s)
- Helen Daniels
- Population Data Science, Swansea University, Swansea, United Kingdom
| | | | - Sharon Heys
- Population Data Science, Swansea University, Swansea, United Kingdom
| | | |
Collapse
|
9
|
Igumbor JO, Bosire EN, Vicente-Crespo M, Igumbor EU, Olalekan UA, Chirwa TF, Kinyanjui SM, Kyobutungi C, Fonn S. Considerations for an integrated population health databank in Africa: lessons from global best practices. Wellcome Open Res 2021; 6:214. [PMID: 35224211 PMCID: PMC8844538 DOI: 10.12688/wellcomeopenres.17000.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/12/2021] [Indexed: 12/17/2022] Open
Abstract
Background: The rising digitisation and proliferation of data sources and repositories cannot be ignored. This trend expands opportunities to integrate and share population health data. Such platforms have many benefits, including the potential to efficiently translate information arising from such data to evidence needed to address complex global health challenges. There are pockets of quality data on the continent that may benefit from greater integration. Integration of data sources is however under-explored in Africa. The aim of this article is to identify the requirements and provide practical recommendations for developing a multi-consortia public and population health data-sharing framework for Africa. Methods: We conducted a narrative review of global best practices and policies on data sharing and its optimisation. We searched eight databases for publications and undertook an iterative snowballing search of articles cited in the identified publications. The Leximancer software © enabled content analysis and selection of a sample of the most relevant articles for detailed review. Themes were developed through immersion in the extracts of selected articles using inductive thematic analysis. We also performed interviews with public and population health stakeholders in Africa to gather their experiences, perceptions, and expectations of data sharing. Results: Our findings described global stakeholder experiences on research data sharing. We identified some challenges and measures to harness available resources and incentivise data sharing. We further highlight progress made by the different groups in Africa and identified the infrastructural requirements and considerations when implementing data sharing platforms. Furthermore, the review suggests key reforms required, particularly in the areas of consenting, privacy protection, data ownership, governance, and data access. Conclusions: The findings underscore the critical role of inclusion, social justice, public good, data security, accountability, legislation, reciprocity, and mutual respect in developing a responsive, ethical, durable, and integrated research data sharing ecosystem.
Collapse
Affiliation(s)
- Jude O. Igumbor
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| | - Edna N. Bosire
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| | - Marta Vicente-Crespo
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
- African Population and Health Research Centre, Nairobi, Kenya
| | - Ehimario U. Igumbor
- Nigeria Centre for Disease Control, Abuja, Nigeria
- School of Public Health, University of the Western Cape, Cape Town, Western Cape, South Africa
| | - Uthman A. Olalekan
- Warwick-Centre for Applied Health Research and Delivery (WCAHRD), Division of Health Sciences, Warwick Medical School, University of Warwick, Coventry, UK
| | - Tobias F. Chirwa
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| | | | | | - Sharon Fonn
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| |
Collapse
|
10
|
Avraam D, Wilson R, Butters O, Burton T, Nicolaides C, Jones E, Boyd A, Burton P. Privacy preserving data visualizations. EPJ DATA SCIENCE 2021; 10:2. [PMID: 33442528 PMCID: PMC7790778 DOI: 10.1140/epjds/s13688-020-00257-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 12/10/2020] [Indexed: 06/12/2023]
Abstract
Data visualizations are a valuable tool used during both statistical analysis and the interpretation of results as they graphically reveal useful information about the structure, properties and relationships between variables, which may otherwise be concealed in tabulated data. In disciplines like medicine and the social sciences, where collected data include sensitive information about study participants, the sharing and publication of individual-level records is controlled by data protection laws and ethico-legal norms. Thus, as data visualizations - such as graphs and plots - may be linked to other released information and used to identify study participants and their personal attributes, their creation is often prohibited by the terms of data use. These restrictions are enforced to reduce the risk of breaching data subject confidentiality, however they limit analysts from displaying useful descriptive plots for their research features and findings. Here we propose the use of anonymization techniques to generate privacy-preserving visualizations that retain the statistical properties of the underlying data while still adhering to strict data disclosure rules. We demonstrate the use of (i) the well-known k-anonymization process which preserves privacy by reducing the granularity of the data using suppression and generalization, (ii) a novel deterministic approach that replaces individual-level observations with the centroids of each k nearest neighbours, and (iii) a probabilistic procedure that perturbs individual attributes with the addition of random stochastic noise. We apply the proposed methods to generate privacy-preserving data visualizations for exploratory data analysis and inferential regression plot diagnostics, and we discuss their strengths and limitations.
Collapse
Affiliation(s)
- Demetris Avraam
- Population Health Sciences Institute, Newcastle University, Newcastle Upon Tyne, UK
- Department of Business and Public Administration, University of Cyprus, Nicosia, Cyprus
| | - Rebecca Wilson
- Population Health Sciences Institute, Newcastle University, Newcastle Upon Tyne, UK
- Department of Public Health, Policy and Systems, Institute of Population Health, University of Liverpool, Liverpool, UK
| | - Oliver Butters
- Population Health Sciences Institute, Newcastle University, Newcastle Upon Tyne, UK
- Department of Public Health, Policy and Systems, Institute of Population Health, University of Liverpool, Liverpool, UK
| | - Thomas Burton
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Christos Nicolaides
- Department of Business and Public Administration, University of Cyprus, Nicosia, Cyprus
- Nireas Research Center, University of Cyprus, Nicosia, Cyprus
- Sloan School of Management, Massachusetts Institute of Technology, Massachusetts, USA
| | - Elinor Jones
- Department of Statistical Science, University College London, London, UK
| | - Andy Boyd
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Paul Burton
- Population Health Sciences Institute, Newcastle University, Newcastle Upon Tyne, UK
| |
Collapse
|
11
|
Domingues MAP, Camacho R, Rodrigues PP. CMIID: A comprehensive medical information identifier for clinical search harmonization in Data Safe Havens. J Biomed Inform 2020; 114:103669. [PMID: 33359111 DOI: 10.1016/j.jbi.2020.103669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/28/2020] [Accepted: 12/16/2020] [Indexed: 11/27/2022]
Abstract
Over the last decades clinical research has been driven by informatics changes nourished by distinct research endeavors. Inherent to this evolution, several issues have been the focus of a variety of studies: multi-location patient data access, interoperability between terminological and classification systems and clinical practice and records harmonization. Having these problems in mind, the Data Safe Haven paradigm emerged to promote a newborn architecture, better reasoning and safe and easy access to distinct Clinical Data Repositories. This study aim is to present a novel solution for clinical search harmonization within a safe environment, making use of a hybrid coding taxonomy that enables researchers to collect information from multiple repositories based on a clinical domain query definition. Results show that is possible to query multiple repositories using a single query definition based on clinical domains and the capabilities of the Unified Medical Language System, although it leads to deterioration of the framework response times. Participants of a Focus Group and a System Usability Scale questionnaire rated the framework with a median value of 72.5, indicating the hybrid coding taxonomy could be enriched with additional metadata to further improve the refinement of the results and enable the possibility of using this system as data quality tagging mechanism.
Collapse
Affiliation(s)
| | - Rui Camacho
- Faculty of Engineering of the University of Porto, Portugal; LIAAD-INESC TEC, Porto, Portugal
| | - Pedro Pereira Rodrigues
- CINTESIS - Center for Health Technology and Services Research, Portugal; Faculty of Medicine of the University of Porto, Portugal
| |
Collapse
|
12
|
Nind T, Sutherland J, McAllister G, Hardy D, Hume A, MacLeod R, Caldwell J, Krueger S, Tramma L, Teviotdale R, Abdelatif M, Gillen K, Ward J, Scobbie D, Baillie I, Brooks A, Prodan B, Kerr W, Sloan-Murphy D, Herrera JFR, McManus D, Morris C, Sinclair C, Baxter R, Parsons M, Morris A, Jefferson E. An extensible big data software architecture managing a research resource of real-world clinical radiology data linked to other health data from the whole Scottish population. Gigascience 2020; 9:giaa095. [PMID: 32990744 PMCID: PMC7523405 DOI: 10.1093/gigascience/giaa095] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 07/28/2020] [Accepted: 08/26/2020] [Indexed: 02/06/2023] Open
Abstract
AIM To enable a world-leading research dataset of routinely collected clinical images linked to other routinely collected data from the whole Scottish national population. This includes more than 30 million different radiological examinations from a population of 5.4 million and >2 PB of data collected since 2010. METHODS Scotland has a central archive of radiological data used to directly provide clinical care to patients. We have developed an architecture and platform to securely extract a copy of those data, link it to other clinical or social datasets, remove personal data to protect privacy, and make the resulting data available to researchers in a controlled Safe Haven environment. RESULTS An extensive software platform has been developed to host, extract, and link data from cohorts to answer research questions. The platform has been tested on 5 different test cases and is currently being further enhanced to support 3 exemplar research projects. CONCLUSIONS The data available are from a range of radiological modalities and scanner types and were collected under different environmental conditions. These real-world, heterogenous data are valuable for training algorithms to support clinical decision making, especially for deep learning where large data volumes are required. The resource is now available for international research access. The platform and data can support new health research using artificial intelligence and machine learning technologies, as well as enabling discovery science.
Collapse
Affiliation(s)
- Thomas Nind
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - James Sutherland
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Gordon McAllister
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Douglas Hardy
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Ally Hume
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Ruairidh MacLeod
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Jacqueline Caldwell
- Electronic Data Research and Innovation Service (eDRIS), Public Health Scotland (PHS), Nine, Edinburgh Bioquarter, Little France Road, Edinburgh EH16 4UX, UK
| | - Susan Krueger
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Leandro Tramma
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Ross Teviotdale
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Mohammed Abdelatif
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Kenny Gillen
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Joe Ward
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| | - Donald Scobbie
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Ian Baillie
- Electronic Data Research and Innovation Service (eDRIS), Public Health Scotland (PHS), Nine, Edinburgh Bioquarter, Little France Road, Edinburgh EH16 4UX, UK
| | - Andrew Brooks
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Bianca Prodan
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - William Kerr
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Dominic Sloan-Murphy
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Juan F R Herrera
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Dan McManus
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Carole Morris
- Electronic Data Research and Innovation Service (eDRIS), Public Health Scotland (PHS), Nine, Edinburgh Bioquarter, Little France Road, Edinburgh EH16 4UX, UK
| | - Carol Sinclair
- Data Driven Innovation, Public Health Scotland (PHS), Gyle Square, 1 South Gyle Crescent, Edinburgh EH12 9EB, UK
| | - Rob Baxter
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Mark Parsons
- Edinburgh Parallel Computing Centre (EPCC), Edinburgh University, Bayes Centre, 47 Potterrow, Edinburgh EH8 9BT, UK
| | - Andrew Morris
- Health Data Research (HDR) UK, Gibbs Building, 215 Euston Road, London NW1 2BE, UK
| | - Emily Jefferson
- Health Informatics Centre (HIC), School of Medicine, University of Dundee, (Main level 5 corridor), Second Floor, Level 7, Mailbox 15, Ninewells Hospital & Medical School, Dundee DD1 9SY2, UK
| |
Collapse
|
13
|
Jones K, Daniels H, Heys S, Lacey A, Ford DV. Toward a Risk-Utility Data Governance Framework for Research Using Genomic and Phenotypic Data in Safe Havens: Multifaceted Review. J Med Internet Res 2020; 22:e16346. [PMID: 32412420 PMCID: PMC7260661 DOI: 10.2196/16346] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 01/13/2020] [Accepted: 01/30/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Research using genomic data opens up new insights into health and disease. Being able to use the data in association with health and administrative record data held in safe havens can multiply the benefits. However, there is much discussion about the use of genomic data with perceptions of particular challenges in doing so safely and effectively. OBJECTIVE This study aimed to work toward a risk-utility data governance framework for research using genomic and phenotypic data in an anonymized form for research in safe havens. METHODS We carried out a multifaceted review drawing upon data governance arrangements in published research, case studies of organizations working with genomic and phenotypic data, public views and expectations, and example studies using genomic and phenotypic data in combination. The findings were contextualized against a backdrop of legislative and regulatory requirements and used to create recommendations. RESULTS We proposed recommendations toward a risk-utility model with a flexible suite of controls to safeguard privacy and retain data utility for research. These were presented as overarching principles aligned to the core elements in the data sharing framework produced by the Global Alliance for Genomics and Health and as practical control measures distilled from published literature and case studies of operational safe havens to be applied as required at a project-specific level. CONCLUSIONS The recommendations presented can be used to contribute toward a proportionate data governance framework to promote the safe, socially acceptable use of genomic and phenotypic data in safe havens. They do not purport to eradicate risk but propose case-by-case assessment with transparency and accountability. If the risks are adequately understood and mitigated, there should be no reason that linked genomic and phenotypic data should not be used in an anonymized form for research in safe havens.
Collapse
Affiliation(s)
- Kerina Jones
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Helen Daniels
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Sharon Heys
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Arron Lacey
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - David V Ford
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| |
Collapse
|
14
|
Lovestone S, EMIF Consortium. The European medical information framework: A novel ecosystem for sharing healthcare data across Europe. Learn Health Syst 2020; 4:e10214. [PMID: 32313838 PMCID: PMC7156868 DOI: 10.1002/lrh2.10214] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 11/27/2019] [Accepted: 11/29/2019] [Indexed: 12/15/2022] Open
Abstract
INTRODUCTION The European medical information framework (EMIF) was an Innovative Medicines Initiative project jointly supported by the European Union and the European Federation of Pharmaceutical Industries and Associations, that generated a common technology and governance framework to identify, assess and (re)use healthcare data, to facilitate real-world data research. The objectives of EMIF included providing a unified platform to support a wide range of studies within two verification programmes-Alzheimer's disease (EMIF-AD), and metabolic consequences of obesity (EMIF-MET). METHODS The EMIF platform was built around two main data-types: electronic health record data and research cohort data, and the platform architecture composed of a set of tools designed to enable data discovery and characterisation. This included the EMIF catalogue, which allowed users to find relevant data sources, including the data-types collected. Data harmonisation via a common data model were central to the project especially for population data sources. EMIF also developed an ethical code of practice to ensure data protection, patient confidentiality and compliance with the European Data Protection Directive, and GDPR. RESULTS Currently 18 population-based disease agnostic and 60 cohort-based Alzheimer's data partners from across 14 countries are contained within the catalogue, and this will continue to expand. The work conducted in EMIF-AD and EMIF-MET includes standardizing cohorts, summarising baseline characteristics of patients, developing diagnostic algorithms, epidemiological studies, identifying and validating novel biomarkers and selecting potential patient samples for pharmacological intervention. CONCLUSIONS EMIF was designed to provide a sustainable model as demonstrated by the sustainability plans for EMIF-AD. Although network-wide studies using EMIF were not conducted during this project to evaluate its sustainability, learning from EMIF will be used in the follow-on IMI-2 project, European Health Data and Evidence Network (EHDEN). Furthermore, EMIF has facilitated collaborations between partners and continues to promote a wider adoption of principles, technology and architecture through some of its continued work.
Collapse
Affiliation(s)
- Simon Lovestone
- Neurodegeneration, Janssen R&D, Janssen Pharmaceutica, Beerse, Belgium
| | | |
Collapse
|
15
|
Demmler JC, Brophy ST, Marchant A, John A, Tan JOA. Shining the light on eating disorders, incidence, prognosis and profiling of patients in primary and secondary care: national data linkage study. Br J Psychiatry 2020; 216:105-112. [PMID: 31256764 PMCID: PMC7557634 DOI: 10.1192/bjp.2019.153] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
BACKGROUND Diagnosing eating disorders can be difficult and few people with the disorder receive specialist services despite the associated high morbidity and mortality. AIMS To examine the burden of eating disorders in the population in terms of incidence, comorbidities and survival. METHOD We used linked electronic health records from general practitioner and hospital admissions in Wales, UK within the Secure Anonymised Information Linkage (SAIL) databank to investigate the incidence of new eating disorder diagnoses. We examined the frequency of comorbid diagnoses and prescribed medications in cases and controls in the 2 years before and 3 years after diagnosis, and performed a survival analysis. RESULTS A total of 15 558 people were diagnosed with eating disorders between 1990 and 2017. The incidence peaked at 24 per 100 000 people in 2003/04. People with eating disorders showed higher levels of other mental disorders (odds ratio 4.32, 95% CI 4.01-4.66) and external causes of morbidity and mortality (odds ratio 2.92, 95% CI 2.44-3.50). They had greater prescription of central nervous system drugs (odds ratio 3.15, 95% CI 2.97-3.33), gastrointestinal drugs (odds ratio 2.61, 95% CI 2.45-2.79) and dietetic drugs (odds ratio 2.42, 95% CI 2.24-2.62) before diagnosis. These excess diagnoses and prescriptions remained 3 years after diagnosis. Mortality was raised compared with controls for some eating disorders, particularly in females with anorexia nervosa. CONCLUSIONS Incidence of diagnosed eating disorders is relatively low in the population but there is a major longer term burden in morbidity and mortality to the individual.
Collapse
Affiliation(s)
- Joanne C. Demmler
- Lecturer in Health Data Science, Swansea University, UK,Correspondence: Joanne C. Demmler, Data Science Building, College of Medicine, Swansea University, Singleton Park, SwanseaSA2 8PP, UK.
| | | | | | - Ann John
- Professor of Public Health and Psychiatry, Swansea University, UK
| | | |
Collapse
|
16
|
Jones KH, Daniels H, Squires E, Ford DV. Public Views on Models for Accessing Genomic and Health Data for Research: Mixed Methods Study. J Med Internet Res 2019; 21:e14384. [PMID: 31436163 PMCID: PMC6727690 DOI: 10.2196/14384] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2019] [Revised: 07/05/2019] [Accepted: 07/07/2019] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND The literature abounds with increasing numbers of research studies using genomic data in combination with health data (eg, health records and phenotypic and lifestyle data), with great potential for large-scale research and precision medicine. However, concerns have been raised about social acceptability and risks posed for individuals and their kin. Although there has been public engagement on various aspects of this topic, there is a lack of information about public views on data access models. OBJECTIVE This study aimed to address the lack of information on the social acceptability of access models for reusing genomic data collected for research in conjunction with health data. Models considered were open web-based access, released externally to researchers, and access within a data safe haven. METHODS Views were ascertained using a series of 8 public workshops (N=116). The workshops included an explanation of benefits and risks in using genomic data with health data, a facilitated discussion, and an exit questionnaire. The resulting quantitative data were analyzed using descriptive and inferential statistics, and the qualitative data were analyzed for emerging themes. RESULTS Respondents placed a high value on the reuse of genomic data but raised concerns including data misuse, information governance, and discrimination. They showed a preference for giving consent and use of data within a safe haven over external release or open access. Perceived risks with open access included data being used by unscrupulous parties, with external release included data security, and with safe havens included the need for robust safeguards. CONCLUSIONS This is the first known study exploring public views of access models for reusing anonymized genomic and health data in research. It indicated that people are generally amenable but prefer data safe havens because of perceived sensitivities. We recommend that public views be incorporated into guidance on models for the reuse of genomic and health data.
Collapse
Affiliation(s)
- Kerina H Jones
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Helen Daniels
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - Emma Squires
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| | - David V Ford
- Population Data Science, Swansea University Medical School, Swansea University, Swansea, United Kingdom
| |
Collapse
|
17
|
Henare KL, Parker KE, Wihongi H, Blenkiron C, Jansen R, Reid P, Findlay MP, Lawrence B, Hudson M, Print CG. Mapping a route to Indigenous engagement in cancer genomic research. Lancet Oncol 2019; 20:e327-e335. [DOI: 10.1016/s1470-2045(19)30307-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Revised: 03/28/2019] [Accepted: 04/01/2019] [Indexed: 12/23/2022]
|
18
|
Willison DJ, Trowbridge J, Greiver M, Keshavjee K, Mumford D, Sullivan F. Participatory governance over research in an academic research network: the case of Diabetes Action Canada. BMJ Open 2019; 9:e026828. [PMID: 31005936 PMCID: PMC6500288 DOI: 10.1136/bmjopen-2018-026828] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 02/15/2019] [Accepted: 02/18/2019] [Indexed: 11/04/2022] Open
Abstract
Digital data generated in the course of clinical care are increasingly being leveraged for a wide range of secondary purposes. Researchers need to develop governance policies that can assure the public that their information is being used responsibly. Our aim was to develop a generalisable model for governance of research emanating from health data repositories that will invoke the trust of the patients and the healthcare professionals whose data are being accessed for health research. We developed our governance principles and processes through literature review and iterative consultation with key actors in the research network including: a data governance working group, the lead investigators and patient advisors. We then recruited persons to participate in the governing and advisory bodies. Our governance process is informed by eight principles: (1) transparency; (2) accountability; (3) follow rule of law; (4) integrity; (5) participation and inclusiveness; (6) impartiality and independence; (7) effectiveness, efficiency and responsiveness and (8) reflexivity and continuous quality improvement. We describe the rationale for these principles, as well as their connections to the subsequent policies and procedures we developed. We then describe the function of the Research Governing Committee, the majority of whom are either persons living with diabetes or physicians whose data are being used, and the patient and data provider advisory groups with whom they consult and communicate. In conclusion, we have developed a values-based information governance framework and process for Diabetes Action Canada that adds value over-and-above existing scientific and ethics review processes by adding a strong patient perspective and contextual integrity. This model is adaptable to other secure data repositories.
Collapse
Affiliation(s)
- Donald J Willison
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Joslyn Trowbridge
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Michelle Greiver
- Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Family and Community Medicine, North York General Hospital, Toronto, Ontario, Canada
| | | | | | - Frank Sullivan
- Family and Community Medicine, North York General Hospital, Toronto, Ontario, Canada
- School of Medicine, University of St. Andrews, St Andrews, UK
| |
Collapse
|
19
|
Kalkman S, Mostert M, Gerlinger C, van Delden JJM, van Thiel GJMW. Responsible data sharing in international health research: a systematic review of principles and norms. BMC Med Ethics 2019; 20:21. [PMID: 30922290 PMCID: PMC6437875 DOI: 10.1186/s12910-019-0359-9] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 03/12/2019] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Large-scale linkage of international clinical datasets could lead to unique insights into disease aetiology and facilitate treatment evaluation and drug development. Hereto, multi-stakeholder consortia are currently designing several disease-specific translational research platforms to enable international health data sharing. Despite the recent adoption of the EU General Data Protection Regulation (GDPR), the procedures for how to govern responsible data sharing in such projects are not at all spelled out yet. In search of a first, basic outline of an ethical governance framework, we set out to explore relevant ethical principles and norms. METHODS We performed a systematic review of literature and ethical guidelines for principles and norms pertaining to data sharing for international health research. RESULTS We observed an abundance of principles and norms with considerable convergence at the aggregate level of four overarching themes: societal benefits and value; distribution of risks, benefits and burdens; respect for individuals and groups; and public trust and engagement. However, at the level of principles and norms we identified substantial variation in the phrasing and level of detail, the number and content of norms considered necessary to protect a principle, and the contextual approaches in which principles and norms are used. CONCLUSIONS While providing some helpful leads for further work on a coherent governance framework for data sharing, the current collection of principles and norms prompts important questions about how to streamline terminology regarding de-identification and how to harmonise the identified principles and norms into a coherent governance framework that promotes data sharing while securing public trust.
Collapse
Affiliation(s)
- Shona Kalkman
- Department of Medical Humanities, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584, CG, Utrecht, the Netherlands.
| | - Menno Mostert
- Department of Medical Humanities, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584, CG, Utrecht, the Netherlands
| | - Christoph Gerlinger
- Statistics and Data Insights, Bayer AG, Berlin, Germany
- Clinic for Gynecology, Obstetrics and Reproductive Medicine, Saarland University Medical Center, Homburg, Saarland, Germany
| | - Johannes J M van Delden
- Department of Medical Humanities, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584, CG, Utrecht, the Netherlands
| | - Ghislaine J M W van Thiel
- Department of Medical Humanities, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584, CG, Utrecht, the Netherlands
| |
Collapse
|
20
|
Protect us from poor-quality medical research. Hum Reprod 2019; 33:770-776. [PMID: 29617882 DOI: 10.1093/humrep/dey056] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Accepted: 03/01/2018] [Indexed: 01/22/2023] Open
Abstract
Much of the published medical research is apparently flawed, cannot be replicated and/or has limited or no utility. This article presents an overview of the current landscape of biomedical research, identifies problems associated with common study designs and considers potential solutions. Randomized clinical trials, observational studies, systematic reviews and meta-analyses are discussed in terms of their inherent limitations and potential ways of improving their conduct, analysis and reporting. The current emphasis on statistical significance needs to be replaced by sound design, transparency and willingness to share data with a clear commitment towards improving the quality and utility of clinical research.
Collapse
|
21
|
Foster J, McLeod J, Nolin J, Greifeneder E. Data work in context: Value, risks, and governance. J Assoc Inf Sci Technol 2018. [DOI: 10.1002/asi.24105] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Jonathan Foster
- Information School; University of Sheffield; Sheffield South Yorkshire UK
| | - Julie McLeod
- Department of Computer and Information Sciences; Northumbria University; Newcastle upon Tyne UK
| | - Jan Nolin
- Swedish School of Library and Information Science; Borås Sweden
| | - Elke Greifeneder
- Berlin School of Library and Information Science, Humboldt University Berlin; Berlin Germany
| |
Collapse
|
22
|
van Veen EB. Observational health research in Europe: understanding the General Data Protection Regulation and underlying debate. Eur J Cancer 2018; 104:70-80. [PMID: 30336359 DOI: 10.1016/j.ejca.2018.09.032] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 09/27/2018] [Indexed: 01/26/2023]
Abstract
Insights into the incidence and survival of cancer, the influence of lifestyle and environmental factors and the interaction of treatment regimens with outcomes are hugely dependent on observational research, patient data derived from the healthcare system and from volunteers participating in cohort studies, often non-selective. Since 25th May 2018, the European General Data Protection Regulation (GDPR) applies to such data. The GDPR focusses on more individual control for data subjects of 'their' data. Yet, the GDPR was preceded by a long debate. The research community participated actively in that debate, and as a result, the GDPR has research exemptions as well. Some of those apply directly; other exemptions need to be implemented into national law. Those exemptions will be discussed together with a general outline of the GDPR. I propose a substantive definition of research-absent in the GDPR-which can warrant its special status in the GDPR. The debate is not over yet. Most legal texts exhibit ambiguity and are interpreted against a background of values. In this case, those could be subsumed under informational self-determination versus solidarity and the deeper meaning of autonomy. Values will also guide national implementation and their interpretation. The value of individual control or informational self-determination should be balanced by nuanced visions about our mutual dependency in healthcare, as an ever-learning system, especially in the European solidarity-based healthcare systems. Good research governance might be a way forward to escape the consent or anonymise dichotomy.
Collapse
Affiliation(s)
- Evert-Ben van Veen
- MLC Foundation, Dagelijkse Groenmarkt 2, 2513 AL Den Haag, the Netherlands.
| |
Collapse
|
23
|
Lugg-Widger FV, Angel L, Cannings-John R, Hood K, Hughes K, Moody G, Robling M. Challenges in accessing routinely collected data from multiple providers in the UK for primary studies: Managing the morass. Int J Popul Data Sci 2018; 3:432. [PMID: 34095522 PMCID: PMC8142952 DOI: 10.23889/ijpds.v3i3.432] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
INTRODUCTION Researchers are increasingly using routinely collected data in addition to, or instead of, other data collection methods. The UK government continues to invest in research centres to encourage use of these data, and trials and cohort studies utilise data linkage methods in the follow-up of participants. This does not come without its limitations and challenges, such as data access delays. OBJECTIVE This paper outlines the challenges faced by three projects utilising individual-level routinely-collected linked data for the longer-term follow-up of participants. METHODS These studies are varied in design, study population and data providers. One researcher was common to the three studies and collated relevant study correspondence, formal documentary evidence such as data sharing agreements and, where relevant, meeting records to review. Key themes were identified and reviewed by other members of the research teams. Mitigating strategies were identified and discussed with a data provider representative and a broader group of researchers to finalise the recommendations presented. RESULTS The challenges discussed are grouped into five themes: Data application process; Project timelines; Dependencies and considerations related to consent; Information Governance; Contractual. In presenting our results descriptively we summarise each case study, identify the main cross-cutting themes and consider the potential for mitigation of challenges. CONCLUSIONS We make recommendations that identify responsibilities for both researchers and data providers for mitigating and managing data access challenges. A continued conversation within the research community and with data providers is needed to continue to enable researchers to access and utilise the wealth of routinely-collected data available. The suggestions made in this paper will help researchers be better prepared to deal with the challenges of applying for data from multiple data providers.
Collapse
Affiliation(s)
- Fiona V Lugg-Widger
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Lianna Angel
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Rebecca Cannings-John
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Kerenza Hood
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Kathryn Hughes
- Division of Population Medicine, Cardiff University, School of Medicine, UHW Main Building, Heath Park, Cardiff, CF14 4XN
| | - Gwenllian Moody
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Michael Robling
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| |
Collapse
|
24
|
Lugg-Widger FV, Angel L, Cannings-John R, Hood K, Hughes K, Moody G, Robling M. Challenges in accessing routinely collected data from multiple providers in the UK for primary studies: Managing the morass. Int J Popul Data Sci 2018. [PMID: 34095522 DOI: 10.23889/ijpds.v3i3.432.] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022] Open
Abstract
Introduction Researchers are increasingly using routinely collected data in addition to, or instead of, other data collection methods. The UK government continues to invest in research centres to encourage use of these data, and trials and cohort studies utilise data linkage methods in the follow-up of participants. This does not come without its limitations and challenges, such as data access delays. Objective This paper outlines the challenges faced by three projects utilising individual-level routinely-collected linked data for the longer-term follow-up of participants. Methods These studies are varied in design, study population and data providers. One researcher was common to the three studies and collated relevant study correspondence, formal documentary evidence such as data sharing agreements and, where relevant, meeting records to review. Key themes were identified and reviewed by other members of the research teams. Mitigating strategies were identified and discussed with a data provider representative and a broader group of researchers to finalise the recommendations presented. Results The challenges discussed are grouped into five themes: Data application process; Project timelines; Dependencies and considerations related to consent; Information Governance; Contractual. In presenting our results descriptively we summarise each case study, identify the main cross-cutting themes and consider the potential for mitigation of challenges. Conclusions We make recommendations that identify responsibilities for both researchers and data providers for mitigating and managing data access challenges. A continued conversation within the research community and with data providers is needed to continue to enable researchers to access and utilise the wealth of routinely-collected data available. The suggestions made in this paper will help researchers be better prepared to deal with the challenges of applying for data from multiple data providers.
Collapse
Affiliation(s)
- Fiona V Lugg-Widger
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Lianna Angel
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Rebecca Cannings-John
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Kerenza Hood
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Kathryn Hughes
- Division of Population Medicine, Cardiff University, School of Medicine, UHW Main Building, Heath Park, Cardiff, CF14 4XN
| | - Gwenllian Moody
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| | - Michael Robling
- Centre for Trials Research, Cardiff University, Neuadd Meirionnydd, Heath Park Way, Cardiff CF14 4YS
| |
Collapse
|
25
|
Peek N, Rodrigues PP. Three controversies in health data science. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2018; 6:261-269. [PMID: 30957010 PMCID: PMC6413491 DOI: 10.1007/s41060-018-0109-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Accepted: 02/24/2018] [Indexed: 12/18/2022]
Abstract
The routine operation of modern healthcare systems produces a wealth of data in electronic health records, administrative databases, clinical registries, and other clinical systems. It is widely acknowledged that there is great potential for utilising these routine data for health research to derive new knowledge about health, disease, and treatments. However, the reuse of routine healthcare data for research is not beyond debate. In this paper, we discuss three issues that have stirred considerable controversy among health data scientists. First, we discuss van der Lei's 1st Law of Medical Informatics, which states that data shall be used only for the purpose for which they were collected. Then, we discuss to which extent routine data sources and innovations in analytical methods alleviate the need to conduct randomised clinical trials. Finally, we address questions of governance, privacy, and trust when routine health data are made available for research. While we don't think that there is a definite "right answer" for any of these issues, we argue that data scientists should be aware of the arguments for different viewpoints, respect their validity, and contribute constructively to the debate. The three controversies discussed in this paper relate to core challenges for research with health data and define an essential research agenda for the health data science community.
Collapse
Affiliation(s)
- Niels Peek
- Division of Informatics, Imaging, and Data Science, School of Health Sciences, University of Manchester, Manchester, UK
- NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester, UK
| | - Pedro Pereira Rodrigues
- Centre for Health Technology and Services Research, Faculty of Medicine, University of Porto, Porto, Portugal
| |
Collapse
|
26
|
McDonnell L, Delaney BC, Sullivan F. Finding and using routine clinical datasets for observational research and quality improvement. Br J Gen Pract 2018; 68:147-148. [PMID: 29472226 PMCID: PMC5819976 DOI: 10.3399/bjgp18x695237] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 12/18/2017] [Indexed: 10/31/2022] Open
Affiliation(s)
- Lucy McDonnell
- School of Population Health and Environmental Sciences, King's College London, London, UK
| | - Brendan C Delaney
- Faculty of Medicine, Department of Surgery and Cancer, Imperial College London, London, UK
| | - Frank Sullivan
- Population and Behavioural Science, University of St Andrews, St Andrews, UK; Gordon F Cheesbrough research chair, North York General Hospital, Toronto, ON, Canada; professor, Department of Family and Community Medicine and Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada; adjunct scientist, Institute for Clinical Evaluative Sciences, Ontario, Canada
| |
Collapse
|
27
|
Heeney C, Kerr SM. Balancing the local and the universal in maintaining ethical access to a genomics biobank. BMC Med Ethics 2017; 18:80. [PMID: 29282045 PMCID: PMC5745812 DOI: 10.1186/s12910-017-0240-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 12/18/2017] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Issues of balancing data accessibility with ethical considerations and governance of a genomics research biobank, Generation Scotland, are explored within the evolving policy landscape of the past ten years. During this time data sharing and open data access have become increasingly important topics in biomedical research. Decisions around data access are influenced by local arrangements for governance and practices such as linkage to health records, and the global through policies for biobanking and the sharing of data with large-scale biomedical research data resources and consortia. METHODS We use a literature review of policy relevant documents which apply to the conduct of biobanks in two areas: support for open access and the protection of data subjects and researchers managing a bioresource. We present examples of decision making within a biobank based upon observations of the Generation Scotland Access Committee. We reflect upon how the drive towards open access raises ethical dilemmas for established biorepositories containing data and samples from human subjects. RESULTS Despite much discussion in science policy literature about standardisation, the contextual aspects of biobanking are often overlooked. Using our engagement with GS we demonstrate the importance of local arrangements in the creation of a responsive ethical approach to biorepository governance. We argue that governance decisions regarding access to the biobank are intertwined with considerations about maintenance and viability at the local level. We show that in addition to the focus upon ever more universal and standardised practices, the local expertise gained in the management of such repositories must be supported. CONCLUSIONS A commitment to open access in genomics research has found almost universal backing in science and health policy circles, but repositories of data and samples from human subjects may have to operate under managed access, to protect privacy, align with participant consent and ensure that the resource can be managed in a sustainable way. Data access committees need to be reflexive and flexible, to cope with changing technology and opportunities and threats from the wider data sharing environment. To understand these interactions also involves nurturing what is particular about the biobank in its local context.
Collapse
Affiliation(s)
- Catherine Heeney
- Science, Technology and Innovation Studies, University of Edinburgh, High School Yards, Edinburgh, Scotland EH1 1LZ UK
| | - Shona M. Kerr
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road South, Edinburgh, Scotland EH4 2XU UK
| |
Collapse
|
28
|
Jones KH, Laurie G, Stevens L, Dobbs C, Ford DV, Lea N. The other side of the coin: Harm due to the non-use of health-related data. Int J Med Inform 2016; 97:43-51. [PMID: 27919394 DOI: 10.1016/j.ijmedinf.2016.09.010] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 09/15/2016] [Accepted: 09/22/2016] [Indexed: 10/20/2022]
Abstract
INTRODUCTION It is widely acknowledged that breaches and misuses of health-related data can have serious implications and consequently they often carry penalties. However, harm due to the omission of health data usage, or data non-use, is a subject that lacks attention. A better understanding of this 'other side of the coin' is required before it can be addressed effectively. APPROACH This article uses an international case study approach to explore why data non-use is difficult to ascertain, the sources and types of health-related data non-use, its implications for citizens and society and some of the reasons it occurs. It does this by focussing on issues with clinical care records, research data and governance frameworks and associated examples of non-use. RESULTS AND DISCUSSION The non-use of health-related data is a complex issue with multiple explanations. Individual instances of data non-use can be associated with harm, but taken together, they can describe a trail of data non-use that may complicate and compound its impacts. There is ample indirect evidence that health data non-use is implicated in the deaths of many thousands of people and potentially £billions in financial burdens to societies. CONCLUSIONS Harm due to the non-use of health data is difficult to attribute unequivocally and actual proven evidence is sparse. Although it can be elusive, it is nevertheless a real problem with widespread and serious, if largely unquantifiable, consequences. The most effective initiatives to address specific contexts of data non-use will be those that: firstly, understand the pertinent sources, types and reasons for data non-use in a given domain in order to meet the challenges and create appropriate incentives and repercussions; and secondly, are cognisant of the multiple aspects to this complex issue in other domains to keep benefits and limitations in perspective, to move steadily towards socially responsible reuse of data becoming the norm to save lives and resources.
Collapse
Affiliation(s)
- Kerina H Jones
- Data Science, Swansea School of Medicine, Swansea University, Swansea SA2 8PP, UK.
| | - Graeme Laurie
- School of Law, University of Edinburgh, Old College, South Bridge, Edinburgh EH8 9YL, UK
| | - Leslie Stevens
- School of Law, University of Edinburgh, Old College, South Bridge, Edinburgh EH8 9YL, UK
| | - Christine Dobbs
- Data Science, Swansea School of Medicine, Swansea University, Swansea SA2 8PP, UK
| | - David V Ford
- Data Science, Swansea School of Medicine, Swansea University, Swansea SA2 8PP, UK
| | - Nathan Lea
- Centre for Health Informatics, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|