1
|
Lloyd KE, Smith SG. Dataset for a qualitative interview study exploring the barriers and facilitators to using and recommending aspirin for cancer prevention. Health Psychol Behav Med 2025; 13:2463916. [PMID: 39959432 PMCID: PMC11827028 DOI: 10.1080/21642850.2025.2463916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Accepted: 02/04/2025] [Indexed: 02/18/2025] Open
Abstract
Introduction Aspirin is increasingly recommended for colorectal cancer prevention for people with Lynch syndrome, who are at higher risk. Before starting aspirin, patients should be reviewed by a healthcare professional for contraindications. We conducted interviews to explore the views of people with Lynch syndrome and healthcare professionals on aspirin for cancer prevention. While open data sharing is increasingly implemented for quantitative research, it is less likely to be adopted for qualitative data. We aimed to create and share a qualitative dataset of the interview transcripts in a restricted access repository. Methods We interviewed 15 people with Lynch syndrome and 23 healthcare professionals. Healthcare professionals included general practitioners (GPs), community pharmacists, genetic counsellors, and specialist hospital clinicians (e.g. genetics, gastroenterology). The interview schedule was informed by the Theoretical Domains Framework. Interviews were conducted over video or telephone. Results Participants could consent to their anonymised interview transcript being deposited in a restricted repository, with access limited to people using the data for non-commercial research, learning or teaching purposes. Those who did not consent could still participate in the interview. Several transcripts were removed due to identifiability concerns. In total, we deposited 12 transcripts with people with Lynch syndrome, and 8 transcripts with GPs. Discussion To safeguard participants' identities, we fully anonymised the dataset. While this acted to protect participants' identities, it also potentially reduced its usability due to the removal of rich contextual detail. When sharing qualitative data, it is important to balance confidentiality with data reusability.
Collapse
Affiliation(s)
- Kelly E. Lloyd
- Leeds Institute of Health Sciences, University of Leeds, Leeds, UK
| | - Samuel G. Smith
- Leeds Institute of Health Sciences, University of Leeds, Leeds, UK
| |
Collapse
|
2
|
Gore-Gorszewska G. "I'm telling you my story, not publishing a blog": Considerations and suggestions on data sharing in qualitative health psychology research on sensitive topics. J Health Psychol 2024; 29:665-673. [PMID: 38549221 DOI: 10.1177/13591053241239109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024] Open
Abstract
Qualitative research plays a pivotal role in health psychology, offering insights into the intricacies of health-related issues. However, the specificity of qualitative methodology presents challenges in adhering to standard open science principles, including data sharing. The guidelines to address these issues are limited. Drawing from the author's experience in conducting in-depth interviews with middle-aged and older adults regarding their sexuality, this article discusses various challenges in implementing data sharing requirements. It emphasizes factors like participants' reasonable reluctance to share in specific populations, the depth of personal information gleaned from comprehensive interviews, concerns surrounding potential data misuse both within and outside academic circles, and the complex issue of obtaining informed consent. A universal approach to data sharing in qualitative research proves impractical, emphasizing the necessity for adaptable, context-specific guidelines that acknowledge the methodology's nuances. Striking a balance between transparency and ethical responsibility requires tailored strategies and thoughtful consideration.
Collapse
Affiliation(s)
- Gabriela Gore-Gorszewska
- Institute of Psychology, Faculty of Philosophy, Jagiellonian University, Kraków, Poland
- Psychology Research Institute, Faculty of Social Studies, Masaryk University, Brno, Czechia
| |
Collapse
|
3
|
Khan S, Hirsch JS, Zeltzer-Zubida O. A dataset without a code book: ethnography and open science. FRONTIERS IN SOCIOLOGY 2024; 9:1308029. [PMID: 38505356 PMCID: PMC10949981 DOI: 10.3389/fsoc.2024.1308029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 02/20/2024] [Indexed: 03/21/2024]
Abstract
This paper reflects upon calls for "open data" in ethnography, drawing on our experiences doing research on sexual violence. The core claim of this paper is not that open data is undesirable; it is that there is a lot we must know before we presume its benefits apply to ethnographic research. The epistemic and ontological foundation of open data is grounded in a logic that is not always consistent with that of ethnographic practice. We begin by identifying three logics of open data-epistemic, political-economic, and regulatory-which each address a perceived problem with knowledge production and point to open science as the solution. We then evaluate these logics in the context of the practice of ethnographic research. Claims that open data would improve data quality are, in our assessment, potentially reversed: in our own ethnographic work, open data practices would likely have compromised our data quality. And protecting subject identities would have meant creating accessible data that would not allow for replication. For ethnographic work, open data would be like having the data set without the codebook. Before we adopt open data to improve the quality of science, we need to answer a series of questions about what open data does to data quality. Rather than blindly make a normative commitment to a principle, we need empirical work on the impact of such practices - work which must be done with respect to the different epistemic cultures' modes of inquiry. Ethnographers, as well as the institutions that fund and regulate ethnographic research, should only embrace open data after the subject has been researched and evaluated within our own epistemic community.
Collapse
Affiliation(s)
- Shamus Khan
- Departments of Sociology and American Studies, Princeton University, Princeton, NJ, United States
| | - Jennifer S. Hirsch
- Mailman School of Public Health, Columbia University, New York City, NY, United States
| | | |
Collapse
|
4
|
DuBois JM, Mozersky J, Parsons M, Walsh HA, Friedrich A, Pienta A. Exchanging words: Engaging the challenges of sharing qualitative research data. Proc Natl Acad Sci U S A 2023; 120:e2206981120. [PMID: 37831745 PMCID: PMC10614603 DOI: 10.1073/pnas.2206981120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2023] Open
Abstract
In January 2023, a new NIH policy on data sharing went into effect. The policy applies to both quantitative and qualitative research (QR) data such as data from interviews or focus groups. QR data are often sensitive and difficult to deidentify, and thus have rarely been shared in the United States. Over the past 5 y, our research team has engaged stakeholders on QR data sharing, developed software to support data deidentification, produced guidance, and collaborated with the ICPSR data repository to pilot the deposit of 30 QR datasets. In this perspective article, we share important lessons learned by addressing eight clusters of questions on issues such as where, when, and what to share; how to deidentify data and support high-quality secondary use; budgeting for data sharing; and the permissions needed to share data. We also offer a brief assessment of the state of preparedness of data repositories, QR journals, and QR textbooks to support data sharing. While QR data sharing could yield important benefits to the research community, we quickly need to develop enforceable standards, expertise, and resources to support responsible QR data sharing. Absent these resources, we risk violating participant confidentiality and wasting a significant amount of time and funding on data that are not useful for either secondary use or data transparency and verification.
Collapse
Affiliation(s)
- James M. DuBois
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Jessica Mozersky
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Meredith Parsons
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Heidi A. Walsh
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Annie Friedrich
- Bioethics Research Center, Department of Medicine, Washington University School of Medicine, St. Louis, MO63110
| | - Amy Pienta
- ICPSR, Institute for Social Research, University of Michigan, Ann Arbor, MI 48106
| |
Collapse
|
5
|
Huma B, Joyce JB. 'One size doesn't fit all': Lessons from interaction analysis on tailoring Open Science practices to qualitative research. BRITISH JOURNAL OF SOCIAL PSYCHOLOGY 2023; 62:1590-1604. [PMID: 35953889 DOI: 10.1111/bjso.12568] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 07/20/2022] [Accepted: 07/28/2022] [Indexed: 11/25/2022]
Abstract
The Open Science Movement aims to enhance the soundness, transparency, and accessibility of scientific research, and at the same time increase public trust in science. Currently, Open Science practices are mainly presented as solutions to the 'reproducibility crisis' in hypothetico-deductive quantitative research. Increasing interest has been shown towards exploring how these practices can be adopted by qualitative researchers. In reviewing this emerging body of work, we conclude that the issue of diversity within qualitative research has not been adequately addressed. Furthermore, we find that many of these endeavours start with existing solutions for which they are trying to find matching problems to be solved. We contrast this approach with a natural incorporation of Open Science practices within interaction analysis and its constituent research traditions: conversation analysis, discursive psychology, ethnomethodology, and membership categorisation analysis. Zooming in on the development of conversation analysis starting in the 1960s, we highlight how practices for opening up and sharing data and analytic thinking have been embedded into its methodology. On the basis of this presentation, we propose a series of lessons learned for adopting Open Science practices in qualitative research.
Collapse
Affiliation(s)
- Bogdana Huma
- Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | | |
Collapse
|
6
|
Mikkelsen JG, Sørensen NL, Merrild CH, Jensen MB, Thomsen JL. Patient perspectives on data sharing regarding implementing and using artificial intelligence in general practice - a qualitative study. BMC Health Serv Res 2023; 23:335. [PMID: 37016412 PMCID: PMC10071604 DOI: 10.1186/s12913-023-09324-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 03/22/2023] [Indexed: 04/06/2023] Open
Abstract
BACKGROUND Due to more elderly and patients with complex illnesses, there is an increasing pressure on the healthcare system. General practice especially feels this pressure as being the first point of contact for the patients. Developments in digitalization have undergone fast progress and data-driven artificial intelligence (AI) has shown great potential for use in general practice. To develop AI as a support tool for general practitioners (GPs), access to patients' health data is needed, but patients have concerns regarding data sharing. Furthermore, studies show that trust is important regarding the patient-GP relationship, data sharing, and AI. The aim of this paper is to uncover patient perspectives on trust regarding the patient-GP relationship, data sharing and AI in general practice. METHOD This study investigated 10 patients' perspectives through qualitative interviews and written vignettes were chosen to elicit the patients (interviewees) perspectives on topics that they were not familiar with prior to the interviews. The study specifically investigated perspectives on 1) The patient-GP relationship, 2) data sharing regarding developing AI for general practice, and 3) implementation and use of AI in general practice using thematic analysis. The study took place in the North Denmark Region and the interviewees included had to be registered in general practice and be above 18 years in age. We included four men between 25 to 74 years in age and six women between 27 to 46 years in age. RESULTS The interviewees expressed a high level of trust towards their GP and were willing to share their health data with their GP. The interviewees believed that AI could be a great help to GPs if used as a support tool in general practice. However, it was important for the interviewees that the GP would still be the primary decision maker. CONCLUSION Patients may be willing to share health data to help implement and use AI in general practice. If AI is implemented in a way that preserves the patient-GP relationship and used as a support tool for the GP, our results indicate that patients may be positive towards the use of AI in general practice.
Collapse
|
7
|
Campbell R, Goodman-Williams R, Javorka M, Engleton J, Gregory K. Understanding Sexual Assault Survivors’ Perspectives on Archiving Qualitative Data: Implications for Feminist Approaches to Open Science. PSYCHOLOGY OF WOMEN QUARTERLY 2022. [DOI: 10.1177/03616843221131546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The open science movement has framed data sharing as necessary and achievable best practices for high-quality science. Feminist psychologists have complicated that narrative by questioning the purpose of data sharing across different paradigms, methodologies, and research populations. In these debates, the academic community has centered the needs and voices of researchers, and participants’ perspectives are largely missing from this literature. In this study, we sought to understand how research participants feel about sharing qualitative data on a sensitive subject—sexual victimization. As part of a participatory action research project, we conducted qualitative interviews with sexual assault survivors about their post-assault help-seeking experiences. The federal funding agency that supported this project requires researchers to archive de-identified data in a national data repository (the National Archive of Criminal Justice Data [NACJD]). All participants consented to archiving data, and the vast majority expressed positive views about data sharing because they wanted to help other survivors. Participants emphasized that our participatory action research approach and our stated goal of helping survivors were important considerations in their decisions regarding data sharing. Researchers should obtain informed consent from their participants for data sharing/archiving, and discuss their dissemination plans during the informed consent process.
Collapse
Affiliation(s)
- Rebecca Campbell
- Department of Psychology, Michigan State University, East Lansing, MI, USA
| | | | - McKenzie Javorka
- Department of Psychology, Michigan State University, East Lansing, MI, USA
| | - Jasmine Engleton
- Department of Psychology, Michigan State University, East Lansing, MI, USA
| | - Katie Gregory
- Department of Psychology, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
8
|
Freese J, Rauf T, Voelkel JG. Advances in transparency and reproducibility in the social sciences. SOCIAL SCIENCE RESEARCH 2022; 107:102770. [PMID: 36058608 DOI: 10.1016/j.ssresearch.2022.102770] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 06/23/2022] [Accepted: 06/24/2022] [Indexed: 06/15/2023]
Abstract
Worries about a "credibility crisis" besieging science have ignited interest in research transparency and reproducibility as ways of restoring trust in published research. For quantitative social science, advances in transparency and reproducibility can be seen as a set of developments whose trajectory predates the recent alarm. We discuss several of these developments, including preregistration, data-sharing, formal infrastructure in the form of resources and policies, open access to research, and specificity regarding research contributions. We also discuss the spillovers of this predominantly quantitative effort towards transparency for qualitative research. We conclude by emphasizing the importance of mutual accountability for effective science, the essential role of openness for this accountability, and the importance of scholarly inclusiveness in figuring out the best ways for openness to be accomplished in practice.
Collapse
|
9
|
Dolan EH, Shiells K, Goulding J, Skatova A. Public attitudes towards sharing loyalty card data for academic health research: a qualitative study. BMC Med Ethics 2022; 23:58. [PMID: 35672737 PMCID: PMC9171733 DOI: 10.1186/s12910-022-00795-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 05/04/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND A growing number of studies show the potential of loyalty card data for use in health research. However, research into public perceptions of using this data is limited. This study aimed to investigate public attitudes towards donating loyalty card data for academic health research, and the safeguards the public would want to see implemented. The way in which participant attitudes varied according to whether loyalty card data would be used for either cancer or COVID-19 research was also examined. METHODS Participants (N = 40) were recruited via Prolific Academic to take part in semi-structured telephone interviews, with questions focused on data sharing related to either COVID-19 or ovarian/bowel cancer as the proposed health condition to be researched. Content analysis was used to identify sub-themes corresponding to the two a priori themes, attitudes and safeguards. RESULTS Participant attitudes were found to fall into two categories, either rational or emotional. Under rational, most participants were in favour of sharing loyalty card data. Support of health research was seen as an important reason to donate such data, with loyalty card logs being considered as already within the public domain. With increased understanding of research purpose, participants expressed higher willingness to donate data. Within the emotional category, participants shared fears about revealing location information and of third parties obtaining their data. With regards to safeguards, participants described the importance of anonymisation and the level of data detail; the control, convenience and choice they desired in sharing data; and the need for transparency and data security. The change in hypothetical purpose of the data sharing, from Covid-19 to cancer research, had no impact on participants' decision to donate, although did affect their understanding of how loyalty card data could be used. CONCLUSIONS Based on interviews with the public, this study contributes recommendations for those researchers and the wider policy community seeking to obtain loyalty card data for health research. Whilst participants were largely in favour of donating loyalty card data for academic health research, information, choice and appropriate safeguards are all exposed as prerequisites upon which decisions are made.
Collapse
Affiliation(s)
- Elizabeth H Dolan
- N/LAB, Nottingham University Business School, University of Nottingham, Si Yuan Building, Jubilee Campus, Nottingham, NG8 1BB, UK.
| | - Kate Shiells
- Medical Research Council (MRC) Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Department of Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
- Alan Turing Institute, London, UK
| | - James Goulding
- N/LAB, Nottingham University Business School, University of Nottingham, Si Yuan Building, Jubilee Campus, Nottingham, NG8 1BB, UK
| | - Anya Skatova
- Medical Research Council (MRC) Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Department of Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
- Alan Turing Institute, London, UK
| |
Collapse
|
10
|
Mozersky J, Friedrich AB, DuBois JM. A Content Analysis of 100 Qualitative Health Research Articles to Examine Researcher-Participant Relationships and Implications for Data Sharing. INTERNATIONAL JOURNAL OF QUALITATIVE METHODS 2022; 21:10.1177/16094069221105074. [PMID: 38404360 PMCID: PMC10888521 DOI: 10.1177/16094069221105074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
We conducted a qualitative content analysis of health science literature (N = 100) involving qualitative interviews or focus groups. Given recent data sharing mandates, our goal was to characterize the nature of relationships between the researchers and participants to inform ethical deliberations regarding qualitative data sharing and secondary analyses. Specifically, some researchers worry that data sharing might harm relationships, while others claim that data cannot be analyzed absent meaningful relationships with participants. We found little evidence of relationship building with participants. The majority of studies involve single encounters (95%), lasting less than 60 min (59%), with less than half of authors involved in primary data collection. Our findings suggest that relationships with participants might not pose a barrier to sharing some qualitative data collected in the health sciences and speak to the feasibility in principle of secondary analyses of these data.
Collapse
Affiliation(s)
- Jessica Mozersky
- Bioethics Research Center, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Annie B. Friedrich
- Bioethics Research Center, Washington University School of Medicine, St. Louis, Missouri, USA
| | - James M. DuBois
- Bioethics Research Center, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
11
|
VandeVusse A, Mueller J, Karcher S. Qualitative Data Sharing: Participant Understanding, Motivation, and Consent. QUALITATIVE HEALTH RESEARCH 2022; 32:182-191. [PMID: 34847803 PMCID: PMC8739617 DOI: 10.1177/10497323211054058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Expectations to share data underlying studies are increasing, but research on how participants, particularly those in qualitative research, respond to requests for data sharing is limited. We studied research participants' willingness to, understanding of, and motivations for data sharing. As part of a larger qualitative study on abortion reporting, we conducted interviews with 64 cisgender women in two states in early 2020 and asked for consent to share de-identified data. At the end of interviews, we asked participants to reflect on their motivations for agreeing or declining to share their data. The vast majority of respondents consented to data sharing and reported that helping others was a primary motivation for agreeing to share their data. However, a substantial number of participants showed a limited understanding of the concept of "data sharing." Additional research is needed on how to improve participants' understanding of data sharing and thus ensure fully informed consent.
Collapse
|
12
|
Mozersky J, McIntosh T, Walsh HA, Parsons MV, Goodman M, DuBois JM. Barriers and facilitators to qualitative data sharing in the United States: A survey of qualitative researchers. PLoS One 2021; 16:e0261719. [PMID: 34972126 PMCID: PMC8719660 DOI: 10.1371/journal.pone.0261719] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 12/08/2021] [Indexed: 11/19/2022] Open
Abstract
Qualitative health data are rarely shared in the United States (U.S.). This is unfortunate because gathering qualitative data is labor and time-intensive, and data sharing enables secondary research, training, and transparency. A new U.S. federal policy mandates data sharing by 2023, and is agnostic to data type. We surveyed U.S. qualitative researchers (N = 425) on the barriers and facilitators of sharing qualitative health or sensitive research data. Most researchers (96%) have never shared qualitative data in a repository. Primary concerns were lack of participant permission to share data, data sensitivity, and breaching trust. Researcher willingness to share would increase if participants agreed and if sharing increased the societal impact of their research. Key resources to increase willingness to share were funding, guidance, and de-identification assistance. Public health and biomedical researchers were most willing to share. Qualitative researchers need to prepare for this new reality as sharing qualitative data requires unique considerations.
Collapse
Affiliation(s)
- Jessica Mozersky
- Bioethics Research Center, Division of General Medical Sciences, Washington University School of Medicine, St. Louis, MO, United States of America
| | - Tristan McIntosh
- Bioethics Research Center, Division of General Medical Sciences, Washington University School of Medicine, St. Louis, MO, United States of America
| | - Heidi A. Walsh
- Bioethics Research Center, Division of General Medical Sciences, Washington University School of Medicine, St. Louis, MO, United States of America
| | - Meredith V. Parsons
- Bioethics Research Center, Division of General Medical Sciences, Washington University School of Medicine, St. Louis, MO, United States of America
| | - Melody Goodman
- School of Global Public Health, New York University, New York, NY, United States of America
| | - James M. DuBois
- Bioethics Research Center, Division of General Medical Sciences, Washington University School of Medicine, St. Louis, MO, United States of America
| |
Collapse
|
13
|
Gupta A, Lai A, Mozersky J, Ma X, Walsh H, DuBois JM. Enabling qualitative research data sharing using a natural language processing pipeline for deidentification: moving beyond HIPAA Safe Harbor identifiers. JAMIA Open 2021; 4:ooab069. [PMID: 34435175 PMCID: PMC8382275 DOI: 10.1093/jamiaopen/ooab069] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 07/25/2021] [Accepted: 08/10/2021] [Indexed: 11/20/2022] Open
Abstract
Objective Sharing health research data is essential for accelerating the translation of research into actionable knowledge that can impact health care services and outcomes. Qualitative health research data are rarely shared due to the challenge of deidentifying text and the potential risks of participant reidentification. Here, we establish and evaluate a framework for deidentifying qualitative research data using automated computational techniques including removal of identifiers that are not considered HIPAA Safe Harbor (HSH) identifiers but are likely to be found in unstructured qualitative data. Materials and Methods We developed and validated a pipeline for deidentifying qualitative research data using automated computational techniques. An in-depth analysis and qualitative review of different types of qualitative health research data were conducted to inform and evaluate the development of a natural language processing (NLP) pipeline using named-entity recognition, pattern matching, dictionary, and regular expression methods to deidentify qualitative texts. Results We collected 2 datasets with 1.2 million words derived from over 400 qualitative research data documents. We created a gold-standard dataset with 280K words (70 files) to evaluate our deidentification pipeline. The majority of identifiers in qualitative data are non-HSH and not captured by existing systems. Our NLP deidentification pipeline had a consistent F1-score of ∼0.90 for both datasets. Conclusion The results of this study demonstrate that NLP methods can be used to identify both HSH identifiers and non-HSH identifiers. Automated tools to assist researchers with the deidentification of qualitative data will be increasingly important given the new National Institutes of Health (NIH) data-sharing mandate.
Collapse
Affiliation(s)
- Aditi Gupta
- Institute for Informatics, Washington University, St. Louis, Missouri, USA
| | - Albert Lai
- Institute for Informatics, Washington University, St. Louis, Missouri, USA
| | - Jessica Mozersky
- Bioethics Research Center, Division of General Medical Sciences, Washington University, St. Louis, Missouri, USA
| | - Xiaoteng Ma
- Institute for Informatics, Washington University, St. Louis, Missouri, USA
| | - Heidi Walsh
- Bioethics Research Center, Division of General Medical Sciences, Washington University, St. Louis, Missouri, USA
| | - James M DuBois
- Bioethics Research Center, Division of General Medical Sciences, Washington University, St. Louis, Missouri, USA
| |
Collapse
|
14
|
An Inside Perspective of the Opioid Overdose Crisis in Vancouver: A Secondary Qualitative Study. CANADIAN JOURNAL OF ADDICTION 2021. [DOI: 10.1097/cxa.0000000000000103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|