Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

54
(from Reference Citation Analysis)

Article PDFs (14)

Cited by > 0 (28)

Searched Name

de-identification

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Kovačević A, Bašaragin B, Milošević N, Nenadić G. De-identification of clinical free text using natural language processing: A systematic review of current approaches. Artif Intell Med 2024;151:102845. [PMID: 38555848 DOI: 10.1016/j.artmed.2024.102845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 03/13/2024] [Accepted: 03/18/2024] [Indexed: 04/02/2024]

Abstract

BACKGROUND

Electronic health records (EHRs) are a valuable resource for data-driven medical research. However, the presence of protected health information (PHI) makes EHRs unsuitable to be shared for research purposes. De-identification, i.e. the process of removing PHI is a critical step in making EHR data accessible. Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process.

OBJECTIVES

Our study aims to provide systematic evidence on how the de-identification of clinical free text written in English has evolved in the last thirteen years, and to report on the performances and limitations of the current state-of-the-art systems for the English language. In addition, we aim to identify challenges and potential research opportunities in this field.

METHODS

A systematic search in PubMed, Web of Science, and the DBLP was conducted for studies published between January 2010 and February 2023. Titles and abstracts were examined to identify the relevant studies. Selected studies were then analysed in-depth, and information was collected on de-identification methodologies, data sources, and measured performance.

RESULTS

A total of 2125 publications were identified for the title and abstract screening. 69 studies were found to be relevant. Machine learning (37 studies) and hybrid (26 studies) approaches are predominant, while six studies relied only on rules. The majority of the approaches were trained and evaluated on public corpora. The 2014 i2b2/UTHealth corpus is the most frequently used (36 studies), followed by the 2006 i2b2 (18 studies) and 2016 CEGS N-GRID (10 studies) corpora.

CONCLUSION

Earlier de-identification approaches aimed at English were mainly rule and machine learning hybrids with extensive feature engineering and post-processing, while more recent performance improvements are due to feature-inferring recurrent neural networks. Current leading performance is achieved using attention-based neural models. Recent studies report state-of-the-art F1-scores (over 98 %) when evaluated in the manner usually adopted by the clinical natural language processing community. However, their performance needs to be more thoroughly assessed with different measures to judge their reliability to safely de-identify data in a real-world setting. Without additional manually labeled training data, state-of-the-art systems fail to generalise well across a wide range of clinical sub-domains.

Collapse

Negash B, Katz A, Neilson CJ, Moni M, Nesca M, Singer A, Enns JE. De-identification of free text data containing personal health information: a scoping review of reviews. Int J Popul Data Sci 2023;8:2153. [PMID: 38414537 PMCID: PMC10898315 DOI: 10.23889/ijpds.v8i1.2153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024] Open

Abstract

Introduction

Using data in research often requires that the data first be de-identified, particularly in the case of health data, which often include Personal Identifiable Information (PII) and/or Personal Health Identifying Information (PHII). There are established procedures for de-identifying structured data, but de-identifying clinical notes, electronic health records, and other records that include free text data is more complex. Several different ways to achieve this are documented in the literature. This scoping review identifies categories of de-identification methods that can be used for free text data.

Methods

We adopted an established scoping review methodology to examine review articles published up to May 9, 2022, in Ovid MEDLINE; Ovid Embase; Scopus; the ACM Digital Library; IEEE Explore; and Compendex. Our research question was: What methods are used to de-identify free text data? Two independent reviewers conducted title and abstract screening and full-text article screening using the online review management tool Covidence.

Results

The initial literature search retrieved 3,312 articles, most of which focused primarily on structured data. Eighteen publications describing methods of de-identification of free text data met the inclusion criteria for our review. The majority of the included articles focused on removing categories of personal health information identified by the Health Insurance Portability and Accountability Act (HIPAA). The de-identification methods they described combined rule-based methods or machine learning with other strategies such as deep learning.

Conclusion

Our review identifies and categorises de-identification methods for free text data as rule-based methods, machine learning, deep learning and a combination of these and other approaches. Most of the articles we found in our search refer to de-identification methods that target some or all categories of PHII. Our review also highlights how de-identification systems for free text data have evolved over time and points to hybrid approaches as the most promising approach for the future.

Collapse

Radhakrishnan L, Schenk G, Muenzen K, Oskotsky B, Ashouri Choshali H, Plunkett T, Israni S, Butte AJ. A certified de-identification system for all clinical text documents for information extraction at scale. JAMIA Open 2023;6:ooad045. [PMID: 37416449 PMCID: PMC10320112 DOI: 10.1093/jamiaopen/ooad045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 03/25/2023] [Accepted: 06/27/2023] [Indexed: 07/08/2023] Open

Pilgram L, Schäffner E, Eckardt KU, Prasser F. Utility-Preserving Anonymization in a Real-World Scenario: Evidence from the German Chronic Kidney Disease (GCKD) Study. Stud Health Technol Inform 2023;302:28-32. [PMID: 37203603 DOI: 10.3233/shti230058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]

Willmott C, Bryant J. Genomics is here: what can we do with it, and what ethical issues has it brought along for the ride? New Bioeth 2023;29:1-9. [PMID: 36871201 DOI: 10.1080/20502877.2023.2180839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]

Tichopád A, Augustynek M, Beneš J, Dlouhý M, Doležal T, Horáková D, Kršek M, Lhotska L, Panzner P, Penhaker M, Petr M, Piťha J, Popesko B, Rožánek M, Táborský M, Vrablík M. The way to data: opinions and recommendations for the provision of health data for secondary use. Cas Lek Cesk 2023;162:61-66. [PMID: 37474288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 07/22/2023]

Sepas A, Bangash AH, Alraoui O, El Emam K, El-Hussuna A. Algorithms to anonymize structured medical and healthcare data: A systematic review. Front Bioinform 2022;2:984807. [PMID: 36619476 PMCID: PMC9815524 DOI: 10.3389/fbinf.2022.984807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 11/28/2022] [Indexed: 12/24/2022] Open

Abstract

Introduction: With many anonymization algorithms developed for structured medical health data (SMHD) in the last decade, our systematic review provides a comprehensive bird's eye view of algorithms for SMHD anonymization. Methods: This systematic review was conducted according to the recommendations in the Cochrane Handbook for Reviews of Interventions and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). Eligible articles from the PubMed, ACM digital library, Medline, IEEE, Embase, Web of Science Collection, Scopus, ProQuest Dissertation, and Theses Global databases were identified through systematic searches. The following parameters were extracted from the eligible studies: author, year of publication, sample size, and relevant algorithms and/or software applied to anonymize SMHD, along with the summary of outcomes. Results: Among 1,804 initial hits, the present study considered 63 records including research articles, reviews, and books. Seventy five evaluated the anonymization of demographic data, 18 assessed diagnosis codes, and 3 assessed genomic data. One of the most common approaches was k-anonymity, which was utilized mainly for demographic data, often in combination with another algorithm; e.g., l-diversity. No approaches have yet been developed for protection against membership disclosure attacks on diagnosis codes. Conclusion: This study reviewed and categorized different anonymization approaches for MHD according to the anonymized data types (demographics, diagnosis codes, and genomic data). Further research is needed to develop more efficient algorithms for the anonymization of diagnosis codes and genomic data. The risk of reidentification can be minimized with adequate application of the addressed anonymization approaches. Systematic Review Registration: [http://www.crd.york.ac.uk/prospero], identifier [CRD42021228200].

Collapse

Bashir SR, Raza S, Kocaman V, Qamar U. Clinical Application of Detecting COVID-19 Risks: A Natural Language Processing Approach. Viruses 2022;14:v14122761. [PMID: 36560764 PMCID: PMC9781729 DOI: 10.3390/v14122761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 12/08/2022] [Indexed: 12/14/2022] Open

Wang P, Li Y, Yang L, Li S, Li L, Zhao Z, Long S, Wang F, Wang H, Li Y, Wang C. An Efficient Method for Deidentifying Protected Health Information in Chinese Electronic Health Records: Algorithm Development and Validation. JMIR Med Inform 2022;10:e38154. [PMID: 36040774 PMCID: PMC9472063 DOI: 10.2196/38154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 07/19/2022] [Accepted: 07/31/2022] [Indexed: 11/13/2022] Open

Abstract

Background

With the popularization of electronic health records in China, the utilization of digitalized data has great potential for the development of real-world medical research. However, the data usually contains a great deal of protected health information and the direct usage of this data may cause privacy issues. The task of deidentifying protected health information in electronic health records can be regarded as a named entity recognition problem. Existing rule-based, machine learning–based, or deep learning–based methods have been proposed to solve this problem. However, these methods still face the difficulties of insufficient Chinese electronic health record data and the complex features of the Chinese language.

Objective

This paper proposes a method to overcome the difficulties of overfitting and a lack of training data for deep neural networks to enable Chinese protected health information deidentification.

Methods

We propose a new model that merges TinyBERT (bidirectional encoder representations from transformers) as a text feature extraction module and the conditional random field method as a prediction module for deidentifying protected health information in Chinese medical electronic health records. In addition, a hybrid data augmentation method that integrates a sentence generation strategy and a mention-replacement strategy is proposed for overcoming insufficient Chinese electronic health records.

Results

We compare our method with 5 baseline methods that utilize different BERT models as their feature extraction modules. Experimental results on the Chinese electronic health records that we collected demonstrate that our method had better performance (microprecision: 98.7%, microrecall: 99.13%, and micro-F1 score: 98.91%) and higher efficiency (40% faster) than all the BERT-based baseline methods.

Conclusions

Compared to baseline methods, the efficiency advantage of TinyBERT on our proposed augmented data set was kept while the performance improved for the task of Chinese protected health information deidentification.

Collapse

Shahid A, Bazargani MH, Banahan P, Mac Namee B, Kechadi T, Treacy C, Regan G, MacMahon P. A Two-Stage De-Identification Process for Privacy-Preserving Medical Image Analysis. Healthcare (Basel) 2022;10:755. [PMID: 35627892 PMCID: PMC9141493 DOI: 10.3390/healthcare10050755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 04/12/2022] [Accepted: 04/14/2022] [Indexed: 11/17/2022] Open

Chomutare T. Clinical Notes De-Identification: Scoping Recent Benchmarks for n2c2 Datasets. Stud Health Technol Inform 2022;289:293-296. [PMID: 35062150 DOI: 10.3233/shti210917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Brakel LAW. Self-constitution and "Infrastructural" Change: An Interdisciplinary Account of Psychoanalytic Action. Am J Psychoanal 2022;82:618-630. [PMID: 36470990 PMCID: PMC9734568 DOI: 10.1057/s11231-022-09383-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Lee K, Kayaalp M, Henry S, Uzuner Ö. A Context-Enhanced De-identification System. ACM Trans Comput Healthc 2021;3. [PMID: 34676376 DOI: 10.1145/3470980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Meurers T, Bild R, Do KM, Prasser F. A scalable software solution for anonymizing high-dimensional biomedical data. Gigascience 2021;10:giab068. [PMID: 34605868 PMCID: PMC8489190 DOI: 10.1093/gigascience/giab068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 07/19/2021] [Accepted: 09/09/2021] [Indexed: 11/17/2022] Open

Liao S, Kiros J, Chen J, Zhang Z, Chen T. Improving domain adaptation in de-identification of electronic health records through self-training. J Am Med Inform Assoc 2021;28:2093-2100. [PMID: 34363664 PMCID: PMC8449604 DOI: 10.1093/jamia/ocab128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 07/01/2021] [Accepted: 07/04/2021] [Indexed: 11/13/2022] Open

Lee B, Dupervil B, Deputy NP, Duck W, Soroka S, Bottichio L, Silk B, Price J, Sweeney P, Fuld J, Weber JT, Pollock D. Protecting Privacy and Transforming COVID-19 Case Surveillance Datasets for Public Use. Public Health Rep 2021;136:554-561. [PMID: 34139910 PMCID: PMC8216038 DOI: 10.1177/00333549211026817] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Affiliation(s)

Brian Lee COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA Office of the Chief Operations Officer, Office of the Chief Information Officer, Centers for Disease Control and Prevention, Atlanta, GA, USA Brian Lee, MPH, Centers for Disease Control and Prevention, COVID-19 Response, 1600 Clifton Rd NE, MS TW-2, Atlanta, GA 30329, USA;
Brandi Dupervil COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA, USA
Nicholas P. Deputy COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA, USA US Public Health Service, Rockville, MD, USA
Wil Duck COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA Center for Surveillance, Epidemiology, and Laboratory Services, Centers for Disease Control and Prevention, Atlanta, GA, USA
Stephen Soroka COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
Lyndsay Bottichio COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
Benjamin Silk COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA US Public Health Service, Rockville, MD, USA National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
Jason Price COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA, USA
Patricia Sweeney COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA, USA
Jennifer Fuld COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA Office of the Associate Director for Policy and Strategy, Centers for Disease Control and Prevention, Atlanta, GA, USA
J. Todd Weber COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
Dan Pollock COVID-19 Response, Centers for Disease Control and Prevention, Atlanta, GA, USA National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA

Collapse

Murugadoss K, Rajasekharan A, Malin B, Agarwal V, Bade S, Anderson JR, Ross JL, Faubion WA, Halamka JD, Soundararajan V, Ardhanari S. Building a best-in-class automated de-identification tool for electronic health records through ensemble learning. Patterns (N Y) 2021;2:100255. [PMID: 34179842 PMCID: PMC8212138 DOI: 10.1016/j.patter.2021.100255] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 02/24/2021] [Accepted: 04/07/2021] [Indexed: 10/29/2022]

Farzanehfar A, Houssiau F, de Montjoye YA. The risk of re-identification remains high even in country-scale location datasets. Patterns (N Y) 2021;2:100204. [PMID: 33748793 PMCID: PMC7961185 DOI: 10.1016/j.patter.2021.100204] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 11/27/2020] [Accepted: 01/07/2021] [Indexed: 11/30/2022]

Theyers AE, Zamyadi M, O'Reilly M, Bartha R, Symons S, MacQueen GM, Hassel S, Lerch JP, Anagnostou E, Lam RW, Frey BN, Milev R, Müller DJ, Kennedy SH, Scott CJM, Strother SC, Arnott SR. Multisite Comparison of MRI Defacing Software Across Multiple Cohorts. Front Psychiatry 2021;12:617997. [PMID: 33716819 PMCID: PMC7943842 DOI: 10.3389/fpsyt.2021.617997] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 02/03/2021] [Indexed: 01/26/2023] Open

Abstract

With improvements to both scan quality and facial recognition software, there is an increased risk of participants being identified by a 3D render of their structural neuroimaging scans, even when all other personal information has been removed. To prevent this, facial features should be removed before data are shared or openly released, but while there are several publicly available software algorithms to do this, there has been no comprehensive review of their accuracy within the general population. To address this, we tested multiple algorithms on 300 scans from three neuroscience research projects, funded in part by the Ontario Brain Institute, to cover a wide range of ages (3-85 years) and multiple patient cohorts. While skull stripping is more thorough at removing identifiable features, we focused mainly on defacing software, as skull stripping also removes potentially useful information, which may be required for future analyses. We tested six publicly available algorithms (afni_refacer, deepdefacer, mri_deface, mridefacer, pydeface, quickshear), with one skull stripper (FreeSurfer) included for comparison. Accuracy was measured through a pass/fail system with two criteria; one, that all facial features had been removed and two, that no brain tissue was removed in the process. A subset of defaced scans were also run through several preprocessing pipelines to ensure that none of the algorithms would alter the resulting outputs. We found that the success rates varied strongly between defacers, with afni_refacer (89%) and pydeface (83%) having the highest rates, overall. In both cases, the primary source of failure came from a single dataset that the defacer appeared to struggle with - the youngest cohort (3-20 years) for afni_refacer and the oldest (44-85 years) for pydeface, demonstrating that defacer performance not only depends on the data provided, but that this effect varies between algorithms. While there were some very minor differences between the preprocessing results for defaced and original scans, none of these were significant and were within the range of variation between using different NIfTI converters, or using raw DICOM files.

Collapse

Affiliation(s)

Athena E Theyers Rotman Research Institute, Baycrest Health Sciences Centre, Toronto, ON, Canada
Mojdeh Zamyadi Rotman Research Institute, Baycrest Health Sciences Centre, Toronto, ON, Canada
Mark O'Reilly Ontario Brain Institute, Toronto, ON, Canada
Robert Bartha Department of Medical Biophysics, Robarts Research Institute, Western University, London, ON, Canada
Sean Symons Department of Medical Imaging, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
Glenda M MacQueen Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Stefanie Hassel Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Jason P Lerch Mouse Imaging Centre, Hospital for Sick Children, Toronto, ON, Canada
Evdokia Anagnostou Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, ON, Canada
Raymond W Lam Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
Benicio N Frey Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, ON, Canada.,Mood Disorders Program, St. Joseph's Healthcare, Hamilton, ON, Canada
Roumen Milev Departments of Psychiatry and Psychology, Queen's University, Providence Care Hospital, Kingston, ON, Canada
Daniel J Müller Molecular Brain Science, Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada.,Department of Psychiatry, University of Toronto, Toronto, ON, Canada
Sidney H Kennedy Department of Psychiatry, University of Toronto, Toronto, ON, Canada.,Department of Psychiatry, Krembil Research Centre, University Health Network, Toronto, ON, Canada.,Department of Psychiatry, St. Michael's Hospital, University of Toronto, Toronto, ON, Canada.,Keenan Research Centre for Biomedical Science, Li Ka Shing Knowledge Institute, St. Michael's Hospital, Toronto, ON, Canada
Christopher J M Scott LC Campbell Cognitive Neurology Research Unit, Toronto, ON, Canada.,Heart & Stroke Foundation Centre for Stroke Recovery, Toronto, ON, Canada.,Sunnybrook Health Sciences Centre, Brain Sciences Research Program, Sunnybrook Research Institute, Toronto, ON, Canada
Stephen C Strother Rotman Research Institute, Baycrest Health Sciences Centre, Toronto, ON, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
Stephen R Arnott Rotman Research Institute, Baycrest Health Sciences Centre, Toronto, ON, Canada

Collapse

Jeong YU, Yoo S, Kim YH, Shim WH. De-Identification of Facial Features in Magnetic Resonance Images: Software Development Using Deep Learning Technology. J Med Internet Res 2020;22:e22739. [PMID: 33208302 PMCID: PMC7759440 DOI: 10.2196/22739] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/09/2020] [Accepted: 11/12/2020] [Indexed: 12/14/2022] Open

Abstract

Background

High-resolution medical images that include facial regions can be used to recognize the subject’s face when reconstructing 3-dimensional (3D)-rendered images from 2-dimensional (2D) sequential images, which might constitute a risk of infringement of personal information when sharing data. According to the Health Insurance Portability and Accountability Act (HIPAA) privacy rules, full-face photographic images and any comparable image are direct identifiers and considered as protected health information. Moreover, the General Data Protection Regulation (GDPR) categorizes facial images as biometric data and stipulates that special restrictions should be placed on the processing of biometric data.

Objective

This study aimed to develop software that can remove the header information from Digital Imaging and Communications in Medicine (DICOM) format files and facial features (eyes, nose, and ears) at the 2D sliced-image level to anonymize personal information in medical images.

Methods

A total of 240 cranial magnetic resonance (MR) images were used to train the deep learning model (144, 48, and 48 for the training, validation, and test sets, respectively, from the Alzheimer's Disease Neuroimaging Initiative [ADNI] database). To overcome the small sample size problem, we used a data augmentation technique to create 576 images per epoch. We used attention-gated U-net for the basic structure of our deep learning model. To validate the performance of the software, we adapted an external test set comprising 100 cranial MR images from the Open Access Series of Imaging Studies (OASIS) database.

Results

The facial features (eyes, nose, and ears) were successfully detected and anonymized in both test sets (48 from ADNI and 100 from OASIS). Each result was manually validated in both the 2D image plane and the 3D-rendered images. Furthermore, the ADNI test set was verified using Microsoft Azure's face recognition artificial intelligence service. By adding a user interface, we developed and distributed (via GitHub) software named “Deface program” for medical images as an open-source project.

Conclusions

We developed deep learning–based software for the anonymization of MR images that distorts the eyes, nose, and ears to prevent facial identification of the subject in reconstructed 3D images. It could be used to share medical big data for secondary research while making both data providers and recipients compliant with the relevant privacy regulations.

Collapse

Jeon S, Seo J, Kim S, Lee J, Kim JH, Sohn JW, Moon J, Joo HJ. Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models. J Med Internet Res 2020;22:e19597. [PMID: 33177037 PMCID: PMC7728527 DOI: 10.2196/19597] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/29/2020] [Accepted: 11/11/2020] [Indexed: 02/01/2023] Open

Abstract

Background

De-identifying personal information is critical when using personal health data for secondary research. The Observational Medical Outcomes Partnership Common Data Model (CDM), defined by the nonprofit organization Observational Health Data Sciences and Informatics, has been gaining attention for its use in the analysis of patient-level clinical data obtained from various medical institutions. When analyzing such data in a public environment such as a cloud-computing system, an appropriate de-identification strategy is required to protect patient privacy.

Objective

This study proposes and evaluates a de-identification strategy that is comprised of several rules along with privacy models such as k-anonymity, l-diversity, and t-closeness. The proposed strategy was evaluated using the actual CDM database.

Methods

The CDM database used in this study was constructed by the Anam Hospital of Korea University. Analysis and evaluation were performed using the ARX anonymizing framework in combination with the k-anonymity, l-diversity, and t-closeness privacy models.

Results

The CDM database, which was constructed according to the rules established by Observational Health Data Sciences and Informatics, exhibited a low risk of re-identification: The highest re-identifiable record rate (11.3%) in the dataset was exhibited by the DRUG_EXPOSURE table, with a re-identification success rate of 0.03%. However, because all tables include at least one “highest risk” value of 100%, suitable anonymizing techniques are required; moreover, the CDM database preserves the “source values” (raw data), a combination of which could increase the risk of re-identification. Therefore, this study proposes an enhanced strategy to de-identify the source values to significantly reduce not only the highest risk in the k-anonymity, l-diversity, and t-closeness privacy models but also the overall possibility of re-identification.

Conclusions

Our proposed de-identification strategy effectively enhanced the privacy of the CDM database, thereby encouraging clinical research involving multiple centers.

Collapse

Delgado J, Llorente S. Security and Privacy when Applying FAIR Principles to Genomic Information. Stud Health Technol Inform 2020;275:37-41. [PMID: 33227736 DOI: 10.3233/shti200690] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

El Emam K, Mosquera L, Bass J. Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation. J Med Internet Res 2020;22:e23139. [PMID: 33196453 PMCID: PMC7704280 DOI: 10.2196/23139] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 09/02/2020] [Accepted: 10/10/2020] [Indexed: 01/13/2023] Open

Parker W, Jaremko JL, Cicero M, Azar M, El-Emam K, Gray BG, Hurrell C, Lavoie-Cardinal F, Desjardins B, Lum A, Sheremeta L, Lee E, Reinhold C, Tang A, Bromwich R. Canadian Association of Radiologists White Paper on De-Identification of Medical Imaging: Part 1, General Principles. Can Assoc Radiol J 2020;72:13-24. [PMID: 33138621 DOI: 10.1177/0846537120967349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Parker W, Jaremko JL, Cicero M, Azar M, El-Emam K, Gray BG, Hurrell C, Lavoie-Cardinal F, Desjardins B, Lum A, Sheremeta L, Lee E, Reinhold C, Tang A, Bromwich R. Canadian Association of Radiologists White Paper on De-identification of Medical Imaging: Part 2, Practical Considerations. Can Assoc Radiol J 2020;72:25-34. [PMID: 33140663 DOI: 10.1177/0846537120967345] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Ahn NY, Park JE, Lee DH, Hong PC. Balancing Personal Privacy and Public Safety During COVID-19: The Case of South Korea. IEEE Access 2020;8:171325-171333. [PMID: 34786290 PMCID: PMC8545276 DOI: 10.1109/access.2020.3025971] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 09/20/2020] [Indexed: 05/09/2023]

Elbers DC, Fillmore NR, Sung FC, Ganas SS, Prokhorenkov A, Meyer C, Hall RB, Ajjarapu SJ, Chen DC, Meng F, Grossman RL, Brophy MT, Do NV. The Veterans Affairs Precision Oncology Data Repository, a Clinical, Genomic, and Imaging Research Database. Patterns (N Y) 2020;1:100083. [PMID: 33205130 PMCID: PMC7660389 DOI: 10.1016/j.patter.2020.100083] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 06/15/2020] [Accepted: 07/10/2020] [Indexed: 02/06/2023]

Affiliation(s)

Danne C Elbers VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,University of Vermont, Complex Systems Center, Burlington, VT 05405, USA
Nathanael R Fillmore VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,Harvard Medical School, Boston, MA 02115, USA.,Dana-Farber Cancer Institute, Boston, MA 02215, USA
Feng-Chi Sung VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA
Spyridon S Ganas VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA
Andrew Prokhorenkov University of Chicago, Center for Data Intensive Science, Chicago, IL 60615, USA
Christopher Meyer University of Chicago, Center for Data Intensive Science, Chicago, IL 60615, USA
Robert B Hall VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA
Samuel J Ajjarapu VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,Dana-Farber Cancer Institute, Boston, MA 02215, USA
Daniel C Chen VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,Boston University School of Medicine, Boston, MA 02118, USA
Frank Meng VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,Boston University School of Medicine, Boston, MA 02118, USA
Robert L Grossman University of Chicago, Center for Data Intensive Science, Chicago, IL 60615, USA
Mary T Brophy VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,Boston University School of Medicine, Boston, MA 02118, USA
Nhan V Do VA Cooperative Studies Program, VA Boston Healthcare System (151MAV), 150 S. Huntington Ave, Jamaica Plain, MA 02130, USA.,Boston University School of Medicine, Boston, MA 02118, USA

Collapse

Carrell DS, Malin BA, Cronkite DJ, Aberdeen JS, Clark C, Li MR, Bastakoty D, Nyemba S, Hirschman L. Resilience of clinical text de-identified with "hiding in plain sight" to hostile reidentification attacks by human readers. J Am Med Inform Assoc 2020;27:1374-1382. [PMID: 32930712 DOI: 10.1093/jamia/ocaa095] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 04/02/2020] [Accepted: 05/26/2020] [Indexed: 11/14/2022] Open

Chomutare T, Yigzaw KY, Budrionis A, Makhlysheva A, Godtliebsen F, Dalianis H. De-Identifying Swedish EHR Text Using Public Resources in the General Domain. Stud Health Technol Inform 2020;270:148-152. [PMID: 32570364 DOI: 10.3233/shti200140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Demuro PR, Petersen C. Managing Privacy and Data Sharing Through the Use of Health Care Information Fiduciaries. Stud Health Technol Inform 2019;265:157-162. [PMID: 31431592 DOI: 10.3233/shti190156] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Cha HS, Jung JM, Shin SY, Jang YM, Park P, Lee JW, Chung SH, Choi KS. The Korea Cancer Big Data Platform (K-CBP) for Cancer Research. Int J Environ Res Public Health 2019;16:E2290. [PMID: 31261630 DOI: 10.3390/ijerph16132290] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 05/31/2019] [Accepted: 06/24/2019] [Indexed: 12/23/2022]

Chevrier R, Foufi V, Gaudet-Blavignac C, Robert A, Lovis C. Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review. J Med Internet Res 2019;21:e13484. [PMID: 31152528 PMCID: PMC6658290 DOI: 10.2196/13484] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 03/29/2019] [Accepted: 04/26/2019] [Indexed: 01/19/2023] Open

Abstract

Background

The secondary use of health data is central to biomedical research in the era of data science and precision medicine. National and international initiatives, such as the Global Open Findable, Accessible, Interoperable, and Reusable (GO FAIR) initiative, are supporting this approach in different ways (eg, making the sharing of research data mandatory or improving the legal and ethical frameworks). Preserving patients’ privacy is crucial in this context. De-identification and anonymization are the two most common terms used to refer to the technical approaches that protect privacy and facilitate the secondary use of health data. However, it is difficult to find a consensus on the definitions of the concepts or on the reliability of the techniques used to apply them. A comprehensive review is needed to better understand the domain, its capabilities, its challenges, and the ratio of risk between the data subjects’ privacy on one side, and the benefit of scientific advances on the other.

Objective

This work aims at better understanding how the research community comprehends and defines the concepts of de-identification and anonymization. A rich overview should also provide insights into the use and reliability of the methods. Six aspects will be studied: (1) terminology and definitions, (2) backgrounds and places of work of the researchers, (3) reasons for anonymizing or de-identifying health data, (4) limitations of the techniques, (5) legal and ethical aspects, and (6) recommendations of the researchers.

Methods

Based on a scoping review protocol designed a priori, MEDLINE was searched for publications discussing de-identification or anonymization and published between 2007 and 2017. The search was restricted to MEDLINE to focus on the life sciences community. The screening process was performed by two reviewers independently.

Results

After searching 7972 records that matched at least one search term, 135 publications were screened and 60 full-text articles were included. (1) Terminology: Definitions of the terms de-identification and anonymization were provided in less than half of the articles (29/60, 48%). When both terms were used (41/60, 68%), their meanings divided the authors into two equal groups (19/60, 32%, each) with opposed views. The remaining articles (3/60, 5%) were equivocal. (2) Backgrounds and locations: Research groups were based predominantly in North America (31/60, 52%) and in the European Union (22/60, 37%). The authors came from 19 different domains; computer science (91/248, 36.7%), biomedical informatics (47/248, 19.0%), and medicine (38/248, 15.3%) were the most prevalent ones. (3) Purpose: The main reason declared for applying these techniques is to facilitate biomedical research. (4) Limitations: Progress is made on specific techniques but, overall, limitations remain numerous. (5) Legal and ethical aspects: Differences exist between nations in the definitions, approaches, and legal practices. (6) Recommendations: The combination of organizational, legal, ethical, and technical approaches is necessary to protect health data.

Conclusions

Interest is growing for privacy-enhancing techniques in the life sciences community. This interest crosses scientific boundaries, involving primarily computer science, biomedical informatics, and medicine. The variability observed in the use of the terms de-identification and anonymization emphasizes the need for clearer definitions as well as for better education and dissemination of information on the subject. The same observation applies to the methods. Several legislations, such as the American Health Insurance Portability and Accountability Act (HIPAA) and the European General Data Protection Regulation (GDPR), regulate the domain. Using the definitions they provide could help address the variable use of these two concepts in the research community.

Collapse

Caetano SJ, Dawe D, Ellis P, Earle CC, Pond GR. Methods to improve the estimation of time-to-event outcomes when data is de-identified. Stat Med 2019;38:625-635. [PMID: 30311241 DOI: 10.1002/sim.7990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Revised: 08/30/2018] [Accepted: 09/06/2018] [Indexed: 11/07/2022]

Kuo SIC, Wheeler LA, Updegraff KA, McHale SM, Umaña-Taylor AJ, Perez-Brena NJ. Parental Modeling and Deidentification in Romantic Relationships Among Mexican-origin Youth. J Marriage Fam 2017;79:1388-1403. [PMID: 29033465 PMCID: PMC5637550 DOI: 10.1111/jomf.12411] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]

Coughlin SS. Reproducing Epidemiologic Research and Ensuring Transparency. Am J Epidemiol 2017;186:393-394. [PMID: 28830078 DOI: 10.1093/aje/kwx065] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2017] [Accepted: 03/03/2017] [Indexed: 11/12/2022] Open

Shepherd BE, Blevins Peratikos M, Rebeiro PF, Duda SN, McGowan CC. A Pragmatic Approach for Reproducible Research With Sensitive Data. Am J Epidemiol 2017;186:387-392. [PMID: 28830079 DOI: 10.1093/aje/kwx066] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Accepted: 02/24/2017] [Indexed: 11/13/2022] Open

Dernoncourt F, Lee JY, Uzuner O, Szolovits P. De-identification of patient notes with recurrent neural networks. J Am Med Inform Assoc 2017;24:596-606. [PMID: 28040687 PMCID: PMC7787254 DOI: 10.1093/jamia/ocw156] [Citation(s) in RCA: 106] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2016] [Revised: 09/06/2016] [Accepted: 10/06/2016] [Indexed: 01/16/2023] Open

Kulynych J, Greely HT. Clinical genomics, big data, and electronic medical records: reconciling patient rights with research when privacy and science collide. J Law Biosci 2017;4:94-132. [PMID: 28852559 PMCID: PMC5570692 DOI: 10.1093/jlb/lsw061] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Lu Y, Sinnott RO, Verspoor K. A Semantic-Based K-Anonymity Scheme for Health Record Linkage. Stud Health Technol Inform 2017;239:84-90. [PMID: 28756441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Foufi V, Gaudet-Blavignac C, Chevrier R, Lovis C. De-Identification of Medical Narrative Data. Stud Health Technol Inform 2017;244:23-27. [PMID: 29039370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Henriksson A, Kvist M, Dalianis H. Prevalence Estimation of Protected Health Information in Swedish Clinical Text. Stud Health Technol Inform 2017;235:216-220. [PMID: 28423786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Spidlen J, Brinkman RR. Use FlowRepository to share your clinical data upon study publication. Cytometry B Clin Cytom 2016;94:196-198. [PMID: 27342384 DOI: 10.1002/cyto.b.21393] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Accepted: 06/23/2016] [Indexed: 01/01/2023]

Pantazos K, Lauesen S, Lippert S. Preserving medical correctness, readability and consistency in de-identified health records. Health Informatics J 2016;23:291-303. [PMID: 27199298 DOI: 10.1177/1460458216647760] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Song X, Wang J, Wang A, Meng Q, Prescott C, Tsu L, Eckert MA. DeID - a data sharing tool for neuroimaging studies. Front Neurosci 2015;9:325. [PMID: 26441500 PMCID: PMC4585207 DOI: 10.3389/fnins.2015.00325] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 08/31/2015] [Indexed: 11/25/2022] Open

Xia W, Heatherly R, Ding X, Li J, Malin BA. R-U policy frontiers for health data de-identification. J Am Med Inform Assoc 2015;22:1029-41. [PMID: 25911674 PMCID: PMC4986667 DOI: 10.1093/jamia/ocv004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Revised: 12/27/2014] [Accepted: 01/09/2015] [Indexed: 11/12/2022] Open

Moffet HH, Warton EM, Parker MM, Liu JY, Lyles CR, Karter AJ. The DISTANCE model for collaborative research: distributing analytic effort using scrambled data sets. ACTA ACUST UNITED AC 2014;2:33-8. [PMID: 25584364 DOI: 10.12691/iscf-2-3-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Li D, Mojarad MR, Li Y, Sohn S, Mehrabi S, Elayavilli RK, Yu Y, Liu H. A Frequency-based Strategy of Obtaining Sentences from Clinical Data Repository for Crowdsourcing. Stud Health Technol Inform 2015;216:1033-4. [PMID: 26262333 PMCID: PMC5859924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

McGraw D. Building public trust in uses of Health Insurance Portability and Accountability Act de-identified data. J Am Med Inform Assoc 2013;20:29-34. [PMID: 22735615 PMCID: PMC3555317 DOI: 10.1136/amiajnl-2012-000936] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Accepted: 05/31/2012] [Indexed: 11/04/2022] Open

Cimino JJ. The false security of blind dates: chrononymization's lack of impact on data privacy of laboratory data. Appl Clin Inform 2012;3:392-403. [PMID: 23646086 DOI: 10.4338/aci-2012-07-ra-0028] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2012] [Accepted: 10/01/2012] [Indexed: 11/23/2022] Open

Chervenak AL, van Erp TGM, Kesselman C, D'Arcy M, Sobell J, Keator D, Dahm L, Murry J, Law M, Hasso A, Ames J, Macciardi F, Potkin SG. A system architecture for sharing de-identified, research-ready brain scans and health information across clinical imaging centers. Stud Health Technol Inform 2012;175:19-28. [PMID: 22941984 PMCID: PMC4478050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]