1
|
Golder S, O’Connor K, Lopez-Garcia G, Tatonetti N, Gonzalez-Hernandez G. LEVERAGING UNSTRUCTURED DATA IN ELECTRONIC HEALTH RECORDS TO DETECT ADVERSE EVENTS FROM PEDIATRIC DRUG USE - A SCOPING REVIEW. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.03.20.25324320. [PMID: 40166566 PMCID: PMC11957175 DOI: 10.1101/2025.03.20.25324320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Adverse drug events (ADEs) in pediatric populations pose significant public health challenges, yet research on their detection and monitoring remains limited. This scoping review evaluates the use of unstructured data from electronic health records (EHRs) to identify ADEs in children. We searched six databases, including MEDLINE, Embase and IEEE Xplore, in September 2024. From 984 records, only nine studies met our inclusion criteria, indicating a significant gap in research towards identify ADEs in children. We found that unstructured data in EHRs can indeed be of value and enhance pediatric pharmacovigilance, although its use has been so far very limited. Traditional Natural Language Processing (NLP) methods have been employed to extract ADEs, but the approaches utilized face challenges in generalizability and context interpretation. These challenges could be addressed with recent advances in transformer-based models and large language models (LLMs), unlocking the use of EHR data at scale for pediatric pharmacovigilance.
Collapse
Affiliation(s)
- Su Golder
- Department of Health Sciences, University of York, York, United Kingdom
| | - Karen O’Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Guillermo Lopez-Garcia
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, USA
| | - Nicholas Tatonetti
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, USA
| | | |
Collapse
|
2
|
Beiler D, Chopra A, Gregor CM, Tusing LD, Pradhan AM, Romagnoli KM, Kraus CK, Piper BJ, Wright EA, Troiani V. Medical Marijuana Documentation Practices in Patient Electronic Health Records: Retrospective Observational Study Using Smart Data Elements and a Review of Medical Records. JMIR Form Res 2024; 8:e65957. [PMID: 39715532 PMCID: PMC11684775 DOI: 10.2196/65957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 10/07/2024] [Accepted: 10/10/2024] [Indexed: 12/25/2024] Open
Abstract
Background Medical marijuana (MMJ) is available in Pennsylvania, and participation in the state-regulated program requires patient registration and receiving certification by an approved physician. Currently, no integration of MMJ certification data with health records exists in Pennsylvania that would allow clinicians to rapidly identify patients using MMJ, as exists with other scheduled drugs. This absence of a formal data sharing structure necessitates tools aiding in consistent documentation practices to enable comprehensive patient care. Customized smart data elements (SDEs) were made available to clinicians at an integrated health system, Geisinger, following MMJ legalization in Pennsylvania. Objective The purpose of this project was to examine and contextualize the use of MMJ SDEs in the Geisinger population. We accomplished this goal by developing a systematic protocol for review of medical records and creating a tool that resulted in consistent human data extraction. Methods We developed a protocol for reviewing medical records for extracting MMJ-related information. The protocol was developed between August and December of 2022 and focused on a patient group that received one of several MMJ SDEs between January 25, 2019, and May 26, 2022. Characteristics were first identified on a pilot sample (n=5), which were then iteratively reviewed to optimize for consistency. Following the pilot, 2 reviewers were assigned 200 randomly selected patients' medical records, with a third reviewer examining a subsample (n=30) to determine reliability. We then summarized the clinician- and patient-level features from 156 medical records with a table-format SDE that best captured MMJ information. Results We found the review protocol for medical records was feasible for those with minimal medical background to complete, with high interrater reliability (κ=0.966; P<.001; odds ratio 0.97, 95% CI 0.954-0.978). MMJ certification was largely documented by nurses and medical assistants (n=138, 88.5%) and typically within primary care settings (n=107, 68.6%). The SDE has 6 preset field prompts with heterogeneous documentation completion rates, including certifying conditions (n=146, 93.6%), product (n=145, 92.9%), authorized dispensary (n=137, 87.8%), active ingredient (n=130, 83.3%), certifying provider (n=96, 61.5%), and dosage (n=48, 30.8%). We found preset fields were overall well-recorded (mean 76.6%, SD 23.7% across all fields). Primary diagnostic codes recorded at documentation encounters varied, with the most frequent being routine examinations and testing (n=34, 21.8%), musculoskeletal or nervous conditions, and signs and symptoms not classified elsewhere (n=21, 13.5%). Conclusions This method of reviewing medical records yields high-quality data extraction that can serve as a model for other health record inquiries. Our evaluation showed relatively high completeness of SDE fields, primarily by clinical staff responsible for rooming patients, with an overview of conditions under which MMJ is documented. Improving the adoption and fidelity of SDE data collection may present a valuable data source for future research on patient MMJ use, treatment efficacy, and outcomes.
Collapse
Affiliation(s)
- Donielle Beiler
- Autism and Developmental Medicine Institute, Geisinger, Lewisburg, PA, United States
| | - Aanya Chopra
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
| | - Christina M Gregor
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
| | - Lorraine D Tusing
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
| | - Apoorva M Pradhan
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
| | - Katrina M Romagnoli
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
- Department of Population Health Sciences, Geisinger, Danville, PA, United States
| | - Chadd K Kraus
- Department of Emergency and Hospital Medicine, Lehigh Valley Health Network, Hazelton, PA, United States
| | - Brian J Piper
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
- Department of Medical Education, Geisinger Commonwealth School of Medicine, Scranton, PA, United States
| | - Eric A Wright
- Center for Pharmacy Innovation and Outcomes, Geisinger, Danville, PA, United States
- Department of Bioethics and Decision Sciences, Geisinger, Danville, PA, United States
| | - Vanessa Troiani
- Autism and Developmental Medicine Institute, Geisinger, Lewisburg, PA, United States
| |
Collapse
|
3
|
Zheng C, Ackerson B, Qiu S, Sy LS, Daily LIV, Song J, Qian L, Luo Y, Ku JH, Cheng Y, Wu J, Tseng HF. Natural Language Processing Versus Diagnosis Code-Based Methods for Postherpetic Neuralgia Identification: Algorithm Development and Validation. JMIR Med Inform 2024; 12:e57949. [PMID: 39254589 PMCID: PMC11407135 DOI: 10.2196/57949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 07/02/2024] [Accepted: 07/08/2024] [Indexed: 09/11/2024] Open
Abstract
Background Diagnosis codes and prescription data are used in algorithms to identify postherpetic neuralgia (PHN), a debilitating complication of herpes zoster (HZ). Because of the questionable accuracy of codes and prescription data, manual chart review is sometimes used to identify PHN in electronic health records (EHRs), which can be costly and time-consuming. Objective This study aims to develop and validate a natural language processing (NLP) algorithm for automatically identifying PHN from unstructured EHR data and to compare its performance with that of code-based methods. Methods This retrospective study used EHR data from Kaiser Permanente Southern California, a large integrated health care system that serves over 4.8 million members. The source population included members aged ≥50 years who received an incident HZ diagnosis and accompanying antiviral prescription between 2018 and 2020 and had ≥1 encounter within 90-180 days of the incident HZ diagnosis. The study team manually reviewed the EHR and identified PHN cases. For NLP development and validation, 500 and 800 random samples from the source population were selected, respectively. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F-score, and Matthews correlation coefficient (MCC) of NLP and the code-based methods were evaluated using chart-reviewed results as the reference standard. Results The NLP algorithm identified PHN cases with a 90.9% sensitivity, 98.5% specificity, 82% PPV, and 99.3% NPV. The composite scores of the NLP algorithm were 0.89 (F-score) and 0.85 (MCC). The prevalences of PHN in the validation data were 6.9% (reference standard), 7.6% (NLP), and 5.4%-13.1% (code-based). The code-based methods achieved a 52.7%-61.8% sensitivity, 89.8%-98.4% specificity, 27.6%-72.1% PPV, and 96.3%-97.1% NPV. The F-scores and MCCs ranged between 0.45 and 0.59 and between 0.32 and 0.61, respectively. Conclusions The automated NLP-based approach identified PHN cases from the EHR with good accuracy. This method could be useful in population-based PHN research.
Collapse
Affiliation(s)
- Chengyi Zheng
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Bradley Ackerson
- South Bay Medical Center, Kaiser Permanente Southern California, Harbor City, CA, United States
| | - Sijia Qiu
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Lina S Sy
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Leticia I Vega Daily
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Jeannie Song
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Lei Qian
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Yi Luo
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Jennifer H Ku
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Yanjun Cheng
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Jun Wu
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
| | - Hung Fu Tseng
- Department of Research & Evaluation, Kaiser Permanente Southern California, 100 S Los Robles Ave, 2nd Floor, Pasadena, CA, 91101, United States, 1 626-986-8665, 1 626-564-7872
- Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA, United States
| |
Collapse
|
4
|
Wu JJ, Hauben M, Younus M. Current Approaches in Postapproval Vaccine Safety Studies Using Real-World Data: A Systematic Review of Published Literature. Clin Ther 2024; 46:555-564. [PMID: 39142925 DOI: 10.1016/j.clinthera.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 05/06/2024] [Accepted: 06/05/2024] [Indexed: 08/16/2024]
Abstract
PURPOSE Well-designed observational postmarketing studies using real-world data (RWD) are critical in supporting an evidence base and bolstering public confidence in vaccine safety. This systematic review presents current research methodologies in vaccine safety research in postapproval settings, technological advancements contributing to research resources and capabilities, and their major strengths and limitations. METHODS A comprehensive search was conducted using PubMed to identify relevant articles published from January 1, 2019, to December 31, 2022. Eligible studies were summarized overall by study design and other study characteristics (eg, country, vaccine studied, types of data source, and study population). An in-depth review of select studies representative of conventional or new designs, analytical approaches, or data collection methods was conducted to summarize current methods in vaccine safety research. FINDINGS Out of 977 articles screened for inclusion, 135 were reviewed. The review shows that recent advancements in scientific methods, digital technology, and analytic approaches have significantly contributed to postapproval vaccine safety studies using RWD. "Near real-time surveillance" using large datasets (via collaborative or distributed databases) has been used to facilitate rapid signal detection that complements passive surveillance. There was increasing appreciation for self-controlled case-only designs (self-controlled case series and self-controlled risk interval) to assess acute-onset safety outcomes, artificial intelligence, and natural language processing to improve outcome accuracy and study timeliness and emerging artificial intelligence-based analysis to capture adverse events from social media platforms. IMPLICATIONS Continued development in the area of vaccine safety research methodologies using RWD is warranted. The future of successful vaccine safety research, especially evaluation of rare safety events, is likely to comprise digital technologies including linking RWD networks, machine learning, and advanced analytic methods to generate rapid and robust real-world safety information.
Collapse
Affiliation(s)
- Juan Joanne Wu
- Safety Surveillance Research, Worldwide Medical and Safety, Pfizer Inc, New York, NY
| | - Manfred Hauben
- Department of Family and Community Medicine, New York Medical College, Valhalla, NY and Truliant Consulting, Baltimore, Maryland
| | - Muhammad Younus
- Safety Surveillance Research, Worldwide Medical and Safety, Pfizer Inc, New York, NY.
| |
Collapse
|
5
|
Iqbal U, Hsu YHE, Celi LA, Li YCJ. Artificial intelligence in healthcare: Opportunities come with landmines. BMJ Health Care Inform 2024; 31:e101086. [PMID: 38839426 PMCID: PMC11163668 DOI: 10.1136/bmjhci-2024-101086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 05/02/2024] [Indexed: 06/07/2024] Open
Affiliation(s)
- Usman Iqbal
- School of Population Health, Faculty of Medicine and Health, University of New South Wales (UNSW), Sydney, NSW, Australia
- Global Health and Health Security Department, College of Public Health, Taipei Medical University, Taipei, Taiwan
- International Center for Health Information and Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Yi-Hsin Elsa Hsu
- Biotechnology Executive Master's Degree in Business Administration (BioTech EMBA), Taipei Medical University, Taipei, Taiwan
- School of Healthcare Administration, College of Management, Taipei Medical University, Taipei, Taiwan
- International Ph.D. Program in BioTech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan
- Department of Humanities in Medicine, College of Medicine, School of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Yu-Chuan Jack Li
- International Center for Health Information and Technology, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Biomedical Informatics, College of Medical Science & Technology, Taipei Medical University, Taipei, Taiwan
- Department of Dermatology, Taipei Municipal Wanfang Hospital, Taipei, Taiwan
- The International Medical Informatics Association (IMIA), Zürich, Switzerland
| |
Collapse
|
6
|
Zheng C, Lee MS, Bansal N, Go AS, Chen C, Harrison TN, Fan D, Allen A, Garcia E, Lidgard B, Singer D, An J. Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records. EUROPEAN HEART JOURNAL. QUALITY OF CARE & CLINICAL OUTCOMES 2024; 10:77-88. [PMID: 36997334 PMCID: PMC10785579 DOI: 10.1093/ehjqcco/qcad021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 03/14/2023] [Accepted: 03/29/2023] [Indexed: 04/01/2023]
Abstract
AIMS This study aimed to develop and apply natural language processing (NLP) algorithms to identify recurrent atrial fibrillation (AF) episodes following rhythm control therapy initiation using electronic health records (EHRs). METHODS AND RESULTS We included adults with new-onset AF who initiated rhythm control therapies (ablation, cardioversion, or antiarrhythmic medication) within two US integrated healthcare delivery systems. A code-based algorithm identified potential AF recurrence using diagnosis and procedure codes. An automated NLP algorithm was developed and validated to capture AF recurrence from electrocardiograms, cardiac monitor reports, and clinical notes. Compared with the reference standard cases confirmed by physicians' adjudication, the F-scores, sensitivity, and specificity were all above 0.90 for the NLP algorithms at both sites. We applied the NLP and code-based algorithms to patients with incident AF (n = 22 970) during the 12 months after initiating rhythm control therapy. Applying the NLP algorithms, the percentages of patients with AF recurrence for sites 1 and 2 were 60.7% and 69.9% (ablation), 64.5% and 73.7% (cardioversion), and 49.6% and 55.5% (antiarrhythmic medication), respectively. In comparison, the percentages of patients with code-identified AF recurrence for sites 1 and 2 were 20.2% and 23.7% for ablation, 25.6% and 28.4% for cardioversion, and 20.0% and 27.5% for antiarrhythmic medication, respectively. CONCLUSION When compared with a code-based approach alone, this study's high-performing automated NLP method identified significantly more patients with recurrent AF. The NLP algorithms could enable efficient evaluation of treatment effectiveness of AF therapies in large populations and help develop tailored interventions.
Collapse
Affiliation(s)
- Chengyi Zheng
- Research and Evaluation Department, Kaiser Permanente Southern California,100 S Los Robles Ave, 2nd Floor, Pasadena, CA 91101, USA
| | - Ming-sum Lee
- Department of Cardiology, Kaiser Permanente Los Angeles Medical Center, Los Angeles, CA 90027, USA
| | - Nisha Bansal
- Kidney Research Institute, Division of Nephrology, University of Washington, Seattle, WA 98104, USA
| | - Alan S Go
- Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
- Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA 91101, USA
- Department of Medicine and Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
- Departments of Medicine, Stanford University, Palo Alto, CA 94305, USA
| | - Cheng Chen
- Department of Cardiology, Kaiser Permanente Fontana Medical Center, Fontana, CA 92335, USA
| | - Teresa N Harrison
- Research and Evaluation Department, Kaiser Permanente Southern California,100 S Los Robles Ave, 2nd Floor, Pasadena, CA 91101, USA
| | - Dongjie Fan
- Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
| | - Amanda Allen
- Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
| | - Elisha Garcia
- Division of Research, Kaiser Permanente Northern California, Oakland, CA 94612, USA
| | - Ben Lidgard
- Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA 91101, USA
| | - Daniel Singer
- Clinical Epidemiology Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jaejin An
- Research and Evaluation Department, Kaiser Permanente Southern California,100 S Los Robles Ave, 2nd Floor, Pasadena, CA 91101, USA
- Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA 91101, USA
| |
Collapse
|