1
|
Painter JL, Ramcharran D, Bate A. Perspective review: Will generative AI make common data models obsolete in future analyses of distributed data networks? Ther Adv Drug Saf 2025; 16:20420986251332743. [PMID: 40290511 PMCID: PMC12033412 DOI: 10.1177/20420986251332743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2024] [Accepted: 03/19/2025] [Indexed: 04/30/2025] Open
Abstract
Integrating real-world healthcare data is challenging due to diverse formats and terminologies, making standardization resource-intensive. While Common Data Models (CDMs) facilitate interoperability, they often cause information loss, exhibit semantic inconsistencies, and are labor-intensive to implement and update. We explore how generative artificial intelligence (GenAI), especially large language models (LLMs), could make CDMs obsolete in quantitative healthcare data analysis by interpreting natural language queries and generating code, enabling direct interaction with raw data. Knowledge graphs (KGs) standardize relationships and semantics across heterogeneous data, preserving integrity. This perspective review proposes a fourth generation of distributed data network analysis, building on previous generations categorized by their approach to data standardization and utilization. It emphasizes the potential of GenAI to overcome the limitations CDMs with GenAI-enabled access, KGs, and automatic code generation. A data commons may further enhance this capability, and KGs may well be needed to enable effective GenAI. Addressing privacy, security, and governance is critical; any new method must ensure protections comparable to CDM-based models. Our approach would aim to enable efficient, real-time analyses across diverse datasets and enhance patient safety. We recommend prioritizing research to assess how GenAI can transform quantitative healthcare data analysis by overcoming current limitations.
Collapse
Affiliation(s)
| | | | - Andrew Bate
- GSK, London, UK
- London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
2
|
Crisafulli S, Bate A, Brown JS, Candore G, Chandler RE, Hammad TA, Lane S, Maro JC, Norén GN, Pariente A, Russom M, Salas M, Segec A, Shakir S, Spini A, Toh S, Tuccori M, van Puijenbroek E, Trifirò G. Interplay of Spontaneous Reporting and Longitudinal Healthcare Databases for Signal Management: Position Statement from the Real-World Evidence and Big Data Special Interest Group of the International Society of Pharmacovigilance. Drug Saf 2025:10.1007/s40264-025-01548-3. [PMID: 40223041 DOI: 10.1007/s40264-025-01548-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/25/2025] [Indexed: 04/15/2025]
Abstract
Signal management, defined as the set of activities from signal detection to recommendations for action, is conducted using different data sources and leveraging data from spontaneous reporting databases (SRDs), which represent the cornerstone of pharmacovigilance. However, the exponentially increasing generation and availability of real-world data collected in longitudinal healthcare databases (LHDs), along with the rapid evolution of artificial intelligence-based algorithms and other advanced analytical methods, offers a wide range of opportunities to complement SRDs throughout all stages of signal management, especially signal detection. Integrating information derived from SRDs and LHDs may reduce their respective limitations, thus potentially enhancing post-marketing surveillance. The aim of this position statement is to critically evaluate the complementary role of SRDs and LHDs in signal management, exploring the potential benefits and challenges in integrating information coming from these two data sources. Furthermore, we presented successful cases of the interplay between SRDs and LHDs for signal management, along with future opportunities and directions to improve such interplay.
Collapse
Affiliation(s)
- Salvatore Crisafulli
- Department of Diagnostics and Public Health, University of Verona, P.le L.A. Scuro 10, 37124, Verona, Italy
| | - Andrew Bate
- Global Safety, GSK, Brentford, UK
- Department of Non-Communicable Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
| | - Jeffrey Stuart Brown
- TriNetX, Cambridge, MA, USA
- Department of Population Medicine, Harvard Medical School, Boston, MA, USA
| | | | | | - Tarek A Hammad
- Takeda Development Center Americas, Inc., Cambridge, MA, USA
| | - Samantha Lane
- Drug Safety Research Unit, Southampton, UK
- University of Portsmouth, Portsmouth, UK
| | | | | | - Antoine Pariente
- Université de Bordeaux, INSERM, BPH, Team AHeaD, U1219, 33000, Bordeaux, France
- Service de Pharmacologie Médicale, CHU de Bordeaux, INSERM, U1219, 33000, Bordeaux, France
| | - Mulugeta Russom
- National Medicines and Food Administration, Ministry of Health, Asmara, Eritrea
- Department of Medical Informatics, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Maribel Salas
- Bayer Pharmaceuticals Inc., Whippany, NJ, USA
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Andrej Segec
- Data Analytics and Methods Task Force, European Medicines Agency, Amsterdam, The Netherlands
| | - Saad Shakir
- Drug Safety Research Unit, Southampton, UK
- University of Portsmouth, Portsmouth, UK
| | - Andrea Spini
- Department of Diagnostics and Public Health, University of Verona, P.le L.A. Scuro 10, 37124, Verona, Italy
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School, Boston, MA, USA
| | - Marco Tuccori
- Department of Diagnostics and Public Health, University of Verona, P.le L.A. Scuro 10, 37124, Verona, Italy
| | - Eugène van Puijenbroek
- Netherlands Pharmacovigilance Centre Lareb, 's-Hertogenbosch, The Netherlands
- PharmacoTherapy, Epidemiology and Economics, University of Groningen, Groningen Research Institute of Pharmacy, Groningen, The Netherlands
| | - Gianluca Trifirò
- Department of Diagnostics and Public Health, University of Verona, P.le L.A. Scuro 10, 37124, Verona, Italy.
| |
Collapse
|
3
|
Stammers M, Ramgopal B, Owusu Nimako A, Vyas A, Nouraei R, Metcalf C, Batchelor J, Shepherd J, Gwiggner M. A foundation systematic review of natural language processing applied to gastroenterology & hepatology. BMC Gastroenterol 2025; 25:58. [PMID: 39915703 PMCID: PMC11800601 DOI: 10.1186/s12876-025-03608-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Accepted: 01/13/2025] [Indexed: 02/11/2025] Open
Abstract
OBJECTIVE This review assesses the progress of NLP in gastroenterology to date, grades the robustness of the methodology, exposes the field to a new generation of authors, and highlights opportunities for future research. DESIGN Seven scholarly databases (ACM Digital Library, Arxiv, Embase, IEEE Explore, Pubmed, Scopus and Google Scholar) were searched for studies published between 2015 and 2023 that met the inclusion criteria. Studies lacking a description of appropriate validation or NLP methods were excluded, as were studies ufinavailable in English, those focused on non-gastrointestinal diseases and those that were duplicates. Two independent reviewers extracted study information, clinical/algorithm details, and relevant outcome data. Methodological quality and bias risks were appraised using a checklist of quality indicators for NLP studies. RESULTS Fifty-three studies were identified utilising NLP in endoscopy, inflammatory bowel disease, gastrointestinal bleeding, liver and pancreatic disease. Colonoscopy was the focus of 21 (38.9%) studies; 13 (24.1%) focused on liver disease, 7 (13.0%) on inflammatory bowel disease, 4 (7.4%) on gastroscopy, 4 (7.4%) on pancreatic disease and 2 (3.7%) on endoscopic sedation/ERCP and gastrointestinal bleeding. Only 30 (56.6%) of the studies reported patient demographics, and only 13 (24.5%) had a low risk of validation bias. Thirty-five (66%) studies mentioned generalisability, but only 5 (9.4%) mentioned explainability or shared code/models. CONCLUSION NLP can unlock substantial clinical information from free-text notes stored in EPRs and is already being used, particularly to interpret colonoscopy and radiology reports. However, the models we have thus far lack transparency, leading to duplication, bias, and doubts about generalisability. Therefore, greater clinical engagement, collaboration, and open sharing of appropriate datasets and code are needed.
Collapse
Affiliation(s)
- Matthew Stammers
- University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK.
- Southampton Emerging Therapies and Technologies (SETT) Centre, Southampton, SO16 6YD, UK.
- Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK.
- University of Southampton, Southampton, SO17 1BJ, UK.
| | | | | | - Anand Vyas
- University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK
| | - Reza Nouraei
- Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK
- University of Southampton, Southampton, SO17 1BJ, UK
- Queen's Medical Centre, ENT Department, Nottingham, NG7 2UH, UK
| | - Cheryl Metcalf
- University of Southampton, Southampton, SO17 1BJ, UK
- School of Healthcare Enterprise and Innovation, University of Southampton, University of Southampton Science Park, Enterprise Road, Chilworth, Southampton, SO16 7NS, UK
| | - James Batchelor
- Clinical Informatics Research Unit (CIRU), Coxford Road, Southampton, SO16 5AF, UK
- University of Southampton, Southampton, SO17 1BJ, UK
| | - Jonathan Shepherd
- Southampton Health Technologies Assessment Centre (SHTAC), Enterprise Road, Alpha House, Southampton, SO16 7NS, England
| | - Markus Gwiggner
- University Hospital Southampton, Tremona Road, Southampton, SO16 6YD, UK
- University of Southampton, Southampton, SO17 1BJ, UK
| |
Collapse
|
4
|
Bate A, Stegmann JU. Safety of medicines and vaccines - building next generation capability. Trends Pharmacol Sci 2021; 42:1051-1063. [PMID: 34635346 DOI: 10.1016/j.tips.2021.09.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 09/10/2021] [Accepted: 09/14/2021] [Indexed: 10/20/2022]
Abstract
The systematic safety surveillance of real-world use of medicinal products and related activities (pharmacovigilance) started in earnest as a scientific field only in the 1960s. While developments have occurred over the past 50 years, adding to its complexity and sophistication, the extent to which some of these advances have positively impacted the capability for ensuring patient safety is questionable. We review how the conduct of safety surveillance has changed, highlight recent scientific advances, and argue how they need to be harnessed to enhance pharmacovigilance in the future. Specifically, we describe five changes that we believe should and will need to happen globally in the coming years: (i) better, more diverse data used for safety; (ii) the switch from manual activities to automation; (iii) removal of limited value, extraneous transactional activities and replacement with sharpened focus on scientific efforts to improve patient safety; (iv) patient-involved and focussed safety; and (v) personalised safety.
Collapse
Affiliation(s)
- Andrew Bate
- GSK, London, UK; London School of Hygiene and Tropical Medicine, University of London, London, UK; New York University, New York, NY, USA.
| | | |
Collapse
|
5
|
Brown JP, Douglas IJ, Hanif S, Thwaites RMA, Bate A. Measuring the Effectiveness of Real-World Evidence to Ensure Appropriate Impact. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2021; 24:1241-1244. [PMID: 34452702 DOI: 10.1016/j.jval.2021.03.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 03/22/2021] [Accepted: 03/25/2021] [Indexed: 06/13/2023]
Abstract
The value of real-world evidence (RWE) in medicines regulation and health technology assessment has been increasingly emphasized. Nevertheless, although RWE is increasingly used, there has been limited systematic evidence of its value. A recent study that examined the role and impact of RWE in regulatory assessments conducted through the European Medicines Agency provided such evidence. Results of the study demonstrated RWE was important to decision making, particularly for certain questions such as the quantification of adverse events, the evaluation of risk minimization measures, and the assessment of product usage. The study suggested, however, that in many of the assessments further RWE would have been valuable and concluded that RWE has, as yet, played a limited role in hypothesis generation and in the assessment of medication effectiveness. This study had been possible only because of the transparency of the European Medicines Agency decision making. Ensuring transparency of RWE evidence collection, study design and conduct, and of decision making based on this evidence will facilitate further development of the uses and value of RWE. Keywords: benefit-risk assessment; medicines regulation; real-world evidence; regulatory decision making.
Collapse
Affiliation(s)
- Jeremy P Brown
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, England, UK.
| | - Ian J Douglas
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, England, UK
| | | | | | - Andrew Bate
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, England, UK; Global Safety, GSK, Brentford, Middlesex, England, UK
| |
Collapse
|
6
|
Barbieri JS, Shin DB, Wang S, Margolis DJ, Takeshita J. Association of Race/Ethnicity and Sex With Differences in Health Care Use and Treatment for Acne. JAMA Dermatol 2020; 156:312-319. [PMID: 32022834 DOI: 10.1001/jamadermatol.2019.4818] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Importance Our understanding of potential racial/ethnic, sex, and other differences in health care use and treatment for acne is limited. Objective To identify potential disparities in acne care by evaluating factors associated with health care use and specific treatments for acne. Design, Setting, and Participants This retrospective cohort study used the Optum deidentified electronic health record data set to identify patients treated for acne from January 1, 2007, to June 30, 2017. Patients had at least 1 International Classification of Diseases, Ninth Revision (ICD-9) or International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) code for acne and at least 1 year of continuous enrollment after the first diagnosis of acne. Data analysis was performed from September 1, 2019, to November 20, 2019. Main Outcomes and Measures Multivariable regression was used to quantify associations between basic patient demographic and socioeconomic characteristics and the outcomes of health care use and treatment for acne during 1 year of follow-up. Results A total of 29 928 patients (median [interquartile range] age, 20.2 [15.4-34.9] years; 19 127 [63.9%] female; 20 310 [67.9%] white) met the inclusion criteria for the study. Compared with non-Hispanic white patients, non-Hispanic black patients were more likely to be seen by a dermatologist (odds ratio [OR], 1.20; 95% CI, 1.09-1.31) but received fewer prescriptions for acne medications (incidence rate ratio, 0.89; 95% CI, 0.84-0.95). Of the acne treatment options, non-Hispanic black patients were more likely to receive prescriptions for topical retinoids (OR, 1.25; 95% CI, 1.14-1.38) and topical antibiotics (OR, 1.35; 95% CI, 1.21-1.52) and less likely to receive prescriptions for oral antibiotics (OR, 0.80; 95% CI, 0.72-0.87), spironolactone (OR, 0.68; 95% CI, 0.49-0.94), and isotretinoin (OR, 0.39; 95% CI, 0.23-0.65) than non-Hispanic white patients. Male patients were more likely to be prescribed isotretinoin than female patients (OR, 2.44; 95% CI, 2.01-2.95). Compared with patients with commercial insurance, those with Medicaid were less likely to see a dermatologist (OR, 0.46; 95% CI, 0.41-0.52) or to be prescribed topical retinoids (OR, 0.82; 95% CI, 0.73-0.92), oral antibiotics (OR, 0.87; 95% CI, 0.79-0.97), spironolactone (OR, 0.50; 95% CI, 0.31-0.80), and isotretinoin (OR, 0.43; 95% CI, 0.25-0.75). Conclusions and Relevance The findings identify racial/ethnic, sex, and insurance-based differences in health care use and prescribing patterns for acne that are independent of other sociodemographic factors and suggest potential disparities in acne care. In particular, the study found underuse of systemic therapies among racial/ethnic minorities and isotretinoin among female patients with acne. Further study is needed to confirm and understand the reasons for these differences.
Collapse
Affiliation(s)
- John S Barbieri
- Department of Dermatology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia
| | - Daniel B Shin
- Department of Dermatology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia
| | - Shiyu Wang
- Department of Dermatology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia
| | - David J Margolis
- Department of Dermatology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia.,Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia
| | - Junko Takeshita
- Department of Dermatology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia.,Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia
| |
Collapse
|
7
|
Barbieri JS, Wang S, Ogdie AR, Shin DB, Takeshita J. Age-appropriate cancer screening: A cohort study of adults with psoriasis prescribed biologics, adults in the general population, and adults with hypertension. J Am Acad Dermatol 2020; 84:1602-1609. [PMID: 33470207 DOI: 10.1016/j.jaad.2020.10.045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 09/23/2020] [Accepted: 10/18/2020] [Indexed: 12/15/2022]
Abstract
BACKGROUND Psoriasis is associated with increased risk of developing and dying from cancer. OBJECTIVE To evaluate whether psoriasis patients who are prescribed biologics receive the recommended screening for cervical, breast, and colon cancer. METHODS We conducted a retrospective cohort study using the Optum deidentified Electronic Health Record data set. Incidence rates for cervical, breast, and colon cancer screening were compared between psoriasis patients who were prescribed biologics and 2 matched comparator cohorts: general patient population and patients being managed for hypertension. Multivariable Cox proportional hazards regression was performed to assess for differences in the rates of cancer screening. RESULTS Compared with those in the general population without psoriasis, psoriasis patients who were prescribed biologics had higher screening rates for cervical cancer (adjusted hazard ratio [aHR] 1.09; 95% confidence interval [CI] 1.02-1.16) and colon cancer (aHR 1.10; 95% CI 1.02-1.18). Compared with those with hypertension, patients with psoriasis who were prescribed biologics had lower screening rates for breast cancer (aHR 0.88; 95% CI 0.83-0.94) and colon cancer (aHR 0.89; 95% CI 0.83-0.95). CONCLUSIONS AND RELEVANCE Patients with psoriasis who are prescribed biologic therapies may not be receiving adequate age-appropriate cancer screening, especially for breast and colon cancer.
Collapse
Affiliation(s)
- John S Barbieri
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Shiyu Wang
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Alexis R Ogdie
- Division of Rheumatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Daniel B Shin
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Junko Takeshita
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.
| |
Collapse
|
8
|
Bate A, Hobbiger SF. Artificial Intelligence, Real-World Automation and the Safety of Medicines. Drug Saf 2020; 44:125-132. [PMID: 33026641 DOI: 10.1007/s40264-020-01001-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 12/16/2022]
Abstract
Despite huge technological advances in the capabilities to capture, store, link and analyse data electronically, there has been some but limited impact on routine pharmacovigilance. We discuss emerging research in the use of artificial intelligence, machine learning and automation across the pharmacovigilance lifecycle including pre-licensure. Reasons are provided on why adoption is challenging and we also provide a perspective on changes needed to accelerate adoption, and thereby improve patient safety. Last, we make clear that while technologies could be superimposed on existing pharmacovigilance processes for incremental improvements, these great societal advances in data and technology also provide us with a timely opportunity to reconsider everything we do in pharmacovigilance operations to maximise the benefit of these advances.
Collapse
Affiliation(s)
- Andrew Bate
- Clinical Safety and Pharmacovigilance, GSK, 980 Great West Road, Brentford, Middlesex, TW8 9GS, UK.
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK.
| | - Steve F Hobbiger
- Clinical Safety and Pharmacovigilance, GSK, 980 Great West Road, Brentford, Middlesex, TW8 9GS, UK
| |
Collapse
|
9
|
Crowson MG, Hamour A, Lin V, Chen JM, Chan TCY. Machine learning for pattern detection in cochlear implant FDA adverse event reports. Cochlear Implants Int 2020; 21:313-322. [DOI: 10.1080/14670100.2020.1784569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Matthew G. Crowson
- Department of Otolaryngology-HNS, Sunnybrook Health Sciences Center, University of Toronto, Toronto, Ontario
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, Ontario
| | - Amr Hamour
- Department of Otolaryngology-HNS, Sunnybrook Health Sciences Center, University of Toronto, Toronto, Ontario
| | - Vincent Lin
- Department of Otolaryngology-HNS, Sunnybrook Health Sciences Center, University of Toronto, Toronto, Ontario
| | - Joseph M. Chen
- Department of Otolaryngology-HNS, Sunnybrook Health Sciences Center, University of Toronto, Toronto, Ontario
| | - Timothy C. Y. Chan
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, Ontario
| |
Collapse
|
10
|
Gruber S, Krakower D, Menchaca JT, Hsu K, Hawrusik R, Maro JC, Cocoros NM, Kruskal BA, Wilson IB, Mayer KH, Klompas M. Using electronic health records to identify candidates for human immunodeficiency virus pre-exposure prophylaxis: An application of super learning to risk prediction when the outcome is rare. Stat Med 2020; 39:3059-3073. [PMID: 32578905 DOI: 10.1002/sim.8591] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 04/13/2020] [Accepted: 05/07/2020] [Indexed: 01/08/2023]
Abstract
Human immunodeficiency virus (HIV) pre-exposure prophylaxis (PrEP) protects high risk patients from becoming infected with HIV. Clinicians need help to identify candidates for PrEP based on information routinely collected in electronic health records (EHRs). The greatest statistical challenge in developing a risk prediction model is that acquisition is extremely rare. METHODS Data consisted of 180 covariates (demographic, diagnoses, treatments, prescriptions) extracted from records on 399 385 patient (150 cases) seen at Atrius Health (2007-2015), a clinical network in Massachusetts. Super learner is an ensemble machine learning algorithm that uses k-fold cross validation to evaluate and combine predictions from a collection of algorithms. We trained 42 variants of sophisticated algorithms, using different sampling schemes that more evenly balanced the ratio of cases to controls. We compared super learner's cross validated area under the receiver operating curve (cv-AUC) with that of each individual algorithm. RESULTS The least absolute shrinkage and selection operator (lasso) using a 1:20 class ratio outperformed the super learner (cv-AUC = 0.86 vs 0.84). A traditional logistic regression model restricted to 23 clinician-selected main terms was slightly inferior (cv-AUC = 0.81). CONCLUSION Machine learning was successful at developing a model to predict 1-year risk of acquiring HIV based on a physician-curated set of predictors extracted from EHRs.
Collapse
Affiliation(s)
- Susan Gruber
- Putnam Data Sciences, LLC, Cambridge, Massachusetts, USA
| | - Douglas Krakower
- Division of Infectious Diseases, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.,The Fenway Institute, Fenway Health, Boston, Massachusetts, USA.,Harvard Medical School, Boston, Massachusetts, USA.,Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - John T Menchaca
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Katherine Hsu
- Massachusetts Department of Public Health, Boston, Massachusetts, USA.,Department of Pediatrics, Boston Medical Center, Boston, Massachusetts, USA
| | - Rebecca Hawrusik
- Massachusetts Department of Public Health, Boston, Massachusetts, USA
| | - Judith C Maro
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Noelle M Cocoros
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Ira B Wilson
- Department of Health Services, Policy and Practice, Brown University, Providence, Rhode Island, USA
| | - Kenneth H Mayer
- Division of Infectious Diseases, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.,The Fenway Institute, Fenway Health, Boston, Massachusetts, USA.,Harvard Medical School, Boston, Massachusetts, USA
| | - Michael Klompas
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts, USA.,Division of Infectious Diseases, Brigham and Women's Hospital, Boston, Massachusetts, USA
| |
Collapse
|
11
|
Wehner MR, Micheletti R, Noe MH, Linos E, Margolis DJ, Naik HB. Hidradenitis suppurativa encounters in a national electronic health record database notable for low dermatology utilization, infrequent biologic prescriptions, and frequent opiate prescriptions. J Am Acad Dermatol 2019; 82:1239-1241. [PMID: 31866261 DOI: 10.1016/j.jaad.2019.12.030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 10/28/2019] [Accepted: 12/12/2019] [Indexed: 11/25/2022]
Affiliation(s)
| | | | - Megan H Noe
- Department of Dermatology, University of Pennsylvania, Philadelphia
| | - Eleni Linos
- Department of Dermatology, Stanford University, Stanford, California
| | - David J Margolis
- Department of Dermatology, University of Pennsylvania, Philadelphia
| | - Haley B Naik
- Department of Dermatology, University of California, San Francisco.
| |
Collapse
|
12
|
Lack of association of biologic therapy for psoriasis with psychiatric illness: An electronic medical records cohort study. J Am Acad Dermatol 2019; 81:709-716. [DOI: 10.1016/j.jaad.2019.04.055] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 04/15/2019] [Accepted: 04/21/2019] [Indexed: 01/05/2023]
|
13
|
Wang SV, Patterson OV, Gagne JJ, Brown JS, Ball R, Jonsson P, Wright A, Zhou L, Goettsch W, Bate A. Transparent Reporting on Research Using Unstructured Electronic Health Record Data to Generate ‘Real World’ Evidence of Comparative Effectiveness and Safety. Drug Saf 2019; 42:1297-1309. [DOI: 10.1007/s40264-019-00851-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
14
|
Barbieri JS, Shin DB, Wang S, Margolis DJ, Takeshita J. The clinical utility of laboratory monitoring during isotretinoin therapy for acne and changes to monitoring practices over time. J Am Acad Dermatol 2019; 82:72-79. [PMID: 31228528 DOI: 10.1016/j.jaad.2019.06.025] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2019] [Revised: 06/10/2019] [Accepted: 06/12/2019] [Indexed: 11/18/2022]
Abstract
BACKGROUND As a result of concerns about hypertriglyceridemia, liver enzyme abnormalities, and leukopenia during isotretinoin therapy for acne, patients are often monitored closely with routine laboratory assessments, although the value of this practice has been questioned. METHODS We conducted a cohort study of patients receiving isotretinoin for acne between January 1, 2008, and June 30, 2017, using the OptumInsights Electronic Health Record Database (Optum, Eden Prairie, MN) to evaluate the frequency of laboratory abnormalities. Poisson regression was used to evaluate for changes to the frequency of routine laboratory monitoring over time. RESULTS Among 1863 patients treated with isotretinoin, grade 3 or greater triglyceride and liver function testing abnormalities were noted in fewer than 1% and 0.5% of patients screened, respectively. No grade 3 or greater cholesterol or complete blood count abnormalities were observed. There were no meaningful changes in the frequency of laboratory monitoring over time. LIMITATIONS Limitations include that we are unable to evaluate the clinical notes to understand the exact clinical decision making when clinicians encountered abnormal laboratory values. CONCLUSION Although laboratory abnormalities are rare and often do not influence management, frequent laboratory monitoring remains a common practice. There are opportunities to improve the quality of care among patients being treated with isotretinoin for acne by reducing the frequency of lipid and liver function monitoring and by eliminating complete blood count monitoring.
Collapse
Affiliation(s)
- John S Barbieri
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.
| | - Daniel B Shin
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Shiyu Wang
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - David J Margolis
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| | - Junko Takeshita
- Department of Dermatology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Center for Clinical Epidemiology and Biostatistics, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| |
Collapse
|
15
|
Shortreed SM, Cook AJ, Coley RY, Bobb JF, Nelson JC. Challenges and Opportunities for Using Big Health Care Data to Advance Medical Science and Public Health. Am J Epidemiol 2019; 188:851-861. [PMID: 30877288 DOI: 10.1093/aje/kwy292] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 12/20/2018] [Indexed: 12/14/2022] Open
Abstract
Methodological advancements in epidemiology, biostatistics, and data science have strengthened the research world's ability to use data captured from electronic health records (EHRs) to address pressing medical questions, but gaps remain. We describe methods investments that are needed to curate EHR data toward research quality and to integrate complementary data sources when EHR data alone are insufficient for research goals. We highlight new methods and directions for improving the integrity of medical evidence generated from pragmatic trials, observational studies, and predictive modeling. We also discuss needed methods contributions to further ease data sharing across multisite EHR data networks. Throughout, we identify opportunities for training and for bolstering collaboration among subject matter experts, methodologists, practicing clinicians, and health system leaders to help ensure that methods problems are identified and resulting advances are translated into mainstream research practice more quickly.
Collapse
Affiliation(s)
- Susan M Shortreed
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington
| | - Andrea J Cook
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington
| | - R Yates Coley
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington
| | - Jennifer F Bobb
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington
| | - Jennifer C Nelson
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, Washington
| |
Collapse
|
16
|
Bate A. Guidance to reinforce the credibility of health care database studies and ensure their appropriate impact. Pharmacoepidemiol Drug Saf 2019; 26:1013-1017. [PMID: 28913965 DOI: 10.1002/pds.4305] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 08/07/2017] [Accepted: 08/07/2017] [Indexed: 12/28/2022]
Affiliation(s)
- Andrew Bate
- Pfizer, Walton Oaks, UK.,New York University, New York, USA
| |
Collapse
|
17
|
Trifirò G, Sultana J, Bate A. From Big Data to Smart Data for Pharmacovigilance: The Role of Healthcare Databases and Other Emerging Sources. Drug Saf 2018; 41:143-149. [PMID: 28840504 DOI: 10.1007/s40264-017-0592-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In the last decade 'big data' has become a buzzword used in several industrial sectors, including but not limited to telephony, finance and healthcare. Despite its popularity, it is not always clear what big data refers to exactly. Big data has become a very popular topic in healthcare, where the term primarily refers to the vast and growing volumes of computerized medical information available in the form of electronic health records, administrative or health claims data, disease and drug monitoring registries and so on. This kind of data is generally collected routinely during administrative processes and clinical practice by different healthcare professionals: from doctors recording their patients' medical history, drug prescriptions or medical claims to pharmacists registering dispensed prescriptions. For a long time, this data accumulated without its value being fully recognized and leveraged. Today big data has an important place in healthcare, including in pharmacovigilance. The expanding role of big data in pharmacovigilance includes signal detection, substantiation and validation of drug or vaccine safety signals, and increasingly new sources of information such as social media are also being considered. The aim of the present paper is to discuss the uses of big data for drug safety post-marketing assessment.
Collapse
Affiliation(s)
- Gianluca Trifirò
- Department of Biomedical and Dental Sciences and Morpho-Functional Imaging, University of Messina, Messina, Italy.
- Department of Medical Informatics, Erasmus Medical Centre, Rotterdam, The Netherlands.
| | - Janet Sultana
- Department of Biomedical and Dental Sciences and Morpho-Functional Imaging, University of Messina, Messina, Italy
- Department of Medical Informatics, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - Andrew Bate
- Epidemiology Group Lead, Analytics, Worldwide Safety, Pfizer, Tadworth, UK
- Department of Clinical Pharmacology, New York University (NYU), New York, USA
| |
Collapse
|
18
|
Ball R, Toh S, Nolan J, Haynes K, Forshee R, Botsis T. Evaluating automated approaches to anaphylaxis case classification using unstructured data from the FDA Sentinel System. Pharmacoepidemiol Drug Saf 2018; 27:1077-1084. [DOI: 10.1002/pds.4645] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Revised: 07/03/2018] [Accepted: 08/01/2018] [Indexed: 11/08/2022]
Affiliation(s)
- Robert Ball
- Office of Surveillance and Epidemiology; Center for Drug Evaluation and Research, FDA; Silver Spring MD USA
| | - Sengwee Toh
- Department of Population Medicine; Harvard Medical School and Harvard Pilgrim Health Care Institute; Boston MA USA
| | - Jamie Nolan
- Department of Population Medicine; Harvard Medical School and Harvard Pilgrim Health Care Institute; Boston MA USA
| | - Kevin Haynes
- Translational Research for Affordability and Quality; HealthCore, Inc.; Wilmington DE USA
| | - Richard Forshee
- Office of Biostatistics and Epidemiology; Center for Biologics Evaluation and Research, FDA; Silver Spring MD USA
| | - Taxiarchis Botsis
- Office of Biostatistics and Epidemiology; Center for Biologics Evaluation and Research, FDA; Silver Spring MD USA
| |
Collapse
|
19
|
Kennell TI, Willig JH, Cimino JJ. Clinical Informatics Researcher's Desiderata for the Data Content of the Next Generation Electronic Health Record. Appl Clin Inform 2017; 8:1159-1172. [PMID: 29270955 DOI: 10.4338/aci-2017-06-r-0101] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
OBJECTIVE Clinical informatics researchers depend on the availability of high-quality data from the electronic health record (EHR) to design and implement new methods and systems for clinical practice and research. However, these data are frequently unavailable or present in a format that requires substantial revision. This article reports the results of a review of informatics literature published from 2010 to 2016 that addresses these issues by identifying categories of data content that might be included or revised in the EHR. MATERIALS AND METHODS We used an iterative review process on 1,215 biomedical informatics research articles. We placed them into generic categories, reviewed and refined the categories, and then assigned additional articles, for a total of three iterations. RESULTS Our process identified eight categories of data content issues: Adverse Events, Clinician Cognitive Processes, Data Standards Creation and Data Communication, Genomics, Medication List Data Capture, Patient Preferences, Patient-reported Data, and Phenotyping. DISCUSSION These categories summarize discussions in biomedical informatics literature that concern data content issues restricting clinical informatics research. These barriers to research result from data that are either absent from the EHR or are inadequate (e.g., in narrative text form) for the downstream applications of the data. In light of these categories, we discuss changes to EHR data storage that should be considered in the redesign of EHRs, to promote continued innovation in clinical informatics. CONCLUSION Based on published literature of clinical informaticians' reuse of EHR data, we characterize eight types of data content that, if included in the next generation of EHRs, would find immediate application in advanced informatics tools and techniques.
Collapse
Affiliation(s)
- Timothy I Kennell
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - James H Willig
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States.,Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - James J Cimino
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States.,Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| |
Collapse
|
20
|
Affiliation(s)
- Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
| |
Collapse
|
21
|
A Case Study of the Incremental Utility for Disease Identification of Natural Language Processing in Electronic Medical Records. Pharmaceut Med 2017. [DOI: 10.1007/s40290-017-0216-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
22
|
Bate A, Reynolds RF, Caubel P. The hope, hype and reality of Big Data for pharmacovigilance. Ther Adv Drug Saf 2017; 9:5-11. [PMID: 29318002 DOI: 10.1177/2042098617736422] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
- Andrew Bate
- Epidemiology, Worldwide Safety, Pfizer R&D, Walton Oaks, England, UK; New York University, New York, NY, USA
| | - Robert F Reynolds
- Global Head of Epidemiology, Worldwide Safety, Pfizer R&D, New York, NY, USA
| | - Patrick Caubel
- Global Head of Worldwide Safety, Pfizer R&D, New York, NY, USA
| |
Collapse
|
23
|
Abstract
Information that is not made explicit is nonetheless embedded in most of our standard procedures. In its simplest form, embedded information may take the form of prior knowledge held by the researcher and presumed to be agreed to by consumers of the research product. More interesting are the settings in which the prior information is held unconsciously by both researcher and reader, or when the very form of an "effective procedure" incorporates its creator's (unspoken) understanding of a problem. While it may not be productive to exhaustively detail the embedded or tacit knowledge that manifests itself in creative scientific work, at least at the beginning, we may want to routinize methods for extracting and documenting the ways of thinking that make "experts" expert. We should not back away from both expecting and respecting the tacit knowledge the pervades our work and the work of others.
Collapse
Affiliation(s)
- Alexander Muir Walker
- World Health Information Science Consultants, 275 Grove St., Suite 2-400, Newton, MA, 02466, USA.
| |
Collapse
|
24
|
A tamper-proof audit and control system for the doctor in the loop. Brain Inform 2016; 3:269-279. [PMID: 27747816 PMCID: PMC5106408 DOI: 10.1007/s40708-016-0046-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2016] [Accepted: 03/01/2016] [Indexed: 12/17/2022] Open
Abstract
The “doctor in the loop” is a new paradigm in information-driven medicine, picturing the doctor as authority inside a loop supplying an expert system with information on actual patients, treatment results, and possible additional (side-)effects, including general information in order to enhance data-driven medical science, as well as giving back treatment advice to the doctor himself. While this approach can be very beneficial for new medical approaches like P4 medicine (personal, predictive, preventive, and participatory), it also relies heavily on the authenticity of the data and thus increases the need for secure and reliable databases. In this paper, we propose a solution in order to protect the doctor in the loop against responsibility derived from manipulated data, thus enabling this new paradigm to gain acceptance in the medical community. This work is an extension of the conference paper Kieseberg et al. (Brain Informatics and Health, 2015), which includes extensions to the original concept.
Collapse
|
25
|
Berger ML, Curtis MD, Smith G, Harnett J, Abernethy AP. Opportunities and challenges in leveraging electronic health record data in oncology. Future Oncol 2016; 12:1261-74. [PMID: 27096309 DOI: 10.2217/fon-2015-0043] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The widespread adoption of electronic health records (EHRs) and the growing wealth of digitized information sources about patients is ushering in an era of 'Big Data' that may revolutionize clinical research in oncology. Research will likely be more efficient and potentially more accurate than the current gold standard of manual chart review studies. However, EHRs as they exist today have significant limitations: important data elements are missing or are only captured in free text or PDF documents. Using two case studies, we illustrate the challenges of leveraging the data that are routinely collected by the healthcare system in EHRs (e.g., real-world data), specific challenges encountered in the cancer domain and opportunities that can be achieved when these are overcome.
Collapse
Affiliation(s)
- Marc L Berger
- Pfizer Inc., 235 East 42nd Street, New York, NY 10017, USA
| | | | - Gregory Smith
- Pfizer Inc., 235 East 42nd Street, New York, NY 10017, USA
| | - James Harnett
- Pfizer Inc., 235 East 42nd Street, New York, NY 10017, USA
| | | |
Collapse
|