1
|
Xu S, Cobzaru R, Finkelstein SN, Welsch RE, Ng K, Middleton L. Foundational model aided automatic high-throughput drug screening using self-controlled cohort study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.08.04.24311480. [PMID: 39148849 PMCID: PMC11326319 DOI: 10.1101/2024.08.04.24311480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Background Developing medicine from scratch to governmental authorization and detecting adverse drug reactions (ADR) have barely been economical, expeditious, and risk-averse investments. The availability of large-scale observational healthcare databases and the popularity of large language models offer an unparalleled opportunity to enable automatic high-throughput drug screening for both repurposing and pharmacovigilance. Objectives To demonstrate a general workflow for automatic high-throughput drug screening with the following advantages: (i) the association of various exposure on diseases can be estimated; (ii) both repurposing and pharmacovigilance are integrated; (iii) accurate exposure length for each prescription is parsed from clinical texts; (iv) intrinsic relationship between drugs and diseases are removed jointly by bioinformatic mapping and large language model - ChatGPT; (v) causal-wise interpretations for incidence rate contrasts are provided. Methods Using a self-controlled cohort study design where subjects serve as their own control group, we tested the intention-to-treat association between medications on the incidence of diseases. Exposure length for each prescription is determined by parsing common dosages in English free text into a structured format. Exposure period starts from initial prescription to treatment discontinuation. A same exposure length preceding initial treatment is the control period. Clinical outcomes and categories are identified using existing phenotyping algorithms. Incident rate ratios (IRR) are tested using uniformly most powerful (UMP) unbiased tests. Results We assessed 3,444 medications on 276 diseases on 6,613,198 patients from the Clinical Practice Research Datalink (CPRD), an UK primary care electronic health records (EHR) spanning from 1987 to 2018. Due to the built-in selection bias of self-controlled cohort studies, ingredients-disease pairs confounded by deterministic medical relationships are removed by existing map from RxNorm and nonexistent maps by calling ChatGPT. A total of 16,901 drug-disease pairs reveals significant risk reduction, which can be considered as candidates for repurposing, while a total of 11,089 pairs showed significant risk increase, where drug safety might be of a concern instead. Conclusions This work developed a data-driven, nonparametric, hypothesis generating, and automatic high-throughput workflow, which reveals the potential of natural language processing in pharmacoepidemiology. We demonstrate the paradigm to a large observational health dataset to help discover potential novel therapies and adverse drug effects. The framework of this study can be extended to other observational medical databases.
Collapse
Affiliation(s)
- Shenbo Xu
- Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Raluca Cobzaru
- Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Stan N. Finkelstein
- Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Roy E. Welsch
- Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Kenney Ng
- Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Lefkos Middleton
- Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| |
Collapse
|
2
|
Zafari Z, Park JE, Shah CH, dosReis S, Gorman EF, Hua W, Ma Y, Tian F. The State of Use and Utility of Negative Controls in Pharmacoepidemiologic Studies. Am J Epidemiol 2024; 193:426-453. [PMID: 37851862 PMCID: PMC11484649 DOI: 10.1093/aje/kwad201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 07/27/2023] [Accepted: 10/06/2023] [Indexed: 10/20/2023] Open
Abstract
Uses of real-world data in drug safety and effectiveness studies are often challenged by various sources of bias. We undertook a systematic search of the published literature through September 2020 to evaluate the state of use and utility of negative controls to address bias in pharmacoepidemiologic studies. Two reviewers independently evaluated study eligibility and abstracted data. Our search identified 184 eligible studies for inclusion. Cohort studies (115, 63%) and administrative data (114, 62%) were, respectively, the most common study design and data type used. Most studies used negative control outcomes (91, 50%), and for most studies the target source of bias was unmeasured confounding (93, 51%). We identified 4 utility domains of negative controls: 1) bias detection (149, 81%), 2) bias correction (16, 9%), 3) P-value calibration (8, 4%), and 4) performance assessment of different methods used in drug safety studies (31, 17%). The most popular methodologies used were the 95% confidence interval and P-value calibration. In addition, we identified 2 reference sets with structured steps to check the causality assumption of the negative control. While negative controls are powerful tools in bias detection, we found many studies lacked checking the underlying assumptions. This article is part of a Special Collection on Pharmacoepidemiology.
Collapse
Affiliation(s)
- Zafar Zafari
- Correspondence to Dr. Zafar Zafari, 220 N. Arch Street, Baltimore, Maryland, 21201 (e-mail: )
| | | | | | | | | | | | | | | |
Collapse
|
3
|
Coste A, Wong A, Bokern M, Bate A, Douglas IJ. Methods for drug safety signal detection using routinely collected observational electronic health care data: A systematic review. Pharmacoepidemiol Drug Saf 2023; 32:28-43. [PMID: 36218170 PMCID: PMC10092128 DOI: 10.1002/pds.5548] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/21/2022] [Accepted: 10/02/2022] [Indexed: 02/06/2023]
Abstract
PURPOSE Signal detection is a crucial step in the discovery of post-marketing adverse drug reactions. There is a growing interest in using routinely collected data to complement established spontaneous report analyses. This work aims to systematically review the methods for drug safety signal detection using routinely collected healthcare data and their performance, both in general and for specific types of drugs and outcomes. METHODS We conducted a systematic review following the PRISMA guidelines, and registered a protocol in PROSPERO. MEDLINE, EMBASE, PubMed, Web of Science, Scopus, and the Cochrane Library were searched until July 13, 2021. RESULTS The review included 101 articles, among which there were 39 methodological works, 25 performance assessment papers, and 24 observational studies. Methods included adaptations from those used with spontaneous reports, traditional epidemiological designs, methods specific to signal detection with real-world data. More recently, implementations of machine learning have been studied in the literature. Twenty-five studies evaluated method performances, 16 of them using the area under the curve (AUC) for a range of positive and negative controls as their main measure. Despite the likelihood that performance measurement could vary by drug-event pair, only 10 studies reported performance stratified by drugs and outcomes, in a heterogeneous manner. The replicability of the performance assessment results was limited due to lack of transparency in reporting and the lack of a gold standard reference set. CONCLUSIONS A variety of methods have been described in the literature for signal detection with routinely collected data. No method showed superior performance in all papers and across all drugs and outcomes, performance assessment and reporting were heterogeneous. However, there is limited evidence that self-controlled designs, high dimensional propensity scores, and machine learning can achieve higher performances than other methods.
Collapse
Affiliation(s)
- Astrid Coste
- Department of Non-Communicable Disease Epidemiology, LSHTM, London, UK
| | - Angel Wong
- Department of Non-Communicable Disease Epidemiology, LSHTM, London, UK
| | - Marleen Bokern
- Department of Non-Communicable Disease Epidemiology, LSHTM, London, UK
| | - Andrew Bate
- Department of Non-Communicable Disease Epidemiology, LSHTM, London, UK.,Global Safety, GSK, Brentford, UK
| | - Ian J Douglas
- Department of Non-Communicable Disease Epidemiology, LSHTM, London, UK
| |
Collapse
|
4
|
Waddingham E, Miller A, Dobson R, Matthews PM. Challenges and Opportunities of Real-World Data: Statistical Analysis Plan for the Optimise:MS Multicenter Prospective Cohort Pharmacovigilance Study. Front Neurol 2022; 13:799531. [PMID: 35418938 PMCID: PMC8996123 DOI: 10.3389/fneur.2022.799531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 03/03/2022] [Indexed: 11/21/2022] Open
Abstract
Introduction Optimise:MS is an observational pharmacovigilance study aimed at characterizing the safety profile of disease-modifying therapies (DMTs) for multiple sclerosis (MS) in a real world population. The study will categorize and quantify the occurrence of serious adverse events (SAEs) in a cohort of MS patients recruited from clinical sites around the UK. The study was motivated particularly by a need to establish the safety profile of newer DMTs, but will also gather data on outcomes among treatment-eligible but untreated patients and those receiving established DMTs (interferons and glatiramer acetate). It will also explore the impact of treatment switching. Methods Causal pathway confounding between treatment selection and outcomes, together with the variety and complexity of treatment and disease patterns observed among MS patients in the real world, present statistical challenges to be addressed in the analysis plan. We developed an approach for analysis of the Optimise:MS data that will include disproportionality-based signal detection methods adapted to the longitudinal structure of the data and a longitudinal time-series analysis of a cohort of participants receiving second-generation DMT for the first time. The time-series analyses will use a number of exposure definitions in order to identify temporal patterns, carryover effects and interactions with prior treatments. Time-dependent confounding will be allowed for via inverse-probability-of-treatment weighting (IPTW). Additional analyses will examine rates and outcomes of pregnancies and explore interactions of these with treatment type and duration. Results To date 14 hospitals have joined the study and over 2,000 participants have been recruited. A statistical analysis plan has been developed and is described here. Conclusion Optimise:MS is expected to be a rich source of data on the outcomes of DMTs in real-world conditions over several years of follow-up in an inclusive sample of UK MS patients. Analysis is complicated by the influence of confounding factors including complex treatment histories and a highly variable disease course, but the statistical analysis plan includes measures to mitigate the biases such factors can introduce. It will enable us to address key questions that are beyond the reach of randomized controlled trials.
Collapse
Affiliation(s)
- Ed Waddingham
- Department of Brain Sciences and Dementia Research Institute, Imperial College London, Hammersmith Campus, London, United Kingdom
| | - Aleisha Miller
- Department of Brain Sciences and Dementia Research Institute, Imperial College London, Hammersmith Campus, London, United Kingdom
| | - Ruth Dobson
- Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London, United Kingdom
| | - Paul M Matthews
- Department of Brain Sciences and Dementia Research Institute, Imperial College London, Hammersmith Campus, London, United Kingdom
| |
Collapse
|
5
|
Thurin NH, Lassalle R, Schuemie M, Pénichon M, Gagne JJ, Rassen JA, Benichou J, Weill A, Blin P, Moore N, Droz-Perroteau C. Empirical assessment of case-based methods for drug safety alert identification in the French National Healthcare System database (SNDS): Methodology of the ALCAPONE project. Pharmacoepidemiol Drug Saf 2020; 29:993-1000. [PMID: 32133717 DOI: 10.1002/pds.4983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 01/02/2020] [Accepted: 02/12/2020] [Indexed: 01/22/2023]
Abstract
OBJECTIVES To introduce the methodology of the ALCAPONE project. BACKGROUND The French National Healthcare System Database (SNDS), covering 99% of the French population, provides a potentially valuable opportunity for drug safety alert generation. ALCAPONE aimed to assess empirically in the SNDS case-based designs for alert generation related to four health outcomes of interest. METHODS ALCAPONE used a reference set adapted from observational medical outcomes partnership (OMOP) and Exploring and Understanding Adverse Drug Reactions (EU-ADR) project, with four outcomes-acute liver injury (ALI), myocardial infarction (MI), acute kidney injury (AKI), and upper gastrointestinal bleeding (UGIB)-and positive and negative drug controls. ALCAPONE consisted of four main phases: (1) data preparation to fit the OMOP Common Data Model and select the drug controls; (2) detection of the selected controls via three case-based designs: case-population, case-control, and self-controlled case series, including design variants (varying risk window, adjustment strategy, etc.); (3) comparison of design variant performance (area under the ROC curve, mean square error, etc.); and (4) selection of the optimal design variants and their calibration for each outcome. RESULTS Over 2009-2014, 5225 cases of ALI, 354 109 MI, 12 633 AKI, and 156 057 UGIB were identified using specific definitions. The number of detectable drugs ranged from 61 for MI to 25 for ALI. Design variants generated more than 50 000 points estimates. Results by outcome will be published in forthcoming papers. CONCLUSIONS ALCAPONE has shown the interest of the empirical assessment of pharmacoepidemiological approaches for drug safety alert generation and may encourage other researchers to do the same in other databases.
Collapse
Affiliation(s)
- Nicolas H Thurin
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France.,INSERM U1219, Université de Bordeaux, Bordeaux, France
| | - Régis Lassalle
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Martijn Schuemie
- Epidemiology Analytics, Janssen Research and Development, Titusville, New Jersey, USA.,Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - Marine Pénichon
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Joshua J Gagne
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | | | - Jacques Benichou
- Department of Biostatistics and Clinical Research, Rouen University Hospital, Rouen, France.,INSERM U1181, Paris, France
| | - Alain Weill
- Caisse Nationale de l'Assurance Maladie, Paris, France
| | - Patrick Blin
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Nicholas Moore
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France.,INSERM U1219, Université de Bordeaux, Bordeaux, France.,CHU de Bordeaux, Bordeaux, France
| | | |
Collapse
|
6
|
Schuemie MJ, Cepeda MS, Suchard MA, Yang J, Tian Y, Schuler A, Ryan PB, Madigan D, Hripcsak G. How Confident Are We about Observational Findings in Healthcare: A Benchmark Study. HARVARD DATA SCIENCE REVIEW 2020; 2. [PMID: 33367288 DOI: 10.1162/99608f92.147cc28e] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Healthcare professionals increasingly rely on observational healthcare data, such as administrative claims and electronic health records, to estimate the causal effects of interventions. However, limited prior studies raise concerns about the real-world performance of the statistical and epidemiological methods that are used. We present the "OHDSI Methods Benchmark" that aims to evaluate the performance of effect estimation methods on real data. The benchmark comprises a gold standard, a set of metrics, and a set of open source software tools. The gold standard is a collection of real negative controls (drug-outcome pairs where no causal effect appears to exist) and synthetic positive controls (drug-outcome pairs that augment negative controls with simulated causal effects). We apply the benchmark using four large healthcare databases to evaluate methods commonly used in practice: the new-user cohort, self-controlled cohort, case-control, case-crossover, and self-controlled case series designs. The results confirm the concerns about these methods, showing that for most methods the operating characteristics deviate considerably from nominal levels. For example, in most contexts, only half of the 95% confidence intervals we calculated contain the corresponding true effect size. We previously developed an "empirical calibration" procedure to restore these characteristics and we also evaluate this procedure. While no one method dominates, self-controlled methods such as the empirically calibrated self-controlled case series perform well across a wide range of scenarios.
Collapse
Affiliation(s)
- Martijn J Schuemie
- Observational Health Data Sciences and Informatics.,Epidemiology Analytics, Janssen Research and Development.,Department of Biostatistics, University of California, Los Angeles
| | - M Soledad Cepeda
- Observational Health Data Sciences and Informatics.,Epidemiology Analytics, Janssen Research and Development
| | - Marc A Suchard
- Observational Health Data Sciences and Informatics.,Department of Biostatistics, University of California, Los Angeles.,Department of Biomathematics, University of California, Los Angeles.,Department of Human Genetics, University of California, Los Angeles
| | - Jianxiao Yang
- Observational Health Data Sciences and Informatics.,Department of Biomathematics, University of California, Los Angeles
| | - Yuxi Tian
- Observational Health Data Sciences and Informatics.,Department of Biomathematics, University of California, Los Angeles
| | - Alejandro Schuler
- Observational Health Data Sciences and Informatics.,Center for Biomedical Informatics Research, Stanford University
| | - Patrick B Ryan
- Observational Health Data Sciences and Informatics.,Epidemiology Analytics, Janssen Research and Development.,Department of Biomedical Informatics, Columbia University
| | - David Madigan
- Observational Health Data Sciences and Informatics.,Department of Statistics, Columbia University
| | - George Hripcsak
- Observational Health Data Sciences and Informatics.,Department of Biomedical Informatics, Columbia University.,Medical Informatics Services, New York-Presbyterian Hospital
| |
Collapse
|
7
|
Kuang Z, Peissig P, Costa VS, Maclin R, Page D. Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data. KDD : PROCEEDINGS. INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING 2017; 2017:1537-1546. [PMID: 29755826 PMCID: PMC5945223 DOI: 10.1145/3097983.3097998] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Several prominent public health hazards [29] that occurred at the beginning of this century due to adverse drug events (ADEs) have raised international awareness of governments and industries about pharmacovigilance (PhV) [6,7], the science and activities to monitor and prevent adverse events caused by pharmaceutical products after they are introduced to the market. A major data source for PhV is large-scale longitudinal observational databases (LODs) [6] such as electronic health records (EHRs) and medical insurance claim databases. Inspired by the Self-Controlled Case Series (SCCS) model [27], arguably the leading method for ADE discovery from LODs, we propose baseline regularization, a regularized generalized linear model that leverages the diverse health profiles available in LODs across different individuals at different times. We apply the proposed method as well as SCCS to the Marshfield Clinic EHR. Experimental results suggest that the proposed method outperforms SCCS under various settings in identifying benchmark ADEs from the Observational Medical Outcomes Partnership ground truth [26].
Collapse
|
8
|
Bridging islands of information to establish an integrated knowledge base of drugs and health outcomes of interest. Drug Saf 2015; 37:557-67. [PMID: 24985530 PMCID: PMC4134480 DOI: 10.1007/s40264-014-0189-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The entire drug safety enterprise has a need to search, retrieve, evaluate, and synthesize scientific evidence more efficiently. This discovery and synthesis process would be greatly accelerated through access to a common framework that brings all relevant information sources together within a standardized structure. This presents an opportunity to establish an open-source community effort to develop a global knowledge base, one that brings together and standardizes all available information for all drugs and all health outcomes of interest (HOIs) from all electronic sources pertinent to drug safety. To make this vision a reality, we have established a workgroup within the Observational Health Data Sciences and Informatics (OHDSI, http://ohdsi.org) collaborative. The workgroup’s mission is to develop an open-source standardized knowledge base for the effects of medical products and an efficient procedure for maintaining and expanding it. The knowledge base will make it simpler for practitioners to access, retrieve, and synthesize evidence so that they can reach a rigorous and accurate assessment of causal relationships between a given drug and HOI. Development of the knowledge base will proceed with the measureable goal of supporting an efficient and thorough evidence-based assessment of the effects of 1,000 active ingredients across 100 HOIs. This non-trivial task will result in a high-quality and generally applicable drug safety knowledge base. It will also yield a reference standard of drug–HOI pairs that will enable more advanced methodological research that empirically evaluates the performance of drug safety analysis methods.
Collapse
|
9
|
Vilar S, Ryan PB, Madigan D, Stang PE, Schuemie MJ, Friedman C, Tatonetti NP, Hripcsak G. Similarity-based modeling applied to signal detection in pharmacovigilance. CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY 2014; 3:e137. [PMID: 25250527 PMCID: PMC4211266 DOI: 10.1038/psp.2014.35] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 07/06/2014] [Indexed: 12/31/2022]
Abstract
One of the main objectives in pharmacovigilance is the detection of adverse drug events (ADEs) through mining of healthcare databases, such as electronic health records or administrative claims data. Although different approaches have been shown to be of great value, research is still focusing on the enhancement of signal detection to gain efficiency in further assessment and follow-up. We applied similarity-based modeling techniques, using 2D and 3D molecular structure, ADE, target, and ATC (anatomical therapeutic chemical) similarity measures, to the candidate associations selected previously in a medication-wide association study for four ADE outcomes. Our results showed an improvement in the precision when we ranked the subset of ADE candidates using similarity scorings. This method is simple, useful to strengthen or prioritize signals generated from healthcare databases, and facilitates ADE detection through the identification of the most similar drugs for which ADE information is available.
Collapse
Affiliation(s)
- S Vilar
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - P B Ryan
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Janssen Research and Development, Titusville, New Jersey, USA
| | - D Madigan
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Department of Statistics, Columbia University, New York, New York, USA
| | - P E Stang
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Janssen Research and Development, Titusville, New Jersey, USA
| | - M J Schuemie
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Janssen Research and Development, Titusville, New Jersey, USA
| | - C Friedman
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - N P Tatonetti
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [3] Department of Systems Biology, Columbia University Medical Center, New York, New York, USA [4] Department of Medicine, Columbia University Medical Center, New York, New York, USA
| | - G Hripcsak
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| |
Collapse
|
10
|
Ryan PB, Schuemie MJ, Welebob E, Duke J, Valentine S, Hartzema AG. Defining a reference set to support methodological research in drug safety. Drug Saf 2014; 36 Suppl 1:S33-47. [PMID: 24166222 DOI: 10.1007/s40264-013-0097-8] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
BACKGROUND Methodological research to evaluate the performance of methods requires a benchmark to serve as a referent comparison. In drug safety, the performance of analyses of spontaneous adverse event reporting databases and observational healthcare data, such as administrative claims and electronic health records, has been limited by the lack of such standards. OBJECTIVES To establish a reference set of test cases that contain both positive and negative controls, which can serve the basis for methodological research in evaluating methods performance in identifying drug safety issues. RESEARCH DESIGN Systematic literature review and natural language processing of structured product labeling was performed to identify evidence to support the classification of drugs as either positive controls or negative controls for four outcomes: acute liver injury, acute kidney injury, acute myocardial infarction, and upper gastrointestinal bleeding. RESULTS Three-hundred and ninety-nine test cases comprised of 165 positive controls and 234 negative controls were identified across the four outcomes. The majority of positive controls for acute kidney injury and upper gastrointestinal bleeding were supported by randomized clinical trial evidence, while the majority of positive controls for acute liver injury and acute myocardial infarction were only supported based on published case reports. Literature estimates for the positive controls shows substantial variability that limits the ability to establish a reference set with known effect sizes. CONCLUSIONS A reference set of test cases can be established to facilitate methodological research in drug safety. Creating a sufficient sample of drug-outcome pairs with binary classification of having no effect (negative controls) or having an increased effect (positive controls) is possible and can enable estimation of predictive accuracy through discrimination. Since the magnitude of the positive effects cannot be reliably obtained and the quality of evidence may vary across outcomes, assumptions are required to use the test cases in real data for purposes of measuring bias, mean squared error, or coverage probability.
Collapse
Affiliation(s)
- Patrick B Ryan
- Janssen Research and Development LLC, 1125 Trenton-Harbourton Road, Room K30205, PO Box 200, Titusville, NJ, 08560, USA,
| | | | | | | | | | | |
Collapse
|
11
|
Replication of the OMOP Experiment in Europe: Evaluating Methods for Risk Identification in Electronic Health Record Databases. Drug Saf 2013; 36 Suppl 1:S159-69. [DOI: 10.1007/s40264-013-0109-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Ryan PB, Schuemie MJ. Evaluating Performance of Risk Identification Methods Through a Large-Scale Simulation of Observational Data. Drug Saf 2013; 36 Suppl 1:S171-80. [DOI: 10.1007/s40264-013-0110-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
13
|
Overhage JM, Ryan PB, Schuemie MJ, Stang PE. Desideratum for Evidence Based Epidemiology. Drug Saf 2013; 36 Suppl 1:S5-14. [DOI: 10.1007/s40264-013-0102-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
14
|
Stang PE, Ryan PB, Overhage JM, Schuemie MJ, Hartzema AG, Welebob E. Variation in Choice of Study Design: Findings from the Epidemiology Design Decision Inventory and Evaluation (EDDIE) Survey. Drug Saf 2013; 36 Suppl 1:S15-25. [PMID: 24166220 DOI: 10.1007/s40264-013-0103-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Paul E Stang
- Janssen Research and Development LLC, Titusville, NJ, USA,
| | | | | | | | | | | |
Collapse
|
15
|
Ryan PB, Stang PE, Overhage JM, Suchard MA, Hartzema AG, DuMouchel W, Reich CG, Schuemie MJ, Madigan D. A Comparison of the Empirical Performance of Methods for a Risk Identification System. Drug Saf 2013; 36 Suppl 1:S143-58. [PMID: 24166231 DOI: 10.1007/s40264-013-0108-9] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Patrick B Ryan
- Janssen Research and Development LLC, 1125 Trenton-Harbourton Road, Room K30205, PO Box 200, Titusville, NJ, 08560, USA,
| | | | | | | | | | | | | | | | | |
Collapse
|