1
|
Shi Y, Yang Y, Liu R, Sun A, Peng X, Li L, Zhang P, Zhang P. A Drug Similarity-Based Bayesian Method for Early Adverse Drug Event Detection. Drug Saf 2025:10.1007/s40264-025-01545-6. [PMID: 40261506 DOI: 10.1007/s40264-025-01545-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/17/2025] [Indexed: 04/24/2025]
Abstract
INTRODUCTION Biochemical drug similarity-based methods demonstrate successes in predicting adverse drug events (ADEs) in preclinical settings and enhancing signals of ADEs in real-world data mining. Despite these successes, drug similarity-based ADE detection shall be expanded with false-positive control and evaluated under a time-to-detection setting. METHODS We tested a drug similarity-based Bayesian method for early ADE detection with false-positive control. Under the tested method, prior distribution of ADE probability of a less frequent drug could be derived from frequent drugs with a high biochemical similarity, and posterior probability of null hypothesis could be used for signal detection and false-positive control. We evaluated the tested and reference methods by mining relatively newer drugs in real-world data (e.g., the US Food and Drug Administration (FDA)'s Adverse Event Reporting System (FAERS) data) and conducting a simulation study. RESULTS In FAERS analysis, the times to achieve a same probability of detection for drug-labeled ADEs following initial drug reporting were 5 years and ≥ 7 years for the tested method and reference methods, respectively. Additionally, the tested method compared with reference methods had higher AUC values (0.57-0.79 vs. 0.32-0.71), especially within 3 years following initial drug reporting. In a simulation study, the tested method demonstrated proper false-positive control, and had higher probabilities of detection (0.31-0.60 vs. 0.11-0.41) and AUC values (0.88-0.95 vs. 0.69-0.86) compared with reference methods. Additionally, we identified different types of drug similarities had a comparable performance in high-throughput ADE mining. CONCLUSION The drug similarity-based Bayesian ADE detection method might be able to accelerate ADE detection while controlling the false-positive rate.
Collapse
Affiliation(s)
- Yi Shi
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Yuedi Yang
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Ruoqi Liu
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Anna Sun
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Xueqiao Peng
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Lang Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA.
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.
| | - Pengyue Zhang
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA.
| |
Collapse
|
2
|
Shi Y, Sun A, Yang Y, Xu J, Li J, Eadon M, Su J, Zhang P. A theoretical model for detecting drug interaction with awareness of timing of exposure. Sci Rep 2025; 15:13693. [PMID: 40258952 PMCID: PMC12012107 DOI: 10.1038/s41598-025-98528-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Accepted: 04/14/2025] [Indexed: 04/23/2025] Open
Abstract
Drug-drug interaction-induced (DDI-induced) adverse drug event (ADE) is a significant public health burden. Risk of ADE can be related to timing of exposure (TOE) such as initiating two drugs concurrently or adding one drug to an existing drug. Thus, real-world data based DDI detection shall be expanded to investigate precise adverse DDI with a special awareness on TOE. We developed a Sensitive and Timing-awarE Model (STEM), which was able to optimize the probability of detection and control false positive rate for mining all two-drug combinations under case-crossover design, in particular for DDIs with TOE-dependent risk. We analyzed a large-scale US administrative claims data and conducted performance evaluation analyses. We identified signals of DDIs by using STEM, in particular for DDIs with TOE-dependent risk. We also observed that STEM identified significantly more signals than the conditional logistic regression model-based (CLRM-based) methods and the Benjamini-Hochberg procedure. In the performance evaluation, we found that STEM demonstrated proper false positive control and achieved a higher probability of detection compared to CLRM-based methods and the Benjamini-Hochberg procedure. STEM has a high probability to identify signals of DDIs in high-throughput DDI mining while controlling false positive rate, in particular for detecting signals of DDI with TOE-dependent risk.
Collapse
Affiliation(s)
- Yi Shi
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Anna Sun
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Yuedi Yang
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Jing Xu
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Justin Li
- Park Tudor School, Indianapolis, IN, USA
| | - Michael Eadon
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jing Su
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA
| | - Pengyue Zhang
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN, USA.
| |
Collapse
|
3
|
Ito S, Narukawa M. Development of a Drug Safety Signal Detection Reference Set Using Japanese Safety Information. Ther Innov Regul Sci 2025; 59:288-294. [PMID: 39709323 DOI: 10.1007/s43441-024-00729-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Accepted: 12/12/2024] [Indexed: 12/23/2024]
Abstract
INTRODUCTION One of the main objectives of pharmacovigilance activities is to confirm unknown adverse drug reactions (ADRs), and data-mining methods have been developed to detect signals that are candidates for ADRs. Reference sets have been developed to evaluate the performance of the data-mining methods. However, reference sets generated in previous studies are not based on Japanese safety information; therefore, they are not suitable for use in evaluation studies in Japan because some drugs have not been approved or marketed for a long time in Japan. This study aimed to develop a reference set using drug safety information marketed in Japan and to evaluate its performance. METHODS A reference set was developed for 43 drugs and 15 events. For each combination of the selected drug and event, those that were listed as important identified risks in the Japan Risk Management Plan (J-RMP) were set as "positive controls" and those that were not listed as adverse reactions in the package insert were set as "negative controls." In addition, we performed data-mining using Japanese Adverse Drug Event Report database (JADER) and evaluated the results against the reference set to empirically confirm its effectiveness. RESULTS The reference set included 127 positive and 386 negative controls. A comparison of the signals obtained from data-mining using JADER with the reference set revealed higher correlations than those in previous studies. CONCLUSION A reference set was developed using the safety information of drugs approved in Japan to promote research on data-mining methods.
Collapse
Affiliation(s)
- Satoru Ito
- Department of Clinical Medicine (Pharmaceutical Medicine), Graduate School of Pharmaceutical Sciences, Kitasato University, 5-9-1 Shirokane, Minato-ku, Tokyo, 108-8641, Japan.
- Kyowa Kirin Co., Ltd., Tokyo, Japan.
| | - Mamoru Narukawa
- Department of Clinical Medicine (Pharmaceutical Medicine), Graduate School of Pharmaceutical Sciences, Kitasato University, 5-9-1 Shirokane, Minato-ku, Tokyo, 108-8641, Japan
| |
Collapse
|
4
|
Berkowitz J, Weissenbacher D, Srinivasan A, Friedrich NA, Acitores Cortina JM, Kivelson S, Hernandez GG, Tatonetti NP. Probing Large Language Model Hidden States for Adverse Drug Reaction Knowledge. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.02.09.25321620. [PMID: 39990542 PMCID: PMC11844579 DOI: 10.1101/2025.02.09.25321620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Large language models (LLMs) integrate knowledge from diverse sources into a single set of internal weights. However, these representations are difficult to interpret, complicating our understanding of the models' learning capabilities. Sparse autoencoders (SAEs) linearize LLM embeddings, creating monosemantic features that both provide insight into the model's comprehension and simplify downstream machine learning tasks. These features are especially important in biomedical applications where explainability is critical. Here, we evaluate the use of Gemma Scope SAEs to identify how LLMs store known facts involving adverse drug reactions (ADRs). We transform hidden-state embeddings of drug names from Gemma2-9b-it into interpretable features and train a linear classifier on these features to classify ADR likelihood, evaluating against an established benchmark. These embeddings provide strong predictive performance, giving AUC-ROC of 0.957 for identifying acute kidney injury, 0.902 for acute liver injury, 0.954 for acute myocardial infarction, and 0.963 for gastrointestinal bleeds. Notably, there are no significant differences (p > 0.05) in performance between the simple linear classifiers built on SAE outputs and neural networks trained on the raw embeddings, suggesting that the information lost in reconstruction is minimal. This finding suggests that SAE-derived representations retain the essential information from the LLM while reducing model complexity, paving the way for more transparent, compute-efficient strategies. We believe that this approach can help synthesize the biomedical knowledge our models learn in training and be used for downstream applications, such as expanding reference sets for pharmacovigilance.
Collapse
Affiliation(s)
- Jacob Berkowitz
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Davy Weissenbacher
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Apoorva Srinivasan
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Nadine A Friedrich
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | | | - Sophia Kivelson
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | | | - Nicholas P Tatonetti
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
- Cedars-Sinai Cancer, Cedars-Sinai Medical Center, Los Angeles, California, USA
| |
Collapse
|
5
|
Bots SH, Belitser S, Groenwold RHH, Durán CE, Riera-Arnau J, Schultze A, Messina D, Segundo E, Douglas I, Carreras JJ, Garcia-Poza P, Gini R, Huerta C, Martín-Pérez M, Martin I, Paoletti O, Bissacco CA, Correcher-Martínez E, Souverein P, Urchueguía-Fornes A, Villalobos F, Sturkenboom MCJM, Klungel OH. Applying two approaches to detect unmeasured confounding due to time-varying variables in a self-controlled risk interval design evaluating COVID-19 vaccine safety signals, using myocarditis as a case example. Am J Epidemiol 2025; 194:208-219. [PMID: 38960670 PMCID: PMC11735966 DOI: 10.1093/aje/kwae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 05/07/2024] [Accepted: 06/27/2024] [Indexed: 07/05/2024] Open
Abstract
We test the robustness of the self-controlled risk interval (SCRI) design in a setting where time between doses may introduce time-varying confounding, using both negative control outcomes (NCOs) and quantitative bias analysis (QBA). All vaccinated cases identified from 5 European databases between September 1, 2020, and end of data availability were included. Exposures were doses 1-3 of the Pfizer, Moderna, AstraZeneca, and Janssen COVID-19 vaccines; outcomes were myocarditis and, as the NCO, otitis externa. The SCRI used a 60-day control window and dose-specific 28-day risk windows, stratified by vaccine brand and adjusted for calendar time. The QBA included two scenarios: (1) baseline probability of the confounder was higher in the control window and (2) vice versa. The NCO was not associated with any of the COVID-19 vaccine types or doses except Moderna dose 1 (IRR = 1.09; 95% CI 1.01-1.09). The QBA suggested that even the strongest literature-reported confounder (COVID-19; RR for myocarditis = 18.3) could only explain away part of the observed effect, from IRR = 3 to IRR = 1.40. The SCRI seems robust to unmeasured confounding in the COVID-19 setting, although a strong unmeasured confounder could bias the observed effect upward. Replication of our findings for other safety signals would strengthen this conclusion. This article is part of a Special Collection on Pharmacoepidemiology.
Collapse
Affiliation(s)
- Sophie H Bots
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, 3508 TB, Utrecht, The Netherlands
| | - Svetlana Belitser
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, 3508 TB, Utrecht, The Netherlands
| | - Rolf H H Groenwold
- Department of Clinical Epidemiology, Leiden University Medical Centre, 2333 ZA, Leiden, the Netherlands
| | - Carlos E Durán
- Department of Data Science and Biostatistics, Julius Center for Health Sciences and Primary Health, University Medical Center Utrecht, 3584 CG, Utrecht, The Netherlands
| | - Judit Riera-Arnau
- Department of Data Science and Biostatistics, Julius Center for Health Sciences and Primary Health, University Medical Center Utrecht, 3584 CG, Utrecht, The Netherlands
- Clinical Pharmacology Service, Vall d’Hebron Hospital Universitari, Vall d’Hebron Barcelona Hospital Campus, Universitat Autònoma de Barcelona, 08035, Barcelona, Spain
| | - Anna Schultze
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, WC1E 7HT, London, UK
| | - Davide Messina
- Agenzia Regionale di Sanità, 50141, Florence, Toscana, Italy
| | - Elena Segundo
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), 08007, Barcelona, Spain
| | - Ian Douglas
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, WC1E 7HT, London, UK
| | - Juan José Carreras
- Vaccine Research Department, Foundation for the Promotion of Health and Biomedical Research in the Valencian Region (FISABIO - Public Health), 46020, Valencia, Spain
| | | | - Rosa Gini
- Agenzia Regionale di Sanità, 50141, Florence, Toscana, Italy
| | - Consuelo Huerta
- Spanish Agency for Medicines and Medical Devices (AEMPS), 28022, Madrid, Spain
| | - Mar Martín-Pérez
- Spanish Agency for Medicines and Medical Devices (AEMPS), 28022, Madrid, Spain
| | - Ivonne Martin
- Department of Data Science and Biostatistics, Julius Center for Health Sciences and Primary Health, University Medical Center Utrecht, 3584 CG, Utrecht, The Netherlands
| | - Olga Paoletti
- Agenzia Regionale di Sanità, 50141, Florence, Toscana, Italy
| | - Carlo Alberto Bissacco
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), 08007, Barcelona, Spain
| | - Elisa Correcher-Martínez
- Vaccine Research Department, Foundation for the Promotion of Health and Biomedical Research in the Valencian Region (FISABIO - Public Health), 46020, Valencia, Spain
| | - Patrick Souverein
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, 3508 TB, Utrecht, The Netherlands
| | - Arantxa Urchueguía-Fornes
- Vaccine Research Department, Foundation for the Promotion of Health and Biomedical Research in the Valencian Region (FISABIO - Public Health), 46020, Valencia, Spain
| | - Felipe Villalobos
- Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), 08007, Barcelona, Spain
| | - Miriam C J M Sturkenboom
- Department of Data Science and Biostatistics, Julius Center for Health Sciences and Primary Health, University Medical Center Utrecht, 3584 CG, Utrecht, The Netherlands
| | - Olaf H Klungel
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, 3508 TB, Utrecht, The Netherlands
| |
Collapse
|
6
|
Zafari Z, Park JE, Shah CH, dosReis S, Gorman EF, Hua W, Ma Y, Tian F. The State of Use and Utility of Negative Controls in Pharmacoepidemiologic Studies. Am J Epidemiol 2024; 193:426-453. [PMID: 37851862 PMCID: PMC11484649 DOI: 10.1093/aje/kwad201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 07/27/2023] [Accepted: 10/06/2023] [Indexed: 10/20/2023] Open
Abstract
Uses of real-world data in drug safety and effectiveness studies are often challenged by various sources of bias. We undertook a systematic search of the published literature through September 2020 to evaluate the state of use and utility of negative controls to address bias in pharmacoepidemiologic studies. Two reviewers independently evaluated study eligibility and abstracted data. Our search identified 184 eligible studies for inclusion. Cohort studies (115, 63%) and administrative data (114, 62%) were, respectively, the most common study design and data type used. Most studies used negative control outcomes (91, 50%), and for most studies the target source of bias was unmeasured confounding (93, 51%). We identified 4 utility domains of negative controls: 1) bias detection (149, 81%), 2) bias correction (16, 9%), 3) P-value calibration (8, 4%), and 4) performance assessment of different methods used in drug safety studies (31, 17%). The most popular methodologies used were the 95% confidence interval and P-value calibration. In addition, we identified 2 reference sets with structured steps to check the causality assumption of the negative control. While negative controls are powerful tools in bias detection, we found many studies lacked checking the underlying assumptions. This article is part of a Special Collection on Pharmacoepidemiology.
Collapse
Affiliation(s)
- Zafar Zafari
- Correspondence to Dr. Zafar Zafari, 220 N. Arch Street, Baltimore, Maryland, 21201 (e-mail: )
| | | | | | | | | | | | | | | |
Collapse
|
7
|
Dauner DG, Leal E, Adam TJ, Zhang R, Farley JF. Evaluation of four machine learning models for signal detection. Ther Adv Drug Saf 2023; 14:20420986231219472. [PMID: 38157242 PMCID: PMC10752114 DOI: 10.1177/20420986231219472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 11/17/2023] [Indexed: 01/03/2024] Open
Abstract
Background Logistic regression-based signal detection algorithms have benefits over disproportionality analysis due to their ability to handle potential confounders and masking factors. Feature exploration and developing alternative machine learning algorithms can further strengthen signal detection. Objectives Our objective was to compare the signal detection performance of logistic regression, gradient-boosted trees, random forest and support vector machine models utilizing Food and Drug Administration adverse event reporting system data. Design Cross-sectional study. Methods The quarterly data extract files from 1 October 2017 through 31 December 2020 were downloaded. Due to an imbalanced outcome, two training sets were used: one stratified on the outcome variable and another using Synthetic Minority Oversampling Technique (SMOTE). A crude model and a model with tuned hyperparameters were developed for each algorithm. Model performance was compared against a reference set using accuracy, precision, F1 score, recall, the receiver operating characteristic area under the curve (ROCAUC), and the precision-recall curve area under the curve (PRCAUC). Results Models trained on the balanced training set had higher accuracy, F1 score and recall compared to models trained on the SMOTE training set. When using the balanced training set, logistic regression, gradient-boosted trees, random forest and support vector machine models obtained similar performance evaluation metrics. The gradient-boosted trees hyperparameter tuned model had the highest ROCAUC (0.646) and the random forest crude model had the highest PRCAUC (0.839) when using the balanced training set. Conclusion All models trained on the balanced training set performed similarly. Logistic regression models had higher accuracy, precision and recall. Logistic regression, random forest and gradient-boosted trees hyperparameter tuned models had a PRCAUC ⩾ 0.8. All models had an ROCAUC ⩾ 0.5. Including both disproportionality analysis results and additional case report information in models resulted in higher performance evaluation metrics than disproportionality analysis alone.
Collapse
Affiliation(s)
- Daniel G. Dauner
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, University of Minnesota Duluth, 232 Life Science, 1110 Kirby Drive, Duluth, MN 55812, USA
| | - Eleazar Leal
- Department of Computer Science, Swenson College of Science and Engineering, University of Minnesota Duluth, Duluth, MN, USA
| | - Terrence J. Adam
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Joel F. Farley
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
8
|
Steinberg E, Ignatiadis N, Yadlowsky S, Xu Y, Shah N. Using public clinical trial reports to probe non-experimental causal inference methods. BMC Med Res Methodol 2023; 23:204. [PMID: 37689623 PMCID: PMC10492298 DOI: 10.1186/s12874-023-02025-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 08/24/2023] [Indexed: 09/11/2023] Open
Abstract
BACKGROUND Non-experimental studies (also known as observational studies) are valuable for estimating the effects of various medical interventions, but are notoriously difficult to evaluate because the methods used in non-experimental studies require untestable assumptions. This lack of intrinsic verifiability makes it difficult both to compare different non-experimental study methods and to trust the results of any particular non-experimental study. METHODS We introduce TrialProbe, a data resource and statistical framework for the evaluation of non-experimental methods. We first collect a dataset of pseudo "ground truths" about the relative effects of drugs by using empirical Bayesian techniques to analyze adverse events recorded in public clinical trial reports. We then develop a framework for evaluating non-experimental methods against that ground truth by measuring concordance between the non-experimental effect estimates and the estimates derived from clinical trials. As a demonstration of our approach, we also perform an example methods evaluation between propensity score matching, inverse propensity score weighting, and an unadjusted approach on a large national insurance claims dataset. RESULTS From the 33,701 clinical trial records in our version of the ClinicalTrials.gov dataset, we are able to extract 12,967 unique drug/drug adverse event comparisons to form a ground truth set. During our corresponding methods evaluation, we are able to use that reference set to demonstrate that both propensity score matching and inverse propensity score weighting can produce estimates that have high concordance with clinical trial results and substantially outperform an unadjusted baseline. CONCLUSIONS We find that TrialProbe is an effective approach for probing non-experimental study methods, being able to generate large ground truth sets that are able to distinguish how well non-experimental methods perform in real world observational data.
Collapse
Affiliation(s)
- Ethan Steinberg
- Center for Biomedical Informatics Research, Stanford University, Stanford, US.
| | | | | | - Yizhe Xu
- Center for Biomedical Informatics Research, Stanford University, Stanford, US
| | - Nigam Shah
- Center for Biomedical Informatics Research, Stanford University, Stanford, US
| |
Collapse
|
9
|
Niazi SK. The Coming of Age of AI/ML in Drug Discovery, Development, Clinical Testing, and Manufacturing: The FDA Perspectives. Drug Des Devel Ther 2023; 17:2691-2725. [PMID: 37701048 PMCID: PMC10493153 DOI: 10.2147/dddt.s424991] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/24/2023] [Indexed: 09/14/2023] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) represent significant advancements in computing, building on technologies that humanity has developed over millions of years-from the abacus to quantum computers. These tools have reached a pivotal moment in their development. In 2021 alone, the U.S. Food and Drug Administration (FDA) received over 100 product registration submissions that heavily relied on AI/ML for applications such as monitoring and improving human performance in compiling dossiers. To ensure the safe and effective use of AI/ML in drug discovery and manufacturing, the FDA and numerous other U.S. federal agencies have issued continuously updated, stringent guidelines. Intriguingly, these guidelines are often generated or updated with the aid of AI/ML tools themselves. The overarching goal is to expedite drug discovery, enhance the safety profiles of existing drugs, introduce novel treatment modalities, and improve manufacturing compliance and robustness. Recent FDA publications offer an encouraging outlook on the potential of these tools, emphasizing the need for their careful deployment. This has expanded market opportunities for retraining personnel handling these technologies and enabled innovative applications in emerging therapies such as gene editing, CRISPR-Cas9, CAR-T cells, mRNA-based treatments, and personalized medicine. In summary, the maturation of AI/ML technologies is a testament to human ingenuity. Far from being autonomous entities, these are tools created by and for humans designed to solve complex problems now and in the future. This paper aims to present the status of these technologies, along with examples of their present and future applications.
Collapse
|
10
|
Trillenberg P, Sprenger A, Machner B. Sensitivity and specificity in signal detection with the reporting odds ratio and the information component. Pharmacoepidemiol Drug Saf 2023; 32:910-917. [PMID: 36966482 DOI: 10.1002/pds.5624] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 03/21/2023] [Accepted: 03/22/2023] [Indexed: 03/27/2023]
Abstract
PURPOSE As measures of association between an adverse drug reaction (ADR) and exposure to a drug the reporting odds ratio (ROR) and the information component (IC) can be used. We sought to test the reliability of signal detection with these. METHODS We simulated ADR counts as binomially distributed random numbers for different expected ADR frequencies and theoretical reporting odds ratios (RORs). We then calculated the empirical IC and the empirical ROR and their confidence intervals. The rate of signals that was detected despite a theoretical ROR of 1 represented the false positive rate, and represented the sensitivity if the ROR was >1. RESULTS For expected case counts below 1 the false positive rate oscillates from 0.01 to 0.1 even though 0.025 were intended. Even beyond expected case counts of 5 oscillations can cover a range of 0.018 to 0.035. The first n oscillations with the largest amplitude are eliminated if a minimum case count of n is required. To detect an ROR of 2 with a sensitivity of 0.8, a minimum of 12 expected ADRs are required. In contrast, 2 expected ADRs suffice to detect an ROR of 4. CONCLUSION Summaries of measures for disproportionality should include the expected number of cases in the group of interest if a signal was detected. If no signal was detected the sensitivity for the detection of a representative ROR or the minimum ROR that could be detected with probability 0.8 should be reported.
Collapse
Affiliation(s)
- Peter Trillenberg
- University Hospital of Schleswig-Holstein, Campus Lübeck, Dept. of Neurology, Ratzeburger Allee, 160, Lübeck, Germany
| | - Andreas Sprenger
- University Hospital of Schleswig-Holstein, Campus Lübeck, Dept. of Neurology, Ratzeburger Allee, 160, Lübeck, Germany
- Institute of Psychology II, University of Lübeck, Marie-Curie-Straße, 23562, Lübeck, Germany
| | - Björn Machner
- University Hospital of Schleswig-Holstein, Campus Lübeck, Dept. of Neurology, Ratzeburger Allee, 160, Lübeck, Germany
| |
Collapse
|
11
|
Salvo F, Micallef J, Lahouegue A, Chouchana L, Létinier L, Faillie JL, Pariente A. Will the future of pharmacovigilance be more automated? Expert Opin Drug Saf 2023; 22:541-548. [PMID: 37435796 DOI: 10.1080/14740338.2023.2227091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 06/15/2023] [Indexed: 07/13/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) based tools offer new opportunities for pharmacovigilance (PV) activities. Nevertheless, their contribution to PV needs to be tailored to preserve and strengthen medical and pharmacological expertise in drug safety. AREAS COVERED This work aims to describe PV tasks in which the contribution of AI and intelligent automation (IA) tools is required, in the context of a continuous increase of spontaneous reporting cases and regulatory tasks. A narrative review with expert selection of pertinent references was performed through Medline. Two areas were covered, management of spontaneous reporting cases and signal detection. PERSPECTIVE The use of AI and IA tools will assist a large spectrum of PV activities, both in public and private PV systems, in particular for tasks of low added value (e.g. initial quality check, verification of essential regulatory information, search for duplicates). Testing, validating, and integrating these tools in the PV routine are the actual challenges for modern PV systems, to guarantee high-quality standards in terms of case management and signal detection.
Collapse
Affiliation(s)
- Francesco Salvo
- University of Bordeaux, Inserm, BPH, Team AHeaD, Bordeaux, France
- CHU de Bordeaux, Service de Pharmacologie Medicale, Bordeaux, France
| | - Joelle Micallef
- Pharmacovigilance Centre, Department of Clinical Pharmacology and Pharmacovigilance, University of Aix Marseille, INSERM UMR 1106 Institut de Neurosciences des Systèmes, Marseille, France
| | - Amir Lahouegue
- Department of Pharmacovigilance and Medical Information, Astrazeneca, Courbevoie, France
| | - Laurent Chouchana
- Regional Center of Pharmacovigilance, Pharmacology Department, Cochin Port Royal University Hospital, Paris, France
| | - Louis Létinier
- University of Bordeaux, Inserm, BPH, Team AHeaD, Bordeaux, France
- CHU de Bordeaux, Service de Pharmacologie Medicale, Bordeaux, France
- Synapse Medicine, Bordeaux, France
| | - Jean-Luc Faillie
- Inserm, Departement de Pharmacologie Medicale Et Toxicologie, Centre Regional de PV, Institut Desbrest D'epidemiologie Et de Sante Publique, CHU de Montpellier, Universite Montpellier, Montpellier, France
| | - Antoine Pariente
- University of Bordeaux, Inserm, BPH, Team AHeaD, Bordeaux, France
- CHU de Bordeaux, Service de Pharmacologie Medicale, Bordeaux, France
| |
Collapse
|
12
|
Dauner DG, Zhang R, Adam TJ, Leal E, Heitlage V, Farley JF. Performance of subgrouped proportional reporting ratios in the US Food and Drug Administration (FDA) adverse event reporting system. Expert Opin Drug Saf 2023; 22:589-597. [PMID: 36800190 DOI: 10.1080/14740338.2023.2182289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 02/06/2023] [Indexed: 02/18/2023]
Abstract
BACKGROUND Many signal detection algorithms give the same weight to information from all products and patients, which may result in signals being masked or false positives being flagged as potential signals. Subgrouped analysis can be used to help correct for this. RESEARCH DESIGN AND METHODS The publicly available US Food and Drug Administration Adverse Event Reporting System quarterly data extract files from 1 January 2015 through 30 September 2017 were utilized. A proportional reporting ratio (PRR) analysis subgrouped by either age, sex, ADE report type, seriousness of ADE, or reporter was compared to the crude PRR analysis using sensitivity, specificity, precision, and c-statistic. RESULTS Subgrouping by age (n = 78, 34.5% increase), sex (n = 67, 15.5% increase), and reporter (n = 64, 10.3% increase) identified more signals than the crude analysis. Subgrouping by either age or sex increased both the sensitivity and precision. Subgrouping by report type or seriousness resulted in fewer signals (n = 50, -13.8% for both). Subgrouped analyses had higher c-statistic values, with age having the highest (0.468). CONCLUSIONS Subgrouping by either age or sex produced more signals with higher sensitivity and precision than the crude PRR analysis. Subgrouping by these variables can unmask potentially important associations.
Collapse
Affiliation(s)
- Daniel G Dauner
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, Minnesota, USA
| | - Terrence J Adam
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Eleazar Leal
- Department of Computer Science, Swenson College of Science and Engineering, University of Minnesota, Duluth, Minnesota, USA
| | - Viviene Heitlage
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota, USA
| | - Joel F Farley
- Department of Pharmaceutical Care and Health Systems, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
13
|
Conover MM, Weaver J, Fan B, Leitz G, Richarz U, Li Q, Gifkins D. Cardiovascular outcomes among patients with castration-resistant prostate cancer: A comparative safety study using US administrative claims data. Prostate 2023; 83:729-739. [PMID: 36879362 DOI: 10.1002/pros.24510] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 05/23/2022] [Accepted: 02/22/2023] [Indexed: 03/08/2023]
Abstract
BACKGROUND Cardiovascular conditions are the most prevalent comorbidity among patients with prostate cancer, regardless of treatment. Additionally, cardiovascular risk has been shown to increase following exposure to certain treatments for advanced prostate cancer. There is conflicting evidence on risk of overall and specific cardiovascular outcomes among men treated for metastatic castrate resistant prostate cancer (CRPC). We, therefore, sought to compare incidence of serious cardiovascular events among CRPC patients treated with abiraterone acetate plus predniso(lo)ne (AAP) and enzalutamide (ENZ), the two most widely used CRPC therapies. METHODS Using US administrative claims data, we selected CRPC patients newly exposed to either treatment after August 31, 2012, with prior androgen deprivation therapy (ADT). We assessed incidence of hospitalization for heart failure (HHF), ischemic stroke, and acute myocardial infarction (AMI) during the period 30-days after AAP or ENZ initiation to discontinuation, outcome occurrence, death, or disenrollment. We matched treatment groups on propensity-scores (PSs) to control for observed confounding to estimate the average treatment effect among the treated (AAP) using conditional Cox proportional hazards models. To account for residual bias, we calibrated our estimates against a distribution of effect estimates from 124 negative-control outcomes. RESULTS The HHF analysis included 2322 (45.1%) AAP initiators and 2827 (54.9%) ENZ initiators. In this analysis, the median follow-up times among AAP and ENZ initiators (after PS matching) were 144 and 122 days, respectively. The empirically calibrated hazard ratio (HR) estimate for HHF was 2.56 (95% confidence interval [CI]: 1.32, 4.94). Corresponding HRs for AMI and ischemic stroke were 1.94 (95% CI: 0.90, 4.18) and 1.25 (95% CI: 0.54, 2.85), respectively. CONCLUSIONS Our study sought to quantify risk of HHF, AMI and ischemic stroke among CRPC patients initiating AAP relative to ENZ within a national administrative claims database. Increased risk for HHF among AAP compared to ENZ users was observed. The difference in myocardial infarction did not attain statistical significance after controlling for residual bias, and no differences were noted in ischemic stroke between the two treatments. These findings confirm labeled warnings and precautions for AAP for HHF and contribute to the comparative real-world evidence on AAP relative to ENZ.
Collapse
Affiliation(s)
| | - James Weaver
- Janssen Research & Development, Titusville, New Jersey, USA
| | - Bo Fan
- Janssen Research & Development, Titusville, New Jersey, USA
| | - Gerhard Leitz
- Janssen Research & Development, Titusville, New Jersey, USA
| | - Ute Richarz
- Janssen Research & Development, Titusville, New Jersey, USA
| | - Qing Li
- Janssen Research & Development, Titusville, New Jersey, USA
| | - Dina Gifkins
- Janssen Research & Development, Titusville, New Jersey, USA
| |
Collapse
|
14
|
Early Detection of Adverse Drug Reaction Signals by Association Rule Mining Using Large-Scale Administrative Claims Data. Drug Saf 2023; 46:371-389. [PMID: 36828947 PMCID: PMC10113351 DOI: 10.1007/s40264-023-01278-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/06/2023] [Indexed: 02/26/2023]
Abstract
INTRODUCTION Adverse drug reactions (ADRs) are a leading cause of mortality worldwide and should be detected promptly to reduce health risks to patients. A data-mining approach using large-scale medical records might be a useful method for the early detection of ADRs. Many studies have analyzed medical records to detect ADRs; however, most of them have focused on a narrow range of ADRs, limiting their usefulness. OBJECTIVE This study aimed to identify methods for the early detection of a wide range of ADR signals. METHODS First, to evaluate the performance in signal detection of ADRs by data-mining, we attempted to create a gold standard based on clinical evidence. Second, association rule mining (ARM) was applied to patient symptoms and medications registered in claims data, followed by evaluating ADR signal detection performance. RESULTS We created a new gold standard consisting of 92 positive and 88 negative controls. In the assessment of ARM using claims data, the areas under the receiver-operating characteristic curve and the precision-recall curve were 0.80 and 0.83, respectively. If the detection criteria were defined as lift > 1, conviction > 1, and p-value < 0.05, ARM could identify 156 signals, of which 90 were true positive controls (sensitivity: 0.98, specificity: 0.25). Evaluation of the capability of ARM with short periods of data revealed that ARM could detect a greater number of positive controls than the conventional analysis method. CONCLUSIONS ARM of claims data may be effective in the early detection of a wide range of ADR signals.
Collapse
|
15
|
Swerdel JN, Ramcharran D, Hardin J. Using a data-driven approach for the development and evaluation of phenotype algorithms for systemic lupus erythematosus. PLoS One 2023; 18:e0281929. [PMID: 36795690 PMCID: PMC9934349 DOI: 10.1371/journal.pone.0281929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 02/04/2023] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Systemic lupus erythematosus (SLE) is a chronic autoimmune disease of unknown origin. The objective of this research was to develop phenotype algorithms for SLE suitable for use in epidemiological studies using empirical evidence from observational databases. METHODS We used a process for empirically determining and evaluating phenotype algorithms for health conditions to be analyzed in observational research. The process started with a literature search to discover prior algorithms used for SLE. We then used a set of Observational Health Data Sciences and Informatics (OHDSI) open-source tools to refine and validate the algorithms. These included tools to discover codes for SLE that may have been missed in prior studies and to determine possible low specificity and index date misclassification in algorithms for correction. RESULTS We developed four algorithms using our process: two algorithms for prevalent SLE and two for incident SLE. The algorithms for both incident and prevalent cases are comprised of a more specific version and a more sensitive version. Each of the algorithms corrects for possible index date misclassification. After validation, we found the highest positive predictive value estimate for the prevalent, specific algorithm (89%). The highest sensitivity estimate was found for the sensitive, prevalent algorithm (77%). CONCLUSION We developed phenotype algorithms for SLE using a data-driven approach. The four final algorithms may be used directly in observational studies. The validation of these algorithms provides researchers an added measure of confidence that the algorithms are selecting subjects correctly and allows for the application of quantitative bias analysis.
Collapse
Affiliation(s)
- Joel N. Swerdel
- Janssen Research and Development Epidemiology, Titusville, New Jersey, United States of America
- Observational Health Data Sciences and Informatics (OHDSI), New York, New York, United States of America
- * E-mail:
| | - Darmendra Ramcharran
- Janssen Research and Development Epidemiology, Titusville, New Jersey, United States of America
| | - Jill Hardin
- Janssen Research and Development Epidemiology, Titusville, New Jersey, United States of America
- Observational Health Data Sciences and Informatics (OHDSI), New York, New York, United States of America
| |
Collapse
|
16
|
Cherkas Y, Ide J, van Stekelenborg J. Leveraging Machine Learning to Facilitate Individual Case Causality Assessment of Adverse Drug Reactions. Drug Saf 2022; 45:571-582. [PMID: 35579819 DOI: 10.1007/s40264-022-01163-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2022] [Indexed: 11/29/2022]
Abstract
INTRODUCTION Causality assessment of individual case safety reports (ICSRs) is an important step in pharmacovigilance case-level review and aims to establish a position on whether a patient's exposure to a drug is causally related to the patient experiencing an untoward adverse event. There are many different approaches for case causality adjudication, including the use of expert opinions and algorithmic frameworks; however, a great deal of variability exists between assessment methods, products, therapeutic classes, individual physicians, change of process and conventions over time, and other factors. OBJECTIVE The objective of this study was to develop a machine learning-based model that can predict the likelihood of a causal association of an observed drug-reaction combination in an ICSR. METHODS In this study, we used a set of annotated solicited ICSRs (50K cases) from a company post-marketing database. These data were enriched with novel supplementary features from external and internal data sources that aim to capture facets such as temporal plausibility, scientific validity, and confoundedness that have been shown to contribute to causality adjudication. Using these features, we constructed a Bayesian network (BN) model to predict drug-event pair causality assessment. BN topology was driven by an internally developed ICSR causality decision support tool. Performance of the model was evaluated through examination of sensitivity, positive predictive value (PPV), and the area under the receiver operating characteristic curve (AUC) on an independent set of data from a temporally adjacent interval (20K cases). No external validation was performed because of a lack of publicly available ICSRs with causality assessments for drug-event pairs. RESULTS The model demonstrated high performance in predicting the causality assessment of drug-event pairs compared with clinical judgment using global introspection (AUC 0.924; 95% confidence interval [CI] 0.922-0.927). The sensitivity of the model was 0.900 (95% CI 0.896-0.904), and the PPV of the model was 0.778 (95% CI 0.773-0.783). CONCLUSION These results show that robust probabilistic modeling of ICSR causality is feasible, and the approach used in the development of the model can serve as a framework for such causality assessments, leading to improvements in safety decision making.
Collapse
Affiliation(s)
| | - Joshua Ide
- Johnson & Johnson Consumer, Inc, Skillman, NJ, USA
| | | |
Collapse
|
17
|
Kontsioti E, Maskell S, Dutta B, Pirmohamed M. A reference set of clinically relevant adverse drug-drug interactions. Sci Data 2022; 9:72. [PMID: 35246559 PMCID: PMC8897500 DOI: 10.1038/s41597-022-01159-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 01/13/2022] [Indexed: 12/03/2022] Open
Abstract
The accurate and timely detection of adverse drug-drug interactions (DDIs) during the postmarketing phase is an important yet complex task with potentially major clinical implications. The development of data mining methodologies that scan healthcare databases for drug safety signals requires appropriate reference sets for performance evaluation. Methodologies for establishing DDI reference sets are limited in the literature, while there is no publicly available resource simultaneously focusing on clinical relevance of DDIs and individual behaviour of interacting drugs. By automatically extracting and aggregating information from multiple clinical resources, we provide a scalable approach for generating a reference set for DDIs that could support research in postmarketing safety surveillance. CRESCENDDI contains 10,286 positive and 4,544 negative controls, covering 454 drugs and 179 adverse events mapped to RxNorm and MedDRA concepts, respectively. It also includes single drug information for the included drugs (i.e., adverse drug reactions, indications, and negative drug-event associations). We demonstrate usability of the resource by scanning a spontaneous reporting system database for signals of DDIs using traditional signal detection algorithms. Measurement(s) | Adverse Event | Technology Type(s) | digital curation | Sample Characteristic - Organism | Homo sapiens |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.16681933
Collapse
Affiliation(s)
- Elpida Kontsioti
- Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, UK.
| | - Simon Maskell
- Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, UK
| | - Bhaskar Dutta
- Patient Safety Center of Excellence, Chief Medical Office Organization, AstraZeneca Pharmaceuticals, Gaithersburg, MD, USA
| | - Munir Pirmohamed
- The Wolfson Centre for Personalized Medicine, MRC Centre for Drug Safety Science, Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| |
Collapse
|
18
|
Ji X, Cui G, Xu C, Hou J, Zhang Y, Ren Y. Combining a Pharmacological Network Model with a Bayesian Signal Detection Algorithm to Improve the Detection of Adverse Drug Events. Front Pharmacol 2022; 12:773135. [PMID: 35046809 PMCID: PMC8762263 DOI: 10.3389/fphar.2021.773135] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 11/30/2021] [Indexed: 11/13/2022] Open
Abstract
Introduction: Improving adverse drug event (ADE) detection is important for post-marketing drug safety surveillance. Existing statistical approaches can be further optimized owing to their high efficiency and low cost. Objective: The objective of this study was to evaluate the proposed approach for use in pharmacovigilance, the early detection of potential ADEs, and the improvement of drug safety. Methods: We developed a novel integrated approach, the Bayesian signal detection algorithm, based on the pharmacological network model (ICPNM) using the FDA Adverse Event Reporting System (FAERS) data published from 2004 to 2009 and from 2014 to 2019Q2, PubChem, and DrugBank database. First, we used a pharmacological network model to generate the probabilities for drug-ADE associations, which comprised the proper prior information component (IC). We then defined the probability of the propensity score adjustment based on a logistic regression model to control for the confounding bias. Finally, we chose the Side Effect Resource (SIDER) and the Observational Medical Outcomes Partnership (OMOP) data to evaluate the detection performance and robustness of the ICPNM compared with the statistical approaches [disproportionality analysis (DPA)] by using the area under the receiver operator characteristics curve (AUC) and Youden’s index. Results: Of the statistical approaches implemented, the ICPNM showed the best performance (AUC, 0.8291; Youden’s index, 0.5836). Meanwhile, the AUCs of the IC, EBGM, ROR, and PRR were 0.7343, 0.7231, 0.6828, and 0.6721, respectively. Conclusion: The proposed ICPNM combined the strengths of the pharmacological network model and the Bayesian signal detection algorithm and performed better in detecting true drug-ADE associations. It also detected newer ADE signals than a DPA and may be complementary to the existing statistical approaches.
Collapse
Affiliation(s)
- Xiangmin Ji
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, China
| | - Guimei Cui
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, China
| | - Chengzhen Xu
- School of Computer Science and Technology, Huaibei Normal University, Huaibei, China
| | - Jie Hou
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| | - Yunfei Zhang
- Department of Mathematics and Computer Engineering, Ordos Institute of Technology, Ordos, China
| | - Yan Ren
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, China
| |
Collapse
|
19
|
Lee S, Lee JH, Kim GJ, Kim JY, Shin H, Ko I, Choe S, Kim JH. Development of a Data-Driven Reference Standard for Adverse Drug Reaction (RS-ADR) Signal Assessment (Preprint). J Med Internet Res 2021; 24:e35464. [PMID: 36201386 PMCID: PMC9585444 DOI: 10.2196/35464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 04/29/2022] [Accepted: 07/14/2022] [Indexed: 11/13/2022] Open
Abstract
Background Pharmacovigilance using real-world data (RWD), such as multicenter electronic health records (EHRs), yields massively parallel adverse drug reaction (ADR) signals. However, proper validation of computationally detected ADR signals is not possible due to the lack of a reference standard for positive and negative associations. Objective This study aimed to develop a reference standard for ADR (RS-ADR) to streamline the systematic detection, assessment, and understanding of almost all drug-ADR associations suggested by RWD analyses. Methods We integrated well-known reference sets for drug-ADR pairs, including Side Effect Resource, Observational Medical Outcomes Partnership, and EU-ADR. We created a pharmacovigilance dictionary using controlled vocabularies and systematically annotated EHR data. Drug-ADR associations computed from MetaLAB and MetaNurse analyses of multicenter EHRs and extracted from the Food and Drug Administration Adverse Event Reporting System were integrated as “empirically determined” positive and negative reference sets by means of cross-validation between institutions. Results The RS-ADR consisted of 1344 drugs, 4485 ADRs, and 6,027,840 drug-ADR pairs with positive and negative consensus votes as pharmacovigilance reference sets. After the curation of the initial version of RS-ADR, novel ADR signals such as “famotidine–hepatic function abnormal” were detected and reasonably validated by RS-ADR. Although the validation of the entire reference standard is challenging, especially with this initial version, the reference standard will improve as more RWD participate in the consensus voting with advanced pharmacovigilance dictionaries and analytic algorithms. One can check if a drug-ADR pair has been reported by our web-based search interface for RS-ADRs. Conclusions RS-ADRs enriched with the pharmacovigilance dictionary, ADR knowledge, and real-world evidence from EHRs may streamline the systematic detection, evaluation, and causality assessment of computationally detected ADR signals.
Collapse
Affiliation(s)
- Suehyun Lee
- Department of Biomedical Informatics, College of Medicine, Konyang University, Daejeon, Republic of Korea
| | - Jeong Hoon Lee
- Seoul National University Biomedical Informatics (SNUBI), Division of Biomedical Informatics, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Grace Juyun Kim
- Seoul National University Biomedical Informatics (SNUBI), Division of Biomedical Informatics, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jong-Yeup Kim
- Healthcare Data Science Center, Konyang University Hospital, Daejeon, Republic of Korea
| | - Hyunah Shin
- Healthcare Data Science Center, Konyang University Hospital, Daejeon, Republic of Korea
| | - Inseok Ko
- Healthcare Data Science Center, Konyang University Hospital, Daejeon, Republic of Korea
| | - Seon Choe
- Seoul National University Biomedical Informatics (SNUBI), Division of Biomedical Informatics, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Ju Han Kim
- Seoul National University Biomedical Informatics (SNUBI), Division of Biomedical Informatics, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
20
|
Dasgupta S, Jayagopal A, Jun Hong AL, Mariappan R, Rajan V. Adverse Drug Event Prediction Using Noisy Literature-Derived Knowledge Graphs: Algorithm Development and Validation. JMIR Med Inform 2021; 9:e32730. [PMID: 34694230 PMCID: PMC8576589 DOI: 10.2196/32730] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 09/07/2021] [Accepted: 09/18/2021] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Adverse drug events (ADEs) are unintended side effects of drugs that cause substantial clinical and economic burdens globally. Not all ADEs are discovered during clinical trials; therefore, postmarketing surveillance, called pharmacovigilance, is routinely conducted to find unknown ADEs. A wealth of information, which facilitates ADE discovery, lies in the growing body of biomedical literature. Knowledge graphs (KGs) encode information from the literature, where the vertices and the edges represent clinical concepts and their relations, respectively. The scale and unstructured form of the literature necessitates the use of natural language processing (NLP) to automatically create such KGs. Previous studies have demonstrated the utility of such literature-derived KGs in ADE prediction. Through unsupervised learning of the representations (features) of clinical concepts from the KG, which are used in machine learning models, state-of-the-art results for ADE prediction were obtained on benchmark data sets. OBJECTIVE Due to the use of NLP to infer literature-derived KGs, there is noise in the form of false positive (erroneous) and false negative (absent) nodes and edges. Previous representation learning methods do not account for such inaccuracies in the graph. NLP algorithms can quantify the confidence in their inference of extracted concepts and relations from the literature. Our hypothesis, which motivates this work, is that by using such confidence scores during representation learning, the learned embeddings would yield better features for ADE prediction models. METHODS We developed methods to use these confidence scores on two well-known representation learning methods-DeepWalk and Translating Embeddings for Modeling Multi-relational Data (TransE)-to develop their weighted versions: Weighted DeepWalk and Weighted TransE. These methods were used to learn representations from a large literature-derived KG, the Semantic MEDLINE Database, which contains more than 93 million clinical relations. They were compared with Embedding of Semantic Predications, which, to our knowledge, is the best reported representation learning method using the Semantic MEDLINE Database with state-of-the-art results for ADE prediction. Representations learned from different methods were used (separately) as features of drugs and diseases to build classification models for ADE prediction using benchmark data sets. The methods were compared rigorously over multiple cross-validation settings. RESULTS The weighted versions we designed were able to learn representations that yielded more accurate predictive models than the corresponding unweighted versions of both DeepWalk and TransE, as well as Embedding of Semantic Predications, in our experiments. There were performance improvements of up to 5.75% in the F1-score and 8.4% in the area under the receiver operating characteristic curve value, thus advancing the state of the art in ADE prediction from literature-derived KGs. CONCLUSIONS Our classification models can be used to aid pharmacovigilance teams in detecting potentially new ADEs. Our experiments demonstrate the importance of modeling inaccuracies in the inferred KGs for representation learning.
Collapse
Affiliation(s)
| | | | - Abel Lim Jun Hong
- School of Computing, National University of Singapore, Singapore, Singapore
| | | | - Vaibhav Rajan
- Department of Information Systems and Analytics, National University of Singapore, Singapore, Singapore
| |
Collapse
|
21
|
Ding X, Mower J, Subramanian D, Cohen T. Augmenting aer2vec: Enriching distributed representations of adverse event report data with orthographic and lexical information. J Biomed Inform 2021; 119:103833. [PMID: 34111555 DOI: 10.1016/j.jbi.2021.103833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 05/10/2021] [Accepted: 06/02/2021] [Indexed: 11/29/2022]
Abstract
Adverse Drug Events (ADEs) are prevalent, costly, and sometimes preventable. Post-marketing drug surveillance aims to monitor ADEs that occur after a drug is released to market. Reports of such ADEs are aggregated by reporting systems, such as the Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). In this paper, we consider the topic of how best to represent data derived from reports in FAERS for the purpose of detecting post-marketing surveillance signals, in order to inform regulatory decision making. In our previous work, we developed aer2vec, a method for deriving distributed representations (concept embeddings) of drugs and side effects from ADE reports, establishing the utility of distributional information for pharmacovigilance signal detection. In this paper, we advance this line of research further by evaluating the utility of encoding orthographic and lexical information. We do so by adapting two Natural Language Processing methods, subword embedding and vector retrofitting, which were developed to encode such information into word embeddings. Models were compared for their ability to distinguish between positive and negative examples in a set of manually curated drug/ADE relationships, with both aer2vec enhancements offering advantages in performances over baseline models, and best performance obtained when retrofitting and subword embeddings were applied in concert. In addition, this work demonstrates that models leveraging distributed representations do not require extensive manual preprocessing to perform well on this pharmacovigilance signal detection task, and may even benefit from information that would otherwise be lost during the normalization and standardization process.
Collapse
Affiliation(s)
- Xiruo Ding
- Department of Biomedical Informatics & Medical Education, University of Washington, Seattle, WA, USA.
| | - Justin Mower
- Department of Computer Science, Rice University, Houston, TX, USA.
| | | | - Trevor Cohen
- Department of Biomedical Informatics & Medical Education, University of Washington, Seattle, WA, USA.
| |
Collapse
|
22
|
Khouri C, Nguyen T, Revol B, Lepelley M, Pariente A, Roustit M, Cracowski JL. Leveraging the Variability of Pharmacovigilance Disproportionality Analyses to Improve Signal Detection Performances. Front Pharmacol 2021; 12:668765. [PMID: 34122089 PMCID: PMC8193489 DOI: 10.3389/fphar.2021.668765] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 05/14/2021] [Indexed: 11/13/2022] Open
Abstract
Background: A plethora of methods and models of disproportionality analyses for safety surveillance have been developed to date without consensus nor a gold standard, leading to methodological heterogeneity and substantial variability in results. We hypothesized that this variability is inversely correlated to the robustness of a signal of disproportionate reporting (SDR) and could be used to improve signal detection performances. Methods: We used a validated reference set containing 399 true and false drug-event pairs and performed, with a frequentist and a Bayesian disproportionality method, seven types of analyses (model) for which the results were very unlikely to be related to actual differences in absolute risks of ADR. We calculated sensitivity, specificity and plotted ROC curves for each model. We then evaluated the predictive capacities of all models and assessed the impact of combining such models with the number of positive SDR for a given drug-event pair through binomial regression models. Results: We found considerable variability in disproportionality analysis results, both positive and negative SDR could be generated for 60% of all drug-event pairs depending on the model used whatever their truthfulness. Furthermore, using the number of positive SDR for a given drug-event pair largely improved the signal detection performances of all models. Conclusion: We therefore advocate for the pre-registration of protocols and the presentation of a set of secondary and sensitivity analyses instead of a unique result to avoid selective outcome reporting and because variability in the results may reflect the likelihood of a signal being a true adverse drug reaction.
Collapse
Affiliation(s)
- Charles Khouri
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France.,Clinical Pharmacology Department INSERM CIC 1406, Grenoble Alpes University Hospital, Grenoble, France.,Hypoxia and PhysioPathology, UMR 1300, INSERM, University Grenoble Alpes, Grenoble, France
| | - Thuy Nguyen
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France
| | - Bruno Revol
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France.,Clinical Pharmacology Department INSERM CIC 1406, Grenoble Alpes University Hospital, Grenoble, France.,Hypoxia and PhysioPathology, UMR 1300, INSERM, University Grenoble Alpes, Grenoble, France
| | - Marion Lepelley
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France.,Clinical Pharmacology Department INSERM CIC 1406, Grenoble Alpes University Hospital, Grenoble, France
| | - Antoine Pariente
- INSERM U1219, Bordeaux Population Health, Team Pharmacoepidemiology, University of Bordeaux, Bordeaux, France.,Service de Pharmacologie Médicale, Pôle de Santé Publique, CHU de Bordeaux, Bordeaux, France
| | - Matthieu Roustit
- Clinical Pharmacology Department INSERM CIC 1406, Grenoble Alpes University Hospital, Grenoble, France.,Hypoxia and PhysioPathology, UMR 1300, INSERM, University Grenoble Alpes, Grenoble, France
| | - Jean-Luc Cracowski
- Pharmacovigilance Unit, Grenoble Alpes University Hospital, Grenoble, France.,Hypoxia and PhysioPathology, UMR 1300, INSERM, University Grenoble Alpes, Grenoble, France
| |
Collapse
|
23
|
Malec SA, Wei P, Bernstam EV, Boyce RD, Cohen T. Using computable knowledge mined from the literature to elucidate confounders for EHR-based pharmacovigilance. J Biomed Inform 2021; 117:103719. [PMID: 33716168 PMCID: PMC8559730 DOI: 10.1016/j.jbi.2021.103719] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 12/31/2020] [Accepted: 01/04/2021] [Indexed: 10/21/2022]
Abstract
INTRODUCTION Drug safety research asks causal questions but relies on observational data. Confounding bias threatens the reliability of studies using such data. The successful control of confounding requires knowledge of variables called confounders affecting both the exposure and outcome of interest. However, causal knowledge of dynamic biological systems is complex and challenging. Fortunately, computable knowledge mined from the literature may hold clues about confounders. In this paper, we tested the hypothesis that incorporating literature-derived confounders can improve causal inference from observational data. METHODS We introduce two methods (semantic vector-based and string-based confounder search) that query literature-derived information for confounder candidates to control, using SemMedDB, a database of computable knowledge mined from the biomedical literature. These methods search SemMedDB for confounders by applying semantic constraint search for indications treated by the drug (exposure) and that are also known to cause the adverse event (outcome). We then include the literature-derived confounder candidates in statistical and causal models derived from free-text clinical notes. For evaluation, we use a reference dataset widely used in drug safety containing labeled pairwise relationships between drugs and adverse events and attempt to rediscover these relationships from a corpus of 2.2 M NLP-processed free-text clinical notes. We employ standard adjustment and causal inference procedures to predict and estimate causal effects by informing the models with varying numbers of literature-derived confounders and instantiating the exposure, outcome, and confounder variables in the models with dichotomous EHR-derived data. Finally, we compare the results from applying these procedures with naive measures of association (χ2 and reporting odds ratio) and with each other. RESULTS AND CONCLUSIONS We found semantic vector-based search to be superior to string-based search at reducing confounding bias. However, the effect of including more rather than fewer literature-derived confounders was inconclusive. We recommend using targeted learning estimation methods that can address treatment-confounder feedback, where confounders also behave as intermediate variables, and engaging subject-matter experts to adjudicate the handling of problematic covariates.
Collapse
Affiliation(s)
- Scott A Malec
- University of Pittsburgh School of Medicine, Department of Biomedical Informatics, Pittsburgh, PA, United States.
| | - Peng Wei
- The University of Texas MD Anderson Cancer Center, Department of Biostatistics, Houston, TX, United States
| | - Elmer V Bernstam
- University of Texas Health Science Center at Houston, School of Biomedical Informatics, Houston, TX, United States
| | - Richard D Boyce
- University of Pittsburgh School of Medicine, Department of Biomedical Informatics, Pittsburgh, PA, United States
| | - Trevor Cohen
- University of Washington, Department of Biomedical Informatics and Medical Education, Seattle, WA, United States
| |
Collapse
|
24
|
Thurin NH, Lassalle R, Schuemie M, Pénichon M, Gagne JJ, Rassen JA, Benichou J, Weill A, Blin P, Moore N, Droz-Perroteau C. Empirical assessment of case-based methods for identification of drugs associated with acute liver injury in the French National Healthcare System database (SNDS). Pharmacoepidemiol Drug Saf 2020; 30:320-333. [PMID: 33099844 DOI: 10.1002/pds.5161] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 10/20/2020] [Accepted: 10/21/2020] [Indexed: 11/10/2022]
Abstract
PURPOSES Drug induced acute liver injury (ALI) is a frequent cause of liver failure. Case-based designs were empirically assessed and calibrated in the French National claims database (SNDS), aiming to identify the optimum design for drug safety alert generation associated with ALI. METHODS All cases of ALI were extracted from SNDS (2009-2014) using specific and sensitive definitions. Positive and negative drug controls were used to compare 196 self-controlled case series (SCCS), case-control (CC), and case-population (CP) design variants, using area under the receiver operating curve (AUC), mean square error (MSE) and coverage probability. Parameters that had major impacts on results were identified through logistic regression. RESULTS Using a specific ALI definition, AUCs ranged from 0.78 to 0.94, 0.64 to 0.92 and 0.48 to 0.85, for SCCS, CC and CP, respectively. MSE ranged from 0.12 to 0.40, 0.22 to 0.39 and 1.03 to 5.29, respectively. Variants adjusting for multiple drug use had higher coverage probabilities. Univariate regressions showed that high AUCs were achieved with SCCS using exposed time as the risk window. The top SCCS variant yielded an AUC = 0.93 and MSE = 0.22 and coverage = 86%, with 1/7 negative and 13/18 positive controls presenting significant estimates. CONCLUSIONS SCCS adjusting for multiple drugs and using exposed time as the risk window performed best in generating ALI-related drug safety alert and providing estimates of the magnitude of the risk. This approach may be useful for ad-hoc pharmacoepidemiology studies to support regulatory actions.
Collapse
Affiliation(s)
- Nicolas H Thurin
- Univ. Bordeaux, INSERM CIC-P1401, Bordeaux PharmacoEpi, Bordeaux, France
| | - Régis Lassalle
- Univ. Bordeaux, INSERM CIC-P1401, Bordeaux PharmacoEpi, Bordeaux, France
| | - Martijn Schuemie
- Epidemiology Analytics, Janssen Research and Development, Titusville, New Jersey, USA.,Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - Marine Pénichon
- Univ. Bordeaux, INSERM CIC-P1401, Bordeaux PharmacoEpi, Bordeaux, France
| | - Joshua J Gagne
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | | | - Jacques Benichou
- Department of Biostatistics and Clinical Research, Rouen University Hospital, Rouen, France.,INSERM U1181, Paris, France
| | - Alain Weill
- Caisse Nationale de l'Assurance Maladie, Paris, France
| | - Patrick Blin
- Univ. Bordeaux, INSERM CIC-P1401, Bordeaux PharmacoEpi, Bordeaux, France
| | - Nicholas Moore
- Univ. Bordeaux, INSERM CIC-P1401, Bordeaux PharmacoEpi, Bordeaux, France.,CHU de Bordeaux, Bordeaux, France
| | | |
Collapse
|
25
|
Li Y, Jimeno Yepes A, Xiao C. Combining Social Media and FDA Adverse Event Reporting System to Detect Adverse Drug Reactions. Drug Saf 2020; 43:893-903. [PMID: 32385840 PMCID: PMC7434724 DOI: 10.1007/s40264-020-00943-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
INTRODUCTION Adverse drug reactions (ADRs) are unintended reactions caused by a drug or combination of drugs taken by a patient. The current safety surveillance system relies on spontaneous reporting systems (SRSs) and more recently on observational health data; however, ADR detection may be delayed and lack geographic diversity. The broad scope of social media conversations, such as those on Twitter, can include health-related topics. Consequently, these data could be used to detect potentially novel ADRs with less latency. Although research regarding ADR detection using social media has made progress, findings are based on single information sources, and no study has yet integrated drug safety evidence from both an SRS and Twitter. OBJECTIVE The aim of this study was to combine signals from an SRS and Twitter to facilitate the detection of safety signals and compare the performance of the combined system with signals generated by individual data sources. METHODS We extracted potential drug-ADR posts from Twitter, used Monte Carlo expectation maximization to generate drug safety signals from both the US FDA Adverse Event Reporting System and posts from Twitter, and then integrated these signals using a Bayesian hierarchical model. The results from the integrated system and two individual sources were evaluated using a reference standard derived from drug labels. Area under the receiver operating characteristics curve (AUC) was computed to measure performance. RESULTS We observed a significant improvement in the AUC of the combined system when comparing it with Twitter alone, and no improvement when comparing with the SRS alone. The AUCs ranged from 0.587 to 0.637 for the combined SRS and Twitter, from 0.525 to 0.534 for Twitter alone, and from 0.612 to 0.642 for the SRS alone. The results varied because different preprocessing procedures were applied to Twitter. CONCLUSION The accuracy of signal detection using social media can be improved by combining signals with those from SRSs. However, the combined system cannot achieve better AUC performance than data from FAERS alone, which may indicate that Twitter data are not ready to be integrated into a purely data-driven combination system.
Collapse
Affiliation(s)
- Ying Li
- Center for Computational Health, IBM Thomas J. Watson Research Center, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598, USA.
| | | | - Cao Xiao
- Analytics Center of Excellence, IQVIA, Cambridge, MA, USA
| |
Collapse
|
26
|
Yuan Z, DeFalco F, Wang L, Hester L, Weaver J, Swerdel JN, Freedman A, Ryan P, Schuemie M, Qiu R, Yee J, Meininger G, Berlin JA, Rosenthal N. Acute pancreatitis risk in type 2 diabetes patients treated with canagliflozin versus other antihyperglycemic agents: an observational claims database study. Curr Med Res Opin 2020; 36:1117-1124. [PMID: 32338068 DOI: 10.1080/03007995.2020.1761312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Objective: Observational evidence suggests that patients with type 2 diabetes mellitus (T2DM) are at increased risk for acute pancreatitis (AP) versus those without T2DM. A small number of AP events were reported in clinical trials of the sodium glucose co-transporter 2 inhibitor canagliflozin, though no imbalances were observed between treatment groups. This observational study evaluated risk of AP among new users of canagliflozin compared with new users of six classes of other antihyperglycemic agents (AHAs).Methods: Three US claims databases were analyzed based on a prespecified protocol approved by the European Medicines Agency. Propensity score adjustment controlled for imbalances in baseline covariates. Cox regression models estimated the hazard ratio of AP with canagliflozin compared with other AHAs using on-treatment (primary) and intent-to-treat approaches. Sensitivity analyses assessed robustness of findings.Results: Across the three databases, there were between 12,023-80,986 new users of canagliflozin; the unadjusted incidence rates of AP (per 1000 person-years) were between 1.5-2.2 for canagliflozin and 1.1-6.6 for other AHAs. The risk of AP was generally similar for new users of canagliflozin compared with new users of glucagon-like peptide-1 receptor agonists, dipeptidyl peptidase-4 inhibitors, sulfonylureas, thiazolidinediones, insulin, and other AHAs, with no consistent between-treatment differences observed across databases. Intent-to-treat and sensitivity analysis findings were qualitatively consistent with on-treatment findings.Conclusions: In this large observational study, incidence rates of AP in patients with T2DM treated with canagliflozin or other AHAs were generally similar, with no evidence suggesting that canagliflozin is associated with increased risk of AP compared with other AHAs.
Collapse
Affiliation(s)
- Zhong Yuan
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Frank DeFalco
- Epidemiology, Janssen Research & Development, LLC, Raritan, NJ, USA
| | - Lu Wang
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Laura Hester
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - James Weaver
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Joel N Swerdel
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Amy Freedman
- Global Medical Safety, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Patrick Ryan
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Martijn Schuemie
- Epidemiology, Janssen Research & Development, LLC, Titusville, NJ, USA
| | - Rose Qiu
- Cardiovascular and Metabolism, Janssen Research & Development, LLC, Raritan, NJ, USA
| | - Jacqueline Yee
- Cardiovascular and Metabolism, Janssen Research & Development, LLC, Raritan, NJ, USA
| | - Gary Meininger
- Cardiovascular and Metabolism, Janssen Research & Development, LLC, Raritan, NJ, USA
| | | | - Norman Rosenthal
- Cardiovascular and Metabolism, Janssen Research & Development, LLC, Raritan, NJ, USA
| |
Collapse
|
27
|
Thurin NH, Lassalle R, Schuemie M, Pénichon M, Gagne JJ, Rassen JA, Benichou J, Weill A, Blin P, Moore N, Droz‐Perroteau C. Empirical assessment of case‐based methods for identification of drugs associated with upper gastrointestinal bleeding in the French National Healthcare System database (
SNDS
). Pharmacoepidemiol Drug Saf 2020; 29:890-903. [DOI: 10.1002/pds.5038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 02/21/2020] [Accepted: 05/08/2020] [Indexed: 01/05/2023]
Affiliation(s)
- Nicolas H. Thurin
- Bordeaux PharmacoEpi, INSERM CIC1401Université de Bordeaux Bordeaux France
- INSERM U1219Université de Bordeaux Bordeaux France
| | - Régis Lassalle
- Bordeaux PharmacoEpi, INSERM CIC1401Université de Bordeaux Bordeaux France
| | - Martijn Schuemie
- Epidemiology AnalyticsJanssen Research and Development Titusville New Jersey USA
- Observational Health Data Sciences and Informatics (OHDSI) New York New York USA
| | - Marine Pénichon
- Bordeaux PharmacoEpi, INSERM CIC1401Université de Bordeaux Bordeaux France
| | - Joshua J. Gagne
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of MedicineBrigham and Women's Hospital and Harvard Medical School Boston Massachusetts USA
| | | | - Jacques Benichou
- Department of Biostatistics and Clinical ResearchRouen University Hospital Rouen France
- INSERM U1181 Paris France
| | - Alain Weill
- Caisse Nationale de l'Assurance Maladie Paris France
| | - Patrick Blin
- Bordeaux PharmacoEpi, INSERM CIC1401Université de Bordeaux Bordeaux France
| | - Nicholas Moore
- Bordeaux PharmacoEpi, INSERM CIC1401Université de Bordeaux Bordeaux France
- INSERM U1219Université de Bordeaux Bordeaux France
- CHU de Bordeaux Bordeaux France
| | | |
Collapse
|
28
|
Wilcox MA, Villasis-Keever A, Sena AG, Knoll C, Fife D. Evaluation of disability in patients exposed to fluoroquinolones. BMC Pharmacol Toxicol 2020; 21:40. [PMID: 32493505 PMCID: PMC7268406 DOI: 10.1186/s40360-020-00415-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 05/19/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Fluoroquinolones are used for conditions including sinusitis, bronchitis, and urinary tract infections. It has been suggested that exposure to fluoroquinolones for these conditions is associated with disability resulting from adverse events in 2 or more organ systems. The objectives were to: describe: 1) fluoroquinolone, azithromycin, and sulfamethoxazole / trimethoprim utilization for these infections; 2) the rate of disability associated with exposure to each of these antibiotic classes and adverse events in 2 or more system organ classes, and 3) compare outcome rates for each of the antibiotic classes. METHODS This study was conducted using administrative data to mitigate the limitations of spontaneous reports. The sampling frame was a U.S. population with both medical and disability insurance, including patients with the above uncomplicated infections who were prescribed the antibiotics of interest. The primary outcome was an incident short-term disability claim associated with adverse events in 2 different organ systems within 120 days of exposure. A matched analysis was used to compare the outcome for patients receiving each of the drug classes. RESULTS After propensity score matching, there were 119,653 individuals in each of the exposure groups. There were 264 fluoroquinolone associated disability events and 243 azithromycin/ sulfamethoxazole associated disability events (relative risk =1.09 (95% CI: 0.92-1.30; calibrated p = 0.84)). The results were not significantly different from the null hypothesis of no difference between groups. CONCLUSION Comparative assessments are difficult to conduct in spontaneous reports. This examination of disability associated with adverse events in different system organ classes showed no difference between fluoroquinolones and azithromycin or sulfamethoxazole in administrative data.
Collapse
Affiliation(s)
- Marsha A Wilcox
- Janssen Research & Development, LLC, 1125 Trenton Harbourton Road, Titusville, NJ, 08560, USA.
| | | | - Anthony G Sena
- Janssen Research & Development, LLC, 1125 Trenton Harbourton Road, Titusville, NJ, 08560, USA
| | - Christopher Knoll
- Janssen Research & Development, LLC, 1125 Trenton Harbourton Road, Titusville, NJ, 08560, USA
| | - Daniel Fife
- Janssen Research & Development, LLC, 1125 Trenton Harbourton Road, Titusville, NJ, 08560, USA
| |
Collapse
|
29
|
Peng M, Lee S, D'Souza AG, Doktorchik CTA, Quan H. Development and validation of data quality rules in administrative health data using association rule mining. BMC Med Inform Decis Mak 2020; 20:75. [PMID: 32334599 PMCID: PMC7183129 DOI: 10.1186/s12911-020-1089-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Accepted: 04/07/2020] [Indexed: 11/17/2022] Open
Abstract
Background Data quality assessment presents a challenge for research using coded administrative health data. The objective of this study is to develop and validate a set of coding association rules for coded diagnostic data. Methods We used the Canadian re-abstracted hospital discharge abstract data coded in International Classification of Disease, 10th revision (ICD-10) codes. Association rule mining was conducted on the re-abstracted data in four age groups (0–4, 20–44, 45–64; ≥ 65) to extract ICD-10 coding association rules at the three-digit (category of diagnosis) and four-digit levels (category of diagnosis with etiology, anatomy, or severity). The rules were reviewed by a panel of 5 physicians and 2 classification specialists using a modified Delphi rating process. We proposed and defined the variance and bias to assess data quality using the rules. Results After the rule mining process and the panel review, 388 rules at the three-digit level and 275 rules at the four-digit level were developed. Half of the rules were from the age group of ≥65. Rules captured meaningful age-specific clinical associations, with rules at the age group of ≥65 being more complex and comprehensive than other age groups. The variance and bias can identify rules with high bias and variance in Alberta data and provides directions for quality improvement. Conclusions A set of ICD-10 data quality rules were developed and validated by a clinical and classification expert panel. The rules can be used as a tool to assess ICD-coded data, enabling the monitoring and comparison of data quality across institutions, provinces, and countries.
Collapse
Affiliation(s)
- Mingkai Peng
- Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada. .,Analytics, Alberta Health Services, Calgary, Alberta, Canada.
| | - Sangmin Lee
- Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Adam G D'Souza
- Analytics, Alberta Health Services, Calgary, Alberta, Canada.,Centre for Health Informatics, University of Calgary, Calgary, Alberta, Canada
| | | | - Hude Quan
- Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada.,Centre for Health Informatics, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
30
|
Thurin NH, Lassalle R, Schuemie M, Pénichon M, Gagne JJ, Rassen JA, Benichou J, Weill A, Blin P, Moore N, Droz-Perroteau C. Empirical assessment of case-based methods for drug safety alert identification in the French National Healthcare System database (SNDS): Methodology of the ALCAPONE project. Pharmacoepidemiol Drug Saf 2020; 29:993-1000. [PMID: 32133717 DOI: 10.1002/pds.4983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 01/02/2020] [Accepted: 02/12/2020] [Indexed: 01/22/2023]
Abstract
OBJECTIVES To introduce the methodology of the ALCAPONE project. BACKGROUND The French National Healthcare System Database (SNDS), covering 99% of the French population, provides a potentially valuable opportunity for drug safety alert generation. ALCAPONE aimed to assess empirically in the SNDS case-based designs for alert generation related to four health outcomes of interest. METHODS ALCAPONE used a reference set adapted from observational medical outcomes partnership (OMOP) and Exploring and Understanding Adverse Drug Reactions (EU-ADR) project, with four outcomes-acute liver injury (ALI), myocardial infarction (MI), acute kidney injury (AKI), and upper gastrointestinal bleeding (UGIB)-and positive and negative drug controls. ALCAPONE consisted of four main phases: (1) data preparation to fit the OMOP Common Data Model and select the drug controls; (2) detection of the selected controls via three case-based designs: case-population, case-control, and self-controlled case series, including design variants (varying risk window, adjustment strategy, etc.); (3) comparison of design variant performance (area under the ROC curve, mean square error, etc.); and (4) selection of the optimal design variants and their calibration for each outcome. RESULTS Over 2009-2014, 5225 cases of ALI, 354 109 MI, 12 633 AKI, and 156 057 UGIB were identified using specific definitions. The number of detectable drugs ranged from 61 for MI to 25 for ALI. Design variants generated more than 50 000 points estimates. Results by outcome will be published in forthcoming papers. CONCLUSIONS ALCAPONE has shown the interest of the empirical assessment of pharmacoepidemiological approaches for drug safety alert generation and may encourage other researchers to do the same in other databases.
Collapse
Affiliation(s)
- Nicolas H Thurin
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France.,INSERM U1219, Université de Bordeaux, Bordeaux, France
| | - Régis Lassalle
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Martijn Schuemie
- Epidemiology Analytics, Janssen Research and Development, Titusville, New Jersey, USA.,Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - Marine Pénichon
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Joshua J Gagne
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | | | - Jacques Benichou
- Department of Biostatistics and Clinical Research, Rouen University Hospital, Rouen, France.,INSERM U1181, Paris, France
| | - Alain Weill
- Caisse Nationale de l'Assurance Maladie, Paris, France
| | - Patrick Blin
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Nicholas Moore
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France.,INSERM U1219, Université de Bordeaux, Bordeaux, France.,CHU de Bordeaux, Bordeaux, France
| | | |
Collapse
|
31
|
Schuemie MJ, Cepeda MS, Suchard MA, Yang J, Tian Y, Schuler A, Ryan PB, Madigan D, Hripcsak G. How Confident Are We about Observational Findings in Healthcare: A Benchmark Study. HARVARD DATA SCIENCE REVIEW 2020; 2. [PMID: 33367288 DOI: 10.1162/99608f92.147cc28e] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Healthcare professionals increasingly rely on observational healthcare data, such as administrative claims and electronic health records, to estimate the causal effects of interventions. However, limited prior studies raise concerns about the real-world performance of the statistical and epidemiological methods that are used. We present the "OHDSI Methods Benchmark" that aims to evaluate the performance of effect estimation methods on real data. The benchmark comprises a gold standard, a set of metrics, and a set of open source software tools. The gold standard is a collection of real negative controls (drug-outcome pairs where no causal effect appears to exist) and synthetic positive controls (drug-outcome pairs that augment negative controls with simulated causal effects). We apply the benchmark using four large healthcare databases to evaluate methods commonly used in practice: the new-user cohort, self-controlled cohort, case-control, case-crossover, and self-controlled case series designs. The results confirm the concerns about these methods, showing that for most methods the operating characteristics deviate considerably from nominal levels. For example, in most contexts, only half of the 95% confidence intervals we calculated contain the corresponding true effect size. We previously developed an "empirical calibration" procedure to restore these characteristics and we also evaluate this procedure. While no one method dominates, self-controlled methods such as the empirically calibrated self-controlled case series perform well across a wide range of scenarios.
Collapse
Affiliation(s)
- Martijn J Schuemie
- Observational Health Data Sciences and Informatics.,Epidemiology Analytics, Janssen Research and Development.,Department of Biostatistics, University of California, Los Angeles
| | - M Soledad Cepeda
- Observational Health Data Sciences and Informatics.,Epidemiology Analytics, Janssen Research and Development
| | - Marc A Suchard
- Observational Health Data Sciences and Informatics.,Department of Biostatistics, University of California, Los Angeles.,Department of Biomathematics, University of California, Los Angeles.,Department of Human Genetics, University of California, Los Angeles
| | - Jianxiao Yang
- Observational Health Data Sciences and Informatics.,Department of Biomathematics, University of California, Los Angeles
| | - Yuxi Tian
- Observational Health Data Sciences and Informatics.,Department of Biomathematics, University of California, Los Angeles
| | - Alejandro Schuler
- Observational Health Data Sciences and Informatics.,Center for Biomedical Informatics Research, Stanford University
| | - Patrick B Ryan
- Observational Health Data Sciences and Informatics.,Epidemiology Analytics, Janssen Research and Development.,Department of Biomedical Informatics, Columbia University
| | - David Madigan
- Observational Health Data Sciences and Informatics.,Department of Statistics, Columbia University
| | - George Hripcsak
- Observational Health Data Sciences and Informatics.,Department of Biomedical Informatics, Columbia University.,Medical Informatics Services, New York-Presbyterian Hospital
| |
Collapse
|
32
|
Spiro A, Fernández García J, Yanover C. Inferring new relations between medical entities using literature curated term co-occurrences. JAMIA Open 2020; 2:378-385. [PMID: 31984370 PMCID: PMC6951958 DOI: 10.1093/jamiaopen/ooz022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/05/2019] [Accepted: 06/08/2019] [Indexed: 11/17/2022] Open
Abstract
Objectives Identifying new relations between medical entities, such as drugs, diseases, and side effects, is typically a resource-intensive task, involving experimentation and clinical trials. The increased availability of related data and curated knowledge enables a computational approach to this task, notably by training models to predict likely relations. Such models rely on meaningful representations of the medical entities being studied. We propose a generic features vector representation that leverages co-occurrences of medical terms, linked with PubMed citations. Materials and Methods We demonstrate the usefulness of the proposed representation by inferring two types of relations: a drug causes a side effect and a drug treats an indication. To predict these relations and assess their effectiveness, we applied 2 modeling approaches: multi-task modeling using neural networks and single-task modeling based on gradient boosting machines and logistic regression. Results These trained models, which predict either side effects or indications, obtained significantly better results than baseline models that use a single direct co-occurrence feature. The results demonstrate the advantage of a comprehensive representation. Discussion Selecting the appropriate representation has an immense impact on the predictive performance of machine learning models. Our proposed representation is powerful, as it spans multiple medical domains and can be used to predict a wide range of relation types. Conclusion The discovery of new relations between various medical entities can be translated into meaningful insights, for example, related to drug development or disease understanding. Our representation of medical entities can be used to train models that predict such relations, thus accelerating healthcare-related discoveries.
Collapse
Affiliation(s)
- Adam Spiro
- Machine Learning for Healthcare and Life Sciences, Department of Health Informatics, IBM Research, Haifa, Israel
| | - Jonatan Fernández García
- Machine Learning for Healthcare and Life Sciences, Department of Health Informatics, IBM Research, Haifa, Israel
| | - Chen Yanover
- Machine Learning for Healthcare and Life Sciences, Department of Health Informatics, IBM Research, Haifa, Israel
| |
Collapse
|
33
|
A Comparison Study of Algorithms to Detect Drug-Adverse Event Associations: Frequentist, Bayesian, and Machine-Learning Approaches. Drug Saf 2020; 42:743-750. [PMID: 30762164 DOI: 10.1007/s40264-018-00792-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
INTRODUCTION It is important to monitor the safety profile of drugs, and mining for strong associations between drugs and adverse events is an effective and inexpensive method of post-marketing safety surveillance. OBJECTIVE The objective of our work was to compare the accuracy of both common and innovative methods of data mining for pharmacovigilance purposes. METHODS We used the reference standard provided by the Observational Medical Outcomes Partnership, which contains 398 drug-adverse event pairs (165 positive controls, 233 negative controls). Ten methods and algorithms were applied to the US FDA Adverse Event Reporting System data to investigate the 398 pairs. The ten methods include popular methods in the pharmacovigilance literature, newly developed pharmacovigilance methods as at 2018, and popular methods in the genome-wide association study literature. We compared their performance using the receiver operating characteristic (ROC) plot, area under the curve (AUC), and Youden's index. RESULTS The Bayesian confidence propagation neural network had the highest AUC overall. Monte Carlo expectation maximization, a method developed in 2018, had the second highest AUC and the highest Youden's index, and performed very well in terms of high specificity. The regression-adjusted gamma Poisson shrinkage model performed best under high-sensitivity requirements. CONCLUSION Our results will be useful to help choose a method for a given desired level of specificity. Methods popular in the genome-wide association study literature did not perform well because of the sparsity of data and will need modification before their properties can be used in the drug-adverse event association problem.
Collapse
|
34
|
Mower J, Cohen T, Subramanian D. Complementing Observational Signals with Literature-Derived Distributed Representations for Post-Marketing Drug Surveillance. Drug Saf 2019; 43:67-77. [PMID: 31646442 DOI: 10.1007/s40264-019-00872-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
INTRODUCTION As a result of the well documented limitations of data collected by spontaneous reporting systems (SRS), such as bias and under-reporting, a number of authors have evaluated the utility of other data sources for the purpose of pharmacovigilance, including the biomedical literature. Previous work has demonstrated the utility of literature-derived distributed representations (concept embeddings) with machine learning for the purpose of drug side-effect prediction. In terms of data sources, these methods are complementary, observing drug safety from two different perspectives (knowledge extracted from the literature and statistics from SRS data). However, the combined utility of these pharmacovigilance methods has yet to be evaluated. OBJECTIVE This research investigates the utility of directly or indirectly combining an observational signal from SRS with literature-derived distributed representations into a single feature vector or in an ensemble approach for downstream machine learning (logistic regression). METHODS Leveraging a recently developed representation scheme, concept embeddings were generated from relational connections extracted from the literature and composed to represent drug and associated adverse reactions, as defined by two reference standards of positive (likely causal) and negative (no causal evidence) pairs. Embeddings were presented with and without common measures of observational signal from SRS sources to logistic regressors, and performance was evaluated with the receiver operating characteristic (ROC) area under the curve (AUC) metric. RESULTS ROC AUC performance with these composite models improves up to ≈ 20% over SRS-based disproportionality metrics alone and exceeds the best prior results reported in the literature when models leverage both sources of information. CONCLUSIONS Results from this study support the hypothesis that knowledge extracted from the literature can enhance the performance of SRS-based methods (and vice versa). Across reference sets, using literature and SRS information together performed better than using either source alone, providing strong support for the complementary nature of these approaches to post-marketing drug surveillance.
Collapse
Affiliation(s)
- Justin Mower
- Department of Computer Science, Rice University, Houston, TX, 77018, USA.
| | - Trevor Cohen
- University of Washington, Biomedical Informatics and Medical Education, Seattle, WA, 98195, USA
| | - Devika Subramanian
- Department of Computer Science, Rice University, Houston, TX, 77018, USA
| |
Collapse
|
35
|
Koutkias V. From Data Silos to Standardized, Linked, and FAIR Data for Pharmacovigilance: Current Advances and Challenges with Observational Healthcare Data. Drug Saf 2019; 42:583-586. [PMID: 30666591 DOI: 10.1007/s40264-018-00793-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Vassilis Koutkias
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, 6th Km. Charilaou-Thermi Road, Thermi, P.O. Box 60631, 57001, Thessaloniki, Greece.
| |
Collapse
|
36
|
Thilakaratne M, Falkner K, Atapattu T. A systematic review on literature-based discovery workflow. PeerJ Comput Sci 2019; 5:e235. [PMID: 33816888 PMCID: PMC7924697 DOI: 10.7717/peerj-cs.235] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/17/2019] [Indexed: 05/02/2023]
Abstract
As scientific publication rates increase, knowledge acquisition and the research development process have become more complex and time-consuming. Literature-Based Discovery (LBD), supporting automated knowledge discovery, helps facilitate this process by eliciting novel knowledge by analysing existing scientific literature. This systematic review provides a comprehensive overview of the LBD workflow by answering nine research questions related to the major components of the LBD workflow (i.e., input, process, output, and evaluation). With regards to the input component, we discuss the data types and data sources used in the literature. The process component presents filtering techniques, ranking/thresholding techniques, domains, generalisability levels, and resources. Subsequently, the output component focuses on the visualisation techniques used in LBD discipline. As for the evaluation component, we outline the evaluation techniques, their generalisability, and the quantitative measures used to validate results. To conclude, we summarise the findings of the review for each component by highlighting the possible future research directions.
Collapse
Affiliation(s)
- Menasha Thilakaratne
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Katrina Falkner
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Thushari Atapattu
- Faculty of Engineering, Computer and Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| |
Collapse
|
37
|
Ryan PB, Buse JB, Schuemie MJ, DeFalco F, Yuan Z, Stang PE, Berlin JA, Rosenthal N. Comparative effectiveness of canagliflozin, SGLT2 inhibitors and non-SGLT2 inhibitors on the risk of hospitalization for heart failure and amputation in patients with type 2 diabetes mellitus: A real-world meta-analysis of 4 observational databases (OBSERVE-4D). Diabetes Obes Metab 2018; 20:2585-2597. [PMID: 29938883 PMCID: PMC6220807 DOI: 10.1111/dom.13424] [Citation(s) in RCA: 133] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 06/01/2018] [Accepted: 06/12/2018] [Indexed: 12/16/2022]
Abstract
AIMS Sodium glucose co-transporter 2 inhibitors (SGLT2i) are indicated for treatment of type 2 diabetes mellitus (T2DM); some SGLT2i have reported cardiovascular benefit, and some have reported risk of below-knee lower extremity (BKLE) amputation. This study examined the real-world comparative effectiveness within the SGLT2i class and compared with non-SGLT2i antihyperglycaemic agents. MATERIALS AND METHODS Data from 4 large US administrative claims databases were used to characterize risk and provide population-level estimates of canagliflozin's effects on hospitalization for heart failure (HHF) and BKLE amputation vs other SGLT2i and non-SGLT2i in T2DM patients. Comparative analyses using a propensity score-adjusted new-user cohort design examined relative hazards of outcomes across all new users and a subpopulation with established cardiovascular disease. RESULTS Across the 4 databases (142 800 new users of canagliflozin, 110 897 new users of other SGLT2i, 460 885 new users of non-SGLT2i), the meta-analytic hazard ratio estimate for HHF with canagliflozin vs non-SGLT2i was 0.39 (95% CI, 0.26-0.60) in the on-treatment analysis. The estimate for BKLE amputation with canagliflozin vs non-SGLT2i was 0.75 (95% CI, 0.40-1.41) in the on-treatment analysis and 1.01 (95% CI, 0.93-1.10) in the intent-to-treat analysis. Effects in the subpopulation with established cardiovascular disease were similar for both outcomes. No consistent differences were observed between canagliflozin and other SGLT2i. CONCLUSIONS In this large comprehensive analysis, canagliflozin and other SGLT2i demonstrated HHF benefits consistent with clinical trial data, but showed no increased risk of BKLE amputation vs non-SGLT2i. HHF and BKLE amputation results were similar in the subpopulation with established cardiovascular disease. This study helps further characterize the potential benefits and harms of SGLT2i in routine clinical practice to complement evidence from clinical trials and prior observational studies.
Collapse
Affiliation(s)
| | - John B. Buse
- University of North Carolina School of Medicine, Department of MedicineChapel HillNorth Carolina
| | | | - Frank DeFalco
- Janssen Research & Development, LLCRaritanNew Jersey
| | - Zhong Yuan
- Janssen Research & Development, LLCTitusvilleNew Jersey
| | - Paul E. Stang
- Janssen Research & Development, LLCTitusvilleNew Jersey
| | | | | |
Collapse
|
38
|
Zhou X, Douglas IJ, Shen R, Bate A. Signal Detection for Recently Approved Products: Adapting and Evaluating Self-Controlled Case Series Method Using a US Claims and UK Electronic Medical Records Database. Drug Saf 2018; 41:523-536. [PMID: 29327136 DOI: 10.1007/s40264-017-0626-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
INTRODUCTION The Self-Controlled Case Series (SCCS) method has been widely used for hypothesis testing, but there is limited evidence of its performance for safety signal detection. OBJECTIVE The objective of this study was to evaluate SCCS for signal detection on recently approved products. METHODS A retrospective study covered the period after three recently marketed drugs were launched through to 31 December 2010 using The Health Improvement Network, a UK primary care database, and Optum, a US claims database. The SCCS method was applied to examine five heterogenous outcomes with desvenlafaxine and escitalopram and six outcomes with adalimumab for Signals of Disproportional Recording (SDRs); a positive finding was determined to be when the lower bound of 95% Confidence Interval of the incidence rate ratio (IRR) estimate was > 1. Multiple design choices were tested and the trend in IRR estimates over calendar time for one drug event pair was examined. RESULTS All six outcomes with adalimumab, three of five outcomes with desvenlafaxine, and four of five outcomes with escitalopram had SDRs. SCCS highlighted all acute events in the primary analysis but was less successful with slower-onset outcomes. Performance varied by risk period definition. Changes in IRR estimates over quarterly intervals for adalimumab with herpes zoster showed marked higher SDR within 9 months of drug launch. CONCLUSION SCCS shows promise for signal detection: it may highlight known associations for recent marketed products and has potential for early signal identification. SCCS performance varied by design choice and the nature of both exposure and event pair. Future work is needed to determine how effective the approach is in prospective testing and determining the performance characteristics of the approach.
Collapse
Affiliation(s)
- Xiaofeng Zhou
- Epidemiology, Worldwide Safety and Regulatory, Pfizer Inc, 219 E. 42nd Street, Mail Stop 219/9/01, New York, NY, 10017, USA.
| | - Ian J Douglas
- London School of Hygiene & Tropical Medicine, London, UK
| | - Rongjun Shen
- Epidemiology, Worldwide Safety and Regulatory, Pfizer Inc, 219 E. 42nd Street, Mail Stop 219/9/01, New York, NY, 10017, USA
| | - Andrew Bate
- Epidemiology, Worldwide Safety and Regulatory, Pfizer Inc, 219 E. 42nd Street, Mail Stop 219/9/01, New York, NY, 10017, USA.,Division of Clinical Pharmacology, NYU School of Medicine, New York, NY, USA
| |
Collapse
|
39
|
Mower J, Subramanian D, Cohen T. Learning predictive models of drug side-effect relationships from distributed representations of literature-derived semantic predications. J Am Med Inform Assoc 2018; 25:1339-1350. [PMID: 30010902 PMCID: PMC6454491 DOI: 10.1093/jamia/ocy077] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 04/23/2018] [Accepted: 06/05/2018] [Indexed: 02/01/2023] Open
Abstract
Objective The aim of this work is to leverage relational information extracted from biomedical literature using a novel synthesis of unsupervised pretraining, representational composition, and supervised machine learning for drug safety monitoring. Methods Using ≈80 million concept-relationship-concept triples extracted from the literature using the SemRep Natural Language Processing system, distributed vector representations (embeddings) were generated for concepts as functions of their relationships utilizing two unsupervised representational approaches. Embeddings for drugs and side effects of interest from two widely used reference standards were then composed to generate embeddings of drug/side-effect pairs, which were used as input for supervised machine learning. This methodology was developed and evaluated using cross-validation strategies and compared to contemporary approaches. To qualitatively assess generalization, models trained on the Observational Medical Outcomes Partnership (OMOP) drug/side-effect reference set were evaluated against a list of ≈1100 drugs from an online database. Results The employed method improved performance over previous approaches. Cross-validation results advance the state of the art (AUC 0.96; F1 0.90 and AUC 0.95; F1 0.84 across the two sets), outperforming methods utilizing literature and/or spontaneous reporting system data. Examination of predictions for unseen drug/side-effect pairs indicates the ability of these methods to generalize, with over tenfold label support enrichment in the top 100 predictions versus the bottom 100 predictions. Discussion and Conclusion Our methods can assist the pharmacovigilance process using information from the biomedical literature. Unsupervised pretraining generates a rich relationship-based representational foundation for machine learning techniques to classify drugs in the context of a putative side effect, given known examples.
Collapse
Affiliation(s)
- Justin Mower
- Baylor College of Medicine, Quantitative and Computational Biosciences, Houston, Texas, USA
| | | | - Trevor Cohen
- School of Biomedical Informatics, University of Texas Health Science Center Houston, Texas, USA
| |
Collapse
|
40
|
Gulmez SE, Unal US, Lassalle R, Chartier A, Grolleau A, Moore N. Risk of hospital admission for liver injury in users of NSAIDs and nonoverdose paracetamol: Preliminary results from the EPIHAM study. Pharmacoepidemiol Drug Saf 2018; 27:1174-1181. [PMID: 30112779 DOI: 10.1002/pds.4640] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2017] [Revised: 07/12/2018] [Accepted: 07/17/2018] [Indexed: 12/12/2022]
Abstract
PURPOSE The SALT study found similar per-user risks of acute liver failure (ALF) leading to transplantation (ALFT) between NSAIDs and a threefold higher risk in nonoverdose paracetamol (NOP) users. The objective of EPIHAM was to identify the risks of hospital admission for acute liver injury (ALI) associated with NSAIDs and NOP. METHODS Case-population study in the 1/97 sample of the French population claims database. Acute liver injury was identified from hospital discharge summaries, from 2009 to 2013. Exposure for cases was dispensing of NSAID or NOP resulting in exposure within 30 days before admission. Population exposure was the number of patients using the drugs over the study timeframe and total number of DDD dispensed. RESULTS Of 63 cases of ALI, 13 had been exposed to NSAIDs and 24 to NOP. Events per million DDD (95% CI) ranged from 0.46 (0.09-1.34) (ketoprofen) to 1.43 (0.04-7.97) (diclofenac combinations), 0.43 (0.23-0.73) all NSAIDs combined, 0.58 (0.37-0.86) for NOP. There was no association with average duration of treatment. Per patient risk ranged from 19.5 (5.31-49.9) (ibuprofen) per million users to 37.2 (19.8-63.6) all NSAIDs combined, 58.0 (37.2-86.3) for NOP. There was a linear relationship between average treatment duration and per-user risk (R2 = 0.51, P < .05 for NSAIDs, R2 = 0.97, P < .01 for NOP). CONCLUSIONS Risk of hospital admission for ALI with NSAIDs and NOP was similar and indicative of a dose and duration-related effect (pharmacological) effect. Acute liver injury rates were not predictive of ALFT risk.
Collapse
Affiliation(s)
- Sinem Ezgi Gulmez
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Ulku Sur Unal
- Tekirdağ Çerkezköy Tepe Emlak Family Medicine Centre,, Cumhuriyet District Tepe Emlak Part 2, Çerkezköy-Tekirdağ, Turkey
| | - Régis Lassalle
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Anaïs Chartier
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Adeline Grolleau
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| | - Nicholas Moore
- Bordeaux PharmacoEpi, INSERM CIC1401, Université de Bordeaux, Bordeaux, France
| |
Collapse
|
41
|
Zhang P, Li M, Chiang C, Wang L, Xiang Y, Cheng L, Feng W, Schleyer TK, Quinney SK, Wu H, Zeng D, Li L. Three-Component Mixture Model-Based Adverse Drug Event Signal Detection for the Adverse Event Reporting System. CPT Pharmacometrics Syst Pharmacol 2018; 7:499-506. [PMID: 30091855 PMCID: PMC6118321 DOI: 10.1002/psp4.12294] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 02/26/2018] [Indexed: 01/24/2023] Open
Abstract
The US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) is an important source for detecting adverse drug event (ADE) signals. In this article, we propose a three-component mixture model (3CMM) for FAERS signal detection. In 3CMM, a drug-ADE pair is assumed to have either a zero relative risk (RR), or a background RR (mean RR = 1), or an increased RR (mean RR >1). By clearly defining the second component (mean RR = 1) as the null distribution, 3CMM estimates local false discovery rates (FDRs) for ADE signals under the empirical Bayes framework. Compared with existing approaches, the local FDR's top signals have noninferior or better sensitivities to detect true signals in both FAERS analysis and simulation studies. Additionally, we identify that the top signals of different approaches have different patterns, and they are complementary to each other.
Collapse
Affiliation(s)
- Pengyue Zhang
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
| | - Meng Li
- Biomedical Engineering InstituteCollege of Automation, Harbin Engineering UniversityHarbinHeilongjiangChina
- CAS‐MPG Partner Institute for Computational BiologyShanghai Institutes for Biological SciencesShanghaiChina
| | - Chien‐Wei Chiang
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
| | - Lei Wang
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
- Biomedical Engineering InstituteCollege of Automation, Harbin Engineering UniversityHarbinHeilongjiangChina
| | - Yang Xiang
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
| | - Lijun Cheng
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
| | - Weixing Feng
- Biomedical Engineering InstituteCollege of Automation, Harbin Engineering UniversityHarbinHeilongjiangChina
| | | | - Sara K. Quinney
- Department of Obstetrics and GynecologyIndiana UniversityIndianapolisIndianaUSA
| | - Heng‐Yi Wu
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
| | - Donglin Zeng
- Department of BiostatisticsUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Lang Li
- Department of Biomedical InformaticsCollege of Medicine, the Ohio State UniversityColumbusOhioUSA
| |
Collapse
|
42
|
Trinh NTH, Solé E, Benkebil M. Benefits of combining change-point analysis with disproportionality analysis in pharmacovigilance signal detection. Pharmacoepidemiol Drug Saf 2018; 28:370-376. [PMID: 29992679 DOI: 10.1002/pds.4613] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/08/2018] [Accepted: 06/04/2018] [Indexed: 11/08/2022]
Abstract
BACKGROUND Change-point analysis (CPA) is a powerful method to analyse pharmacovigilance data but it has never been used on the disproportionality metric. OBJECTIVES To optimize signal detection investigating the interest of time-series analysis in pharmacovigilance and the benefits of combining CPA with the proportional reporting ratio (PRR). METHODS We investigated the couple benfluorex and aortic valve incompetence (AVI) using the French National Pharmacovigilance and EudraVigilance databases: CPA was applied on monthly counts of reports and the lower bound of monthly computed PRR (PRR-). We stated a CPA hypothesis that the substance-event combination is more likely to be a signal when the 2 following criteria are fulfilled: PRR- is greater than 1 with at least 5 cases, and CPA method detects at least 2 successive change points of PRR- which made consecutively increasing segments. We tested this hypothesis by 95 test cases identified from a drug safety reference set and 2 validated signals from EudraVigilance database: CPA was applied on PRR-. RESULTS For benfluorex and AVI, change points detected by CPA on PRR- were more meaningful compared with monthly counts of reports: More change points detected and detected earlier. In the reference set, 14 positive controls satisfied CPA hypothesis, 6 positive controls only met first requirements, 3 negative controls only met first requirement, and 2 validated signals satisfied CPA hypothesis. CONCLUSIONS The combination of CPA and PRR represents a significant advantage in detecting earlier signals and reducing false-positive signals. This approach should be confirmed in further studies.
Collapse
Affiliation(s)
- Nhung T H Trinh
- Inserm UMR 1153, Obstetrical, Perinatal and Pediatric Epidemiology Research Team, Research Center for Epidemiology and Biostatistics Sorbonne Paris Cité (CRESS), Paris Descartes University, Paris, France.,Adverse Events and incidents Department-Surveillance Division, Agence nationale de sécurité du médicament et des produits de santé (ANSM), Saint Denis, France
| | - Elodie Solé
- Adverse Events and incidents Department-Surveillance Division, Agence nationale de sécurité du médicament et des produits de santé (ANSM), Saint Denis, France
| | - Mehdi Benkebil
- Adverse Events and incidents Department-Surveillance Division, Agence nationale de sécurité du médicament et des produits de santé (ANSM), Saint Denis, France
| |
Collapse
|
43
|
Channeling in the Use of Nonprescription Paracetamol and Ibuprofen in an Electronic Medical Records Database: Evidence and Implications. Drug Saf 2018; 40:1279-1292. [PMID: 28780741 PMCID: PMC5688206 DOI: 10.1007/s40264-017-0581-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Introduction Over-the-counter analgesics such as paracetamol and ibuprofen are among the most widely used, and having a good understanding of their safety profile is important to public health. Prior observational studies estimating the risks associated with paracetamol use acknowledge the inherent limitations of these studies. One threat to the validity of observational studies is channeling bias, i.e. the notion that patients are systematically exposed to one drug or the other, based on current and past comorbidities, in a manner that affects estimated relative risk. Objectives The aim of this study was to examine whether evidence of channeling bias exists in observational studies that compare paracetamol with ibuprofen, and, if so, the extent to which confounding adjustment can mitigate this bias. Study Design and Setting In a cohort of 140,770 patients, we examined whether those who received any paracetamol (including concomitant users) were more likely to have prior diagnoses of gastrointestinal (GI) bleeding, myocardial infarction (MI), stroke, or renal disease than those who received ibuprofen alone. We compared propensity score distributions between drugs, and examined the degree to which channeling bias could be controlled using a combination of negative control disease outcome models and large-scale propensity score matching. Analyses were conducted using the Clinical Practice Research Datalink. Results The proportions of prior MI, GI bleeding, renal disease, and stroke were significantly higher in those prescribed any paracetamol versus ibuprofen alone, after adjusting for sex and age. We were not able to adequately remove selection bias using a selected set of covariates for propensity score adjustment; however, when we fit the propensity score model using a substantially larger number of covariates, evidence of residual bias was attenuated. Conclusions Although using selected covariates for propensity score adjustment may not sufficiently reduce bias, large-scale propensity score matching offers a novel approach to consider to mitigate the effects of channeling bias. Electronic supplementary material The online version of this article (doi:10.1007/s40264-017-0581-7) contains supplementary material, which is available to authorized users.
Collapse
|
44
|
Peng M, Sundararajan V, Williamson T, Minty EP, Smith TC, Doktorchik CTA, Quan H. Exploration of association rule mining for coding consistency and completeness assessment in inpatient administrative health data. J Biomed Inform 2018; 79:41-47. [PMID: 29425732 DOI: 10.1016/j.jbi.2018.02.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/23/2018] [Accepted: 02/04/2018] [Indexed: 10/18/2022]
Abstract
OBJECTIVE Data quality assessment is a challenging facet for research using coded administrative health data. Current assessment approaches are time and resource intensive. We explored whether association rule mining (ARM) can be used to develop rules for assessing data quality. MATERIALS AND METHODS We extracted 2013 and 2014 records from the hospital discharge abstract database (DAD) for patients between the ages of 55 and 65 from five acute care hospitals in Alberta, Canada. The ARM was conducted using the 2013 DAD to extract rules with support ≥0.0019 and confidence ≥0.5 using the bootstrap technique, and tested in the 2014 DAD. The rules were compared against the method of coding frequency and assessed for their ability to detect error introduced by two kinds of data manipulation: random permutation and random deletion. RESULTS The association rules generally had clear clinical meanings. Comparing 2014 data to 2013 data (both original), there were 3 rules with a confidence difference >0.1, while coding frequency difference of codes in the right hand of rules was less than 0.004. After random permutation of 50% of codes in the 2014 data, average rule confidence dropped from 0.72 to 0.27 while coding frequency remained unchanged. Rule confidence decreased with the increase of coding deletion, as expected. Rule confidence was more sensitive to code deletion compared to coding frequency, with slope of change ranging from 1.7 to 184.9 with a median of 9.1. CONCLUSION The ARM is a promising technique to assess data quality. It offers a systematic way to derive coding association rules hidden in data, and potentially provides a sensitive and efficient method of assessing data quality compared to standard methods.
Collapse
Affiliation(s)
- Mingkai Peng
- Department of Community Health Sciences, University of Calgary, Calgary, Canada.
| | - Vijaya Sundararajan
- Department of Medicine, St. Vincent's Hospital, University of Melbourne, Melbourne, Australia
| | - Tyler Williamson
- Department of Community Health Sciences, University of Calgary, Calgary, Canada
| | - Evan P Minty
- Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Tony C Smith
- Department of Computer Science, University of Waikato, Hamilton, New Zealand
| | | | - Hude Quan
- Department of Community Health Sciences, University of Calgary, Calgary, Canada
| |
Collapse
|
45
|
An MCEM Framework for Drug Safety Signal Detection and Combination from Heterogeneous Real World Evidence. Sci Rep 2018; 8:1806. [PMID: 29379048 PMCID: PMC5789130 DOI: 10.1038/s41598-018-19979-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 01/11/2018] [Indexed: 11/08/2022] Open
Abstract
Delayed drug safety insights can impact patients, pharmaceutical companies, and the whole society. Post-market drug safety surveillance plays a critical role in providing drug safety insights, where real world evidence such as spontaneous reporting systems (SRS) and a series of disproportional analysis serve as a cornerstone of proactive and predictive drug safety surveillance. However, they still face several challenges including concomitant drugs confounders, rare adverse drug reaction (ADR) detection, data bias, and the under-reporting issue. In this paper, we are developing a new framework that detects improved drug safety signals from multiple data sources via Monte Carlo Expectation-Maximization (MCEM) and signal combination. In MCEM procedure, we propose a new sampling approach to generate more accurate SRS signals for each ADR through iteratively down-weighting their associations with irrelevant drugs in case reports. While in signal combination step, we adopt Bayesian hierarchical model and propose a new summary statistic such that SRS signals can be combined with signals derived from other observational health data allowing for related signals to borrow statistical support with adjustment of data reliability. They combined effectively alleviate the concomitant confounders, data bias, rare ADR and under-reporting issues. Experimental results demonstrated the effectiveness and usefulness of the proposed framework.
Collapse
|
46
|
Harpaz R, DuMouchel W, Schuemie M, Bodenreider O, Friedman C, Horvitz E, Ripple A, Sorbello A, White RW, Winnenburg R, Shah NH. Toward multimodal signal detection of adverse drug reactions. J Biomed Inform 2017; 76:41-49. [PMID: 29081385 PMCID: PMC8502488 DOI: 10.1016/j.jbi.2017.10.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Revised: 10/14/2017] [Accepted: 10/24/2017] [Indexed: 11/27/2022]
Abstract
OBJECTIVE Improving mechanisms to detect adverse drug reactions (ADRs) is key to strengthening post-marketing drug safety surveillance. Signal detection is presently unimodal, relying on a single information source. Multimodal signal detection is based on jointly analyzing multiple information sources. Building on, and expanding the work done in prior studies, the aim of the article is to further research on multimodal signal detection, explore its potential benefits, and propose methods for its construction and evaluation. MATERIAL AND METHODS Four data sources are investigated; FDA's adverse event reporting system, insurance claims, the MEDLINE citation database, and the logs of major Web search engines. Published methods are used to generate and combine signals from each data source. Two distinct reference benchmarks corresponding to well-established and recently labeled ADRs respectively are used to evaluate the performance of multimodal signal detection in terms of area under the ROC curve (AUC) and lead-time-to-detection, with the latter relative to labeling revision dates. RESULTS Limited to our reference benchmarks, multimodal signal detection provides AUC improvements ranging from 0.04 to 0.09 based on a widely used evaluation benchmark, and a comparative added lead-time of 7-22 months relative to labeling revision dates from a time-indexed benchmark. CONCLUSIONS The results support the notion that utilizing and jointly analyzing multiple data sources may lead to improved signal detection. Given certain data and benchmark limitations, the early stage of development, and the complexity of ADRs, it is currently not possible to make definitive statements about the ultimate utility of the concept. Continued development of multimodal signal detection requires a deeper understanding the data sources used, additional benchmarks, and further research on methods to generate and synthesize signals.
Collapse
Affiliation(s)
- Rave Harpaz
- Oracle Health Sciences, Bedford, MA, United States.
| | | | | | | | | | | | - Anna Ripple
- National Library of Medicine, NIH, Bethesda, MD, United States
| | | | | | | | - Nigam H Shah
- Stanford University, Stanford, CA, United States
| |
Collapse
|
47
|
Zhou X, Bao W, Gaffney M, Shen R, Young S, Bate A. Assessing performance of sequential analysis methods for active drug safety surveillance using observational data. J Biopharm Stat 2017; 28:668-681. [PMID: 29157113 DOI: 10.1080/10543406.2017.1372776] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The routine use of sequential methods is well established in clinical studies. Recently, there has been increasing interest in applying these methods to prospectively monitor the safety of newly approved drugs through accrual of real-world data. However, the application to marketed drugs using real-world data has been limited and work is needed to determine which sequential approaches are most suited to such data. In this study, the conditional sequential sampling procedure (CSSP), a group sequential method, was compared with a log-linear model with Poisson distribution (LLMP) through a SAS procedure (PROC GENMOD) combined with an alpha-spending function on two large longitudinal US administrative health claims databases. Relative performance in identifying known drug-outcome associations was examined using a set of 50 well-studied drug-outcome pairs. The study finds that neither method correctly identified all pairs but that LLMP often provides better ability and shorter time for identifying the known drug-outcome associations with superior computational performance when compared with CSSP, albeit with more false positives. With the features of flexible confounding control and ease of implementation, LLMP may be a good alternative or complement to CSSP.
Collapse
Affiliation(s)
- Xiaofeng Zhou
- a Epidemiology , Worldwide Safety and Regulatory, Pfizer Inc , New York , NY , USA
| | - Warren Bao
- a Epidemiology , Worldwide Safety and Regulatory, Pfizer Inc , New York , NY , USA
| | - Mike Gaffney
- a Epidemiology , Worldwide Safety and Regulatory, Pfizer Inc , New York , NY , USA
| | - Rongjun Shen
- a Epidemiology , Worldwide Safety and Regulatory, Pfizer Inc , New York , NY , USA
| | - Sarah Young
- a Epidemiology , Worldwide Safety and Regulatory, Pfizer Inc , New York , NY , USA
| | - Andrew Bate
- a Epidemiology , Worldwide Safety and Regulatory, Pfizer Inc , New York , NY , USA
| |
Collapse
|
48
|
Bezin J, Duong M, Lassalle R, Droz C, Pariente A, Blin P, Moore N. The national healthcare system claims databases in France, SNIIRAM and EGB: Powerful tools for pharmacoepidemiology. Pharmacoepidemiol Drug Saf 2017; 26:954-962. [PMID: 28544284 DOI: 10.1002/pds.4233] [Citation(s) in RCA: 400] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Revised: 03/30/2017] [Accepted: 04/23/2017] [Indexed: 12/11/2022]
Affiliation(s)
- Julien Bezin
- Department of Medical Pharmacology, CHU de Bordeaux; Université de Bordeaux; 33076 Bordeaux France
- INSERM U1219; 33076 Bordeaux France
| | - Mai Duong
- INSERM U1219; 33076 Bordeaux France
- Bordeaux PharmacoEpi; INSERM CIC1401; 33076 Bordeaux France
| | - Régis Lassalle
- Bordeaux PharmacoEpi; INSERM CIC1401; 33076 Bordeaux France
| | - Cécile Droz
- Bordeaux PharmacoEpi; INSERM CIC1401; 33076 Bordeaux France
| | - Antoine Pariente
- Department of Medical Pharmacology, CHU de Bordeaux; Université de Bordeaux; 33076 Bordeaux France
- INSERM U1219; 33076 Bordeaux France
| | - Patrick Blin
- Bordeaux PharmacoEpi; INSERM CIC1401; 33076 Bordeaux France
| | - Nicholas Moore
- Department of Medical Pharmacology, CHU de Bordeaux; Université de Bordeaux; 33076 Bordeaux France
- INSERM U1219; 33076 Bordeaux France
- Bordeaux PharmacoEpi; INSERM CIC1401; 33076 Bordeaux France
| |
Collapse
|
49
|
Arnaud M, Bégaud B, Thurin N, Moore N, Pariente A, Salvo F. Methods for safety signal detection in healthcare databases: a literature review. Expert Opin Drug Saf 2017; 16:721-732. [DOI: 10.1080/14740338.2017.1325463] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Affiliation(s)
- Mickael Arnaud
- University of Bordeaux, Bordeaux, France
- Bordeaux Population Health Research Centre, Pharmacoepidemiology team, INSERM UMR1219, Bordeaux, France
| | - Bernard Bégaud
- University of Bordeaux, Bordeaux, France
- Bordeaux Population Health Research Centre, Pharmacoepidemiology team, INSERM UMR1219, Bordeaux, France
- CHU Bordeaux, Service de Pharmacologie Médicale, Bordeaux, France
| | - Nicolas Thurin
- University of Bordeaux, Bordeaux, France
- Bordeaux Population Health Research Centre, Pharmacoepidemiology team, INSERM UMR1219, Bordeaux, France
- CIC Bordeaux
| | - Nicholas Moore
- University of Bordeaux, Bordeaux, France
- Bordeaux Population Health Research Centre, Pharmacoepidemiology team, INSERM UMR1219, Bordeaux, France
- CHU Bordeaux, Service de Pharmacologie Médicale, Bordeaux, France
- CIC Bordeaux
| | - Antoine Pariente
- University of Bordeaux, Bordeaux, France
- Bordeaux Population Health Research Centre, Pharmacoepidemiology team, INSERM UMR1219, Bordeaux, France
- CHU Bordeaux, Service de Pharmacologie Médicale, Bordeaux, France
- CIC Bordeaux
| | - Francesco Salvo
- University of Bordeaux, Bordeaux, France
- Bordeaux Population Health Research Centre, Pharmacoepidemiology team, INSERM UMR1219, Bordeaux, France
- CHU Bordeaux, Service de Pharmacologie Médicale, Bordeaux, France
| |
Collapse
|
50
|
Sorbello A, Ripple A, Tonning J, Munoz M, Hasan R, Ly T, Francis H, Bodenreider O. Harnessing scientific literature reports for pharmacovigilance. Prototype software analytical tool development and usability testing. Appl Clin Inform 2017; 8:291-305. [PMID: 28326432 PMCID: PMC5373771 DOI: 10.4338/aci-2016-11-ra-0188] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 01/14/2017] [Indexed: 11/23/2022] Open
Abstract
Objectives We seek to develop a prototype software analytical tool to augment FDA regulatory reviewers’ capacity to harness scientific literature reports in PubMed/MEDLINE for pharmacovigilance and adverse drug event (ADE) safety signal detection. We also aim to gather feedback through usability testing to assess design, performance, and user satisfaction with the tool. Methods A prototype, open source, web-based, software analytical tool generated statistical disproportionality data mining signal scores and dynamic visual analytics for ADE safety signal detection and management. We leveraged Medical Subject Heading (MeSH) indexing terms assigned to published citations in PubMed/MEDLINE to generate candidate drug-adverse event pairs for quantitative data mining. Six FDA regulatory reviewers participated in usability testing by employing the tool as part of their ongoing real-life pharmacovigilance activities to provide subjective feedback on its practical impact, added value, and fitness for use. Results All usability test participants cited the tool’s ease of learning, ease of use, and generation of quantitative ADE safety signals, some of which corresponded to known established adverse drug reactions. Potential concerns included the comparability of the tool’s automated literature search relative to a manual ‘all fields’ PubMed search, missing drugs and adverse event terms, interpretation of signal scores, and integration with existing computer-based analytical tools. Conclusions Usability testing demonstrated that this novel tool can automate the detection of ADE safety signals from published literature reports. Various mitigation strategies are described to foster improvements in design, productivity, and end user satisfaction.
Collapse
Affiliation(s)
- Alfred Sorbello
- Alfred Sorbello, DO, MPH, US Food and Drug Administration, Center for Drug Evaluation and Research, Office of Translational Sciences, 10903 New Hampshire Avenue, Silver Spring, MD 20993-0002 USA,
| | | | | | | | | | | | | | | |
Collapse
|