1
|
Ball R, Dal Pan G. "Artificial Intelligence" for Pharmacovigilance: Ready for Prime Time? Drug Saf 2022; 45:429-438. [PMID: 35579808 PMCID: PMC9112277 DOI: 10.1007/s40264-022-01157-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/10/2022] [Indexed: 01/28/2023]
Abstract
There is great interest in the application of 'artificial intelligence' (AI) to pharmacovigilance (PV). Although US FDA is broadly exploring the use of AI for PV, we focus on the application of AI to the processing and evaluation of Individual Case Safety Reports (ICSRs) submitted to the FDA Adverse Event Reporting System (FAERS). We describe a general framework for considering the readiness of AI for PV, followed by some examples of the application of AI to ICSR processing and evaluation in industry and FDA. We conclude that AI can usefully be applied to some aspects of ICSR processing and evaluation, but the performance of current AI algorithms requires a 'human-in-the-loop' to ensure good quality. We identify outstanding scientific and policy issues to be addressed before the full potential of AI can be exploited for ICSR processing and evaluation, including approaches to quality assurance of 'human-in-the-loop' AI systems, large-scale, publicly available training datasets, a well-defined and computable 'cognitive framework', a formal sociotechnical framework for applying AI to PV, and development of best practices for applying AI to PV. Practical experience with stepwise implementation of AI for ICSR processing and evaluation will likely provide important lessons that will inform the necessary policy and regulatory framework to facilitate widespread adoption and provide a foundation for further development of AI approaches to other aspects of PV.
Collapse
Affiliation(s)
- Robert Ball
- grid.483500.a0000 0001 2154 2448US Food and Drug Administration, Center for Drug Evaluation and Research, Office of Surveillance and Epidemiology, Silver Spring, MD USA
| | - Gerald Dal Pan
- grid.483500.a0000 0001 2154 2448US Food and Drug Administration, Center for Drug Evaluation and Research, Office of Surveillance and Epidemiology, Silver Spring, MD USA
| |
Collapse
|
2
|
Du J, Xiang Y, Sankaranarayanapillai M, Zhang M, Wang J, Si Y, Pham HA, Xu H, Chen Y, Tao C. Extracting postmarketing adverse events from safety reports in the vaccine adverse event reporting system (VAERS) using deep learning. J Am Med Inform Assoc 2021; 28:1393-1400. [PMID: 33647938 DOI: 10.1093/jamia/ocab014] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVE Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. MATERIALS AND METHODS We collected Guillain-Barré syndrome (GBS) related influenza vaccine safety reports from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016. VAERS reports were selected and manually annotated with major entities related to nervous system disorders, including, investigation, nervous_AE, other_AE, procedure, social_circumstance, and temporal_expression. A variety of conventional machine learning and deep learning algorithms were then evaluated for the extraction of the above entities. We further pretrained domain-specific BERT (Bidirectional Encoder Representations from Transformers) using VAERS reports (VAERS BERT) and compared its performance with existing models. RESULTS AND CONCLUSIONS Ninety-one VAERS reports were annotated, resulting in 2512 entities. The corpus was made publicly available to promote community efforts on vaccine AEs identification. Deep learning-based methods (eg, bi-long short-term memory and BERT models) outperformed conventional machine learning-based methods (ie, conditional random fields with extensive features). The BioBERT large model achieved the highest exact match F-1 scores on nervous_AE, procedure, social_circumstance, and temporal_expression; while VAERS BERT large models achieved the highest exact match F-1 scores on investigation and other_AE. An ensemble of these 2 models achieved the highest exact match microaveraged F-1 score at 0.6802 and the second highest lenient match microaveraged F-1 score at 0.8078 among peer models.
Collapse
Affiliation(s)
- Jingcheng Du
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yang Xiang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | | | - Meng Zhang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Jingqi Wang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yuqi Si
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Huy Anh Pham
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yong Chen
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Cui Tao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
3
|
Spiker J, Kreimeyer K, Dang O, Boxwell D, Chan V, Cheng C, Gish P, Lardieri A, Wu E, De S, Naidoo J, Lehmann H, Rosner GL, Ball R, Botsis T. Information Visualization Platform for Postmarket Surveillance Decision Support. Drug Saf 2020; 43:905-915. [DOI: 10.1007/s40264-020-00945-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
4
|
Cook MJ, Yao L, Wang X. Facilitating accurate health provider directories using natural language processing. BMC Med Inform Decis Mak 2019; 19:80. [PMID: 30943977 PMCID: PMC6448184 DOI: 10.1186/s12911-019-0788-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Accurate information in provider directories are vital in health care including health information exchange, health benefits exchange, quality reporting, and in the reimbursement and delivery of care. Maintaining provider directory data and keeping it up to date is challenging. The objective of this study is to determine the feasibility of using natural language processing (NLP) techniques to combine disparate resources and acquire accurate information on health providers. METHODS Publically available state licensure lists in Connecticut were obtained along with National Plan and Provider Enumeration System (NPPES) public use files. Connecticut licensure lists textual information of each health professional who is licensed to practice within the state. A NLP-based system was developed based on healthcare provider taxonomy code, location, name and address information to identify textual data within the state and federal records. Qualitative and quantitative evaluation were performed, and the recall and precision were calculated. RESULTS We identified nurse midwives, nurse practitioners, and dentists in the State of Connecticut. The recall and precision were 0.95 and 0.93 respectively. Using the system, we were able to accurately acquire 6849 of the 7177 records of health provider directory information. CONCLUSIONS The authors demonstrated that the NLP- based approach was effective at acquiring health provider information. Furthermore, the NLP-based system can always be applied to update information further reducing processing burdens as data changes.
Collapse
Affiliation(s)
- Matthew J. Cook
- Center for Quantitative Medicine, University of Connecticut Health Center, Farmington, CT 06030 USA
- Office of the Vice President for Research, University of Connecticut, Storrs, CT 06269 USA
- Department of Community Medicine and Health Care, University of Connecticut Health Center, Farmington, CT 06030 USA
| | - Lixia Yao
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905 USA
| | - Xiaoyan Wang
- Center for Quantitative Medicine, University of Connecticut Health Center, Farmington, CT 06030 USA
- Department of Community Medicine and Health Care, University of Connecticut Health Center, Farmington, CT 06030 USA
- Department of Family Medicine, University of Connecticut Health Center, Farmington, CT 06030 USA
| |
Collapse
|
5
|
Ball R, Toh S, Nolan J, Haynes K, Forshee R, Botsis T. Evaluating automated approaches to anaphylaxis case classification using unstructured data from the FDA Sentinel System. Pharmacoepidemiol Drug Saf 2018; 27:1077-1084. [DOI: 10.1002/pds.4645] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Revised: 07/03/2018] [Accepted: 08/01/2018] [Indexed: 11/08/2022]
Affiliation(s)
- Robert Ball
- Office of Surveillance and Epidemiology; Center for Drug Evaluation and Research, FDA; Silver Spring MD USA
| | - Sengwee Toh
- Department of Population Medicine; Harvard Medical School and Harvard Pilgrim Health Care Institute; Boston MA USA
| | - Jamie Nolan
- Department of Population Medicine; Harvard Medical School and Harvard Pilgrim Health Care Institute; Boston MA USA
| | - Kevin Haynes
- Translational Research for Affordability and Quality; HealthCore, Inc.; Wilmington DE USA
| | - Richard Forshee
- Office of Biostatistics and Epidemiology; Center for Biologics Evaluation and Research, FDA; Silver Spring MD USA
| | - Taxiarchis Botsis
- Office of Biostatistics and Epidemiology; Center for Biologics Evaluation and Research, FDA; Silver Spring MD USA
| |
Collapse
|
6
|
Han L, Ball R, Pamer CA, Altman RB, Proestel S. Development of an automated assessment tool for MedWatch reports in the FDA adverse event reporting system. J Am Med Inform Assoc 2018; 24:913-920. [PMID: 28371826 DOI: 10.1093/jamia/ocx022] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 02/24/2017] [Indexed: 11/13/2022] Open
Abstract
Objective As the US Food and Drug Administration (FDA) receives over a million adverse event reports associated with medication use every year, a system is needed to aid FDA safety evaluators in identifying reports most likely to demonstrate causal relationships to the suspect medications. We combined text mining with machine learning to construct and evaluate such a system to identify medication-related adverse event reports. Methods FDA safety evaluators assessed 326 reports for medication-related causality. We engineered features from these reports and constructed random forest, L1 regularized logistic regression, and support vector machine models. We evaluated model accuracy and further assessed utility by generating report rankings that represented a prioritized report review process. Results Our random forest model showed the best performance in report ranking and accuracy, with an area under the receiver operating characteristic curve of 0.66. The generated report ordering assigns reports with a higher probability of medication-related causality a higher rank and is significantly correlated to a perfect report ordering, with a Kendall's tau of 0.24 ( P = .002). Conclusion Our models produced prioritized report orderings that enable FDA safety evaluators to focus on reports that are more likely to contain valuable medication-related adverse event information. Applying our models to all FDA adverse event reports has the potential to streamline the manual review process and greatly reduce reviewer workload.
Collapse
Affiliation(s)
- Lichy Han
- Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA
| | - Robert Ball
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Carol A Pamer
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Russ B Altman
- Department of Genetics, Stanford University.,Department of Bioengineering, Stanford University
| | - Scott Proestel
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| |
Collapse
|
7
|
Botsis T, Foster M, Kreimeyer K, Pandey A, Forshee R. Monitoring biomedical literature for post-market safety purposes by analyzing networks of text-based coded information. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017; 2017:66-75. [PMID: 28815108 PMCID: PMC5543357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Literature review is critical but time-consuming in the post-market surveillance of medical products. We focused on the safety signal of intussusception after the vaccination of infants with the Rotashield Vaccine in 1999 and retrieved all PubMed abstracts for rotavirus vaccines published after January 1, 1998. We used the Event-based Text-mining of Health Electronic Records system, the MetaMap tool, and the National Center for Biomedical Ontologies Annotator to process the abstracts and generate coded terms stamped with the date of publication. Data were analyzed in the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment to evaluate the intussusception-related findings before and after the release of the new rotavirus vaccines in 2006. The tight connection of intussusception with the historical signal in the first period and the absence of any safety concern for the new vaccines in the second period were verified. We demonstrated the feasibility for semi-automated solutions that may assist medical reviewers in monitoring biomedical literature.
Collapse
Affiliation(s)
- Taxiarchis Botsis
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD
| | - Matthew Foster
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD
| | - Kory Kreimeyer
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD
| | - Abhishek Pandey
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD
| | - Richard Forshee
- Office of Biostatistics and Epidemiology, Center for Biologics Evaluation & Research, US Food and Drug Administration, Silver Spring, MD
| |
Collapse
|
8
|
Decision support environment for medical product safety surveillance. J Biomed Inform 2016; 64:354-362. [DOI: 10.1016/j.jbi.2016.07.023] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Revised: 07/22/2016] [Accepted: 07/27/2016] [Indexed: 02/04/2023]
|
9
|
Baer B, Nguyen M, Woo EJ, Winiecki S, Scott J, Martin D, Botsis T, Ball R. Can Natural Language Processing Improve the Efficiency of Vaccine Adverse Event Report Review? Methods Inf Med 2015; 55:144-50. [PMID: 26394725 DOI: 10.3414/me14-01-0066] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2014] [Accepted: 06/30/2015] [Indexed: 11/09/2022]
Abstract
BACKGROUND Individual case review of spontaneous adverse event (AE) reports remains a cornerstone of medical product safety surveillance for industry and regulators. Previously we developed the Vaccine Adverse Event Text Miner (VaeTM) to offer automated information extraction and potentially accelerate the evaluation of large volumes of unstructured data and facilitate signal detection. OBJECTIVE To assess how the information extraction performed by VaeTM impacts the accuracy of a medical expert's review of the vaccine adverse event report. METHODS The "outcome of interest" (diagnosis, cause of death, second level diagnosis), "onset time," and "alternative explanations" (drug, medical and family history) for the adverse event were extracted from 1000 reports from the Vaccine Adverse Event Reporting System (VAERS) using the VaeTM system. We compared the human interpretation, by medical experts, of the VaeTM extracted data with their interpretation of the traditional full text reports for these three variables. Two experienced clinicians alternately reviewed text miner output and full text. A third clinician scored the match rate using a predefined algorithm; the proportion of matches and 95% confidence intervals (CI) were calculated. Review time per report was analyzed. RESULTS Proportion of matches between the interpretation of the VaeTM extracted data, compared to the interpretation of the full text: 93% for outcome of interest (95% CI: 91-94%) and 78% for alternative explanation (95% CI: 75-81%). Extracted data on the time to onset was used in 14% of cases and was a match in 54% (95% CI: 46-63%) of those cases. When supported by structured time data from reports, the match for time to onset was 79% (95% CI: 76-81%). The extracted text averaged 136 (74%) fewer words, resulting in a mean reduction in review time of 50 (58%) seconds per report. CONCLUSION Despite a 74% reduction in words, the clinical conclusion from VaeTM extracted data agreed with the full text in 93% and 78% of reports for the outcome of interest and alternative explanation, respectively. The limited amount of extracted time interval data indicates the need for further development of this feature. VaeTM may improve review efficiency, but further study is needed to determine if this level of agreement is sufficient for routine use.
Collapse
Affiliation(s)
- B Baer
- Bethany Baer, FDA Center for Biologics Evaluation and Research, 10903 New Hampshire Ave, WO71-1323, Silver Spring, MD 20993-0002, 240-402-8584, USA, E-mail:
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Lehmann CU, Haux R. From bench to bed: bridging from informatics theory to practice. An exploratory analysis. Methods Inf Med 2014; 53:511-5. [PMID: 25377761 DOI: 10.3414/me14-01-0098] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2014] [Indexed: 11/09/2022]
Abstract
BACKGROUND In 2009, the journal Applied Clinical Informatics (ACI) commenced publication. Focused on applications in clinical informatics, ACI was intended to be a companion journal to METHODS of Information in Medicine (MIM). Both journals are official journals of IMIA, the International Medical Informatics Association. OBJECTIVES To explore, after five years, which congruencies and interdependencies exist in publications of these journals and to determine if gaps exist. To achieve this goal, major topics discussed in ACI and in MIM had to be analysed. Finally, we wanted to explore, whether the intention of publishing these companion journals to provide an information bridge from informatics theory to informatics practice and from practice to theory could be supported by this model. In this manuscript we will report on congruencies and interdependencies from practise to theory and on major topis in ACI. Further results will be reported in a second paper. METHODS Retrospective, prolective observational study on recent publications of ACI and MIM. All publications of the years 2012 and 2013 from these journals were indexed and analysed. RESULTS Hundred and ninety-six publications have been analysed (87 ACI, 109 MIM). In ACI publications addressed care coordination, shared decision support, and provider communication in its importance for complex patient care and safety and quality. Other major themes included improving clinical documentation quality and efficiency, effectiveness of clinical decision support and alerts, implementation of health information technology systems including discussion of failures and succeses. An emerging topic in the years analyzed was a focus on health information technology to predict and prevent hospital admissions and managing population health including the application of mobile health technology. Congruencies between journals could be found in themes, but with different focus in its contents. Interdependencies from practise to theory found in these publications, were only limited. CONCLUSIONS Bridging from informatics theory to practise and vice versa remains a major component of successful research and practise as well as a major challenge.
Collapse
Affiliation(s)
- C U Lehmann
- Prof. Dr. Christoph U. Lehmann, Pediatrics and Biomedical Informatics, Vanderbilt University, 2200 Children's Way, 11111 Doctors' Office Tower, Nashville, TN 37232-9544, USA, E-mail:
| | | |
Collapse
|
11
|
Botsis T, Scott J, Woo EJ, Ball R. Identifying Similar Cases in Document Networks Using Cross-Reference Structures. IEEE J Biomed Health Inform 2014; 19:1906-17. [PMID: 25122604 DOI: 10.1109/jbhi.2014.2345873] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Our objective was to explore the creation of document networks based on different thresholds of shared information and different clustering algorithms on those networks to identify document clusters describing similar clinical cases. We created networks from vaccine adverse event report sets using seven approaches for linking reports. We then applied three clustering algorithms [visualization of similarities (VOS), Louvain, k-means] to these networks and evaluated their ability to identify known clusters. The report sets included one simulated set and three sets from the Vaccine Adverse Event Reporting System; each was split into training and testing subsets. Training subsets were used to estimate parameter values for the clustering algorithms and testing subsets to evaluate clusters. We created the networks by linking reports based on shared information in the form either of individual Medical Dictionary for Regulatory Activities Preferred Terms (PTs) or of dyads, triplets, quadruplets, quintuplets, and sextuplets of PTs; we created another network by weighting the single PT network connections by Lin's information theoretic approach to similarity. We then repeated this entire process using networks based on text mining output rather than structured data. We evaluated report clustering using recall, precision, and f-measure. The VOS algorithm outperformed Louvain and k-means in general. The best weighting scheme appeared to be related to the complexity of the known cluster. For example, singleton weighting performed best for an intussusception cluster driven by a single PT. We observed marginal differences between the code- and textual-based clustering. In conclusion, our approach supported identification of similar nodes in a document network.
Collapse
|