1
|
Padhi B, Liu R, Yang Y, Peng X, Li L, Zhang P, Zhang P. Using multiple drug similarity networks to promote adverse drug event detection. Heliyon 2024; 10:e39728. [PMID: 39748955 PMCID: PMC11693886 DOI: 10.1016/j.heliyon.2024.e39728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 10/22/2024] [Indexed: 01/04/2025] Open
Abstract
The occurrence of an adverse drug event (ADE) has become a serious social concern of public health. Early detection of ADEs can lower the risk of drug safety as well as the expense of the drug. While post-market spontaneous reports of ADEs remain a cornerstone of pharmacovigilance, most existing signal detection algorithms rely on substantial accumulated data, limiting their applicability to early ADE detection when reports are scarce. To address this issue, we propose a label propagation model for generating enhanced drug safety signals using multiple drug features. We first construct multiple drug similarity networks using a range of drug features. We then calculate initial drug safety signals using conventional signal detection algorithms. These original signals are subsequently propagated across each drug similarity network to obtain enhanced drug safety signals. We evaluate our proposed model using two common signal detection algorithms on data from the FDA Adverse Event Reporting System (FAERS). Results demonstrate that enhanced drug safety signals with pre-clinical information outperform the standard safety signal detection algorithms on early ADE detection. In addition, we systematically evaluate the performance of different drug similarities against different types of ADEs. Furthermore, we have developed a web interface (http://drug-drug-sim.aimedlab.net/) to display our multiple drug similarity scores, facilitating access to this valuable resource for drug safety monitoring.
Collapse
Affiliation(s)
- Biswajit Padhi
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, OH 43210, USA
| | - Ruoqi Liu
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, OH 43210, USA
| | - Yuedi Yang
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, 410 W. 10th Street HITS 3000, Indianapolis, IN 46202, USA
| | - Xueqiao Peng
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, OH 43210, USA
| | - Lang Li
- Department of Biomedical Informatics, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, USA
| | - Pengyue Zhang
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, 410 W. 10th Street HITS 3000, Indianapolis, IN 46202, USA
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, OH 43210, USA
- Department of Biomedical Informatics, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, USA
- Translational Data Analytics institute, The Ohio State University, 1760 Neil Ave, Columbus, OH 43210, USA
| |
Collapse
|
2
|
Krix S, DeLong LN, Madan S, Domingo-Fernández D, Ahmad A, Gul S, Zaliani A, Fröhlich H. MultiGML: Multimodal graph machine learning for prediction of adverse drug events. Heliyon 2023; 9:e19441. [PMID: 37681175 PMCID: PMC10481305 DOI: 10.1016/j.heliyon.2023.e19441] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 08/22/2023] [Accepted: 08/23/2023] [Indexed: 09/09/2023] Open
Abstract
Adverse drug events constitute a major challenge for the success of clinical trials. Several computational strategies have been suggested to estimate the risk of adverse drug events in preclinical drug development. While these approaches have demonstrated high utility in practice, they are at the same time limited to specific information sources. Thus, many current computational approaches neglect a wealth of information which results from the integration of different data sources, such as biological protein function, gene expression, chemical compound structure, cell-based imaging and others. In this work we propose an integrative and explainable multi-modal Graph Machine Learning approach (MultiGML), which fuses knowledge graphs with multiple further data modalities to predict drug related adverse events and general drug target-phenotype associations. MultiGML demonstrates excellent prediction performance compared to alternative algorithms, including various traditional knowledge graph embedding techniques. MultiGML distinguishes itself from alternative techniques by providing in-depth explanations of model predictions, which point towards biological mechanisms associated with predictions of an adverse drug event. Hence, MultiGML could be a versatile tool to support decision making in preclinical drug development.
Collapse
Affiliation(s)
- Sophia Krix
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Lauren Nicole DeLong
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Artificial Intelligence and its Applications Institute, School of Informatics, University of Edinburgh, 10 Crichton Street, EH8 9AB, UK
| | - Sumit Madan
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Department of Computer Science, University of Bonn, 53115, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| | - Ashar Ahmad
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Grunenthal GmbH, 52099, Aachen, Germany
| | - Sheraz Gul
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Andrea Zaliani
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
| |
Collapse
|
3
|
The 2011–2020 Trends of Data-Driven Approaches in Medical Informatics for Active Pharmacovigilance. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11052249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Pharmacovigilance, the scientific discipline pertaining to drug safety, has been studied extensively and is progressing continuously. In this field, medical informatics techniques and interpretation play important roles, and appropriate approaches are required. In this study, we investigated and analyzed the trends of pharmacovigilance systems, especially the data collection, detection, assessment, and monitoring processes. We used PubMed to collect papers on pharmacovigilance published over the past 10 years, and analyzed a total of 40 significant papers to determine the characteristics of the databases and data analysis methods used to identify drug safety indicators. Through systematic reviews, we identified the difficulty of standardizing data and terminology and establishing an adverse drug reactions (ADR) evaluation system in pharmacovigilance, and their corresponding implications. We found that appropriate methods and guidelines for active pharmacovigilance using medical big data are still required and should continue to be developed.
Collapse
|
4
|
Liu R, Zhang P. Towards early detection of adverse drug reactions: combining pre-clinical drug structures and post-market safety reports. BMC Med Inform Decis Mak 2019; 19:279. [PMID: 31849321 PMCID: PMC6918608 DOI: 10.1186/s12911-019-0999-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 12/04/2019] [Indexed: 01/10/2023] Open
Abstract
Background Adverse drug reaction (ADR) is a major burden for patients and healthcare industry. Early and accurate detection of potential ADRs can help to improve drug safety and reduce financial costs. Post-market spontaneous reports of ADRs remain a cornerstone of pharmacovigilance and a series of drug safety signal detection methods play an important role in providing drug safety insights. However, existing methods require sufficient case reports to generate signals, limiting their usages for newly approved drugs with few (or even no) reports. Methods In this study, we propose a label propagation framework to enhance drug safety signals by combining drug chemical structures with FDA Adverse Event Reporting System (FAERS). First, we compute original drug safety signals via common signal detection algorithms. Then, we construct a drug similarity network based on chemical structures. Finally, we generate enhanced drug safety signals by propagating original signals on the drug similarity network. Our proposed framework enriches post-market safety reports with pre-clinical drug similarity network, effectively alleviating issues of insufficient cases for newly approved drugs. Results We apply the label propagation framework to four popular signal detection algorithms (PRR, ROR, MGPS, BCPNN) and find that our proposed framework generates more accurate drug safety signals than the corresponding baselines. In addition, our framework identifies potential ADRs for newly approved drugs, thus paving the way for early detection of ADRs. Conclusions The proposed label propagation framework combines pre-clinical drug structures with post-market safety reports, generates enhanced drug safety signals, and can potentially help to accurately detect ADRs ahead of time. Availability The source code for this paper is available at: https://github.com/ruoqi-liu/LP-SDA.
Collapse
Affiliation(s)
- Ruoqi Liu
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, 43210, Ohio, USA
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, 43210, Ohio, USA. .,Department of Biomedical Informatics, The Ohio State University, 1800 Cannon Drive, Columbus, 43210, Ohio, USA.
| |
Collapse
|
5
|
Chapman AB, Peterson KS, Alba PR, DuVall SL, Patterson OV. Detecting Adverse Drug Events with Rapidly Trained Classification Models. Drug Saf 2019; 42:147-156. [PMID: 30649737 PMCID: PMC6373386 DOI: 10.1007/s40264-018-0763-y] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
INTRODUCTION Identifying occurrences of medication side effects and adverse drug events (ADEs) is an important and challenging task because they are frequently only mentioned in clinical narrative and are not formally reported. METHODS We developed a natural language processing (NLP) system that aims to identify mentions of symptoms and drugs in clinical notes and label the relationship between the mentions as indications or ADEs. The system leverages an existing word embeddings model with induced word clusters for dimensionality reduction. It employs a conditional random field (CRF) model for named entity recognition (NER) and a random forest model for relation extraction (RE). RESULTS Final performance of each model was evaluated separately and then combined on a manually annotated evaluation set. The micro-averaged F1 score was 80.9% for NER, 88.1% for RE, and 61.2% for the integrated systems. Outputs from our systems were submitted to the NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE 1.0) competition (Yu et al. in http://bio-nlp.org/index.php/projects/39-nlp-challenges , 2018). System performance was evaluated in three tasks (NER, RE, and complete system) with multiple teams submitting output from their systems for each task. Our RE system placed first in Task 2 of the challenge and our integrated system achieved third place in Task 3. CONCLUSION Adding to the growing number of publications that utilize NLP to detect occurrences of ADEs, our study illustrates the benefits of employing innovative feature engineering.
Collapse
Affiliation(s)
| | - Kelly S Peterson
- VA Salt Lake City Health Care System, University of Utah, Salt Lake City, UT, USA
- Division of Epidemiology, University of Utah, Salt Lake City, UT, USA
| | - Patrick R Alba
- VA Salt Lake City Health Care System, University of Utah, Salt Lake City, UT, USA
- Division of Epidemiology, University of Utah, Salt Lake City, UT, USA
| | - Scott L DuVall
- VA Salt Lake City Health Care System, University of Utah, Salt Lake City, UT, USA
- Division of Epidemiology, University of Utah, Salt Lake City, UT, USA
| | - Olga V Patterson
- VA Salt Lake City Health Care System, University of Utah, Salt Lake City, UT, USA.
- Division of Epidemiology, University of Utah, Salt Lake City, UT, USA.
| |
Collapse
|
6
|
Vilar S, Friedman C, Hripcsak G. Detection of drug-drug interactions through data mining studies using clinical sources, scientific literature and social media. Brief Bioinform 2018; 19:863-877. [PMID: 28334070 PMCID: PMC6454455 DOI: 10.1093/bib/bbx010] [Citation(s) in RCA: 90] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Revised: 12/28/2016] [Indexed: 11/13/2022] Open
Abstract
Drug-drug interactions (DDIs) constitute an important concern in drug development and postmarketing pharmacovigilance. They are considered the cause of many adverse drug effects exposing patients to higher risks and increasing public health system costs. Methods to follow-up and discover possible DDIs causing harm to the population are a primary aim of drug safety researchers. Here, we review different methodologies and recent advances using data mining to detect DDIs with impact on patients. We focus on data mining of different pharmacovigilance sources, such as the US Food and Drug Administration Adverse Event Reporting System and electronic health records from medical institutions, as well as on the diverse data mining studies that use narrative text available in the scientific biomedical literature and social media. We pay attention to the strengths but also further explain challenges related to these methods. Data mining has important applications in the analysis of DDIs showing the impact of the interactions as a cause of adverse effects, extracting interactions to create knowledge data sets and gold standards and in the discovery of novel and dangerous DDIs.
Collapse
Affiliation(s)
- Santiago Vilar
- Department of Biomedical Informatics, Columbia University, New York, USA
- Department of Organic Chemistry, University of Santiago de Compostela, Spain
| | - Carol Friedman
- Department of Biomedical Informatics, Columbia University, New York, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, USA
| |
Collapse
|
7
|
Meystre SM, Lovis C, Bürkle T, Tognola G, Budrionis A, Lehmann CU. Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress. Yearb Med Inform 2017; 26:38-52. [PMID: 28480475 PMCID: PMC6239225 DOI: 10.15265/iy-2017-007] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Indexed: 12/30/2022] Open
Abstract
Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selected publication was reviewed by the authors, and a structured analysis and summarization of its content was developed. Results: The initial search produced 359 publications, reduced after a manual examination of abstracts and full publications. The following aspects of clinical data reuse are discussed: motivations and challenges, privacy and ethical concerns, data integration and interoperability, data models and terminologies, unstructured data reuse, structured data mining, clinical practice and research integration, and examples of clinical data reuse (quality measurement and learning healthcare systems). Conclusion: Reuse of clinical data is a fast-growing field recognized as essential to realize the potentials for high quality healthcare, improved healthcare management, reduced healthcare costs, population health management, and effective clinical research.
Collapse
Affiliation(s)
- S. M. Meystre
- Medical University of South Carolina, Charleston, SC, USA
| | - C. Lovis
- Division of Medical Information Sciences, University Hospitals of Geneva, Switzerland
| | - T. Bürkle
- University of Applied Sciences, Bern, Switzerland
| | - G. Tognola
- Institute of Electronics, Computer and Telecommunication Engineering, Italian Natl. Research Council IEIIT-CNR, Milan, Italy
| | - A. Budrionis
- Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, Norway
| | - C. U. Lehmann
- Departments of Biomedical Informatics and Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
8
|
Abstract
Background and Objective Several studies have demonstrated the ability to detect adverse events potentially related to multiple drug exposure via data mining. However, the number of putative associations produced by such computational approaches is typically large, making experimental validation difficult. We theorized that those potential associations for which there is evidence from multiple complementary sources are more likely to be true, and explored this idea using a published database of drug–drug-adverse event associations derived from electronic health records (EHRs). Methods We prioritized drug–drug-event associations derived from EHRs using four sources of information: (1) public databases, (2) sources of spontaneous reports, (3) literature, and (4) non-EHR drug–drug interaction (DDI) prediction methods. After pre-filtering the associations by removing those found in public databases, we devised a ranking for associations based on the support from the remaining sources, and evaluated the results of this rank-based prioritization. Results We collected information for 5983 putative EHR-derived drug–drug-event associations involving 345 drugs and ten adverse events from four data sources and four prediction methods. Only seven drug–drug-event associations (<0.5 %) had support from the majority of evidence sources, and about one third (1777) had support from at least one of the evidence sources. Conclusions Our proof-of-concept method for scoring putative drug–drug-event associations from EHRs offers a systematic and reproducible way of prioritizing associations for further study. Our findings also quantify the agreement (or lack thereof) among complementary sources of evidence for drug–drug-event associations and highlight the challenges of developing a robust approach for prioritizing signals of these associations. Electronic supplementary material The online version of this article (doi:10.1007/s40264-015-0352-2) contains supplementary material, which is available to authorized users.
Collapse
|
9
|
Cole AM, Stephens KA, Keppel GA, Estiri H, Baldwin LM. Extracting Electronic Health Record Data in a Practice-Based Research Network: Processes to Support Translational Research across Diverse Practice Organizations. EGEMS 2016; 4:1206. [PMID: 27141519 PMCID: PMC4827782 DOI: 10.13063/2327-9214.1206] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Context: The widespread adoption of electronic health records (EHRs) offers significant opportunities to conduct research with clinical data from patients outside traditional academic research settings. Because EHRs are designed primarily for clinical care and billing, significant challenges are inherent in the use of EHR data for clinical and translational research. Efficient processes are needed for translational researchers to overcome these challenges. The Data QUEST Coordinating Center (DQCC), which oversees Data Query Extraction Standardization Translation (Data QUEST) – a primary-care, EHR data-sharing infrastructure – created processes that guide EHR data extraction for clinical and translational research across these diverse practices. We describe these processes and their application in a case example. Case Description: The DQCC process for developing EHR data extractions not only supports researchers’ access to EHR data, but supports this access for the purpose of answering scientific questions. This process requires complex coordination across multiple domains, including the following: (1) understanding the context of EHR data; (2) creating and maintaining a governance structure to support exchange of EHR data; and (3) defining data parameters that are used in order to extract data from the EHR. We use the Northwest-Alaska Pharmacogenomics Research Network (NWA-PGRN) as a case example that focuses on pharmacogenomic discovery and clinical applications to describe the DQCC process. The NWA-PGRN collaborates with Data QUEST to explore ways to leverage primary-care EHR data to support pharmacogenomics research. Findings: Preliminary analysis on the case example shows that initial decisions about how researchers define the study population can influence study outcomes. Major Themes and Conclusions: The experience of the DQCC demonstrates that coordinating centers provide expertise in helping researchers understand the context of EHR data, create and maintain governance structures, and guide the definition of parameters for data extractions. This expertise is critical to supporting research with EHR data. Replication of these strategies through coordinating centers may lead to more efficient translational research. Investigators must also consider the impact of initial decisions in defining study groups that may potentially affect outcomes.
Collapse
Affiliation(s)
- Allison M Cole
- University of Washington, Institute of Translational Health Sciences
| | - Kari A Stephens
- University of Washington, Institute of Translational Health Sciences
| | - Gina A Keppel
- University of Washington, Institute of Translational Health Sciences
| | - Hossein Estiri
- University of Washington, Institute of Translational Health Sciences
| | - Laura-Mae Baldwin
- University of Washington, Institute of Translational Health Sciences
| |
Collapse
|
10
|
Vilar S, Lorberbaum T, Hripcsak G, Tatonetti NP. Improving Detection of Arrhythmia Drug-Drug Interactions in Pharmacovigilance Data through the Implementation of Similarity-Based Modeling. PLoS One 2015; 10:e0129974. [PMID: 26068584 PMCID: PMC4466327 DOI: 10.1371/journal.pone.0129974] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 05/14/2015] [Indexed: 11/18/2022] Open
Abstract
Identification of Drug-Drug Interactions (DDIs) is a significant challenge during drug development and clinical practice. DDIs are responsible for many adverse drug effects (ADEs), decreasing patient quality of life and causing higher care expenses. DDIs are not systematically evaluated in pre-clinical or clinical trials and so the FDA U. S. Food and Drug Administration relies on post-marketing surveillance to monitor patient safety. However, existing pharmacovigilance algorithms show poor performance for detecting DDIs exhibiting prohibitively high false positive rates. Alternatively, methods based on chemical structure and pharmacological similarity have shown promise in adverse drug event detection. We hypothesize that the use of chemical biology data in a post hoc analysis of pharmacovigilance results will significantly improve the detection of dangerous interactions. Our model integrates a reference standard of DDIs known to cause arrhythmias with drug similarity data. To compare similarity between drugs we used chemical structure (both 2D and 3D molecular structure), adverse drug side effects, chemogenomic targets, drug indication classes, and known drug-drug interactions. We evaluated the method on external reference standards. Our results showed an enhancement of sensitivity, specificity and precision in different top positions with the use of similarity measures to rank the candidates extracted from pharmacovigilance data. For the top 100 DDI candidates, similarity-based modeling yielded close to twofold precision enhancement compared to the proportional reporting ratio (PRR). Moreover, the method helps in the DDI decision making through the identification of the DDI in the reference standard that generated the candidate.
Collapse
Affiliation(s)
- Santiago Vilar
- Department of Biomedical Informatics, Columbia University, New York, NY, United States of America
- Department of Systems Biology, Columbia University, New York, NY, United States of America
- Observational Health Data Sciences and Informatics (OHDSI), New York, NY, United States of America
- * E-mail:
| | - Tal Lorberbaum
- Department of Biomedical Informatics, Columbia University, New York, NY, United States of America
- Department of Systems Biology, Columbia University, New York, NY, United States of America
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, NY, United States of America
- Observational Health Data Sciences and Informatics (OHDSI), New York, NY, United States of America
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, United States of America
- Department of Systems Biology, Columbia University, New York, NY, United States of America
- Observational Health Data Sciences and Informatics (OHDSI), New York, NY, United States of America
- Department of Medicine, Columbia University, New York, NY, United States of America
| |
Collapse
|
11
|
Liu M, Hu Y, Tang B. Role of text mining in early identification of potential drug safety issues. Methods Mol Biol 2015; 1159:227-51. [PMID: 24788270 DOI: 10.1007/978-1-4939-0709-0_13] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Drugs are an important part of today's medicine, designed to treat, control, and prevent diseases; however, besides their therapeutic effects, drugs may also cause adverse effects that range from cosmetic to severe morbidity and mortality. To identify these potential drug safety issues early, surveillance must be conducted for each drug throughout its life cycle, from drug development to different phases of clinical trials, and continued after market approval. A major aim of pharmacovigilance is to identify the potential drug-event associations that may be novel in nature, severity, and/or frequency. Currently, the state-of-the-art approach for signal detection is through automated procedures by analyzing vast quantities of data for clinical knowledge. There exists a variety of resources for the task, and many of them are textual data that require text analytics and natural language processing to derive high-quality information. This chapter focuses on the utilization of text mining techniques in identifying potential safety issues of drugs from textual sources such as biomedical literature, consumer posts in social media, and narrative electronic medical records.
Collapse
Affiliation(s)
- Mei Liu
- Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ, 07102, USA,
| | | | | |
Collapse
|
12
|
3D pharmacophoric similarity improves multi adverse drug event identification in pharmacovigilance. Sci Rep 2015; 5:8809. [PMID: 25744369 PMCID: PMC4351525 DOI: 10.1038/srep08809] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Accepted: 01/30/2015] [Indexed: 11/08/2022] Open
Abstract
Adverse drugs events (ADEs) detection constitutes a considerable concern in patient safety and public health care. For this reason, it is important to develop methods that improve ADE signal detection in pharmacovigilance databases. Our objective is to apply 3D pharmacophoric similarity models to enhance ADE recognition in Offsides, a pharmacovigilance resource with drug-ADE associations extracted from the FDA Adverse Event Reporting System (FAERS). We developed a multi-ADE predictor implementing 3D drug similarity based on a pharmacophoric approach, with an ADE reference standard extracted from the SIDER database. The results showed that the application of our 3D multi-type ADE predictor to the pharmacovigilance data in Offsides improved ADE identification and generated enriched sets of drug-ADE signals. The global ROC curve for the Offsides ADE candidates ranked with the 3D similarity score showed an area of 0.7. The 3D predictor also allows the identification of the most similar drug that causes the ADE under study, which could provide hypotheses about mechanisms of action and ADE etiology. Our method is useful in drug development, screening potential adverse effects in experimental drugs, and in drug safety, applicable to the evaluation of ADE signals selected through pharmacovigilance data mining.
Collapse
|
13
|
Hung WY, Abreu Lanfranco O. Contemporary review of drug-induced pancreatitis: A different perspective. World J Gastrointest Pathophysiol 2014; 5:405-415. [PMID: 25400984 PMCID: PMC4231505 DOI: 10.4291/wjgp.v5.i4.405] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Revised: 06/17/2014] [Accepted: 07/29/2014] [Indexed: 02/06/2023] Open
Abstract
Although gallstone and alcohol use have been considered the most common causes of acute pancreatitis, hundreds of frequently prescribed medications are associated with this disease state. The true incidence is unknown since there are few population based studies available. The knowledge of drug induced acute pancreatitis is limited by the availability and the quality of the evidence as the majority of data is extrapolated from case reports. Establishing a definitive causal relationship between a drug and acute pancreatitis poses a challenge to clinicians. Several causative agent classification systems are often used to identify the suspected agents. They require regular updates since new drug induced acute pancreatitis cases are reported continuously. In addition, infrequently prescribed medications and herbal medications are often omitted. Furthermore, identification of drug induced acute pancreatitis with new medications often requires accumulation of post market case reports. The unrealistic expectation for a comprehensive list of medications and the multifactorial nature of acute pancreatitis call for a different approach. In this article, we review the potential mechanisms of drug induced acute pancreatitis and provide the perspective of deductive reasoning in order to allow clinicians to identify potential drug induced acute pancreatitis with limited data.
Collapse
|
14
|
Vilar S, Ryan PB, Madigan D, Stang PE, Schuemie MJ, Friedman C, Tatonetti NP, Hripcsak G. Similarity-based modeling applied to signal detection in pharmacovigilance. CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY 2014; 3:e137. [PMID: 25250527 PMCID: PMC4211266 DOI: 10.1038/psp.2014.35] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 07/06/2014] [Indexed: 12/31/2022]
Abstract
One of the main objectives in pharmacovigilance is the detection of adverse drug events (ADEs) through mining of healthcare databases, such as electronic health records or administrative claims data. Although different approaches have been shown to be of great value, research is still focusing on the enhancement of signal detection to gain efficiency in further assessment and follow-up. We applied similarity-based modeling techniques, using 2D and 3D molecular structure, ADE, target, and ATC (anatomical therapeutic chemical) similarity measures, to the candidate associations selected previously in a medication-wide association study for four ADE outcomes. Our results showed an improvement in the precision when we ranked the subset of ADE candidates using similarity scorings. This method is simple, useful to strengthen or prioritize signals generated from healthcare databases, and facilitates ADE detection through the identification of the most similar drugs for which ADE information is available.
Collapse
Affiliation(s)
- S Vilar
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - P B Ryan
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Janssen Research and Development, Titusville, New Jersey, USA
| | - D Madigan
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Department of Statistics, Columbia University, New York, New York, USA
| | - P E Stang
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Janssen Research and Development, Titusville, New Jersey, USA
| | - M J Schuemie
- 1] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [2] Janssen Research and Development, Titusville, New Jersey, USA
| | - C Friedman
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| | - N P Tatonetti
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA [3] Department of Systems Biology, Columbia University Medical Center, New York, New York, USA [4] Department of Medicine, Columbia University Medical Center, New York, New York, USA
| | - G Hripcsak
- 1] Department of Biomedical Informatics, Columbia University, New York, New York, USA [2] Observational Health Data Sciences and Informatics (OHDSI), New York, New York, USA
| |
Collapse
|
15
|
Vilar S, Uriarte E, Santana L, Lorberbaum T, Hripcsak G, Friedman C, Tatonetti NP. Similarity-based modeling in large-scale prediction of drug-drug interactions. Nat Protoc 2014; 9:2147-63. [PMID: 25122524 PMCID: PMC4422192 DOI: 10.1038/nprot.2014.151] [Citation(s) in RCA: 121] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Drug-drug interactions (DDIs) are a major cause of adverse drug effects and a public health concern, as they increase hospital care expenses and reduce patients' quality of life. DDI detection is, therefore, an important objective in patient safety, one whose pursuit affects drug development and pharmacovigilance. In this article, we describe a protocol applicable on a large scale to predict novel DDIs based on similarity of drug interaction candidates to drugs involved in established DDIs. The method integrates a reference standard database of known DDIs with drug similarity information extracted from different sources, such as 2D and 3D molecular structure, interaction profile, target and side-effect similarities. The method is interpretable in that it generates drug interaction candidates that are traceable to pharmacological or clinical effects. We describe a protocol with applications in patient safety and preclinical toxicity screening. The time frame to implement this protocol is 5-7 h, with additional time potentially necessary, depending on the complexity of the reference standard DDI database and the similarity measures implemented.
Collapse
Affiliation(s)
- Santiago Vilar
- 1] Department of Biomedical Informatics, Columbia University Medical Center, New York, New York, USA. [2] Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Eugenio Uriarte
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Lourdes Santana
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela, Spain
| | - Tal Lorberbaum
- 1] Department of Biomedical Informatics, Columbia University Medical Center, New York, New York, USA. [2] Department of Physiology and Cellular Biophysics, Columbia University Medical Center, New York, New York, USA. [3] Department of Systems Biology, Columbia University Medical Center, New York, New York, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Medical Center, New York, New York, USA
| | - Carol Friedman
- Department of Biomedical Informatics, Columbia University Medical Center, New York, New York, USA
| | - Nicholas P Tatonetti
- 1] Department of Biomedical Informatics, Columbia University Medical Center, New York, New York, USA. [2] Department of Systems Biology, Columbia University Medical Center, New York, New York, USA. [3] Department of Medicine, Columbia University Medical Center, New York, New York, USA
| |
Collapse
|
16
|
Abstract
OBJECTIVES Implementation of Electronic Health Record (EHR) systems continues to expand. The massive number of patient encounters results in high amounts of stored data. Transforming clinical data into knowledge to improve patient care has been the goal of biomedical informatics professionals for many decades, and this work is now increasingly recognized outside our field. In reviewing the literature for the past three years, we focus on "big data" in the context of EHR systems and we report on some examples of how secondary use of data has been put into practice. METHODS We searched PubMed database for articles from January 1, 2011 to November 1, 2013. We initiated the search with keywords related to "big data" and EHR. We identified relevant articles and additional keywords from the retrieved articles were added. Based on the new keywords, more articles were retrieved and we manually narrowed down the set utilizing predefined inclusion and exclusion criteria. RESULTS Our final review includes articles categorized into the themes of data mining (pharmacovigilance, phenotyping, natural language processing), data application and integration (clinical decision support, personal monitoring, social media), and privacy and security. CONCLUSION The increasing adoption of EHR systems worldwide makes it possible to capture large amounts of clinical data. There is an increasing number of articles addressing the theme of "big data", and the concepts associated with these articles vary. The next step is to transform healthcare big data into actionable knowledge.
Collapse
Affiliation(s)
- M K Ross
- Lucila Ohno-Machado, Division of Biomedical Informatics, 9500 Gilman Drive, MC 0505, La Jolla, California, 92037-0505, USA, Tel: +1 858 822 4931, E-mail:
| | | | | |
Collapse
|
17
|
Low YS, Sedykh AY, Rusyn I, Tropsha A. Integrative approaches for predicting in vivo effects of chemicals from their structural descriptors and the results of short-term biological assays. Curr Top Med Chem 2014; 14:1356-64. [PMID: 24805064 PMCID: PMC5344042 DOI: 10.2174/1568026614666140506121116] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2014] [Revised: 02/05/2014] [Accepted: 02/05/2014] [Indexed: 12/22/2022]
Abstract
Cheminformatics approaches such as Quantitative Structure Activity Relationship (QSAR) modeling have been used traditionally for predicting chemical toxicity. In recent years, high throughput biological assays have been increasingly employed to elucidate mechanisms of chemical toxicity and predict toxic effects of chemicals in vivo. The data generated in such assays can be considered as biological descriptors of chemicals that can be combined with molecular descriptors and employed in QSAR modeling to improve the accuracy of toxicity prediction. In this review, we discuss several approaches for integrating chemical and biological data for predicting biological effects of chemicals in vivo and compare their performance across several data sets. We conclude that while no method consistently shows superior performance, the integrative approaches rank consistently among the best yet offer enriched interpretation of models over those built with either chemical or biological data alone. We discuss the outlook for such interdisciplinary methods and offer recommendations to further improve the accuracy and interpretability of computational models that predict chemical toxicity.
Collapse
Affiliation(s)
| | | | | | - Alexander Tropsha
- 100K Beard Hall, Campus Box 7568, University of North Carolina, Chapel Hill, NC 27599-7568, USA.
| |
Collapse
|
18
|
Liu M, Cai R, Hu Y, Matheny ME, Sun J, Hu J, Xu H. Determining molecular predictors of adverse drug reactions with causality analysis based on structure learning. J Am Med Inform Assoc 2013; 21:245-51. [PMID: 24334612 DOI: 10.1136/amiajnl-2013-002051] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE Adverse drug reaction (ADR) can have dire consequences. However, our current understanding of the causes of drug-induced toxicity is still limited. Hence it is of paramount importance to determine molecular factors of adverse drug responses so that safer therapies can be designed. METHODS We propose a causality analysis model based on structure learning (CASTLE) for identifying factors that contribute significantly to ADRs from an integration of chemical and biological properties of drugs. This study aims to address two major limitations of the existing ADR prediction studies. First, ADR prediction is mostly performed by assessing the correlations between the input features and ADRs, and the identified associations may not indicate causal relations. Second, most predictive models lack biological interpretability. RESULTS CASTLE was evaluated in terms of prediction accuracy on 12 organ-specific ADRs using 830 approved drugs. The prediction was carried out by first extracting causal features with structure learning and then applying them to a support vector machine (SVM) for classification. Through rigorous experimental analyses, we observed significant increases in both macro and micro F1 scores compared with the traditional SVM classifier, from 0.88 to 0.89 and 0.74 to 0.81, respectively. Most importantly, identified links between the biological factors and organ-specific drug toxicities were partially supported by evidence in Online Mendelian Inheritance in Man. CONCLUSIONS The proposed CASTLE model not only performed better in prediction than the baseline SVM but also produced more interpretable results (ie, biological factors responsible for ADRs), which is critical to discovering molecular activators of ADRs.
Collapse
Affiliation(s)
- Mei Liu
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA
| | | | | | | | | | | | | |
Collapse
|
19
|
van den Heever M, Mittal A, Haydock M, Windsor J. The use of intelligent database systems in acute pancreatitis--a systematic review. Pancreatology 2013; 14:9-16. [PMID: 24555973 DOI: 10.1016/j.pan.2013.11.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2013] [Revised: 10/15/2013] [Accepted: 11/18/2013] [Indexed: 12/11/2022]
Abstract
INTRODUCTION Acute pancreatitis (AP) is a complex disease with multiple aetiological factors, wide ranging severity, and multiple challenges to effective triage and management. Databases, data mining and machine learning algorithms (MLAs), including artificial neural networks (ANNs), may assist by storing and interpreting data from multiple sources, potentially improving clinical decision-making. AIMS 1) Identify database technologies used to store AP data, 2) collate and categorise variables stored in AP databases, 3) identify the MLA technologies, including ANNs, used to analyse AP data, and 4) identify clinical and non-clinical benefits and obstacles in establishing a national or international AP database. METHODS Comprehensive systematic search of online reference databases. The predetermined inclusion criteria were all papers discussing 1) databases, 2) data mining or 3) MLAs, pertaining to AP, independently assessed by two reviewers with conflicts resolved by a third author. RESULTS Forty-three papers were included. Three data mining technologies and five ANN methodologies were reported in the literature. There were 187 collected variables identified. ANNs increase accuracy of severity prediction, one study showed ANNs had a sensitivity of 0.89 and specificity of 0.96 six hours after admission--compare APACHE II (cutoff score ≥8) with 0.80 and 0.85 respectively. Problems with databases were incomplete data, lack of clinical data, diagnostic reliability and missing clinical data. CONCLUSION This is the first systematic review examining the use of databases, MLAs and ANNs in the management of AP. The clinical benefits these technologies have over current systems and other advantages to adopting them are identified.
Collapse
Affiliation(s)
| | - Anubhav Mittal
- Department of Surgery, University of Auckland, Auckland, New Zealand
| | - Matthew Haydock
- Department of Surgery, University of Auckland, Auckland, New Zealand
| | - John Windsor
- Department of Surgery, University of Auckland, Auckland, New Zealand.
| |
Collapse
|
20
|
Predicting Adverse Drug Events by Analyzing Electronic Patient Records. Artif Intell Med 2013. [DOI: 10.1007/978-3-642-38326-7_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|