1
|
Park SJ, Yang S, Lee S, Joo SH, Park T, Kim DH, Kim H, Park S, Kim JT, Kwack WG, Kang SW, Song YK, Cha JM, Rhee SY, Chung EK. Machine-Learning Parsimonious Prediction Model for Diagnostic Screening of Severe Hematological Adverse Events in Cancer Patients Treated with PD-1/PD-L1 Inhibitors: Retrospective Observational Study by Using the Common Data Model. Diagnostics (Basel) 2025; 15:226. [PMID: 39857110 PMCID: PMC11763827 DOI: 10.3390/diagnostics15020226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Revised: 01/08/2025] [Accepted: 01/17/2025] [Indexed: 01/27/2025] Open
Abstract
Background/Objectives: Earlier detection of severe immune-related hematological adverse events (irHAEs) in cancer patients treated with a PD-1 or PD-L1 inhibitor is critical to improving treatment outcomes. The study aimed to develop a simple machine learning (ML) model for predicting irHAEs associated with PD-1/PD-L1 inhibitors. Methods: We utilized the Observational Medical Outcomes Partnership-Common Data Model based on electronic medical records from a tertiary (KHMC) and a secondary (KHNMC) hospital in South Korea. Severe irHAEs were defined as Grades 3-5 by the Common Terminology Criteria for Adverse Events (version 5.0). The predictive model was developed using the KHMC dataset, and then cross-validated against an independent cohort (KHNMC). The full ML models were then simplified by selecting critical features based on the feature importance values (FIVs). Results: Overall, 397 and 255 patients were included in the primary (KHMC) and cross-validation (KHNMC) cohort, respectively. Among the tested ML algorithms, random forest achieved the highest accuracy (area under the receiver operating characteristic curve [AUROC] 0.88 for both cohorts). Parsimonious models reduced to 50% FIVs of the full models showed comparable performance to the full models (AUROC 0.83-0.86, p > 0.05). The KHMC and KHNMC parsimonious models shared common predictive features including furosemide, oxygen gas, piperacillin/tazobactam, and acetylcysteine. Conclusions: Considering the simplicity and adequate predictive performance, our simplified ML models might be easily implemented in clinical practice with broad applicability. Our model might enhance early diagnostic screening of irHAEs induced by PD-1/PD-L1 inhibitors, contributing to minimizing the risk of severe irHAEs and improving the effectiveness of cancer immunotherapy.
Collapse
Affiliation(s)
- Seok Jun Park
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
| | - Seungwon Yang
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Suhyun Lee
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Sung Hwan Joo
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
| | - Taemin Park
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Dong Hyun Kim
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Hyeonji Kim
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Soyun Park
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Jung-Tae Kim
- Department of Pharmacy, Kyung Hee University Hospital at Gangdong, Seoul 05278, Republic of Korea;
| | - Won Gun Kwack
- Division of Pulmonary, Allergy and Critical Care Medicine, Kyung Hee University Hospital, Seoul 02447, Republic of Korea;
| | - Sung Wook Kang
- Department of Pulmonary and Critical Care Medicine, Kyung Hee University Hospital at Gangdong, Seoul 05278, Republic of Korea;
| | - Yun-Kyoung Song
- College of Pharmacy, The Catholic University of Korea-Sungsim Campus, Bucheon 14662, Gyeonggi-do, Republic of Korea;
| | - Jae Myung Cha
- Division of Gastroenterology, Department of Internal Medicine, Kyung Hee University Hospital at Gangdong, Kyung Hee University School of Medicine, Seoul 05278, Republic of Korea
| | - Sang Youl Rhee
- Center for Digital Health, Medical Science Research Institute, College of Medicine, Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Endocrinology and Metabolism, Kyung Hee University School of Medicine, Seoul 02447, Republic of Korea
| | - Eun Kyoung Chung
- Department of Regulatory Science, College of Pharmacy, Graduate School, Kyung Hee University, Seoul 02447, Republic of Korea; (S.J.P.); (S.Y.); (S.H.J.); (T.P.); (D.H.K.); (H.K.); (S.P.)
- Institute of Regulatory Innovation Through Science (IRIS), Kyung Hee University, Seoul 02447, Republic of Korea
- Department of Pharmacy, College of Pharmacy, Kyung Hee University, Seoul 02447, Republic of Korea;
- Department of Pharmacy, Kyung Hee University Hospital at Gangdong, Seoul 05278, Republic of Korea;
| |
Collapse
|
2
|
Zhao Y, Brush M, Wang C, Wagner AH, Liu H, Freimuth RR. Leveraging a pharmacogenomics knowledgebase to formulate a drug response phenotype terminology for genomic medicine. Bioinformatics 2022; 38:5279-5287. [PMID: 36222570 PMCID: PMC9710557 DOI: 10.1093/bioinformatics/btac646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 05/31/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Despite the increasing evidence of utility of genomic medicine in clinical practice, systematically integrating genomic medicine information and knowledge into clinical systems with a high-level of consistency, scalability and computability remains challenging. A comprehensive terminology is required for relevant concepts and the associated knowledge model for representing relationships. In this study, we leveraged PharmGKB, a comprehensive pharmacogenomics (PGx) knowledgebase, to formulate a terminology for drug response phenotypes that can represent relationships between genetic variants and treatments. We evaluated coverage of the terminology through manual review of a randomly selected subset of 200 sentences extracted from genetic reports that contained concepts for 'Genes and Gene Products' and 'Treatments'. RESULTS Results showed that our proposed drug response phenotype terminology could cover 96% of the drug response phenotypes in genetic reports. Among 18 653 sentences that contained both 'Genes and Gene Products' and 'Treatments', 3011 sentences were able to be mapped to a drug response phenotype in our proposed terminology, among which the most discussed drug response phenotypes were response (994), sensitivity (829) and survival (332). In addition, we were able to re-analyze genetic report context incorporating the proposed terminology and enrich our previously proposed PGx knowledge model to reveal relationships between genetic variants and treatments. In conclusion, we proposed a drug response phenotype terminology that enhanced structured knowledge representation of genomic medicine. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yiqing Zhao
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | - Matthew Brush
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Chen Wang
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH 43205, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH 43210, USA
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH 43210, USA
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | - Robert R Freimuth
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
3
|
Zong N, Wen A, Moon S, Fu S, Wang L, Zhao Y, Yu Y, Huang M, Wang Y, Zheng G, Mielke MM, Cerhan JR, Liu H. Computational drug repurposing based on electronic health records: a scoping review. NPJ Digit Med 2022; 5:77. [PMID: 35701544 PMCID: PMC9198008 DOI: 10.1038/s41746-022-00617-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 05/19/2022] [Indexed: 11/30/2022] Open
Abstract
Computational drug repurposing methods adapt Artificial intelligence (AI) algorithms for the discovery of new applications of approved or investigational drugs. Among the heterogeneous datasets, electronic health records (EHRs) datasets provide rich longitudinal and pathophysiological data that facilitate the generation and validation of drug repurposing. Here, we present an appraisal of recently published research on computational drug repurposing utilizing the EHR. Thirty-three research articles, retrieved from Embase, Medline, Scopus, and Web of Science between January 2000 and January 2022, were included in the final review. Four themes, (1) publication venue, (2) data types and sources, (3) method for data processing and prediction, and (4) targeted disease, validation, and released tools were presented. The review summarized the contribution of EHR used in drug repurposing as well as revealed that the utilization is hindered by the validation, accessibility, and understanding of EHRs. These findings can support researchers in the utilization of medical data resources and the development of computational methods for drug repurposing.
Collapse
Affiliation(s)
- Nansu Zong
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA.
| | - Andrew Wen
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Sungrim Moon
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Sunyang Fu
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Liwei Wang
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Yiqing Zhao
- Department of Preventive Medicine, Northwestern Medicine, Northwestern University, Chicago, IL, USA
| | - Yue Yu
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Ming Huang
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Yanshan Wang
- Department of Health Information Management, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Gang Zheng
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | | | - James R Cerhan
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
4
|
Zheng NS, Stone CA, Jiang L, Shaffer CM, Kerchberger VE, Chung CP, Feng Q, Cox NJ, Stein CM, Roden DM, Denny JC, Phillips EJ, Wei WQ. High-throughput framework for genetic analyses of adverse drug reactions using electronic health records. PLoS Genet 2021; 17:e1009593. [PMID: 34061827 PMCID: PMC8195357 DOI: 10.1371/journal.pgen.1009593] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 06/11/2021] [Accepted: 05/10/2021] [Indexed: 11/30/2022] Open
Abstract
Understanding the contribution of genetic variation to drug response can improve the delivery of precision medicine. However, genome-wide association studies (GWAS) for drug response are uncommon and are often hindered by small sample sizes. We present a high-throughput framework to efficiently identify eligible patients for genetic studies of adverse drug reactions (ADRs) using “drug allergy” labels from electronic health records (EHRs). As a proof-of-concept, we conducted GWAS for ADRs to 14 common drug/drug groups with 81,739 individuals from Vanderbilt University Medical Center’s BioVU DNA Biobank. We identified 7 genetic loci associated with ADRs at P < 5 × 10−8, including known genetic associations such as CYP2D6 and OPRM1 for CYP2D6-metabolized opioid ADR. Additional expression quantitative trait loci and phenome-wide association analyses added evidence to the observed associations. Our high-throughput framework is both scalable and portable, enabling impactful pharmacogenomic research to improve precision medicine. Adverse drug reactions are a considerable burden on the healthcare system. Genetic studies can improve our understanding of the pathophysiological mechanisms of adverse drug reactions but have been hindered by small sample sizes. Drug responses are less often recorded than physiological traits and common diseases. Here, we present a high-throughput framework to efficiently identify eligible patients for genetic studies of adverse drug reactions from electronic health records. We validated our approach by conducting genome-wide association studies for adverse reactions to 14 common drug/drug groups with 81,739 individuals from Vanderbilt University Medical Centre’s BioVU DNA Biobank, identifying 7 genetic loci associated with adverse drug reactions. Our high-throughput framework can enable impactful pharmacogenomic research to help develop clinical guidelines for the delivery of the right drug to the right person.
Collapse
Affiliation(s)
- Neil S. Zheng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Cosby A. Stone
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Lan Jiang
- Division of Rheumatology & Immunology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Christian M. Shaffer
- Tennessee Valley Healthcare System—Nashville Campus, Nashville, Tennessee, United States of America
| | - V. Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Cecilia P. Chung
- Division of Rheumatology & Immunology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Tennessee Valley Healthcare System—Nashville Campus, Nashville, Tennessee, United States of America
- Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - QiPing Feng
- Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Nancy J. Cox
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - C. Michael Stein
- Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Dan M. Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University, Nashville, Tennessee, United States of America
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Elizabeth J. Phillips
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
5
|
Caraballo PJ, Sutton JA, Giri J, Wright JA, Nicholson WT, Kullo IJ, Parkulo MA, Bielinski SJ, Moyer AM. Integrating pharmacogenomics into the electronic health record by implementing genomic indicators. J Am Med Inform Assoc 2021; 27:154-158. [PMID: 31591640 DOI: 10.1093/jamia/ocz177] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 08/19/2019] [Accepted: 09/11/2019] [Indexed: 12/27/2022] Open
Abstract
Pharmacogenomics (PGx) clinical decision support integrated into the electronic health record (EHR) has the potential to provide relevant knowledge to clinicians to enable individualized care. However, past experience implementing PGx clinical decision support into multiple EHR platforms has identified important clinical, procedural, and technical challenges. Commercial EHRs have been widely criticized for the lack of readiness to implement precision medicine. Herein, we share our experiences and lessons learned implementing new EHR functionality charting PGx phenotypes in a unique repository, genomic indicators, instead of using the problem or allergy list. The Gen-Ind has additional features including a brief description of the clinical impact, a hyperlink to the original laboratory report, and links to additional educational resources. The automatic generation of genomic indicators from interfaced PGx test results facilitates implementation and long-term maintenance of PGx data in the EHR and can be used as criteria for synchronous and asynchronous CDS.
Collapse
Affiliation(s)
- Pedro J Caraballo
- Division of General Internal Medicine, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
- Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, Minnesota, USA
| | - Joseph A Sutton
- Department of Information Technology, Mayo Clinic, Rochester, Minnesota
| | - Jyothsna Giri
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Jessica A Wright
- Department of Pharmacy Services, Mayo Clinic, Rochester, Minnesota, USA
| | - Wayne T Nicholson
- Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Mark A Parkulo
- Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, Minnesota, USA
- Division of Community Internal Medicine, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Suzette J Bielinski
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Ann M Moyer
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
6
|
Lee HJ, Jiang M, Wu Y, Shaffer CM, Cleator JH, Friedman EA, Lewis JP, Roden DM, Denny J, Xu H. A comparative study of different methods for automatic identification of clopidogrel-induced bleedings in electronic health records. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017; 2017:185-192. [PMID: 28815128 PMCID: PMC5543340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Electronic health records (EHRs) linked with biobanks have been recognized as valuable data sources for pharmacogenomic studies, which require identification of patients with certain adverse drug reactions (ADRs) from a large population. Since manual chart review is costly and time-consuming, automatic methods to accurately identify patients with ADRs have been called for. In this study, we developed and compared different informatics approaches to identify ADRs from EHRs, using clopidogrel-induced bleeding as our case study. Three different types of methods were investigated: 1) rule-based methods; 2) machine learning-based methods; and 3) scoring function-based methods. Our results show that both machine learning and scoring methods are effective and the scoring method can achieve a high precision with a reasonable recall. We also analyzed the contributions of different types of features and found that the temporality information between clopidogrel and bleeding events, as well as textual evidence from physicians' assertion of the adverse events are helpful. We believe that our findings are valuable in advancing EHR-based pharmacogenomic studies.
Collapse
Affiliation(s)
- Hee-Jin Lee
- University of Texas Health Science Center at Houston, Houston, TX
| | - Min Jiang
- University of Texas Health Science Center at Houston, Houston, TX
| | - Yonghui Wu
- University of Texas Health Science Center at Houston, Houston, TX
| | - Christian M Shaffer
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - John H Cleator
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
| | - Eitan A Friedman
- Division of Cardiovascular Medicine, Vanderbilt University, Nashville, TN
| | - Joshua P Lewis
- Division of Endocrinology, Diabetes and Nutrition, University of Maryland School of Medicine, Baltimore, MD
| | - Dan M Roden
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
| | - Josh Denny
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN
| | - Hua Xu
- University of Texas Health Science Center at Houston, Houston, TX
| |
Collapse
|
7
|
Wei WQ, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ, Gamazon ER, Cox NJ, Roden DM, Denny JC. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS One 2017; 12:e0175508. [PMID: 28686612 PMCID: PMC5501393 DOI: 10.1371/journal.pone.0175508] [Citation(s) in RCA: 250] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 03/27/2017] [Indexed: 12/20/2022] Open
Abstract
OBJECTIVE To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated "phecodes" designed to facilitate phenome-wide association studies (PheWAS) in EHRs. METHODS AND MATERIALS We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. RESULTS Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. CONCLUSION Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.
Collapse
Affiliation(s)
- Wei-Qi Wei
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Lisa A. Bastarache
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Robert J. Carroll
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Joy E. Marlo
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Travis J. Osterman
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Eric R. Gamazon
- Vanderbilt Genetic Institute and the Division of Genetic Medicine, Vanderbilt University, Nashville, TN, United States of America
- Department of Clinical Epidemiology, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Department of Psychiatry, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
| | - Nancy J. Cox
- Vanderbilt Genetic Institute and the Division of Genetic Medicine, Vanderbilt University, Nashville, TN, United States of America
| | - Dan M. Roden
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Department of Clinical Pharmacology, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Joshua C. Denny
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
- * E-mail:
| |
Collapse
|
8
|
|
9
|
Dumitrescu L, Ritchie MD, Denny JC, El Rouby NM, McDonough CW, Bradford Y, Ramirez AH, Bielinski SJ, Basford MA, Chai HS, Peissig P, Carrell D, Pathak J, Rasmussen LV, Wang X, Pacheco JA, Kho AN, Hayes MG, Matsumoto M, Smith ME, Li R, Cooper-DeHoff RM, Kullo IJ, Chute CG, Chisholm RL, Jarvik GP, Larson EB, Carey D, McCarty CA, Williams MS, Roden DM, Bottinger E, Johnson JA, de Andrade M, Crawford DC. Genome-wide study of resistant hypertension identified from electronic health records. PLoS One 2017; 12:e0171745. [PMID: 28222112 PMCID: PMC5319785 DOI: 10.1371/journal.pone.0171745] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 01/25/2017] [Indexed: 12/11/2022] Open
Abstract
Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58–0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension.
Collapse
Affiliation(s)
- Logan Dumitrescu
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Marylyn D. Ritchie
- Biomedical and Translational Informatics, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Nihal M. El Rouby
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics, College of Pharmacy, University of Florida, Gainesville, Florida, United States of America
| | - Caitrin W. McDonough
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics, College of Pharmacy, University of Florida, Gainesville, Florida, United States of America
| | - Yuki Bradford
- Biomedical and Translational Informatics, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Andrea H. Ramirez
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Suzette J. Bielinski
- Division of Epidemiology, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Melissa A. Basford
- Office of Research, Vanderbilt University, Nashville, Tennessee, United States of America
| | - High Seng Chai
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Peggy Peissig
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - David Carrell
- Group Health Research Institute, Seattle, Washington, United States of America
| | - Jyotishman Pathak
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Luke V. Rasmussen
- Department of Preventive Medicine, Division of Health and Biomedical Informatics, Northwestern University, Chicago, Illinois, United States of America
| | - Xiaoming Wang
- Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Jennifer A. Pacheco
- Center for Genetic Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Abel N. Kho
- Department Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - M. Geoffrey Hayes
- Department Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Martha Matsumoto
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Maureen E. Smith
- Center for Genetic Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Rongling Li
- Division of Genomic Medicine, National Human Genome Research Institute, Bethesda, Maryland, United States of America
| | - Rhonda M. Cooper-DeHoff
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics, College of Pharmacy, University of Florida, Gainesville, Florida, United States of America
- Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Iftikhar J. Kullo
- Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Christopher G. Chute
- Division of General Internal Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Rex L. Chisholm
- Center for Genetic Medicine, Northwestern University, Chicago, Illinois, United States of America
| | - Gail P. Jarvik
- Department of Medicine, University of Washington Medical Center, Seattle, Washington, United States of America
| | - Eric B. Larson
- Group Health Research Institute, Seattle, Washington, United States of America
| | - David Carey
- Weis Center for Research, Geisinger Health System, Danville, Pennsylvania, United States of America
| | | | - Marc S. Williams
- Genomic Medicine Institute, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Dan M. Roden
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Erwin Bottinger
- Charles R. Bronfman Institute for Personalized Medicine, Mount Sinai, New York, New York, United States of America
| | - Julie A. Johnson
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics, College of Pharmacy, University of Florida, Gainesville, Florida, United States of America
- Division of Cardiovascular Medicine, College of Medicine, University of Florida, Gainesville, Florida, United States of America
| | - Mariza de Andrade
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Dana C. Crawford
- Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
- * E-mail:
| |
Collapse
|
10
|
Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, Ellis SB, Lingren T, Thompson WK, Savova G, Haines J, Roden DM, Harris PA, Denny JC. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc 2016; 23:1046-1052. [PMID: 27026615 PMCID: PMC5070514 DOI: 10.1093/jamia/ocv202] [Citation(s) in RCA: 246] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Revised: 10/27/2015] [Accepted: 11/25/2015] [Indexed: 01/29/2023] Open
Abstract
OBJECTIVE Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http://phekb.org), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites. RESULTS As of June 2015, PheKB contained 30 finalized phenotype algorithms and 62 algorithms in development spanning a range of traits and diseases. Phenotypes have had over 3500 unique views in a 6-month period and have been reused by other institutions. International Classification of Disease codes were the most frequently used component, followed by medications and natural language processing. Among algorithms with published performance data, the median PPV was nearly identical when evaluated at the authoring institutions (n = 44; case 96.0%, control 100%) compared to implementation sites (n = 40; case 97.5%, control 100%). DISCUSSION These results demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others. CONCLUSION By providing a central repository, PheKB enables improved development, transportability, and validity of algorithms for research-grade phenotypes using health care generated data.
Collapse
Affiliation(s)
| | - Peter Speltz
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Luke V Rasmussen
- Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | | | - Omri Gottesman
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | | | | | | | | | | | - Todd Lingren
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Will K Thompson
- Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | - Guergana Savova
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Dan M Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Paul A Harris
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C Denny
- Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
11
|
Brilliant MH, Vaziri K, Connor TB, Schwartz SG, Carroll JJ, McCarty CA, Schrodi SJ, Hebbring SJ, Kishor KS, Flynn HW, Moshfeghi AA, Moshfeghi DM, Fini ME, McKay BS. Mining Retrospective Data for Virtual Prospective Drug Repurposing: L-DOPA and Age-related Macular Degeneration. Am J Med 2016; 129:292-8. [PMID: 26524704 PMCID: PMC4841631 DOI: 10.1016/j.amjmed.2015.10.015] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 10/03/2015] [Accepted: 10/05/2015] [Indexed: 11/16/2022]
Abstract
BACKGROUND Age-related macular degeneration (AMD) is a leading cause of visual loss among the elderly. A key cell type involved in AMD, the retinal pigment epithelium, expresses a G protein-coupled receptor that, in response to its ligand, L-DOPA, up-regulates pigment epithelia-derived factor, while down-regulating vascular endothelial growth factor. In this study we investigated the potential relationship between L-DOPA and AMD. METHODS We used retrospective analysis to compare the incidence of AMD between patients taking vs not taking L-DOPA. We analyzed 2 separate cohorts of patients with extensive medical records from the Marshfield Clinic (approximately 17,000 and approximately 20,000) and the Truven MarketScan outpatient and databases (approximately 87 million) patients. We used International Classification of Diseases, 9th Revision codes to identify AMD diagnoses and L-DOPA prescriptions to determine the relative risk of developing AMD and age of onset with or without an L-DOPA prescription. RESULTS In the retrospective analysis of patients without an L-DOPA prescription, AMD age of onset was 71.2, 71.3, and 71.3 in 3 independent retrospective cohorts. Age-related macular degeneration occurred significantly later in patients with an L-DOPA prescription, 79.4 in all cohorts. The odds ratio of developing AMD was also significantly negatively correlated by L-DOPA (odds ratio 0.78; confidence interval, 0.76-0.80; P <.001). Similar results were observed for neovascular AMD (P <.001). CONCLUSIONS Exogenous L-DOPA was protective against AMD. L-DOPA is normally produced in pigmented tissues, such as the retinal pigment epithelium, as a byproduct of melanin synthesis by tyrosinase. GPR143 is the only known L-DOPA receptor; it is therefore plausible that GPR143 may be a fruitful target to combat this devastating disease.
Collapse
Affiliation(s)
- Murray H Brilliant
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wis
| | - Kamyar Vaziri
- Department of Ophthalmology, Bascom Palmer Eye Institute, Miller School of Medicine, University of Miami, Palm Beach Gardens, Fla
| | - Thomas B Connor
- Department of Ophthalmology, Medical College of Wisconsin, Milwaukee
| | - Stephen G Schwartz
- Department of Ophthalmology, Bascom Palmer Eye Institute, Miller School of Medicine, University of Miami, Palm Beach Gardens, Fla
| | - Joseph J Carroll
- Department of Ophthalmology, Medical College of Wisconsin, Milwaukee; Department of Cell Biology, Neurobiology & Anatomy, Medical College of Wisconsin, Milwaukee
| | | | - Steven J Schrodi
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wis
| | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wis
| | - Krishna S Kishor
- Department of Ophthalmology, Bascom Palmer Eye Institute, Miller School of Medicine, University of Miami, Palm Beach Gardens, Fla
| | - Harry W Flynn
- Department of Ophthalmology, Bascom Palmer Eye Institute, Miller School of Medicine, University of Miami, Palm Beach Gardens, Fla
| | - Andrew A Moshfeghi
- Department of Ophthalmology, USC Eye Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles
| | - Darius M Moshfeghi
- Department of Ophthalmology, Byers Eye Institute, Stanford University School of Medicine, Palo Alto, Calif
| | - M Elizabeth Fini
- USC Institute for Genetic Medicine, Keck School of Medicine of USC, University of Southern California, Los Angeles; Department of Cell & Neurobiology, Keck School of Medicine of USC, University of Southern California, Los Angeles; Department of Ophthalmology, Keck School of Medicine of USC, University of Southern California, Los Angeles
| | - Brian S McKay
- Department of Ophthalmology and Vision Science, University of Arizona, Tucson; Department of Cellular and Molecular Medicine, University of Arizona, Tucson.
| |
Collapse
|
12
|
Shah RR, Gaedigk A, LLerena A, Eichelbaum M, Stingl J, Smith RL. CYP450 genotype and pharmacogenetic association studies: a critical appraisal. Pharmacogenomics 2016; 17:259-75. [DOI: 10.2217/pgs.15.172] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Despite strong pharmacological support, association studies using genotype-predicted phenotype as a variable have yielded conflicting or inconclusive evidence to promote personalized pharmacotherapy. Unless the patient is a genotypic poor metabolizer, imputation of patient's metabolic capacity (or metabolic phenotype), a major factor in drug exposure-related clinical response, is a complex and highly challenging task because of limited number of alleles interrogated, population-specific differences in allele frequencies, allele-specific substrate-selectivity and importantly, phenoconversion mediated by co-medications and inflammatory co-morbidities that modulate the functional activity of drug metabolizing enzymes. Furthermore, metabolic phenotype and clinical outcomes are not binary functions; there is large intragenotypic and intraindividual variability. Therefore, the ability of association studies to identify relationships between genotype and clinical outcomes can be greatly enhanced by determining phenotype measures of study participants and/or by therapeutic drug monitoring to correlate drug concentrations with genotype and actual metabolic phenotype. To facilitate improved analysis and reporting of association studies, we propose acronyms with the prefixes ‘g’ (genotype-predicted phenotype) and ‘m’ (measured metabolic phenotype) to better describe this important variable of the study subjects. Inclusion of actually measured metabolic phenotype, and when appropriate therapeutic drug monitoring, promises to reveal relationships that may not be detected by using genotype alone as the variable.
Collapse
Affiliation(s)
| | - Andrea Gaedigk
- Clinical Pharmacology, Toxicology &, Therapeutic Innovation, Children's Mercy-Kansas City, 2401 Gillham Rd, Kansas City, MO 64108, USA
- School of Medicine, University of Missouri-Kansas City, MO, USA
| | - Adrián LLerena
- CICAB Clinical Research Centre, Extremadura University Hospital & Medical School, Badajoz, Spain
| | - Michel Eichelbaum
- Dr. Margarete Fischer-Bosch – Institut für Klinische Pharmakologie, 70376 Stuttgart Auerbachstr., 112 Germany
| | - Julia Stingl
- Centre for Translational Medicine, University of Bonn Medical School, Bonn, Germany
| | - Robert L Smith
- Department of Surgery & Cancer, Faculty of Medicine, Imperial College, South Kensington Campus, London, UK
| |
Collapse
|
13
|
Razavian N, Blecker S, Schmidt AM, Smith-McLallen A, Nigam S, Sontag D. Population-Level Prediction of Type 2 Diabetes From Claims Data and Analysis of Risk Factors. BIG DATA 2015; 3:277-287. [PMID: 27441408 DOI: 10.1089/big.2015.0020] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We present a new approach to population health, in which data-driven predictive models are learned for outcomes such as type 2 diabetes. Our approach enables risk assessment from readily available electronic claims data on large populations, without additional screening cost. Proposed model uncovers early and late-stage risk factors. Using administrative claims, pharmacy records, healthcare utilization, and laboratory results of 4.1 million individuals between 2005 and 2009, an initial set of 42,000 variables were derived that together describe the full health status and history of every individual. Machine learning was then used to methodically enhance predictive variable set and fit models predicting onset of type 2 diabetes in 2009-2011, 2010-2012, and 2011-2013. We compared the enhanced model with a parsimonious model consisting of known diabetes risk factors in a real-world environment, where missing values are common and prevalent. Furthermore, we analyzed novel and known risk factors emerging from the model at different age groups at different stages before the onset. Parsimonious model using 21 classic diabetes risk factors resulted in area under ROC curve (AUC) of 0.75 for diabetes prediction within a 2-year window following the baseline. The enhanced model increased the AUC to 0.80, with about 900 variables selected as predictive (p < 0.0001 for differences between AUCs). Similar improvements were observed for models predicting diabetes onset 1-3 years and 2-4 years after baseline. The enhanced model improved positive predictive value by at least 50% and identified novel surrogate risk factors for type 2 diabetes, such as chronic liver disease (odds ratio [OR] 3.71), high alanine aminotransferase (OR 2.26), esophageal reflux (OR 1.85), and history of acute bronchitis (OR 1.45). Liver risk factors emerge later in the process of diabetes development compared with obesity-related factors such as hypertension and high hemoglobin A1c. In conclusion, population-level risk prediction for type 2 diabetes using readily available administrative data is feasible and has better prediction performance than classical diabetes risk prediction algorithms on very large populations with missing data. The new model enables intervention allocation at national scale quickly and accurately and recovers potentially novel risk factors at different stages before the disease onset.
Collapse
Affiliation(s)
- Narges Razavian
- 1 Department of Computer Science, New York University , New York, New York
| | - Saul Blecker
- 2 Department of Population Health, NYU Langone Medical Center, New York University , New York, New York
| | - Ann Marie Schmidt
- 3 Department of Medicine, Department of Biochemistry and Molecular Pharmacology, Department of Pathology Medicine, and Diabetes Research Program, NYU Langone Medical Center, New York University , New York, New York
| | | | - Somesh Nigam
- 4 Advanced Analytics, Independence Blue Cross , Philadelphia, Pennsylvania
| | - David Sontag
- 1 Department of Computer Science, New York University , New York, New York
| |
Collapse
|
14
|
Zheng K, Vydiswaran VGV, Liu Y, Wang Y, Stubbs A, Uzuner Ö, Gururaj AE, Bayer S, Aberdeen J, Rumshisky A, Pakhomov S, Liu H, Xu H. Ease of adoption of clinical natural language processing software: An evaluation of five systems. J Biomed Inform 2015; 58 Suppl:S189-S196. [PMID: 26210361 PMCID: PMC4974203 DOI: 10.1016/j.jbi.2015.07.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Revised: 06/09/2015] [Accepted: 07/06/2015] [Indexed: 12/19/2022]
Abstract
OBJECTIVE In recognition of potential barriers that may inhibit the widespread adoption of biomedical software, the 2014 i2b2 Challenge introduced a special track, Track 3 - Software Usability Assessment, in order to develop a better understanding of the adoption issues that might be associated with the state-of-the-art clinical NLP systems. This paper reports the ease of adoption assessment methods we developed for this track, and the results of evaluating five clinical NLP system submissions. MATERIALS AND METHODS A team of human evaluators performed a series of scripted adoptability test tasks with each of the participating systems. The evaluation team consisted of four "expert evaluators" with training in computer science, and eight "end user evaluators" with mixed backgrounds in medicine, nursing, pharmacy, and health informatics. We assessed how easy it is to adopt the submitted systems along the following three dimensions: communication effectiveness (i.e., how effective a system is in communicating its designed objectives to intended audience), effort required to install, and effort required to use. We used a formal software usability testing tool, TURF, to record the evaluators' interactions with the systems and 'think-aloud' data revealing their thought processes when installing and using the systems and when resolving unexpected issues. RESULTS Overall, the ease of adoption ratings that the five systems received are unsatisfactory. Installation of some of the systems proved to be rather difficult, and some systems failed to adequately communicate their designed objectives to intended adopters. Further, the average ratings provided by the end user evaluators on ease of use and ease of interpreting output are -0.35 and -0.53, respectively, indicating that this group of users generally deemed the systems extremely difficult to work with. While the ratings provided by the expert evaluators are higher, 0.6 and 0.45, respectively, these ratings are still low indicating that they also experienced considerable struggles. DISCUSSION The results of the Track 3 evaluation show that the adoptability of the five participating clinical NLP systems has a great margin for improvement. Remedy strategies suggested by the evaluators included (1) more detailed and operation system specific use instructions; (2) provision of more pertinent onscreen feedback for easier diagnosis of problems; (3) including screen walk-throughs in use instructions so users know what to expect and what might have gone wrong; (4) avoiding jargon and acronyms in materials intended for end users; and (5) packaging prerequisites required within software distributions so that prospective adopters of the software do not have to obtain each of the third-party components on their own.
Collapse
Affiliation(s)
- Kai Zheng
- School of Public Health Department of Health Management and Policy, University of Michigan, Ann Arbor, MI, USA; School of Information, University of Michigan, Ann Arbor, MI, USA.
| | - V G Vinod Vydiswaran
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Yang Liu
- School of Information, University of Michigan, Ann Arbor, MI, USA
| | - Yue Wang
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Amber Stubbs
- School of Library and Information Science, Simmons College, Boston, MA, USA
| | - Özlem Uzuner
- Department of Information Studies, University at Albany, SUNY, Albany, NY, USA
| | - Anupama E Gururaj
- The University of Texas School of Biomedical Informatics at Houston, Houston, TX, USA
| | | | | | - Anna Rumshisky
- Department of Computer Science, University of Massachusetts, Lowell, MA, USA
| | | | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Hua Xu
- The University of Texas School of Biomedical Informatics at Houston, Houston, TX, USA.
| |
Collapse
|
15
|
Wei WQ, Teixeira PL, Mo H, Cronin RM, Warner JL, Denny JC. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J Am Med Inform Assoc 2015; 23:e20-7. [PMID: 26338219 DOI: 10.1093/jamia/ocv130] [Citation(s) in RCA: 126] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 07/15/2015] [Indexed: 02/06/2023] Open
Abstract
OBJECTIVE To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications. MATERIALS AND METHODS We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer's disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson's disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, andF-score for each EHR component alone and in combination. RESULTS The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06-0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance (Fscore: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08). CONCLUSION Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.
Collapse
Affiliation(s)
- Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - Pedro L Teixeira
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - Huan Mo
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - Robert M Cronin
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Jeremy L Warner
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA Department of Medicine, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
16
|
Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med 2015; 7:41. [PMID: 25937834 PMCID: PMC4416392 DOI: 10.1186/s13073-015-0166-y] [Citation(s) in RCA: 158] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The convergence of two rapidly developing technologies - high-throughput genotyping and electronic health records (EHRs) - gives scientists an unprecedented opportunity to utilize routine healthcare data to accelerate genomic discovery. Institutions and healthcare systems have been building EHR-linked DNA biobanks to enable such a vision. However, the precise extraction of detailed disease and drug-response phenotype information hidden in EHRs is not an easy task. EHR-based studies have successfully replicated known associations, made new discoveries for diseases and drug response traits, rapidly contributed cases and controls to large meta-analyses, and demonstrated the potential of EHRs for broad-based phenome-wide association studies. In this review, we summarize the advantages and challenges of repurposing EHR data for genetic research. We also highlight recent notable studies and novel approaches to provide an overview of advanced EHR-based phenotyping.
Collapse
|
17
|
Wang RS, Maron BA, Loscalzo J. Systems medicine: evolution of systems biology from bench to bedside. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2015; 7:141-61. [PMID: 25891169 DOI: 10.1002/wsbm.1297] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Revised: 03/04/2015] [Accepted: 03/06/2015] [Indexed: 12/11/2022]
Abstract
High-throughput experimental techniques for generating genomes, transcriptomes, proteomes, metabolomes, and interactomes have provided unprecedented opportunities to interrogate biological systems and human diseases on a global level. Systems biology integrates the mass of heterogeneous high-throughput data and predictive computational modeling to understand biological functions as system-level properties. Most human diseases are biological states caused by multiple components of perturbed pathways and regulatory networks rather than individual failing components. Systems biology not only facilitates basic biological research but also provides new avenues through which to understand human diseases, identify diagnostic biomarkers, and develop disease treatments. At the same time, systems biology seeks to assist in drug discovery, drug optimization, drug combinations, and drug repositioning by investigating the molecular mechanisms of action of drugs at a system's level. Indeed, systems biology is evolving to systems medicine as a new discipline that aims to offer new approaches for addressing the diagnosis and treatment of major human diseases uniquely, effectively, and with personalized precision.
Collapse
Affiliation(s)
- Rui-Sheng Wang
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Bradley A Maron
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Department of Cardiology, Veterans Affairs Boston Healthcare System, West Roxbury, MA, USA
| | - Joseph Loscalzo
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
18
|
Carroll RJ, Eyler AE, Denny JC. Intelligent use and clinical benefits of electronic health records in rheumatoid arthritis. Expert Rev Clin Immunol 2015; 11:329-37. [PMID: 25660652 DOI: 10.1586/1744666x.2015.1009895] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
In the past 10 years, electronic health records (EHRs) have had growing impact in clinical care. EHRs efficiently capture and reuse clinical information, which can directly benefit patient care by guiding treatments and providing effective reminders for best practices. The increased adoption has also lead to more complex implementations, including robust, disease-specific tools, such as for rheumatoid arthritis (RA). In addition, the data collected through normal clinical care is also used in secondary research, helping to refine patient treatment for the future. Although few studies have directly demonstrated benefits for direct clinical care of RA, the opposite is true for EHR-based research - RA has been a particularly fertile ground for clinical and genomic research that have leveraged typically advanced informatics methods to accurately define RA populations. We discuss the clinical impact of EHRs in RA treatment and their impact on secondary research, and provide recommendations for improved utility in future EHR installations.
Collapse
Affiliation(s)
- Robert J Carroll
- Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | | | | |
Collapse
|
19
|
Abstract
Clinicians already face "personalized" medicine every day while experiencing the great variation in toxicities and drug efficacy among individual patients. Pharmacogenetics studies are the platform for discovering the DNA determinants of variability in drug response and tolerability. Research now focuses on the genome after its beginning with analyses of single genes. Therapeutic outcomes from several psychotropic drugs have been weakly linked to specific genetic variants without independent replication. Drug side effects show stronger associations to genetic variants, including human leukocyte antigen loci with carbamazepine-induced dermatologic outcome and MC4R with atypical antipsychotic weight gain. Clinical implementation has proven challenging, with barriers including a lack of replicable prospective evidence for clinical utility required for altering medical care. More recent studies show promising approaches for reducing these barriers to routine incorporation of pharmacogenetics data into clinical care.
Collapse
|
20
|
Ben-Assuli O. Electronic health records, adoption, quality of care, legal and privacy issues and their implementation in emergency departments. Health Policy 2014; 119:287-97. [PMID: 25483873 DOI: 10.1016/j.healthpol.2014.11.014] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 11/06/2014] [Accepted: 11/21/2014] [Indexed: 11/26/2022]
Abstract
Recently, the healthcare sector has shown a growing interest in information technologies. Two popular health IT (HIT) products are the electronic health record (EHR) and health information exchange (HIE) networks. The introduction of these tools is believed to improve care, but has also raised some important questions and legal and privacy issues. The implementation of these systems has not gone smoothly, and still faces some considerable barriers. This article reviews EHR and HIE to address these obstacles, and analyzes the current state of development and adoption in various countries around the world. Moreover, legal and ethical concerns that may be encountered by EHR users and purchasers are reviewed. Finally, links and interrelations between EHR and HIE and several quality of care issues in today's healthcare domain are examined with a focus on EHR and HIE in the emergency department (ED), whose unique characteristics makes it an environment in which the implementation of such technology may be a major contributor to health, but also faces substantial challenges. The paper ends with a discussion of specific policy implications and recommendations based on an examination of the current limitations of these systems.
Collapse
Affiliation(s)
- Ofir Ben-Assuli
- Ono Academic College, Faculty of Business Administration, 104 Zahal Street, 55000 Kiryat Ono, Israel.
| |
Collapse
|
21
|
Rasmussen LV, Thompson WK, Pacheco JA, Kho AN, Carrell DS, Pathak J, Peissig PL, Tromp G, Denny JC, Starren JB. Design patterns for the development of electronic health record-driven phenotype extraction algorithms. J Biomed Inform 2014; 51:280-6. [PMID: 24960203 DOI: 10.1016/j.jbi.2014.06.007] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Revised: 05/27/2014] [Accepted: 06/16/2014] [Indexed: 12/22/2022]
Abstract
BACKGROUND Design patterns, in the context of software development and ontologies, provide generalized approaches and guidance to solving commonly occurring problems, or addressing common situations typically informed by intuition, heuristics and experience. While the biomedical literature contains broad coverage of specific phenotype algorithm implementations, no work to date has attempted to generalize common approaches into design patterns, which may then be distributed to the informatics community to efficiently develop more accurate phenotype algorithms. METHODS Using phenotyping algorithms stored in the Phenotype KnowledgeBase (PheKB), we conducted an independent iterative review to identify recurrent elements within the algorithm definitions. We extracted and generalized recurrent elements in these algorithms into candidate patterns. The authors then assessed the candidate patterns for validity by group consensus, and annotated them with attributes. RESULTS A total of 24 electronic Medical Records and Genomics (eMERGE) phenotypes available in PheKB as of 1/25/2013 were downloaded and reviewed. From these, a total of 21 phenotyping patterns were identified, which are available as an online data supplement. CONCLUSIONS Repeatable patterns within phenotyping algorithms exist, and when codified and cataloged may help to educate both experienced and novice algorithm developers. The dissemination and application of these patterns has the potential to decrease the time to develop algorithms, while improving portability and accuracy.
Collapse
Affiliation(s)
- Luke V Rasmussen
- Feinberg School of Medicine, Northwestern University, Chicago, IL, United States.
| | - Will K Thompson
- Feinberg School of Medicine, Northwestern University, Chicago, IL, United States; Center for Biomedical Research Informatics, NorthShore University HealthSystem, Evanston, IL, United States
| | - Jennifer A Pacheco
- Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Abel N Kho
- Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | | | - Jyotishman Pathak
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Peggy L Peissig
- Marshfield Clinic Research Foundation, Marshfield, WI, United States
| | - Gerard Tromp
- Sigfried and Janet Weis Center for Research, Geisinger Health System, Danville, PA, United States
| | - Joshua C Denny
- Departments of Biomedical Informatics and Medicine, Vanderbilt University, Nashville, TN, United States
| | - Justin B Starren
- Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| |
Collapse
|
22
|
Hall JB, Dumitrescu L, Dilks HH, Crawford DC, Bush WS. Accuracy of administratively-assigned ancestry for diverse populations in an electronic medical record-linked biobank. PLoS One 2014; 9:e99161. [PMID: 24896101 PMCID: PMC4045967 DOI: 10.1371/journal.pone.0099161] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2014] [Accepted: 05/12/2014] [Indexed: 11/19/2022] Open
Abstract
Recently, the development of biobanks linked to electronic medical records has presented new opportunities for genetic and epidemiological research. Studies based on these resources, however, present unique challenges, including the accurate assignment of individual-level population ancestry. In this work we examine the accuracy of administratively-assigned race in diverse populations by comparing assigned races to genetically-defined ancestry estimates. Using 220 ancestry informative markers, we generated principal components for patients in our dataset, which were used to cluster patients into groups based on genetic ancestry. Consistent with other studies, we find a strong overall agreement (Kappa = 0.872) between genetic ancestry and assigned race, with higher rates of agreement for African-descent and European-descent assignments, and reduced agreement for Hispanic, East Asian-descent, and South Asian-descent assignments. These results suggest caution when selecting study samples of non-African and non-European backgrounds when administratively-assigned race from biobanks is used.
Collapse
Affiliation(s)
- Jacob B. Hall
- Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Logan Dumitrescu
- Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Holli H. Dilks
- Vanderbilt Technologies for Advanced Genomics (VANTAGE), Vanderbilt University, Nashville, Tennessee, United States of America
| | - Dana C. Crawford
- Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, United States of America
| | - William S. Bush
- Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
23
|
Wei WQ, Feng Q, Weeke P, Bush W, Waitara MS, Iwuchukwu OF, Roden DM, Wilke RA, Stein CM, Denny JC. Creation and Validation of an EMR-based Algorithm for Identifying Major Adverse Cardiac Events while on Statins. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2014; 2014:112-9. [PMID: 25717410 PMCID: PMC4333709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Statin medications are often prescribed to ameliorate a patient's risk of cardiovascular events due in part to cholesterol reduction. We developed and evaluated an algorithm that can accurately identify subjects with major adverse cardiac events (MACE) while on statins using electronic medical record (EMR) data. The algorithm also identifies subjects experiencing their first MACE while on statins for primary prevention. The algorithm achieved 90% to 97% PPVs in identification of MACE cases as compared against physician review. By applying the algorithm to EMR data in BioVU, cases and controls were identified and used subsequently to replicate known associations with eight genetic variants. We replicated 6/8 previously reported genetic associations with cardiovascular diseases or lipid metabolism disorders. Our results demonstrated that the algorithm can be used to accurately identify subjects with MACE and MACE while on statins. Consequently, future e studies can be conducted to investigate and validate the relationship between statins and MACE using real-world clinical data.
Collapse
Affiliation(s)
- Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN
| | - Qiping Feng
- Division of Clinical Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - Peter Weeke
- Division of Clinical Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - William Bush
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, TN
| | - Magarya S. Waitara
- Division of Clinical Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - Otito F. Iwuchukwu
- Division of Clinical Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - Dan M. Roden
- Division of Clinical Pharmacology, Vanderbilt University School of Medicine, Nashville, TN,Oates Institute for Experimental Therapeutics, Vanderbilt University, Nashville, TN,Office of Personalized Medicine, Vanderbilt University, Nashville, TN
| | | | - Charles M Stein
- Division of Clinical Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN
| |
Collapse
|
24
|
Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2013; 21:221-30. [PMID: 24201027 PMCID: PMC3932460 DOI: 10.1136/amiajnl-2013-001935] [Citation(s) in RCA: 316] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses.
Collapse
Affiliation(s)
- Chaitanya Shivade
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA
| | | | | | | | | | | | | |
Collapse
|
25
|
Kannry JM, Williams MS. Integration of genomics into the electronic health record: mapping terra incognita. Genet Med 2013; 15:757-60. [PMID: 24097178 PMCID: PMC4157459 DOI: 10.1038/gim.2013.102] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Accepted: 06/17/2013] [Indexed: 12/31/2022] Open
Abstract
Successfully realizing the vision of genomic medicine will require management of large amounts of complex data. The electronic health record (EHR) is destined to play a critical role in the translation of genomic information into clinical care. The papers in this special issue explore the challenges associated with the implementation of genomics in the EHR. The proposed solutions are meant to provide guidance for those responsible for moving genomics into the clinic.
Collapse
Affiliation(s)
- Joseph M. Kannry
- EMR Clinical Transformation Group, Mount Sinai Medical Center New York, NY, USA
| | - Marc S. Williams
- Genomic Medicine Institute, Geisinger Health System Danville, PA, USA
| |
Collapse
|
26
|
Vandamme D, Fitzmaurice W, Kholodenko B, Kolch W. Systems medicine: helping us understand the complexity of disease. QJM 2013; 106:891-5. [PMID: 23904523 DOI: 10.1093/qjmed/hct163] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Advances in genomics and other -omic fields in the last decade have resulted in unprecedented volumes of complex data now being available. These data can enable physicians to provide their patients with care that is more personalized, predictive, preventive and participatory. The expertise required to manage and understand this data is to be found in fields outside of medical science, thus multidisciplinary collaboration coupled to a systems approach is key to unlocking its potential, with concomitant new ways of working. Systems medicine can build on the successes in the field of systems biology, recognizing the human body as the multidimensional network of networks that it is. While systems medicine can provide a conceptual and theoretical framework, its practical goal is to provide physicians the tools necessary for harnessing the rapid advances in basic biomedical science into their routine clinical arsenal.
Collapse
Affiliation(s)
- D Vandamme
- Systems Biology Ireland, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | |
Collapse
|
27
|
Characterization of schizophrenia adverse drug interactions through a network approach and drug classification. BIOMED RESEARCH INTERNATIONAL 2013; 2013:458989. [PMID: 24089679 PMCID: PMC3782118 DOI: 10.1155/2013/458989] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2013] [Accepted: 08/08/2013] [Indexed: 11/17/2022]
Abstract
Antipsychotic drugs are medications commonly for schizophrenia (SCZ) treatment, which include two groups: typical and atypical. SCZ patients have multiple comorbidities, and the coadministration of drugs is quite common. This may result in adverse drug-drug interactions, which are events that occur when the effect of a drug is altered by the coadministration of another drug. Therefore, it is important to provide a comprehensive view of these interactions for further coadministration improvement. Here, we extracted SCZ drugs and their adverse drug interactions from the DrugBank and compiled a SCZ-specific adverse drug interaction network. This network included 28 SCZ drugs, 241 non-SCZs, and 991 interactions. By integrating the Anatomical Therapeutic Chemical (ATC) classification with the network analysis, we characterized those interactions. Our results indicated that SCZ drugs tended to have more adverse drug interactions than other drugs. Furthermore, SCZ typical drugs had significant interactions with drugs of the "alimentary tract and metabolism" category while SCZ atypical drugs had significant interactions with drugs of the categories "nervous system" and "antiinfectives for systemic uses." This study is the first to characterize the adverse drug interactions in the course of SCZ treatment and might provide useful information for the future SCZ treatment.
Collapse
|
28
|
Gurwitz D, McLeod HL. Genome-wide studies in pharmacogenomics: harnessing the power of extreme phenotypes. Pharmacogenomics 2013; 14:337-9. [PMID: 23438876 DOI: 10.2217/pgs.13.35] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Affiliation(s)
- David Gurwitz
- Department of Human Molecular Genetics & Biochemistry, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv 69978, Israel.
| | | |
Collapse
|
29
|
Ury AG. Storing and interpreting genomic information in widely deployed electronic health record systems. Genet Med 2013; 15:779-85. [DOI: 10.1038/gim.2013.111] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2013] [Accepted: 06/24/2013] [Indexed: 01/19/2023] Open
|
30
|
Maltais S, Joggerst SJ, Hatzopoulos A, DiSalvo TG, Zhao D, Sung HJ, Wang X, Byrne JG, Naftilan AJ. Stem cell therapy for chronic heart failure: an updated appraisal. Expert Opin Biol Ther 2013; 13:503-16. [PMID: 23289619 DOI: 10.1517/14712598.2013.749852] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
INTRODUCTION Significant advances have been made to understand the mechanisms involved in cardiac cell-based therapies. The early translational application of basic science knowledge has led to several animal and human clinical trials. The initial promising beneficial effect of stem cells on cardiac function restoration has been eclipsed by the inability of animal studies to translate into sustained clinical improvements in human clinical trials. AREAS COVERED In this review, the authors cover an updated overview of various stem cell populations used in chronic heart failure. A critical review of clinical trials conducted in advanced heart failure patients is proposed, and finally promising avenues for developments in the field of cardiac cell-based therapies are presented. EXPERT OPINION Several questions remain unanswered, and this limits our ability to understand basic mechanisms involved in stem cell therapeutics. Human studies have revealed critical unresolved issues. Further elucidation of the proper timing, mode delivery and prosurvival factors is imperative, if the field is to advance. The limited benefits seen to date are simply not enough if the potential for substantial recovery of nonfunctioning myocardium is to be realized.
Collapse
|
31
|
Roden DM. Cardiovascular pharmacogenomics: the future of cardiovascular therapeutics? Can J Cardiol 2013; 29:58-66. [PMID: 23200096 PMCID: PMC3529768 DOI: 10.1016/j.cjca.2012.07.845] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2012] [Revised: 07/17/2012] [Accepted: 07/31/2012] [Indexed: 01/08/2023] Open
Abstract
Responses to drug therapy vary from benefit to no effect to adverse effects which can be serious or occasionally fatal. Increasing evidence supports the idea that genetic variants can play a major role in this spectrum of responses. Well-studied examples in cardiovascular therapeutics include predictors of steady-state warfarin dosage, predictors of reduced efficacy among patients receiving clopidogrel for drug eluting stents, and predictors of some serious adverse drug effects. This review summarizes contemporary approaches to identifying and validating genetic predictors of variability in response to drug treatment. Approaches to incorporating this new knowledge into clinical care, and the barriers to this concept, are addressed.
Collapse
Affiliation(s)
- Dan M Roden
- Departments of Medicine and Pharmacology, Vanderbilt University School of Medicine, Nashville, TN, USA.
| |
Collapse
|
32
|
McGregor TL, Van Driest SL, Brothers KB, Bowton EA, Muglia LJ, Roden DM. Inclusion of pediatric samples in an opt-out biorepository linking DNA to de-identified medical records: pediatric BioVU. Clin Pharmacol Ther 2012; 93:204-11. [PMID: 23281421 DOI: 10.1038/clpt.2012.230] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The Vanderbilt DNA repository, BioVU, links DNA from leftover clinical blood samples to de-identified electronic medical records (EMRs). After initiating adult sample collection, pediatric extension required consideration of ethical concerns specific to pediatrics and implementation of specialized DNA extraction methods. In the first year of pediatric sample collection, more than 11,000 samples from individuals younger than 18 years were included. We compared data from the pediatric BioVU cohort with those from the overall Vanderbilt University Medical Center pediatric population and found similar demographic characteristics; however, the BioVU cohort had higher rates of select diseases, medication exposures, and laboratory testing, demonstrating enriched representation of severe or chronic disease. The fact that the sample accumulation is not balanced may accelerate research in some cohorts while limiting the study of relatively benign conditions and the accrual of unaffected and unbiased control samples. BioVU represents a feasible model for pediatric DNA biobanking but involves both ethical and practical considerations specific to the pediatric population.
Collapse
Affiliation(s)
- T L McGregor
- Department of Pediatrics, Vanderbilt University and the Monroe Carell Jr. Children's Hospital at Vanderbilt, Nashville, Tennessee, USA
| | | | | | | | | | | |
Collapse
|
33
|
Gurwitz D. High-Quality Phenomics are Crucial for Informative Omics Studies. Drug Dev Res 2012. [DOI: 10.1002/ddr.21025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- David Gurwitz
- Department of Human Molecular Genetics and Biochemistry; Sackler Faculty of Medicine; Tel-Aviv University; Tel-Aviv; 69978; Israel
| |
Collapse
|
34
|
Abstract
A new generation of technologies commonly named omics permits assessment of the entirety of the components of biological systems and produces an explosion of data and a major shift in our concepts of disease. These technologies will likely shape the future of health care. One aspect of these advances is that the data generated document the uniqueness of each human being in regard to disease risk and treatment response. These developments have reemphasized the concept of personalized medicine. Here we review the impact of omics technologies on one key aspect of personalized medicine: the individual drug response. We describe how knowledge of different omics may affect treatment decisions, namely drug choice and drug dose, and how it can be used to improve clinical outcomes.
Collapse
Affiliation(s)
- Urs A Meyer
- Division of Pharmacology and Neurobiology, Biozentrum of the University of Basel, CH-4056 Basel, Switzerland.
| | | | | |
Collapse
|
35
|
Azuaje F. Drug interaction networks: an introduction to translational and clinical applications. Cardiovasc Res 2012; 97:631-41. [DOI: 10.1093/cvr/cvs289] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|