51
|
Hulme OL, Khurshid S, Weng LC, Anderson CD, Wang EY, Ashburner JM, Ko D, McManus DD, Benjamin EJ, Ellinor PT, Trinquart L, Lubitz SA. Development and Validation of a Prediction Model for Atrial Fibrillation Using Electronic Health Records. JACC Clin Electrophysiol 2019; 5:1331-1341. [PMID: 31753441 PMCID: PMC6884135 DOI: 10.1016/j.jacep.2019.07.016] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 07/22/2019] [Indexed: 02/07/2023]
Abstract
OBJECTIVES This study sought to determine whether the risk of atrial fibrillation AF can be estimated accurately by using routinely ascertained features in the electronic health record (EHR) and whether AF risk is associated with stroke. BACKGROUND Early diagnosis of AF and treatment with anticoagulation may prevent strokes. METHODS Using a multi-institutional EHR, this study identified 412,085 individuals 45 to 95 years of age without prevalent AF between 2000 and 2014. A prediction model was derived and validated for 5-year AF risk by using split-sample validation and model performance was compared with other methods of AF risk assessment. RESULTS Within 5 years, 14,334 individuals developed AF. In the derivation sample (7,216 AF events of 206,042 total), the optimal risk model included sex, age, race, smoking, height, weight, diastolic blood pressure, hypertension, hyperlipidemia, heart failure, coronary heart disease, valvular disease, prior stroke, peripheral arterial disease, chronic kidney disease, hypothyroidism, and quadratic terms for height, weight, and age. In the validation sample (7,118 AF events of 206,043 total) the AF risk model demonstrated good discrimination (C-statistic: 0.777; 95% confidence interval [CI:] 0.771 to 0.783) and calibration (0.99; 95% CI: 0.96 to 1.01). Model discrimination and calibration were superior to CHARGE-AF (Cohorts for Heart and Aging Research in Genomic Epidemiology AF) (C-statistic: 0.753; 95% CI: 0.747 to 0.759; calibration slope: 0.72; 95% CI: 0.71 to 0.74), C2HEST (Coronary artery disease / chronic obstructive pulmonary disease; Hypertension; Elderly [age ≥75 years]; Systolic heart failure; Thyroid disease [hyperthyroidism]) (C-statistic: 0.754; 95% CI: 0.747 to 0.762; calibration slope: 0.44; 95% CI: 0.43 to 0.45), and CHA2DS2-VASc (Congestive heart failure, Hypertension, Age ≥75 years, Diabetes mellitus, Prior stroke, transient ischemic attack [TIA], or thromboembolism, Vascular disease, Age 65-74 years, Sex category [female]) scores (C-statistic: 0.702; 95% CI: 0.693 to 0.710; calibration slope: 0.37; 95% CI: 0.36 to 0.38). AF risk discriminated incident stroke (n = 4,814; C-statistic: 0.684; 95% CI: 0.677 to 0.692) and stroke within 90 days of incident AF (n = 327; C-statistic: 0.789; 95% CI: 0.764 to 0.814). CONCLUSIONS A model developed from a real-world EHR database predicted AF accurately and stratified stroke risk. Incorporating AF prediction into EHRs may enable risk-guided screening for AF.
Collapse
Affiliation(s)
- Olivia L Hulme
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts
| | - Shaan Khurshid
- Cardiology Division, Massachusetts General Hospital, Boston, Massachusetts
| | - Lu-Chen Weng
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts
| | - Christopher D Anderson
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts; J.P. Kistler Stroke Research Center, Massachusetts General Hospital, Boston, Massachusetts
| | - Elizabeth Y Wang
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts
| | - Jeffrey M Ashburner
- Division of General Internal Medicine, Massachusetts General Hospital, Boston, Massachusetts; Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Darae Ko
- Department of Medicine, Boston University Medical Center, Boston, Massachusetts
| | - David D McManus
- Departments of Medicine and Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Emelia J Benjamin
- Boston University and National Heart, Lung, and Blood Institute Framingham Heart Study, Framingham, Massachusetts; Sections of Preventive Medicine and Cardiovascular Medicine, Department of Medicine, Boston University School of Medicine, and Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts; Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston, Massachusetts
| | - Ludovic Trinquart
- Boston University and National Heart, Lung, and Blood Institute Framingham Heart Study, Framingham, Massachusetts; Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
| | - Steven A Lubitz
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts; Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston, Massachusetts.
| |
Collapse
|
52
|
Bruehl S, Gamazon ER, Van de Ven T, Buchheit T, Walsh CG, Mishra P, Ramanujan K, Shaw A. DNA methylation profiles are associated with complex regional pain syndrome after traumatic injury. Pain 2019; 160:2328-2337. [PMID: 31145213 PMCID: PMC7473388 DOI: 10.1097/j.pain.0000000000001624] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Factors contributing to development of complex regional pain syndrome (CRPS) are not fully understood. This study examined possible epigenetic mechanisms that may contribute to CRPS after traumatic injury. DNA methylation profiles were compared between individuals developing CRPS (n = 9) and those developing non-CRPS neuropathic pain (n = 38) after undergoing amputation following military trauma. Linear Models for Microarray (LIMMA) analyses revealed 48 differentially methylated cytosine-phosphate-guanine dinucleotide (CpG) sites between groups (unadjusted P's < 0.005), with the top gene COL11A1 meeting Bonferroni-adjusted P < 0.05. The second largest differential methylation was observed for the HLA-DRB6 gene, an immune-related gene linked previously to CRPS in a small gene expression study. For all but 7 of the significant CpG sites, the CRPS group was hypomethylated. Numerous functional Gene Ontology-Biological Process categories were significantly enriched (false discovery rate-adjusted q value <0.15), including multiple immune-related categories (eg, activation of immune response, immune system development, regulation of immune system processes, and antigen processing and presentation). Differentially methylated genes were more highly connected in human protein-protein networks than expected by chance (P < 0.05), supporting the biological relevance of the findings. Results were validated in an independent sample linking a DNA biobank with electronic health records (n = 126 CRPS phenotype, n = 19,768 non-CRPS chronic pain phenotype). Analyses using PrediXcan methodology indicated differences in the genetically determined component of gene expression in 7 of 48 genes identified in methylation analyses (P's < 0.02). Results suggest that immune- and inflammatory-related factors might confer risk of developing CRPS after traumatic injury. Validation findings demonstrate the potential of using electronic health records linked to DNA for genomic studies of CRPS.
Collapse
Affiliation(s)
- Stephen Bruehl
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN, United States. Mr. Shaw is now with Department of Anesthesiology and Pain Medicine, University of Alberta, Edmonton, AB, Canada
| | - Eric R. Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Anesthesiology, Clare Hall, University of Cambridge, Cambridge, United Kingdom
| | - Thomas Van de Ven
- Department of Anesthesiology, Duke University Medical Center, Durham, NC, United States
| | - Thomas Buchheit
- Department of Anesthesiology, Duke University Medical Center, Durham, NC, United States
| | - Colin G. Walsh
- Departments of Medicine and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Puneet Mishra
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN, United States. Mr. Shaw is now with Department of Anesthesiology and Pain Medicine, University of Alberta, Edmonton, AB, Canada
| | - Krishnan Ramanujan
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN, United States. Mr. Shaw is now with Department of Anesthesiology and Pain Medicine, University of Alberta, Edmonton, AB, Canada
| | - Andrew Shaw
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN, United States. Mr. Shaw is now with Department of Anesthesiology and Pain Medicine, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
53
|
Oni-Orisan A, Hoffmann TJ, Ranatunga D, Medina MW, Jorgenson E, Schaefer C, Krauss RM, Iribarren C, Risch N. Characterization of Statin Low-Density Lipoprotein Cholesterol Dose-Response Using Electronic Health Records in a Large Population-Based Cohort. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2019; 11:e002043. [PMID: 30354326 DOI: 10.1161/circgen.117.002043] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
BACKGROUND Low-density lipoprotein cholesterol (LDL-C) response to statin therapy has not been fully elucidated in real-world populations. The primary objective of this study was to characterize statin LDL-C dose-response and its heritability in a large, multiethnic population of statin users. METHODS We determined the effect of statin dosing on lipid measures utilizing electronic health records in 33 139 statin users from the Kaiser Permanente GERA cohort (Genetic Epidemiology Research on Adult Health and Aging). The relationship between statin defined daily dose and lipid parameter response (percent change) was determined. RESULTS Defined daily dose and LDL-C response was associated in a log-linear relationship (β, -6.17; SE, 0.09; P<10-300) which remained significant after adjusting for prespecified covariates (adjusted β, -5.59; SE, 0.12; P<10-300). Statin type, sex, age, smoking status, diabetes mellitus, and East Asian race/ethnicity were significant independent predictors of statin-induced changes in LDL-C. Based on a variance-component method within the subset of statin users who had at least 1 first-degree relative who was also a statin user (n=1036), heritability of statin LDL-C response was estimated at 11.7% (SE, 8.6%; P=0.087). CONCLUSIONS Using electronic health record data, we observed a statin LDL-C dose-response consistent with the rule of 6% from prior clinical trial data. Clinical and demographic predictors of statin LDL-C response exhibited highly significant but modest effects. Finally, statin-induced changes in LDL-C were not found to be strongly inherited. Ultimately, these findings demonstrate (1) the utility of electronic health records as a reliable source to generate robust phenotypes for pharmacogenomic research and (2) the potential role of statin precision medicine in lipid management.
Collapse
Affiliation(s)
- Akinyemi Oni-Orisan
- Department of Clinical Pharmacy (A.O.), University of California, San Francisco, CA.,Institute for Human Genetics (A.O., T.J.H., N.R.), University of California, San Francisco, CA
| | - Thomas J Hoffmann
- Institute for Human Genetics (A.O., T.J.H., N.R.), University of California, San Francisco, CA.,Department of Epidemiology and Biostatistics (T.J.H., C.I., N.R.), University of California, San Francisco, CA
| | - Dilrini Ranatunga
- Kaiser Permanente Northern California Division of Research, Oakland, CA (D.R., E.J., C.S., C.I., N.R.)
| | - Marisa W Medina
- Children's Hospital Oakland Research Institute, Oakland, CA (M.W.M., R.M.K.)
| | - Eric Jorgenson
- Kaiser Permanente Northern California Division of Research, Oakland, CA (D.R., E.J., C.S., C.I., N.R.)
| | - Catherine Schaefer
- Kaiser Permanente Northern California Division of Research, Oakland, CA (D.R., E.J., C.S., C.I., N.R.)
| | - Ronald M Krauss
- Department of Medicine (R.M.K.), University of California, San Francisco, CA.,Children's Hospital Oakland Research Institute, Oakland, CA (M.W.M., R.M.K.)
| | - Carlos Iribarren
- Department of Epidemiology and Biostatistics (T.J.H., C.I., N.R.), University of California, San Francisco, CA.,Kaiser Permanente Northern California Division of Research, Oakland, CA (D.R., E.J., C.S., C.I., N.R.)
| | - Neil Risch
- Institute for Human Genetics (A.O., T.J.H., N.R.), University of California, San Francisco, CA.,Department of Epidemiology and Biostatistics (T.J.H., C.I., N.R.), University of California, San Francisco, CA.,Kaiser Permanente Northern California Division of Research, Oakland, CA (D.R., E.J., C.S., C.I., N.R.)
| |
Collapse
|
54
|
Barnado A, Carroll RJ, Casey C, Wheless L, Denny JC, Crofford LJ. Phenome-Wide Association Studies Uncover a Novel Association of Increased Atrial Fibrillation in Male Patients With Systemic Lupus Erythematosus. Arthritis Care Res (Hoboken) 2019; 70:1630-1636. [PMID: 29481723 DOI: 10.1002/acr.23553] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 02/20/2018] [Indexed: 12/15/2022]
Abstract
OBJECTIVE Phenome-wide association studies (PheWAS) scan across billing codes in the electronic health record (EHR) and re-purpose clinical EHR data for research. In this study, we examined whether PheWAS could function as an EHR-based discovery tool for systemic lupus erythematosus (SLE) and identified novel clinical associations in male versus female patients with SLE. METHODS We used a de-identified version of the Vanderbilt University Medical Center EHR, which includes more than 2.8 million subjects. We performed EHR-based PheWAS to compare SLE patients with age-, sex-, and race-matched control subjects and to compare male SLE patients with female SLE patients, controlling for multiple testing using a false discovery rate (FDR) P value of 0.05. RESULTS We identified 1,097 patients with SLE and 5,735 matched control subjects. In a comparison of patients with SLE and matched controls, SLE patients were shown to be more likely to have International Classification of Diseases, Ninth Revision codes related to the SLE disease criteria. In the PheWAS of male versus female SLE patients, with adjustment for age and race, male patients were shown to be more likely to have atrial fibrillation (odds ratio 4.50, false discovery rate P = 3.23 × 10-3 ). Chart review confirmed atrial fibrillation, with the majority of patients developing atrial fibrillation after the SLE diagnosis and having multiple risk factors for atrial fibrillation. After adjustment for age, sex, race, and coronary artery disease, SLE disease status was shown to be significantly associated with atrial fibrillation (P = 0.002). CONCLUSION Using PheWAS to compare male and female patients with SLE, we identified a novel association of an increased incidence of atrial fibrillation in male patients. SLE disease status was shown to be independently associated with atrial fibrillation, even after adjustment for age, sex, race, and coronary artery disease. These results demonstrate the utility of PheWAS as an EHR-based discovery tool for SLE.
Collapse
Affiliation(s)
- April Barnado
- Vanderbilt University Medical Center, Nashville, Tennessee
| | | | - Carolyn Casey
- Lehigh Valley Health Network, Allentown, Pennsylvania
| | - Lee Wheless
- Vanderbilt University Medical Center, Nashville, Tennessee
| | - Joshua C Denny
- Vanderbilt University Medical Center, Nashville, Tennessee
| | | |
Collapse
|
55
|
Nelson CA, Butte AJ, Baranzini SE. Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings. Nat Commun 2019; 10:3045. [PMID: 31292438 PMCID: PMC6620318 DOI: 10.1038/s41467-019-11069-0] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 06/18/2019] [Indexed: 12/16/2022] Open
Abstract
In order to advance precision medicine, detailed clinical features ought to be described in a way that leverages current knowledge. Although data collected from biomedical research is expanding at an almost exponential rate, our ability to transform that information into patient care has not kept at pace. A major barrier preventing this transformation is that multi-dimensional data collection and analysis is usually carried out without much understanding of the underlying knowledge structure. Here, in an effort to bridge this gap, Electronic Health Records (EHRs) of individual patients are connected to a heterogeneous knowledge network called Scalable Precision Medicine Oriented Knowledge Engine (SPOKE). Then an unsupervised machine-learning algorithm creates Propagated SPOKE Entry Vectors (PSEVs) that encode the importance of each SPOKE node for any code in the EHRs. We argue that these results, alongside the natural integration of PSEVs into any EHR machine-learning platform, provide a key step toward precision medicine.
Collapse
Affiliation(s)
- Charlotte A Nelson
- Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, CA, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA.,Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Sergio E Baranzini
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA. .,Weill Institute for Neuroscience. Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
56
|
Dennis J, Yengo-Kahn AM, Kirby P, Solomon GS, Cox NJ, Zuckerman SL. Diagnostic Algorithms to Study Post-Concussion Syndrome Using Electronic Health Records: Validating a Method to Capture an Important Patient Population. J Neurotrauma 2019; 36:2167-2177. [PMID: 30773988 DOI: 10.1089/neu.2018.5916] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Post-concussion syndrome (PCS) is characterized by persistent cognitive, somatic, and emotional symptoms after a mild traumatic brain injury (mTBI). Genetic and other biological variables may contribute to PCS etiology, and the emergence of biobanks linked to electronic health records (EHRs) offers new opportunities for research on PCS. We sought to validate the EHR data of PCS patients by comparing two diagnostic algorithms deployed in the Vanderbilt University Medical Center de-identified database of 2.8 million patient EHRs. The algorithms identified individuals with PCS by: 1) natural language processing (NLP) of narrative text in the EHR combined with structured demographic, diagnostic, and encounter data; or 2) coded billing and procedure data. The predictive value of each algorithm was assessed, and cases and controls identified by each approach were compared on demographic and medical characteristics. The NLP algorithm identified 507 cases and 10,857 controls. The negative predictive value in controls was 78% and the positive predictive value (PPV) in cases was 82%. Conversely, the coded algorithm identified 1142 patients with two or more PCS billing codes and had a PPV of 76%. Comparisons of PCS controls to both case groups recovered known epidemiology of PCS: cases were more likely than controls to be female and to have pre-morbid diagnoses of anxiety, migraine, and post-traumatic stress disorder. In contrast, controls and cases were equally likely to have attention deficit hyperactive disorder and learning disabilities, in accordance with the findings of recent systematic reviews of PCS risk factors. We conclude that EHRs are a valuable research tool for PCS. Ascertainment based on coded data alone had a predictive value comparable to an NLP algorithm, recovered known PCS risk factors, and maximized the number of included patients.
Collapse
Affiliation(s)
- Jessica Dennis
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,2 Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Aaron M Yengo-Kahn
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee.,4 Department of Neurological Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Paul Kirby
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Gary S Solomon
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee.,4 Department of Neurological Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Nancy J Cox
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,2 Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Scott L Zuckerman
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee.,4 Department of Neurological Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee
| |
Collapse
|
57
|
Yu S, Ma Y, Gronsbell J, Cai T, Ananthakrishnan AN, Gainer VS, Churchill SE, Szolovits P, Murphy SN, Kohane IS, Liao KP, Cai T. Enabling phenotypic big data with PheNorm. J Am Med Inform Assoc 2019; 25:54-60. [PMID: 29126253 DOI: 10.1093/jamia/ocx111] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 09/14/2017] [Indexed: 01/20/2023] Open
Abstract
Objective Electronic health record (EHR)-based phenotyping infers whether a patient has a disease based on the information in his or her EHR. A human-annotated training set with gold-standard disease status labels is usually required to build an algorithm for phenotyping based on a set of predictive features. The time intensiveness of annotation and feature curation severely limits the ability to achieve high-throughput phenotyping. While previous studies have successfully automated feature curation, annotation remains a major bottleneck. In this paper, we present PheNorm, a phenotyping algorithm that does not require expert-labeled samples for training. Methods The most predictive features, such as the number of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes or mentions of the target phenotype, are normalized to resemble a normal mixture distribution with high area under the receiver operating curve (AUC) for prediction. The transformed features are then denoised and combined into a score for accurate disease classification. Results We validated the accuracy of PheNorm with 4 phenotypes: coronary artery disease, rheumatoid arthritis, Crohn's disease, and ulcerative colitis. The AUCs of the PheNorm score reached 0.90, 0.94, 0.95, and 0.94 for the 4 phenotypes, respectively, which were comparable to the accuracy of supervised algorithms trained with sample sizes of 100-300, with no statistically significant difference. Conclusion The accuracy of the PheNorm algorithms is on par with algorithms trained with annotated samples. PheNorm fully automates the generation of accurate phenotyping algorithms and demonstrates the capacity for EHR-driven annotations to scale to the next level - phenotypic big data.
Collapse
Affiliation(s)
- Sheng Yu
- Center for Statistical Science, Tsinghua University, Beijing, China.,Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Yumeng Ma
- Department of Mathematical Sciences, Tsinghua University, Beijing, China
| | - Jessica Gronsbell
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Tianrun Cai
- Department of Radiology, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Vivian S Gainer
- Research Information Science and Computing, Partners HealthCare, Charlestown, MA, USA
| | - Susanne E Churchill
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter Szolovits
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Shawn N Murphy
- Research Information Science and Computing, Partners HealthCare, Charlestown, MA, USA.,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Isaac S Kohane
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Katherine P Liao
- Department of Medicine, Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Boston, MA, USA
| | - Tianxi Cai
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
58
|
Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. CURRENT PROTOCOLS IN HUMAN GENETICS 2019; 100:e80. [PMID: 30516347 PMCID: PMC6318047 DOI: 10.1002/cphg.80] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Electronic health records contain patient-level data collected during and for clinical care. Data within the electronic health record include diagnostic billing codes, procedure codes, vital signs, laboratory test results, clinical imaging, and physician notes. With repeated clinic visits, these data are longitudinal, providing important information on disease development, progression, and response to treatment or intervention strategies. The near universal adoption of electronic health records nationally has the potential to provide population-scale real-world clinical data accessible for biomedical research, including genetic association studies. For this research potential to be realized, high-quality research-grade variables must be extracted from these clinical data warehouses. We describe here common and emerging electronic phenotyping approaches applied to electronic health records, as well as current limitations of both the approaches and the biases associated with these clinically collected data that impact their use in research. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Sarah A. Pendergrass
- Biomedical and Translational Informatics Institute,
Geisinger Research, Rockville MD
| | - Dana C. Crawford
- Institute for Computational Biology, Department of
Population and Quantitative Health Sciences, Case Western Reserve University,
Cleveland, OH
| |
Collapse
|
59
|
Saad MN, Mabrouk MS, Eldeib AM, Shaker OG. Comparative study for haplotype block partitioning methods - Evidence from chromosome 6 of the North American Rheumatoid Arthritis Consortium (NARAC) dataset. PLoS One 2018; 13:e0209603. [PMID: 30596705 PMCID: PMC6312333 DOI: 10.1371/journal.pone.0209603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 12/07/2018] [Indexed: 11/19/2022] Open
Abstract
Haplotype-based methods compete with "one-SNP-at-a-time" approaches on being preferred for association studies. Chromosome 6 contains most of the known genetic biomarkers for rheumatoid arthritis (RA) disease. Therefore, chromosome 6 serves as a benchmark for the haplotype methods testing. The aim of this study is to test the North American Rheumatoid Arthritis Consortium (NARAC) dataset to find out if haplotype block methods or single-locus approaches alone can sufficiently provide the significant single nucleotide polymorphisms (SNPs) associated with RA. In addition, could we be satisfied with only one method of the haplotype block methods for partitioning chromosome 6 of the NARAC dataset? In the NARAC dataset, chromosome 6 comprises 35,574 SNPs for 2,062 individuals (868 cases, 1,194 controls). Individual SNP approach and three haplotype block methods were applied to the NARAC dataset to identify the RA biomarkers. We employed three haplotype partitioning methods which are confidence interval test (CIT), four gamete test (FGT), and solid spine of linkage disequilibrium (SSLD). P-values after stringent Bonferroni correction for multiple testing were measured to assess the strength of association between the genetic variants and RA susceptibility. Moreover, the block size (in base pairs (bp) and number of SNPs included), number of blocks, percentage of uncovered SNPs by the block method, percentage of significant blocks from the total number of blocks, number of significant haplotypes and SNPs were used to compare among the three haplotype block methods. Individual SNP, CIT, FGT, and SSLD methods detected 432, 1,086, 1,099, and 1,322 associated SNPs, respectively. Each method identified significant SNPs that were not detected by any other method (Individual SNP: 12, FGT: 37, CIT: 55, and SSLD: 189 SNPs). 916 SNPs were discovered by all the three haplotype block methods. 367 SNPs were discovered by the haplotype block methods and the individual SNP approach. The P-values of these 367 SNPs were lower than those of the SNPs uniquely detected by only one method. The 367 SNPs detected by all the methods represent promising candidates for RA susceptibility. They should be further investigated for the European population. A hybrid technique including the four methods should be applied to detect the significant SNPs associated with RA for chromosome 6 of the NARAC dataset. Moreover, SSLD method may be preferred for its favored benefits in case of selecting only one method.
Collapse
Affiliation(s)
- Mohamed N. Saad
- Biomedical Engineering Department, Faculty of Engineering, Minia University, Minia, Egypt
| | - Mai S. Mabrouk
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology (MUST), 6th of October City, Egypt
| | - Ayman M. Eldeib
- Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, Giza, Egypt
| | - Olfat G. Shaker
- Medical Biochemistry and Molecular Biology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
| |
Collapse
|
60
|
Xie S, Himes BE. Approaches to Link Geospatially Varying Social, Economic, and Environmental Factors with Electronic Health Record Data to Better Understand Asthma Exacerbations. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2018:1561-1570. [PMID: 30815202 PMCID: PMC6371292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Electronic health record (EHR)-derived data has become an invaluable resource for biomedical research, but is seldom used for the study of the health impacts of social and environmental factors due in part to the unavailability of relevant variables. We describe how EHR-derived data can be enhanced via linking of external sources of social, economic and environmental data when patient-related geospatial information is available, and we illustrate an approach to better understand the geospatial patterns of asthma exacerbation rates in Philadelphia. Specifically, we relate the spatial distribution of asthma exacerbations observed in EHR-derived data to that of known and potential risk factors (i.e., economic deprivation, crime, vehicular traffic, tree cover). Areas of highest risk based on integrated social and environmental data were consistent with an area with increased asthma exacerbations, demonstrating that data external to the EHR can enhance our understanding of negative health-related outcomes.
Collapse
Affiliation(s)
- Sherrie Xie
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Blanca E Himes
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
61
|
Henderson J, He H, Malin BA, Denny JC, Kho AN, Ghosh J, Ho JC. Phenotyping through Semi-Supervised Tensor Factorization (PSST). AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2018:564-573. [PMID: 30815097 PMCID: PMC6371355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A computational phenotype is a set of clinically relevant and interesting characteristics that describe patients with a given condition. Various machine learning methods have been proposed to derive phenotypes in an automatic, high-throughput manner. Among these methods, computational phenotyping through tensor factorization has been shown to produce clinically interesting phenotypes. However, few of these methods incorporate auxiliary patient information into the phenotype derivation process. In this work, we introduce Phenotyping through Semi-Supervised Tensor Factorization (PSST), a method that leverages disease status knowledge about subsets of patients to generate computational phenotypes from tensors constructed from the electronic health records of patients. We demonstrate the potential of PSST to uncover predictive and clinically interesting computational phenotypes through case studies focusing on type-2 diabetes and resistant hypertension. PSST yields more discriminative phenotypes compared to the unsupervised methods and more meaningful phenotypes compared to a supervised method.
Collapse
Affiliation(s)
| | - Huan He
- Emory University, Atlanta, GA
| | | | | | | | | | | |
Collapse
|
62
|
Barnado A, Carroll RJ, Casey C, Wheless L, Denny JC, Crofford LJ. Phenome-wide association study identifies dsDNA as a driver of major organ involvement in systemic lupus erythematosus. Lupus 2018; 28:66-76. [PMID: 30477398 DOI: 10.1177/0961203318815577] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In systemic lupus erythematosus (SLE), dsDNA antibodies are associated with renal disease. Less is known about comorbidities in patients without dsDNA or other autoantibodies. Using an electronic health record (EHR) SLE cohort, we employed a phenome-wide association study (PheWAS) that scans across billing codes to compare comorbidities in SLE patients with and without autoantibodies. We used our validated algorithm to identify SLE subjects. Autoantibody status was defined as ever positive for dsDNA, RNP, Smith, SSA and SSB. PheWAS was performed in antibody positive vs. negative SLE patients adjusting for age and race and using a false discovery rate of 0.05. We identified 1097 SLE subjects. In the PheWAS of dsDNA positive vs. negative subjects, dsDNA positive subjects were more likely to have nephritis ( p = 2.33 × 10-9) and renal failure ( p = 1.85 × 10-5). After adjusting for sex, race, age and other autoantibodies, dsDNA was independently associated with nephritis and chronic kidney disease. Those patients negative for dsDNA, RNP, SSA and SSB negative subjects were all more likely to have billing codes for sleep, pain and mood disorders. PheWAS uncovered a hierarchy within SLE-specific autoantibodies with dsDNA having the greatest impact on major organ involvement.
Collapse
Affiliation(s)
- A Barnado
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, USA
| | - R J Carroll
- 2 Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - C Casey
- 3 Department of Medicine, Lehigh Valley Health Network, Allentown, USA
| | - L Wheless
- 4 Department of Dermatology, Vanderbilt University Medical Center, Nashville, USA
| | - J C Denny
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, USA.,2 Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - L J Crofford
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, USA
| |
Collapse
|
63
|
Smoller JW. The use of electronic health records for psychiatric phenotyping and genomics. Am J Med Genet B Neuropsychiatr Genet 2018; 177:601-612. [PMID: 28557243 PMCID: PMC6440216 DOI: 10.1002/ajmg.b.32548] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2017] [Accepted: 04/20/2017] [Indexed: 12/22/2022]
Abstract
The widespread adoption of electronic health record (EHRs) in healthcare systems has created a vast and continuously growing resource of clinical data and provides new opportunities for population-based research. In particular, the linking of EHRs to biospecimens and genomic data in biobanks may help address what has become a rate-limiting study for genetic research: the need for large sample sizes. The principal roadblock to capitalizing on these resources is the need to establish the validity of phenotypes extracted from the EHR. For psychiatric genetic research, this represents a particular challenge given that diagnosis is based on patient reports and clinician observations that may not be well-captured in billing codes or narrative records. This review addresses the opportunities and pitfalls in EHR-based phenotyping with a focus on their application to psychiatric genetic research. A growing number of studies have demonstrated that diagnostic algorithms with high positive predictive value can be derived from EHRs, especially when structured data are supplemented by text mining approaches. Such algorithms enable semi-automated phenotyping for large-scale case-control studies. In addition, the scale and scope of EHR databases have been used successfully to identify phenotypic subgroups and derive algorithms for longitudinal risk prediction. EHR-based genomics are particularly well-suited to rapid look-up replication of putative risk genes, studies of pleiotropy (phenomewide association studies or PheWAS), investigations of genetic networks and overlap across the phenome, and pharmacogenomic research. EHR phenotyping has been relatively under-utilized in psychiatric genomic research but may become a key component of efforts to advance precision psychiatry.
Collapse
Affiliation(s)
- Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
| |
Collapse
|
64
|
Dang LC, Samanez-Larkin GR, Smith CT, Castrellon JJ, Perkins SF, Cowan RL, Claassen DO, Zald DH. FTO affects food cravings and interacts with age to influence age-related decline in food cravings. Physiol Behav 2018; 192:188-193. [PMID: 29233619 PMCID: PMC5994171 DOI: 10.1016/j.physbeh.2017.12.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 12/04/2017] [Accepted: 12/08/2017] [Indexed: 12/31/2022]
Abstract
The fat mass and obesity associated gene (FTO) was the first gene identified by genome-wide association studies to correlate with higher body mass index (BMI) and increased odds of obesity. FTO remains the locus with the largest and most replicated effect on body weight, but the mechanism whereby FTO affects body weight and the development of obesity is not fully understood. Here we tested whether FTO is associated with differences in food cravings and a key aspect of dopamine function that has been hypothesized to influence food reward mechanisms. Moreover, as food cravings and dopamine function are known to decline with age, we explored effects of age on relations between FTO and food cravings and dopamine function. Seven-eight healthy subjects between 22 and 83years old completed the Food Cravings Questionnaire and underwent genotyping for FTO rs9939609, the first FTO single nucleotide polymorphism associated with obesity. Compared to TT homozygotes, individuals carrying the obesity-susceptible A allele had higher total food cravings, which correlated with higher BMI. Additionally, food cravings declined with age, but this age effect differed across variants of FTO rs9939609: while TT homozygotes showed the typical age-related decline in food cravings, there was no such decline among A carriers. All subjects were scanned with [18F]fallypride PET to assess a recent proposal that at the neurochemical level FTO alters dopamine D2-like receptor (DRD2) function to influence food reward related mechanisms. However, we observed no evidence of FTO effects on DRD2 availability.
Collapse
Affiliation(s)
- Linh C Dang
- Department of Psychology, Vanderbilt University, 219 Wilson Hall, 111 21(st) Avenue South, Nashville, TN 37203, United States.
| | - Gregory R Samanez-Larkin
- Department of Psychology and Neuroscience, Duke University, 417 Chapel Drive, Campus Box 90086, Durham, NC 27708, United States
| | - Christopher T Smith
- Department of Psychology, Vanderbilt University, 219 Wilson Hall, 111 21(st) Avenue South, Nashville, TN 37203, United States
| | - Jaime J Castrellon
- Department of Psychology and Neuroscience, Duke University, 417 Chapel Drive, Campus Box 90086, Durham, NC 27708, United States
| | - Scott F Perkins
- Department of Psychology, Vanderbilt University, 219 Wilson Hall, 111 21(st) Avenue South, Nashville, TN 37203, United States
| | - Ronald L Cowan
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University School of Medicine, 1601 23(rd) Ave South, Nashville, TN 37212, United States; Department of Radiology and Radiological Sciences, Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, TN 37232, United States
| | - Daniel O Claassen
- Department of Neurology, Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, TN 37232, United States
| | - David H Zald
- Department of Psychology, Vanderbilt University, 219 Wilson Hall, 111 21(st) Avenue South, Nashville, TN 37203, United States; Department of Psychiatry and Behavioral Sciences, Vanderbilt University School of Medicine, 1601 23(rd) Ave South, Nashville, TN 37212, United States
| |
Collapse
|
65
|
Song W, Huang H, Zhang CZ, Bates DW, Wright A. Using whole genome scores to compare three clinical phenotyping methods in complex diseases. Sci Rep 2018; 8:11360. [PMID: 30054501 PMCID: PMC6063939 DOI: 10.1038/s41598-018-29634-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 07/16/2018] [Indexed: 12/19/2022] Open
Abstract
Genome-wide association studies depend on accurate ascertainment of patient phenotype. However, phenotyping is difficult, and it is often treated as an afterthought in these studies because of the expense involved. Electronic health records (EHRs) may provide higher fidelity phenotypes for genomic research than other sources such as administrative data. We used whole genome association models to evaluate different EHR and administrative data-based phenotyping methods in a cohort of 16,858 Caucasian subjects for type 1 diabetes mellitus, type 2 diabetes mellitus, coronary artery disease and breast cancer. For each disease, we trained and evaluated polygenic models using three different phenotype definitions: phenotypes derived from billing data, the clinical problem list, or a curated phenotyping algorithm. We observed that for these diseases, the curated phenotype outperformed the problem list, and the problem list outperformed administrative billing data. This suggests that using advanced EHR-derived phenotypes can further increase the power of genome-wide association studies.
Collapse
Affiliation(s)
- Wenyu Song
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, 02120, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA
| | - Hailiang Huang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, 02114, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts, 02142, USA
| | - Cheng-Zhong Zhang
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA.,Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, 02215, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts, 02142, USA
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, 02120, USA.,Information Systems Department, Partners HealthCare, Somerville, Massachusetts, 02145, USA
| | - Adam Wright
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, 02120, USA. .,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA. .,Information Systems Department, Partners HealthCare, Somerville, Massachusetts, 02145, USA.
| |
Collapse
|
66
|
Abstract
Biomedical data science has experienced an explosion of new data over the past decade. Abundant genetic and genomic data are increasingly available in large, diverse data sets due to the maturation of modern molecular technologies. Along with these molecular data, dense, rich phenotypic data are also available on comprehensive clinical data sets from health care provider organizations, clinical trials, population health registries, and epidemiologic studies. The methods and approaches for interrogating these large genetic/genomic and clinical data sets continue to evolve rapidly, as our understanding of the questions and challenges continue to emerge. In this review, the state-of-the-art methodologies for genetic/genomic analysis along with complex phenomics will be discussed. This field is changing and adapting to the novel data types made available, as well as technological advances in computation and machine learning. Thus, I will also discuss the future challenges in this exciting and innovative space. The promises of precision medicine rely heavily on the ability to marry complex genetic/genomic data with clinical phenotypes in meaningful ways.
Collapse
Affiliation(s)
- Marylyn D. Ritchie
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
67
|
Banda JM, Seneviratne M, Hernandez-Boussard T, Shah NH. Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models. Annu Rev Biomed Data Sci 2018; 1:53-68. [PMID: 31218278 PMCID: PMC6583807 DOI: 10.1146/annurev-biodatasci-080917-013315] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
With the widespread adoption of electronic health records (EHRs), large repositories of structured and unstructured patient data are becoming available to conduct observational studies. Finding patients with specific conditions or outcomes, known as phenotyping, is one of the most fundamental research problems encountered when using these new EHR data. Phenotyping forms the basis of translational research, comparative effectiveness studies, clinical decision support, and population health analyses using routinely collected EHR data. We review the evolution of electronic phenotyping, from the early rule-based methods to the cutting edge of supervised and unsupervised machine learning models. We aim to cover the most influential papers in commensurate detail, with a focus on both methodology and implementation. Finally, future research directions are explored.
Collapse
Affiliation(s)
- Juan M Banda
- Stanford Center for Biomedical Informatics Research, Stanford, California 94305, USA
| | - Martin Seneviratne
- Stanford Center for Biomedical Informatics Research, Stanford, California 94305, USA
| | | | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research, Stanford, California 94305, USA
| |
Collapse
|
68
|
Henderson J, Ke J, Ho JC, Ghosh J, Wallace BC. Phenotype Instance Verification and Evaluation Tool (PIVET): A Scaled Phenotype Evidence Generation Framework Using Web-Based Medical Literature. J Med Internet Res 2018; 20:e164. [PMID: 29728351 PMCID: PMC5960038 DOI: 10.2196/jmir.9610] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 02/26/2018] [Accepted: 02/28/2018] [Indexed: 12/24/2022] Open
Abstract
Background Researchers are developing methods to automatically extract clinically relevant and useful patient characteristics from raw healthcare datasets. These characteristics, often capturing essential properties of patients with common medical conditions, are called computational phenotypes. Being generated by automated or semiautomated, data-driven methods, such potential phenotypes need to be validated as clinically meaningful (or not) before they are acceptable for use in decision making. Objective The objective of this study was to present Phenotype Instance Verification and Evaluation Tool (PIVET), a framework that uses co-occurrence analysis on an online corpus of publically available medical journal articles to build clinical relevance evidence sets for user-supplied phenotypes. PIVET adopts a conceptual framework similar to the pioneering prototype tool PheKnow-Cloud that was developed for the phenotype validation task. PIVET completely refactors each part of the PheKnow-Cloud pipeline to deliver vast improvements in speed without sacrificing the quality of the insights PheKnow-Cloud achieved. Methods PIVET leverages indexing in NoSQL databases to efficiently generate evidence sets. Specifically, PIVET uses a succinct representation of the phenotypes that corresponds to the index on the corpus database and an optimized co-occurrence algorithm inspired by the Aho-Corasick algorithm. We compare PIVET’s phenotype representation with PheKnow-Cloud’s by using PheKnow-Cloud’s experimental setup. In PIVET’s framework, we also introduce a statistical model trained on domain expert–verified phenotypes to automatically classify phenotypes as clinically relevant or not. Additionally, we show how the classification model can be used to examine user-supplied phenotypes in an online, rather than batch, manner. Results PIVET maintains the discriminative power of PheKnow-Cloud in terms of identifying clinically relevant phenotypes for the same corpus with which PheKnow-Cloud was originally developed, but PIVET’s analysis is an order of magnitude faster than that of PheKnow-Cloud. Not only is PIVET much faster, it can be scaled to a larger corpus and still retain speed. We evaluated multiple classification models on top of the PIVET framework and found ridge regression to perform best, realizing an average F1 score of 0.91 when predicting clinically relevant phenotypes. Conclusions Our study shows that PIVET improves on the most notable existing computational tool for phenotype validation in terms of speed and automation and is comparable in terms of accuracy.
Collapse
Affiliation(s)
- Jette Henderson
- The University of Texas at Austin, Austin, TX, United States
| | - Junyuan Ke
- Emory University, Atlanda, GA, United States
| | - Joyce C Ho
- Emory University, Atlanda, GA, United States
| | - Joydeep Ghosh
- The University of Texas at Austin, Austin, TX, United States
| | | |
Collapse
|
69
|
Robinson JR, Wei WQ, Roden DM, Denny JC. Defining Phenotypes from Clinical Data to Drive Genomic Research. Annu Rev Biomed Data Sci 2018; 1:69-92. [PMID: 34109303 DOI: 10.1146/annurev-biodatasci-080917-013335] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The rise in available longitudinal patient information in electronic health records (EHRs) and their coupling to DNA biobanks has resulted in a dramatic increase in genomic research using EHR data for phenotypic information. EHRs have the benefit of providing a deep and broad data source of health-related phenotypes, including drug response traits, expanding the phenome available to researchers for discovery. The earliest efforts at repurposing EHR data for research involved manual chart review of limited numbers of patients but now typically involve applications of rule-based and machine learning algorithms operating on sometimes huge corpora for both genome-wide and phenome-wide approaches. We highlight here the current methods, impact, challenges, and opportunities for repurposing clinical data to define patient phenotypes for genomics discovery. Use of EHR data has proven a powerful method for elucidation of genomic influences on diseases, traits, and drug-response phenotypes and will continue to have increasing applications in large cohort studies.
Collapse
Affiliation(s)
- Jamie R Robinson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN.,Department of General Surgery, Vanderbilt University Medical Center, Nashville, TN
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Dan M Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN.,Department of Medicine, Vanderbilt University Medical Center, Nashville, TN.,Department of Pharmacology, Vanderbilt University Medical Center
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN.,Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
70
|
Horsky J, Drucker EA, Ramelson HZ. Accuracy and Completeness of Clinical Coding Using ICD-10 for Ambulatory Visits. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:912-920. [PMID: 29854158 PMCID: PMC5977598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This study describes a simulation of diagnostic coding using an EHR. Twenty-three ambulatory clinicians were asked to enter appropriate codes for six standardized scenarios with two different EHRs. Their interactions with the query interface were analyzed for patterns and variations in search strategies and the resulting sets of entered codes for accuracy and completeness. Just over a half of entered codes were appropriate for a given scenario and about a quarter were omitted. Crohn's disease and diabetes scenarios had the highest rate of inappropriate coding and code variation. The omission rate was higher for secondary than for primary visit diagnoses. Codes for immunization, dialysis dependence and nicotine dependence were the most often omitted. We also found a high rate of variation in the search terms used to query the EHR for the same diagnoses. Changes to the training of clinicians and improved design of EHR query modules may lower the rate of inappropriate and omitted codes.
Collapse
Affiliation(s)
- Jan Horsky
- Brigham &Women's Hospital, Div. of General Internal Medicine, Boston, MA
- Harvard Medical School, Boston, MA
| | - Elizabeth A Drucker
- Newton-Wellesley Hospital, Dept. of Radiology, Newton, MA
- Massachusetts General Hospital, Dept. of Radiology, Boston, MA
- Harvard Medical School, Boston, MA
| | - Harley Z Ramelson
- Brigham &Women's Hospital, Div. of General Internal Medicine, Boston, MA
- Harvard Medical School, Boston, MA
- Partners HealthCare, Inc., Boston, MA
| |
Collapse
|
71
|
Barnado A, Carroll RJ, Casey C, Wheless L, Denny JC, Crofford LJ. Phenome-wide association study identifies marked increased in burden of comorbidities in African Americans with systemic lupus erythematosus. Arthritis Res Ther 2018; 20:69. [PMID: 29636090 PMCID: PMC5894248 DOI: 10.1186/s13075-018-1561-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 03/06/2018] [Indexed: 01/08/2023] Open
Abstract
Background African Americans with systemic lupus erythematosus (SLE) have increased renal disease compared to Caucasians, but differences in other comorbidities have not been well-described. We used an electronic health record (EHR) technique to test for differences in comorbidities in African Americans compared to Caucasians with SLE. Methods We used a de-identified EHR with 2.8 million subjects to identify SLE cases using a validated algorithm. We performed phenome-wide association studies (PheWAS) comparing African American to Caucasian SLE cases and African American SLE cases to matched non-SLE controls. Controls were age, sex, and race matched to SLE cases. For multiple testing, a false discovery rate (FDR) p value of 0.05 was used. Results We identified 270 African Americans and 715 Caucasians with SLE and 1425 matched African American controls. Compared to Caucasians with SLE adjusting for age and sex, African Americans with SLE had more comorbidities in every organ system. The most striking included hypertension odds ratio (OR) = 4.25, FDR p = 5.49 × 10− 15; renal dialysis OR = 10.90, FDR p = 8.75 × 10− 14; and pneumonia OR = 3.57, FDR p = 2.32 × 10− 8. Compared to the African American matched controls without SLE, African Americans with SLE were more likely to have comorbidities in every organ system. The most significant codes were renal and cardiac, and included renal failure (OR = 9.55, FDR p = 2.26 × 10− 40) and hypertensive heart and renal disease (OR = 8.08, FDR p = 1.78 × 10− 22). Adjusting for race, age, and sex in a model including both African American and Caucasian SLE cases and controls, SLE was independently associated with renal, cardiovascular, and infectious diseases (all p < 0.01). Conclusions African Americans with SLE have an increased comorbidity burden compared to Caucasians with SLE and matched controls. This increase in comorbidities in African Americans with SLE highlights the need to monitor for cardiovascular and infectious complications. Electronic supplementary material The online version of this article (10.1186/s13075-018-1561-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- April Barnado
- Department of Medicine, Vanderbilt University Medical Center, 1161 21st Avenue South, T3113 MCN, Nashville, TN, 37232, USA.
| | - Robert J Carroll
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Carolyn Casey
- Department of Medicine, Lehigh Valley Health Network, Allentown, PA, USA
| | - Lee Wheless
- Department of Dermatology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C Denny
- Department of Medicine, Vanderbilt University Medical Center, 1161 21st Avenue South, T3113 MCN, Nashville, TN, 37232, USA.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Leslie J Crofford
- Department of Medicine, Vanderbilt University Medical Center, 1161 21st Avenue South, T3113 MCN, Nashville, TN, 37232, USA
| |
Collapse
|
72
|
Verma A, Bradford Y, Dudek S, Lucas AM, Verma SS, Pendergrass SA, Ritchie MD. A simulation study investigating power estimates in phenome-wide association studies. BMC Bioinformatics 2018; 19:120. [PMID: 29618318 PMCID: PMC5885318 DOI: 10.1186/s12859-018-2135-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 03/26/2018] [Indexed: 01/01/2023] Open
Abstract
Background Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. Results We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. Conclusions This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses. Electronic supplementary material The online version of this article (10.1186/s12859-018-2135-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anurag Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA
| | - Yuki Bradford
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Scott Dudek
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Anastasia M Lucas
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Shefali S Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA
| | | | - Marylyn D Ritchie
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA. .,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
73
|
Wright F, Fessele K. Primer in Genetics and Genomics, Article 5-Further Defining the Concepts of Genotype and Phenotype and Exploring Genotype-Phenotype Associations. Biol Res Nurs 2018; 19:576-585. [PMID: 28920489 DOI: 10.1177/1099800417725190] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
As nurses begin to incorporate genetic and genomic sciences into clinical practice, education, and research, it is essential that they have a working knowledge of the terms foundational to the science. The first article in this primer series provided brief definitions of the basic terms (e.g., genetics and genomics) and introduced the concept of phenotype during the discussion of Mendelian inheritance. These terms, however, are inconsistently used in publications and conversations, and the linkage between genotype and phenotype requires clarification. The goal of this fifth article in the series is to elucidate these terms, provide an overview of the research methods used to determine genotype-phenotype associations, and discuss their significance to nursing through examples from the current nursing literature.
Collapse
Affiliation(s)
- Fay Wright
- 1 Yale School of Nursing, Orange CT, USA
| | - Kristen Fessele
- 2 Scientific Project Leader, Flatiron Health, New York, NY, USA.,3 Post-doctoral Research Fellow, University of Utah College of Nursing, Salt Lake City, UT, USA
| |
Collapse
|
74
|
Replication confirms the association of loci in FOXE1, PDE8B, CAPZB and PDE10A with thyroid traits: a Genetics of Diabetes Audit and Research Tayside study (GoDARTS). Pharmacogenet Genomics 2018; 27:356-362. [PMID: 28727628 DOI: 10.1097/fpc.0000000000000299] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
OBJECTIVE Replication of associations in genome-wide association studies is desirable to ensure that such signals are potentially clinically meaningful. This study aimed to replicate associations of selected single-nucleotide polymorphisms (SNPs) with hypothyroidism and serum thyroid-stimulating hormone (TSH) using electronic medical records (EMRs). PATIENTS AND METHODS A cross-sectional study was carried out among patients of European Caucasian ethnicity from the Genetics of Diabetes Audit and Research Tayside recruited in Tayside (Scotland, UK). EMRs (biochemistry, prescribing, hospital admissions and demographics) were used to ascertain patients with hypothyroidism and their controls as well as average serum TSH concentration, and linked to genetic biobank data. Genetic tests of association were performed using logistic and linear regression models. RESULTS We analysed 1703 cases of hypothyroidism and 9457 controls. All four SNPs located on chromosome 9 at FOXE1 were associated with hypothyroidism with similar effect estimates (odds ratio=0.75-0.76, P<5e-08). Also, loci on chromosomes 1 (PTPN22), six (HLA-E/HLA-C) and 12 (SH2B3) were replicated. For serum TSH, we confirmed 12 SNPs previously reported at PDE8B, CAPZB, PDE10A, LOC105371356, NR3C2, VEGFA, IGFBP5, INSR, PRDM11, NFIA, ITPK1 and ABO. Overall, these SNPs accounted for 6.8% of the serum TSH variation (P<1e-04). CONCLUSION EMRs linked to genomic data in large populations enable validation of genome-wide association studies discoveries without additional genotyping costs. Our replication confirmed at genome-wide significance the association of loci at FOXE1 with hypothyroidism, and PDE8B, CAPZB and PDE10A with serum TSH. A total of 12 SNPs seemed to explain nearly 7% of the serum TSH variation.
Collapse
|
75
|
Damotte V, Lizée A, Tremblay M, Agrawal A, Khankhanian P, Santaniello A, Gomez R, Lincoln R, Tang W, Chen T, Lee N, Villoslada P, Hollenbach JA, Bevan CD, Graves J, Bove R, Goodin DS, Green AJ, Baranzini SE, Cree BAC, Henry RG, Hauser SL, Gelfand JM, Gourraud PA. Harnessing electronic medical records to advance research on multiple sclerosis. Mult Scler 2018; 25:408-418. [DOI: 10.1177/1352458517747407] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Background: Electronic medical records (EMR) data are increasingly used in research, but no studies have yet evaluated similarity between EMR and research-quality data and between characteristics of an EMR multiple sclerosis (MS) population and known natural MS history. Objectives: To (1) identify MS patients in an EMR system and extract clinical data, (2) compare EMR-extracted data with gold-standard research data, and (3) compare EMR MS population characteristics to expected MS natural history. Methods: Algorithms were implemented to identify MS patients from the University of California San Francisco EMR, de-identify the data and extract clinical variables. EMR-extracted data were compared to research cohort data in a subset of patients. Results: We identified 4142 MS patients via search of the EMR and extracted their clinical data with good accuracy. EMR and research values showed good concordance for Expanded Disability Status Scale (EDSS), timed-25-foot walk, and subtype. We replicated several expected MS epidemiological features from MS natural history including higher EDSS for progressive versus relapsing–remitting patients and for male versus female patients and increased EDSS with age at examination and disease duration. Conclusion: Large real-world cohorts algorithmically extracted from the EMR can expand opportunities for MS clinical research.
Collapse
Affiliation(s)
- Vincent Damotte
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Antoine Lizée
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA/Université de Nantes, INSERM, UMR 1064, ATIP-Avenir, Equipe 5 Centre de Recherche en Transplantation et Immunologie, Nantes, France
| | - Matthew Tremblay
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA/Department of Neurology, John Dempsey Hospital, University of Connecticut Health Center, Farmington, CT, USA
| | - Alisha Agrawal
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Pouya Khankhanian
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA/Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Adam Santaniello
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Refujia Gomez
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Robin Lincoln
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Wendy Tang
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Tiffany Chen
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Nelson Lee
- Information Technology, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Pablo Villoslada
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA/IDIBAPS—Hospital Clinic of Barcelona, Barcelona, Spain
| | - Jill A Hollenbach
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Carolyn D Bevan
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Jennifer Graves
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Riley Bove
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Douglas S Goodin
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Ari J Green
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Sergio E Baranzini
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Bruce AC Cree
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Roland G Henry
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Stephen L Hauser
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Jeffrey M Gelfand
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA
| | - Pierre-Antoine Gourraud
- MS Genetics, Department of Neurology, School of Medicine, University of California San Francisco (UCSF), San Francisco, CA, USA/Université de Nantes, INSERM, UMR 1064, ATIP-Avenir, Equipe 5 Centre de Recherche en Transplantation et Immunologie, Nantes, France
| |
Collapse
|
76
|
Barradas-Bautista D, Rosell M, Pallara C, Fernández-Recio J. Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems. PROTEIN-PROTEIN INTERACTIONS IN HUMAN DISEASE, PART A 2018; 110:203-249. [DOI: 10.1016/bs.apcsb.2017.06.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
77
|
Huang J, Duan R, Hubbard RA, Wu Y, Moore JH, Xu H, Chen Y. PIE: A prior knowledge guided integrated likelihood estimation method for bias reduction in association studies using electronic health records data. J Am Med Inform Assoc 2017; 25:345-352. [PMID: 29206922 PMCID: PMC7378882 DOI: 10.1093/jamia/ocx137] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 10/10/2017] [Accepted: 11/15/2017] [Indexed: 12/17/2022] Open
Abstract
Objectives This study proposes a novelPrior knowledge guidedIntegrated likelihoodEstimation (PIE) method to correct bias in estimations of associations due to misclassification of electronic health record (EHR)-derived binary phenotypes, and evaluates the performance of the proposed method by comparing it to 2 methods in common practice. Methods We conducted simulation studies and data analysis of real EHR-derived data on diabetes from Kaiser Permanente Washington to compare the estimation bias of associations using the proposed method, the method ignoring phenotyping errors, the maximum likelihood method with misspecified sensitivity and specificity, and the maximum likelihood method with correctly specified sensitivity and specificity (gold standard). The proposed method effectively leverages available information on phenotyping accuracy to construct a prior distribution for sensitivity and specificity, and incorporates this prior information through the integrated likelihood for bias reduction. Results Our simulation studies and real data application demonstrated that the proposed method effectively reduces the estimation bias compared to the 2 current methods. It performed almost as well as the gold standard method when the prior had highest density around true sensitivity and specificity. The analysis of EHR data from Kaiser Permanente Washington showed that the estimated associations from PIE were very close to the estimates from the gold standard method and reduced bias by 60%–100% compared to the 2 commonly used methods in current practice for EHR data. Conclusions This study demonstrates that the proposed method can effectively reduce estimation bias caused by imperfect phenotyping in EHR-derived data by incorporating prior information through integrated likelihood.
Collapse
Affiliation(s)
- Jing Huang
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Rui Duan
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Rebecca A Hubbard
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Yonghui Wu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jason H Moore
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Hua Xu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yong Chen
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
78
|
Abstract
PURPOSE OF REVIEW Over many decades, researchers have been designing studies to investigate the relationship between genotypes and phenotypes to gain an understanding about the effect of genetics on disease. Recently, a high-throughput approach called phenome-wide associations studies (PheWAS) have been extensively used to identify associations between genetic variants and many diseases and traits simultaneously. In this review, we describe the value of PheWAS along with methodological issues and challenges in interpretation for current applications of PheWAS. RECENT FINDINGS PheWAS have uncovered a paradigm to identify new associations for genetic loci across many diseases. The application of PheWAS have been effective with phenotype data from electronic health records, epidemiological studies, and clinical trials data. SUMMARY The key strength of a PheWAS is to identify the association of one or more genetic variants with multiple phenotypes, which can showcase interconnections among the phenotypes due to shared genetic associations. While the PheWAS approach appears promising, there are a number of challenges that need to be addressed to provide additional robustness to PheWAS findings.
Collapse
Affiliation(s)
- Anurag Verma
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
| | - Marylyn D Ritchie
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA
| |
Collapse
|
79
|
Wang L, Damrauer SM, Zhang H, Zhang AX, Xiao R, Moore JH, Chen J. Phenotype validation in electronic health records based genetic association studies. Genet Epidemiol 2017; 41:790-800. [PMID: 29023970 DOI: 10.1002/gepi.22080] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Revised: 06/30/2017] [Accepted: 08/01/2017] [Indexed: 12/13/2022]
Abstract
The linkage between electronic health records (EHRs) and genotype data makes it plausible to study the genetic susceptibility of a wide range of disease phenotypes. Despite that EHR-derived phenotype data are subjected to misclassification, it has been shown useful for discovering susceptible genes, particularly in the setting of phenome-wide association studies (PheWAS). It is essential to characterize discovered associations using gold standard phenotype data by chart review. In this work, we propose a genotype stratified case-control sampling strategy to select subjects for phenotype validation. We develop a closed-form maximum-likelihood estimator for the odds ratio parameters and a score statistic for testing genetic association using the combined validated and error-prone EHR-derived phenotype data, and assess the extent of power improvement provided by this approach. Compared with case-control sampling based only on EHR-derived phenotype data, our genotype stratified strategy maintains nominal type I error rates, and result in higher power for detecting associations. It also corrects the bias in the odds ratio parameter estimates, and reduces the corresponding variance especially when the minor allele frequency is small.
Collapse
Affiliation(s)
- Lu Wang
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Scott M Damrauer
- Division of Vascular Surgery and Endovascular Therapy, Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Department of Surgery, Corporal Michael Crescenz VA Medical Center, Philadelphia, Pennsylvania, United States of America
| | - Hong Zhang
- Institute of Biostatistics, Fudan University, Shanghai, P.R. China
| | - Alan X Zhang
- Sidwell Friends School, Washington, DC, United States of America
| | - Rui Xiao
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jason H Moore
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.,Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jinbo Chen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
80
|
Schneider BP, Shen F, Jiang G, O’Neill A, Radovich M, Li L, Gardner L, Lai D, Foroud T, Sparano JA, Sledge GW, Miller KD. Impact of Genetic Ancestry on Outcomes in ECOG-ACRIN-E5103. JCO Precis Oncol 2017; 2017:PO.17.00059. [PMID: 29333527 PMCID: PMC5765553 DOI: 10.1200/po.17.00059] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Racial disparity in breast cancer outcomes exists between African American and Caucasian women in the United States. We have evaluated the impact of genetically determined ancestry on disparity in efficacy and therapy-induced toxicity for breast cancer patients in the context of a randomized, phase III adjuvant trial. PATIENTS AND METHODS This study compared outcomes between 386 patients of African ancestry (AA) and 2473 patients of European ancestry (EA) in a randomized, phase III breast cancer trial; ECOG-ACRIN-E5103. The primary efficacy endpoint, invasive disease free survival (DFS) and clinically significant toxicities were compared including: anthracycline-induced congestive heart failure (CHF), taxane-induced peripheral neuropathy (TIPN), and bevacizumab-induced hypertension. RESULTS Overall, AAs had significantly inferior DFS (p=0.002; HR=1.5) compared with EAs. This was significant in the estrogen receptor-positive subgroup (p=0.03); with a similar, non-significant trend for those who had triple negative breast cancer (TNBC; p=0.12). AAs also had significantly more grade 3-4 TIPN (OR=2.9; p=2.4 ×10-11) and grade 3-4 bevacizumab-induced hypertension (OR=1.6; p=0.02), with a trend for more CHF (OR=1.8; p=0.08). AAs had significantly more dose reductions for paclitaxel (p=6.6 ×10-6). In AAs, dose reductions in paclitaxel had a significant negative impact on DFS (p=0.03); whereas in EAs, dose reductions did not impact outcome (p=0.35). CONCLUSION AAs had inferior DFS with more clinically important toxicities in ECOG-ACRIN-E5103. The altered risk to benefit ratio for adjuvant breast cancer chemotherapy should lead to additional research with the focus centered on the impact of genetic ancestry on both efficacy and toxicity. Strategies to minimize dose reductions for paclitaxel, especially due to TIPN, are warranted for this population.
Collapse
Affiliation(s)
- Bryan P. Schneider
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Fei Shen
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Guanglong Jiang
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Anne O’Neill
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Milan Radovich
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Lang Li
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Laura Gardner
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Dongbing Lai
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Tatiana Foroud
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Joseph A. Sparano
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - George W. Sledge
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| | - Kathy D. Miller
- Bryan P. Schneider, Fei Shen, Guanglong Jiang, Milan Radovich, Lang Li, Laura Gardner, Dongbing Lai, Tatiana Foroud, and Kathy D. Miller, Indiana University School of Medicine, Indianapolis, IN; Anne O’Neill, Dana Farber Cancer Institute–ECOG-ACRIN Biostatistics Center, Boston, MA; Joseph A. Sparano, Albert Einstein University, Bronx, NY; and George W. Sledge Jr, Stanford University, Stanford, CA
| |
Collapse
|
81
|
Wang L, Widatalla SE, Whalen DS, Ochieng J, Sakwe AM. Association of calcium sensing receptor polymorphisms at rs1801725 with circulating calcium in breast cancer patients. BMC Cancer 2017; 17:511. [PMID: 28764683 PMCID: PMC5540567 DOI: 10.1186/s12885-017-3502-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Accepted: 07/24/2017] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Breast cancer (BC) patients with late-stage and/or rapidly growing tumors are prone to develop high serum calcium levels which have been shown to be associated with larger and aggressive breast tumors in post and premenopausal women respectively. Given the pivotal role of the calcium sensing receptor (CaSR) in calcium homeostasis, we evaluated whether polymorphisms of the CASR gene at rs1801725 and rs1801726 SNPs in exon 7, are associated with circulating calcium levels in African American and Caucasian control subjects and BC cases. METHODS In this retrospective case-control study, we assessed the mean circulating calcium levels, the distribution of two inactivating CaSR SNPs at rs1801725 and rs1801726 in 199 cases and 384 age-matched controls, and used multivariable regression analysis to determine whether these SNPs are associated with circulating calcium in control subjects and BC cases. RESULTS We found that the mean circulating calcium levels in African American subjects were higher than those in Caucasian subjects (p < 0.001). As expected, the mean calcium levels were higher in BC cases compared to control subjects (p < 0.001), but the calcium levels in BC patients were independent of race. We also show that in BC cases and control subjects, the major alleles at rs1801725 (G/T, A986S) and at rs1801726 (C/G, Q1011E) were common among Caucasians and African Americans respectively. Compared to the wild type alleles, polymorphisms at the rs1801725 SNP were associated with higher calcium levels (p = 0.006) while those at rs1801726 were not. Using multivariable linear mixed-effects models and adjusting for age and race, we show that circulating calcium levels in BC cases were associated with tumor grade (p = 0.009), clinical stage (p = 0.003) and more importantly, with inactivating mutations of the CASR at the rs1801725 SNP (p = 0.038). CONCLUSIONS These data suggest that decreased sensitivity of the CaSR to calcium due to inactivating polymorphisms at rs1801725, may predispose up to 20% of BC cases to high circulating calcium-associated larger and/or aggressive breast tumors.
Collapse
Affiliation(s)
- Li Wang
- Vanderbilt Center for Quantitative Sciences, Department of Biostatistics, Vanderbilt University, Nashville, TN, USA
| | - Sarrah E Widatalla
- Department of Biochemistry and Cancer Biology, School of Graduate Studies and Research, Meharry Medical College, Nashville, TN, 37208, USA
| | - Diva S Whalen
- Department of Biochemistry and Cancer Biology, School of Graduate Studies and Research, Meharry Medical College, Nashville, TN, 37208, USA
| | - Josiah Ochieng
- Department of Biochemistry and Cancer Biology, School of Graduate Studies and Research, Meharry Medical College, Nashville, TN, 37208, USA
| | - Amos M Sakwe
- Department of Biochemistry and Cancer Biology, School of Graduate Studies and Research, Meharry Medical College, Nashville, TN, 37208, USA.
| |
Collapse
|
82
|
Lee HJ, Jiang M, Wu Y, Shaffer CM, Cleator JH, Friedman EA, Lewis JP, Roden DM, Denny J, Xu H. A comparative study of different methods for automatic identification of clopidogrel-induced bleedings in electronic health records. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017; 2017:185-192. [PMID: 28815128 PMCID: PMC5543340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Electronic health records (EHRs) linked with biobanks have been recognized as valuable data sources for pharmacogenomic studies, which require identification of patients with certain adverse drug reactions (ADRs) from a large population. Since manual chart review is costly and time-consuming, automatic methods to accurately identify patients with ADRs have been called for. In this study, we developed and compared different informatics approaches to identify ADRs from EHRs, using clopidogrel-induced bleeding as our case study. Three different types of methods were investigated: 1) rule-based methods; 2) machine learning-based methods; and 3) scoring function-based methods. Our results show that both machine learning and scoring methods are effective and the scoring method can achieve a high precision with a reasonable recall. We also analyzed the contributions of different types of features and found that the temporality information between clopidogrel and bleeding events, as well as textual evidence from physicians' assertion of the adverse events are helpful. We believe that our findings are valuable in advancing EHR-based pharmacogenomic studies.
Collapse
Affiliation(s)
- Hee-Jin Lee
- University of Texas Health Science Center at Houston, Houston, TX
| | - Min Jiang
- University of Texas Health Science Center at Houston, Houston, TX
| | - Yonghui Wu
- University of Texas Health Science Center at Houston, Houston, TX
| | - Christian M Shaffer
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
| | - John H Cleator
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
| | - Eitan A Friedman
- Division of Cardiovascular Medicine, Vanderbilt University, Nashville, TN
| | - Joshua P Lewis
- Division of Endocrinology, Diabetes and Nutrition, University of Maryland School of Medicine, Baltimore, MD
| | - Dan M Roden
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, TN
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
| | - Josh Denny
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN
| | - Hua Xu
- University of Texas Health Science Center at Houston, Houston, TX
| |
Collapse
|
83
|
Xie S, Greenblatt R, Levy MZ, Himes BE. Enhancing Electronic Health Record Data with Geospatial Information. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017; 2017:123-132. [PMID: 28815121 PMCID: PMC5543367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Electronic Health Record (EHR)-derived data is a valuable resource for research, and efforts are underway to overcome some of its limitations by using data from external sources to gain a fuller picture of patient characteristics, symptoms, and exposures. Our goal was to assess the utility of augmenting EHR data with geocoded patient addresses to identify geospatial variation of disease that is not explained by EHR-derived demographic factors. Using 2011-2014 encounter data from 27,604 University of Pennsylvania Hospital System asthma patients, we identified factors associated with asthma exacerbations: risk was higher in female, black, middle aged to elderly, and obese patients, as well as those with positive smoking history and with Medicare or Medicaid vs. private insurance. Significant geospatial variability of asthma exacerbations was found using generalized additive models, even after adjusting for demographic factors. Our work shows that geospatial data can be used to cost-effectively enhance EHR data.
Collapse
Affiliation(s)
- Sherrie Xie
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Rebecca Greenblatt
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Michael Z Levy
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Blanca E Himes
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
84
|
Wei WQ, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ, Gamazon ER, Cox NJ, Roden DM, Denny JC. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS One 2017; 12:e0175508. [PMID: 28686612 PMCID: PMC5501393 DOI: 10.1371/journal.pone.0175508] [Citation(s) in RCA: 250] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 03/27/2017] [Indexed: 12/20/2022] Open
Abstract
OBJECTIVE To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated "phecodes" designed to facilitate phenome-wide association studies (PheWAS) in EHRs. METHODS AND MATERIALS We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. RESULTS Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. CONCLUSION Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.
Collapse
Affiliation(s)
- Wei-Qi Wei
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Lisa A. Bastarache
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Robert J. Carroll
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Joy E. Marlo
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Travis J. Osterman
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Eric R. Gamazon
- Vanderbilt Genetic Institute and the Division of Genetic Medicine, Vanderbilt University, Nashville, TN, United States of America
- Department of Clinical Epidemiology, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Department of Psychiatry, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
| | - Nancy J. Cox
- Vanderbilt Genetic Institute and the Division of Genetic Medicine, Vanderbilt University, Nashville, TN, United States of America
| | - Dan M. Roden
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Department of Clinical Pharmacology, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Joshua C. Denny
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
- * E-mail:
| |
Collapse
|
85
|
Mosley JD, Shoemaker MB, Wells QS, Darbar D, Shaffer CM, Edwards TL, Bastarache L, McCarty CA, Thompson W, Chute CG, Jarvik GP, Crosslin DR, Larson EB, Kullo IJ, Pacheco JA, Peissig PL, Brilliant MH, Linneman JG, Witte JS, Denny JC, Roden DM. Investigating the Genetic Architecture of the PR Interval Using Clinical Phenotypes. ACTA ACUST UNITED AC 2017; 10:CIRCGENETICS.116.001482. [PMID: 28416512 DOI: 10.1161/circgenetics.116.001482] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 03/03/2017] [Indexed: 01/24/2023]
Abstract
BACKGROUND One potential use for the PR interval is as a biomarker of disease risk. We hypothesized that quantifying the shared genetic architectures of the PR interval and a set of clinical phenotypes would identify genetic mechanisms contributing to PR variability and identify diseases associated with a genetic predictor of PR variability. METHODS AND RESULTS We used ECG measurements from the ARIC study (Atherosclerosis Risk in Communities; n=6731 subjects) and 63 genetically modulated diseases from the eMERGE network (Electronic Medical Records and Genomics; n=12 978). We measured pairwise genetic correlations (rG) between PR phenotypes (PR interval, PR segment, P-wave duration) and each of the 63 phenotypes. The PR segment was genetically correlated with atrial fibrillation (rG=-0.88; P=0.0009). An analysis of metabolic phenotypes in ARIC also showed that the P wave was genetically correlated with waist circumference (rG=0.47; P=0.02). A genetically predicted PR interval phenotype based on 645 714 single-nucleotide polymorphisms was associated with atrial fibrillation (odds ratio=0.89 per SD change; 95% confidence interval, 0.83-0.95; P=0.0006). The differing pattern of associations among the PR phenotypes is consistent with analyses that show that the genetic correlation between the P wave and PR segment was not significantly different from 0 (rG=-0.03 [0.16]). CONCLUSIONS The genetic architecture of the PR interval comprises modulators of atrial fibrillation risk and obesity.
Collapse
Affiliation(s)
- Jonathan D Mosley
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.).
| | - M Benjamin Shoemaker
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Quinn S Wells
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Dawood Darbar
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Christian M Shaffer
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Todd L Edwards
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Lisa Bastarache
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Catherine A McCarty
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Will Thompson
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Christopher G Chute
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Gail P Jarvik
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - David R Crosslin
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Eric B Larson
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Iftikhar J Kullo
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Jennifer A Pacheco
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Peggy L Peissig
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Murray H Brilliant
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - James G Linneman
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - John S Witte
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Josh C Denny
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| | - Dan M Roden
- From the Department of Medicine (J.D.M., M.B.S., Q.S.W., C.M.S., J.C.D., D.M.R.), Vanderbilt Epidemiology Center (T.L.E.), Department of Biomedical Informatics (L.B., J.C.D., D.M.R.), Department of Pharmacology (D.M.R.), Vanderbilt University, Nashville, TN; Division of Cardiology, University of Illinois at Chicago (D.D.); Essentia Institute of Rural Health, Duluth, MN (C.A.M.); Center for Biomedical Research Informatics, NorthShore University Health System, Evanston, IL (W.T.); School of Medicine (C.G.C.), School of Public Health (C.G.C.), and School of Nursing (C.G.C.), Johns Hopkins University, Baltimore, MD; Division of Medical Genetics, Department of Medicine (G.P.J.), Department of Genome Sciences (G.P.J.), Department of Biomedical Informatics (D.R.C.), Department of Medical Education (D.R.C.), University of Washington; Group Health Research Institute, Seattle, WA (E.B.L.); Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN (I.J.K.); Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.); Biomedical Informatics Research Center (P.L.P.), Center for Human Genetics (M.H.B., J.G.L.), Marshfield Clinic Research Foundation, WI; and Department of Epidemiology and Biostatistics, University of California, San Francisco (J.S.W.)
| |
Collapse
|
86
|
Velez Edwards DR, Hartmann KE, Wellons M, Shah A, Xu H, Edwards TL. Evaluating the role of race and medication in protection of uterine fibroids by type 2 diabetes exposure. BMC Womens Health 2017; 17:28. [PMID: 28399866 PMCID: PMC5387248 DOI: 10.1186/s12905-017-0386-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Accepted: 04/04/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Uterine fibroids (UF) affect 77% of women by menopause, and account for $9.4 billion in annual healthcare costs. Type-2-diabetes (T2D) has inconsistently associated with protection from UFs in prior studies. To further evaluate the relationship between T2D and UFs we tested for association between T2D and UF risk in a large clinical population as well as the potential differences due to T2D medications and interaction with race. METHODS This nested case-control study is derived from a clinical cohort. Our outcome was UF case-control status and our exposure was T2D. UF outcomes and T2D exposure were classified using validated electronic medical record (EMR) algorithms. Logistic regression, adjusted for covariates, was used to model the association between T2D diagnosis and UF risk. Secondary analyses were performed evaluating the interaction between T2D exposure and race and stratifying T2D exposed subjects by T2D medication being taken. RESULTS We identified 3,789 subjects with UF outcomes (608 UF cases and 3,181 controls), 714 were diabetic and 3,075 were non-diabetic. We observed a nominally significant interaction between T2D exposure and race in adjusted models (interaction p = 0.083). Race stratified analyses demonstrated more protection by T2D exposure on UF risk among European Americans (adjusted odds ratio [aOR] = 0.50, 95% CI 0.35 to 0.72) than African Americans (aOR = 0.76, 95% CI 0.50 to 1.17). We also observed a protective effect by T2D regardless of type of T2D medication being taken, with slightly more protection among subjects on insulin treatments (European Americans aOR = 0.42, 95% CI 0.26 to 0.68; African Americans aOR = 0.60, 95% CI 0.36 to 1.01). CONCLUSIONS These data, conducted in a large population of UF cases and controls, support prior studies that have found a protective association between diabetes presence and UF risk and is further modified by race. Protection from UFs by T2D exposure was observed regardless of medication type with slightly more protection among insulin users. Further mechanistic research in larger cohorts is necessary to reconcile the potential role of T2D in UF risk.
Collapse
Affiliation(s)
- Digna R. Velez Edwards
- Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, 2525 West End Ave., Suite 600 6th Floor, Nashville, TN 37203 USA
- Institute of Medicine and Public Health, Vanderbilt University Medical Center, Nashville, TN USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center , Nashville, TN USA
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN USA
| | - Katherine E. Hartmann
- Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, 2525 West End Ave., Suite 600 6th Floor, Nashville, TN 37203 USA
- Institute of Medicine and Public Health, Vanderbilt University Medical Center, Nashville, TN USA
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN USA
| | - Melissa Wellons
- Division of Diabetes, Endocrinology, and Metabolism, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN USA
| | - Anushi Shah
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN USA
| | - Hua Xu
- The University of Texas School Health Science Center, School of Biomedical Informatics, Houston, TX USA
| | - Todd L. Edwards
- Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, 2525 West End Ave., Suite 600 6th Floor, Nashville, TN 37203 USA
- Institute of Medicine and Public Health, Vanderbilt University Medical Center, Nashville, TN USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center , Nashville, TN USA
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN USA
| |
Collapse
|
87
|
Smith CT, Dang LC, Buckholtz JW, Tetreault AM, Cowan RL, Kessler RM, Zald DH. The impact of common dopamine D2 receptor gene polymorphisms on D2/3 receptor availability: C957T as a key determinant in putamen and ventral striatum. Transl Psychiatry 2017; 7:e1091. [PMID: 28398340 PMCID: PMC5416688 DOI: 10.1038/tp.2017.45] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/03/2016] [Revised: 12/02/2016] [Accepted: 01/17/2017] [Indexed: 12/20/2022] Open
Abstract
Dopamine function is broadly implicated in multiple neuropsychiatric conditions believed to have a genetic basis. Although a few positron emission tomography (PET) studies have investigated the impact of single-nucleotide polymorphisms (SNPs) in the dopamine D2 receptor gene (DRD2) on D2/3 receptor availability (binding potential, BPND), these studies have often been limited by small sample size. Furthermore, the most commonly studied SNP in D2/3 BPND (Taq1A) is not located in the DRD2 gene itself, suggesting that its linkage with other DRD2 SNPs may explain previous PET findings. Here, in the largest PET genetic study to date (n=84), we tested for effects of the C957T and -141C Ins/Del SNPs (located within DRD2) as well as Taq1A on BPND of the high-affinity D2 receptor tracer 18F-Fallypride. In a whole-brain voxelwise analysis, we found a positive linear effect of C957T T allele status on striatal BPND bilaterally. The multilocus genetic scores containing C957T and one or both of the other SNPs produced qualitatively similar striatal results to C957T alone. The number of C957T T alleles predicted BPND in anatomically defined putamen and ventral striatum (but not caudate) regions of interest, suggesting some regional specificity of effects in the striatum. By contrast, no significant effects arose in cortical regions. Taken together, our data support the critical role of C957T in striatal D2/3 receptor availability. This work has implications for a number of psychiatric conditions in which dopamine signaling and variation in C957T status have been implicated, including schizophrenia and substance use disorders.
Collapse
Affiliation(s)
- C T Smith
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - L C Dang
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - J W Buckholtz
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - A M Tetreault
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - R L Cowan
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - R M Kessler
- Department of Radiology, UAB School of Medicine, Birmingham, AL, USA
| | - D H Zald
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA
| |
Collapse
|
88
|
Barnado A, Casey C, Carroll RJ, Wheless L, Denny JC, Crofford LJ. Developing Electronic Health Record Algorithms That Accurately Identify Patients With Systemic Lupus Erythematosus. Arthritis Care Res (Hoboken) 2017; 69:687-693. [PMID: 27390187 DOI: 10.1002/acr.22989] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Revised: 06/02/2016] [Accepted: 07/05/2016] [Indexed: 12/19/2022]
Abstract
OBJECTIVE To study systemic lupus erythematosus (SLE) in the electronic health record (EHR), we must accurately identify patients with SLE. Our objective was to develop and validate novel EHR algorithms that use International Classification of Diseases, Ninth Revision (ICD-9), Clinical Modification codes, laboratory testing, and medications to identify SLE patients. METHODS We used Vanderbilt's Synthetic Derivative, a de-identified version of the EHR, with 2.5 million subjects. We selected all individuals with at least 1 SLE ICD-9 code (710.0), yielding 5,959 individuals. To create a training set, 200 subjects were randomly selected for chart review. A subject was defined as a case if diagnosed with SLE by a rheumatologist, nephrologist, or dermatologist. Positive predictive values (PPVs) and sensitivity were calculated for combinations of code counts of the SLE ICD-9 code, a positive antinuclear antibody (ANA), ever use of medications, and a keyword of "lupus" in the problem list. The algorithms with the highest PPV were each internally validated using a random set of 100 individuals from the remaining 5,759 subjects. RESULTS The algorithm with the highest PPV at 95% in the training set and 91% in the validation set was 3 or more counts of the SLE ICD-9 code, ANA positive (≥1:40), and ever use of both disease-modifying antirheumatic drugs and steroids, while excluding individuals with systemic sclerosis and dermatomyositis ICD-9 codes. CONCLUSION We developed and validated the first EHR algorithm that incorporates laboratory values and medications with the SLE ICD-9 code to identify patients with SLE accurately.
Collapse
Affiliation(s)
- April Barnado
- Vanderbilt University Medical Center, Nashville, Tennessee
| | - Carolyn Casey
- Vanderbilt University Medical Center, Nashville, Tennessee
| | | | - Lee Wheless
- Vanderbilt University Medical Center, Nashville, Tennessee
| | - Joshua C Denny
- Vanderbilt University Medical Center, Nashville, Tennessee
| | | |
Collapse
|
89
|
Vogel MME, Combs SE, Kessel KA. mHealth and Application Technology Supporting Clinical Trials: Today's Limitations and Future Perspective of smartRCTs. Front Oncol 2017; 7:37. [PMID: 28348978 PMCID: PMC5346562 DOI: 10.3389/fonc.2017.00037] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 02/27/2017] [Indexed: 11/13/2022] Open
Abstract
Nowadays, applications (apps) for smartphones and tablets have become indispensable especially for young generations. The estimated number of mobile devices will exceed 2.16 billion in 2016. Over 2.2 million apps are available in the Google Play store®, and about 1.8 million apps are available in the Apple App Store®. Google and Apple distribute nearly 70,000 apps each in the category Health and Fitness, and about 33,000 and 46,000 each in medical apps. It seems like the willingness to use mHealth apps is high and the intention to share data for health research is existing. This leads to one conclusion: the time for app-accompanied clinical trials (smartRCTs) has come. In this perspective article, we would like to point out the stones put in the way while trying to implement apps in clinical research. Further, we try to offer a glimpse of what the future of smartRCT research may hold.
Collapse
Affiliation(s)
- Marco M E Vogel
- Department of Radiation Oncology, Technische Universität München (TUM), Munich, Germany; Institute for Innovative Radiotherapy, Helmholtz Zentrum München, Neuherberg, Germany
| | - Stephanie E Combs
- Department of Radiation Oncology, Technische Universität München (TUM), Munich, Germany; Institute for Innovative Radiotherapy, Helmholtz Zentrum München, Neuherberg, Germany
| | - Kerstin A Kessel
- Department of Radiation Oncology, Technische Universität München (TUM), Munich, Germany; Institute for Innovative Radiotherapy, Helmholtz Zentrum München, Neuherberg, Germany
| |
Collapse
|
90
|
Chase HS, Mitrani LR, Lu GG, Fulgieri DJ. Early recognition of multiple sclerosis using natural language processing of the electronic health record. BMC Med Inform Decis Mak 2017; 17:24. [PMID: 28241760 PMCID: PMC5329909 DOI: 10.1186/s12911-017-0418-4] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Accepted: 02/10/2017] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Diagnostic accuracy might be improved by algorithms that searched patients' clinical notes in the electronic health record (EHR) for signs and symptoms of diseases such as multiple sclerosis (MS). The focus this study was to determine if patients with MS could be identified from their clinical notes prior to the initial recognition by their healthcare providers. METHODS An MS-enriched cohort of patients with well-established MS (n = 165) and controls (n = 545), was generated from the adult outpatient clinic. A random sample cohort was generated from randomly selected patients (n = 2289) from the same adult outpatient clinic, some of whom had MS (n = 16). Patients' notes were extracted from the data warehouse and signs and symptoms mapped to UMLS terms using MedLEE. Approximately 1000 MS-related terms occurred significantly more frequently in MS patients' notes than controls'. Synonymous terms were manually clustered into 50 buckets and used as classification features. Patients were classified as MS or not using Naïve Bayes classification. RESULTS Classification of patients known to have MS using notes of the MS-enriched cohort entered after the initial ICD9[MS] code yielded an ROC AUC, sensitivity, and specificity of 0.90 [0.87-0.93], 0.75[0.66-0.82], and 0.91 [0.87-0.93], respectively. Similar classification accuracy was achieved using the notes from the random sample cohort. Classification of patients not yet known to have MS using notes of the MS-enriched cohort entered before the initial ICD9[MS] documentation identified 40% [23-59%] as having MS. Manual review of the EHR of 45 patients of the random sample cohort classified as having MS but lacking an ICD9[MS] code identified four who might have unrecognized MS. CONCLUSIONS Diagnostic accuracy might be improved by mining patients' clinical notes for signs and symptoms of specific diseases using NLP. Using this approach, we identified patients with MS early in the course of their disease which could potentially shorten the time to diagnosis. This approach could also be applied to other diseases often missed by primary care providers such as cancer. Whether implementing computerized diagnostic support ultimately shortens the time from earliest symptoms to formal recognition of the disease remains to be seen.
Collapse
Affiliation(s)
- Herbert S Chase
- Department of Biomedical Informatics, Columbia University Medical Center, PH-20, 622 West 168th street, New York, NY, 10032, USA.
| | - Lindsey R Mitrani
- Department of Biomedical Informatics, Columbia University Medical Center, PH-20, 622 West 168th street, New York, NY, 10032, USA
| | - Gabriel G Lu
- Department of Biomedical Informatics, Columbia University Medical Center, PH-20, 622 West 168th street, New York, NY, 10032, USA
| | - Dominick J Fulgieri
- Department of Biomedical Informatics, Columbia University Medical Center, PH-20, 622 West 168th street, New York, NY, 10032, USA
| |
Collapse
|
91
|
Heit JA, Armasu SM, McCauley BM, Kullo IJ, Sicotte H, Pathak J, Chute CG, Gottesman O, Bottinger EP, Denny JC, Roden DM, Li R, Ritchie MD, de Andrade M. Identification of unique venous thromboembolism-susceptibility variants in African-Americans. Thromb Haemost 2017; 117:758-768. [PMID: 28203683 DOI: 10.1160/th16-08-0652] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 01/12/2017] [Indexed: 12/30/2022]
Abstract
To identify novel single nucleotide polymorphisms (SNPs) associated with venous thromboembolism (VTE) in African-Americans (AAs), we performed a genome-wide association study (GWAS) of VTE in AAs using the Electronic Medical Records and Genomics (eMERGE) Network, comprised of seven sites each with DNA biobanks (total ~39,200 unique DNA samples) with genome-wide SNP data (imputed to 1000 Genomes Project cosmopolitan reference panel) and linked to electronic health records (EHRs). Using a validated EHR-driven phenotype extraction algorithm, we identified VTE cases and controls and tested for an association between each SNP and VTE using unconditional logistic regression, adjusted for age, sex, stroke, site-platform combination and sickle cell risk genotype. Among 393 AA VTE cases and 4,941 AA controls, three intragenic SNPs reached genome-wide significance: LEMD3 rs138916004 (OR=3.2; p=1.3E-08), LY86 rs3804476 (OR=1.8; p=2E-08) and LOC100130298 rs142143628 (OR=4.5; p=4.4E-08); all three SNPs validated using internal cross-validation, parametric bootstrap and meta-analysis methods. LEMD3 rs138916004 and LOC100130298 rs142143628 are only present in Africans (1000G data). LEMD3 showed a significant differential expression in both NCBI Gene Expression Omnibus (GEO) and the Mayo Clinic gene expression data, LOC100130298 showed a significant differential expression only in the GEO expression data, and LY86 showed a significant differential expression only in the Mayo expression data. LEMD3 encodes for an antagonist of TGF-β-induced cell proliferation arrest. LY86 encodes for MD-1 which down-regulates the pro-inflammatory response to lipopolysaccharide; LY86 variation was previously associated with VTE in white women; LOC100130298 is a non-coding RNA gene with unknown regulatory activity in gene expression and epigenetics.
Collapse
Affiliation(s)
- John A Heit
- John A. Heit, MD, Stabile 6-Hematology Research, Mayo Clinic, 200 First Street, SW, Rochester, MN 55905, USA, Tel.: +1 507 284 4634, Fax: +1 507 266 9302, E-mail:
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
92
|
Duan R, Cao M, Wu Y, Huang J, Denny JC, Xu H, Chen Y. An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2017; 2016:1764-1773. [PMID: 28269935 PMCID: PMC5333313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Over the last decade, Electronic Health Records (EHR) systems have been increasingly implemented at US hospitals. Despite their great potential, the complex and uneven nature of clinical documentation and data quality brings additional challenges for analyzing EHR data. A critical challenge is the information bias due to the measurement errors in outcome and covariates. We conducted empirical studies to quantify the impacts of the information bias on association study. Specifically, we designed our simulation studies based on the characteristics of the Electronic Medical Records and Genomics (eMERGE) Network. Through simulation studies, we quantified the loss of power due to misclassifications in case ascertainment and measurement errors in covariate status extraction, with respect to different levels of misclassification rates, disease prevalence, and covariate frequencies. These empirical findings can inform investigators for better understanding of the potential power loss due to misclassification and measurement errors under a variety of conditions in EHR based association studies.
Collapse
Affiliation(s)
- Rui Duan
- Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA
| | - Ming Cao
- School of Public Health, The University of Texas Health Science Center at Houston Houston, TX, USA
| | - Yonghui Wu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jing Huang
- School of Public Health, The University of Texas Health Science Center at Houston Houston, TX, USA
| | - Joshua C Denny
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA; Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yong Chen
- Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
93
|
Kolek MJ, Graves AJ, Xu M, Bian A, Teixeira PL, Shoemaker MB, Parvez B, Xu H, Heckbert SR, Ellinor PT, Benjamin EJ, Alonso A, Denny JC, Moons KGM, Shintani AK, Harrell FE, Roden DM, Darbar D. Evaluation of a Prediction Model for the Development of Atrial Fibrillation in a Repository of Electronic Medical Records. JAMA Cardiol 2016; 1:1007-1013. [PMID: 27732699 DOI: 10.1001/jamacardio.2016.3366] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Importance Atrial fibrillation (AF) contributes to substantial morbidity, mortality, and health care expenditures. Accurate prediction of incident AF would enhance AF management and potentially improve patient outcomes. Objective To validate the AF risk prediction model originally developed by the Cohorts for Heart and Aging Research in Genomic Epidemiology-Atrial Fibrillation (CHARGE-AF) investigators using a large repository of electronic medical records (EMRs). Design, Setting, and Participants In this prediction model study, deidentified EMRs of 33 494 individuals 40 years or older who were white or African American and had no history of AF were reviewed and analyzed. The participants were followed up in the internal medicine outpatient clinics at Vanderbilt University Medical Center for incident AF from December 31, 2005, until December 31, 2010. Adjusting for differences in baseline hazard, the CHARGE-AF Cox proportional hazards model regression coefficients were applied to the EMR cohort. A simple version of the model with no echocardiographic variables was also evaluated. Data were analyzed from October 31, 2013, to January 31, 2014. Main Outcomes and Measures Incident AF. Predictors in the model included age, race, height, weight, systolic and diastolic blood pressure, treatment for hypertension, smoking status, type 2 diabetes, heart failure, history of myocardial infarction, left ventricular hypertrophy, and PR interval. Results Among the 33 494 participants, the median age was 57 (interquartile range, 49-67) years; 57% of patients were women, 43% were men, 85.7% were white, and 14.3% were African American. During the mean (SD) follow-up of 4.8 (0.9) years, 2455 individuals (7.3%) developed AF. Both models had poor calibration in the EMR cohort, with underprediction of AF among low-risk individuals and overprediction of AF among high-risk individuals (10th and 90th percentiles for predicted probability of incident AF, 0.005 and 0.179, respectively). The full CHARGE-AF model had a C index of 0.708 (95% CI, 0.699-0.718) in our cohort. The simple model had similar discrimination (C index, 0.709; 95% CI, 0.699-0.718; P = .70 for difference between models). Conclusions and Relevance Despite reasonable discrimination, the CHARGE-AF models showed poor calibration in this EMR cohort. This study highlights the difficulties of applying a risk model derived from prospective cohort studies to an EMR cohort and suggests that these AF risk prediction models be used with caution in the EMR setting. Future risk models may need to be developed and validated within EMR cohorts.
Collapse
Affiliation(s)
- Matthew J Kolek
- Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Amy J Graves
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Meng Xu
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Aihua Bian
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Pedro Luis Teixeira
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - M Benjamin Shoemaker
- Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Babar Parvez
- Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Hua Xu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston
| | | | | | - Emelia J Benjamin
- Framingham Heart Study, National Heart Lung and Blood Institute and Boston University, Framingham, Massachusetts8Department of Medicine, Boston University School of Medicine, Boston, Massachusetts
| | - Alvaro Alonso
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Karel G M Moons
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee10Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands
| | | | - Frank E Harrell
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Dan M Roden
- Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Dawood Darbar
- Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee12Division of Cardiology, University of Illinois at Chicago
| |
Collapse
|
94
|
Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, Pacheco JA, Tromp G, Pathak J, Carrell DS, Ellis SB, Lingren T, Thompson WK, Savova G, Haines J, Roden DM, Harris PA, Denny JC. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc 2016; 23:1046-1052. [PMID: 27026615 PMCID: PMC5070514 DOI: 10.1093/jamia/ocv202] [Citation(s) in RCA: 246] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Revised: 10/27/2015] [Accepted: 11/25/2015] [Indexed: 01/29/2023] Open
Abstract
OBJECTIVE Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http://phekb.org), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites. RESULTS As of June 2015, PheKB contained 30 finalized phenotype algorithms and 62 algorithms in development spanning a range of traits and diseases. Phenotypes have had over 3500 unique views in a 6-month period and have been reused by other institutions. International Classification of Disease codes were the most frequently used component, followed by medications and natural language processing. Among algorithms with published performance data, the median PPV was nearly identical when evaluated at the authoring institutions (n = 44; case 96.0%, control 100%) compared to implementation sites (n = 40; case 97.5%, control 100%). DISCUSSION These results demonstrate that a broad range of algorithms to mine electronic health record data from different health systems can be developed with high PPV, and algorithms developed at one site are generally transportable to others. CONCLUSION By providing a central repository, PheKB enables improved development, transportability, and validity of algorithms for research-grade phenotypes using health care generated data.
Collapse
Affiliation(s)
| | - Peter Speltz
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Luke V Rasmussen
- Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | | | - Omri Gottesman
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | | | | | | | | | | | - Todd Lingren
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Will K Thompson
- Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | - Guergana Savova
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Dan M Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Paul A Harris
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C Denny
- Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
95
|
Restrepo NA, Butkiewicz M, McGrath JA, Crawford DC. Shared Genetic Etiology of Autoimmune Diseases in Patients from a Biorepository Linked to De-identified Electronic Health Records. Front Genet 2016; 7:185. [PMID: 27812365 PMCID: PMC5071319 DOI: 10.3389/fgene.2016.00185] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 10/03/2016] [Indexed: 01/15/2023] Open
Abstract
Autoimmune diseases represent a significant medical burden affecting up to 5–8% of the U.S. population. While genetics is known to play a role, studies of common autoimmune diseases are complicated by phenotype heterogeneity, limited sample sizes, and a single disease approach. Here we performed a targeted genetic association study for cases of multiple sclerosis (MS), rheumatoid arthritis (RA), and Crohn's disease (CD) to assess which common genetic variants contribute individually and pleiotropically to disease risk. Joint modeling and pathway analysis combining the three phenotypes were performed to identify common underlying mechanisms of risk of autoimmune conditions. European American cases of MS, RA, and CD, (n = 119, 53, and 129, respectively) and 1924 controls were identified using de-identified electronic health records (EHRs) through a combination of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) billing codes, Current Procedural Terminology (CPT) codes, medication lists, and text matching. As expected, hallmark SNPs in MS, such as DQA1 rs9271366 (OR = 1.91; p = 0.008), replicated in the present study. Both MS and CD were associated with TIMMDC1 rs2293370 (OR = 0.27, p = 0.01; OR = 0.25, p = 0.02; respectively). Additionally, PDE2A rs3781913 was significantly associated with both CD and RA (OR = 0.46, p = 0.02; OR = 0.32, p = 0.02; respectively). Joint modeling and pathway analysis identified variants within the KEGG NOD-like receptor signaling pathway and Shigellosis pathway as being correlated with the combined autoimmune phenotype. Our study replicated previously-reported genetic associations for MS and CD in a population derived from de-identified EHRs. We found evidence to support a shared genetic etiology between CD/MS and CD/RA outside of the major histocompatibility complex region and identified KEGG pathways indicative of a bacterial pathogenesis risk for autoimmunity in a joint model. Future work to elucidate this shared etiology will be key in the development of risk models as envisioned in the era of precision medicine.
Collapse
Affiliation(s)
- Nicole A Restrepo
- Department of Epidemiology and Biostatistics, Case Western Reserve University Cleveland, OH, USA
| | - Mariusz Butkiewicz
- Department of Epidemiology and Biostatistics, Case Western Reserve University Cleveland, OH, USA
| | - Josephine A McGrath
- Vanderbilt Eye Institute, Vanderbilt University Medical Center Nashville, TN, USA
| | - Dana C Crawford
- Department of Epidemiology and Biostatistics, Case Western Reserve UniversityCleveland, OH, USA; Institute for Computational Biology, Case Western Reserve UniversityCleveland, OH, USA
| |
Collapse
|
96
|
Li M, Wei C, Wen Y, Wang T, Lu Q. Detecting Gene-Gene Interactions Associated with Multiple Complex Traits with U-Statistics. Curr Genomics 2016; 17:403-415. [PMID: 28479869 PMCID: PMC5320542 DOI: 10.2174/1389202917666160513100946] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Revised: 05/26/2015] [Accepted: 06/06/2015] [Indexed: 12/02/2022] Open
Abstract
Many complex diseases, such as psychiatric and behavioral disorders, are commonly characterized through various measurements that reflect physical, behavioral and psychological aspects of diseases. While it remains a great challenge to find a unified measurement to characterize a disease, the available multiple phenotypes can be analyzed jointly in the genetic association study. Simultaneously testing these phenotypes has many advantages, including considering different aspects of the disease in the analysis, and utilizing correlated phenotypes to improve the power of detecting disease-associated variants. Furthermore, complex diseases are likely caused by the interplay of multiple genetic variants through complicated mechanisms. Considering gene-gene interactions in the joint association analysis of complex diseases could further increase our ability to discover genetic variants involving complex disease pathways. In this article, we propose a stepwise U-test for joint association analysis of multiple loci and multiple phenotypes. Through simulations, we demonstrated that testing multiple phenotypes simultaneously could attain higher power than testing one single phenotype at a time, especially when there are shared genes contributing to multiple phenotypes. We also illustrated the proposed method with an application to Nicotine Dependence (ND), using datasets from the Study of Addition, Genetics and Environment (SAGE). The joint analysis of three ND phenotypes identified two SNPs, rs10508649 and rs2491397, and reached a nominal P-value of 3.79e-13. The association was further replicated in two independent datasets with P-values of 2.37e-05 and 7.46e-05.
Collapse
Affiliation(s)
| | | | | | | | - Qing Lu
- Address correspondence to this author at the Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, P.R. China; Tel: 517.353.8623 x137; Fax: 517.432.1130;, E-mail:
| |
Collapse
|
97
|
Manolio TA. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio. Atherosclerosis 2016; 253:225-236. [PMID: 27612677 PMCID: PMC5064852 DOI: 10.1016/j.atherosclerosis.2016.08.034] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Revised: 08/19/2016] [Accepted: 08/23/2016] [Indexed: 01/08/2023]
Abstract
Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so.
Collapse
Affiliation(s)
- Teri A Manolio
- Division of Genomic Medicine, National Human Genome Research Institute, 5635 Fishers Lane, Room 4113, MSC 9305, Bethesda MD, USA.
| |
Collapse
|
98
|
Denny JC, Bastarache L, Roden DM. Phenome-Wide Association Studies as a Tool to Advance Precision Medicine. Annu Rev Genomics Hum Genet 2016; 17:353-73. [PMID: 27147087 PMCID: PMC5480096 DOI: 10.1146/annurev-genom-090314-024956] [Citation(s) in RCA: 164] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Beginning in the early 2000s, the accumulation of biospecimens linked to electronic health records (EHRs) made possible genome-phenome studies (i.e., comparative analyses of genetic variants and phenotypes) using only data collected as a by-product of typical health care. In addition to disease and trait genetics, EHRs proved a valuable resource for analyzing pharmacogenetic traits and developing reverse genetics approaches such as phenome-wide association studies (PheWASs). PheWASs are designed to survey which of many phenotypes may be associated with a given genetic variant. PheWAS methods have been validated through replication of hundreds of known genotype-phenotype associations, and their use has differentiated between true pleiotropy and clinical comorbidity, added context to genetic discoveries, and helped define disease subtypes, and may also help repurpose medications. PheWAS methods have also proven to be useful with research-collected data. Future efforts that integrate broad, robust collection of phenotype data (e.g., EHR data) with purpose-collected research data in combination with a greater understanding of EHR data will create a rich resource for increasingly more efficient and detailed genome-phenome analysis to usher in new discoveries in precision medicine.
Collapse
Affiliation(s)
- Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
| | - Dan M Roden
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| |
Collapse
|
99
|
Joshi AD, Andersson C, Buch S, Stender S, Noordam R, Weng LC, Weeke PE, Auer PL, Boehm B, Chen C, Choi H, Curhan G, Denny JC, De Vivo I, Eicher JD, Ellinghaus D, Folsom AR, Fuchs C, Gala M, Haessler J, Hofman A, Hu F, Hunter DJ, Janssen HL, Kang JH, Kooperberg C, Kraft P, Kratzer W, Lieb W, Lutsey PL, Murad SD, Nordestgaard BG, Pasquale LR, Reiner AP, Ridker PM, Rimm E, Rose LM, Shaffer CM, Schafmayer C, Tamimi RM, Uitterlinden AG, Völker U, Völzke H, Wakabayashi Y, Wiggs JL, Zhu J, Roden DM, Stricker BH, Tang W, Teumer A, Hampe J, Tybjærg-Hansen A, Chasman DI, Chan AT, Johnson AD. Four Susceptibility Loci for Gallstone Disease Identified in a Meta-analysis of Genome-Wide Association Studies. Gastroenterology 2016; 151:351-363.e28. [PMID: 27094239 PMCID: PMC4959966 DOI: 10.1053/j.gastro.2016.04.007] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 04/06/2016] [Accepted: 04/07/2016] [Indexed: 01/01/2023]
Abstract
BACKGROUND & AIMS A genome-wide association study (GWAS) of 280 cases identified the hepatic cholesterol transporter ABCG8 as a locus associated with risk for gallstone disease, but findings have not been reported from any other GWAS of this phenotype. We performed a large-scale, meta-analysis of GWASs of individuals of European ancestry with available prior genotype data, to identify additional genetic risk factors for gallstone disease. METHODS We obtained per-allele odds ratio (OR) and standard error estimates using age- and sex-adjusted logistic regression models within each of the 10 discovery studies (8720 cases and 55,152 controls). We performed an inverse variance weighted, fixed-effects meta-analysis of study-specific estimates to identify single-nucleotide polymorphisms that were associated independently with gallstone disease. Associations were replicated in 6489 cases and 62,797 controls. RESULTS We observed independent associations for 2 single-nucleotide polymorphisms at the ABCG8 locus: rs11887534 (OR, 1.69; 95% confidence interval [CI], 1.54-1.86; P = 2.44 × 10(-60)) and rs4245791 (OR, 1.27; P = 1.90 × 10(-34)). We also identified and/or replicated associations for rs9843304 in TM4SF4 (OR, 1.12; 95% CI, 1.08-1.16; P = 6.09 × 10(-11)), rs2547231 in SULT2A1 (encodes a sulfoconjugation enzyme that acts on hydroxysteroids and cholesterol-derived sterol bile acids) (OR, 1.17; 95% CI, 1.12-1.21; P = 2.24 × 10(-10)), rs1260326 in glucokinase regulatory protein (OR, 1.12; 95% CI, 1.07-1.17; P = 2.55 × 10(-10)), and rs6471717 near CYP7A1 (encodes an enzyme that catalyzes conversion of cholesterol to primary bile acids) (OR, 1.11; 95% CI, 1.08-1.15; P = 8.84 × 10(-9)). Among individuals of African American and Hispanic American ancestry, rs11887534 and rs4245791 were associated positively with gallstone disease risk, whereas the association for the rs1260326 variant was inverse. CONCLUSIONS In this large-scale GWAS of gallstone disease, we identified 4 loci in genes that have putative functions in cholesterol metabolism and transport, and sulfonylation of bile acids or hydroxysteroids.
Collapse
Affiliation(s)
- Amit D. Joshi
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, MA,Division of Gastroenterology, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA,Clinical and Translational Epidemiology Unit, Massachusetts General Hospital Boston, MA,To whom correspondence should be addressed: Amit D. Joshi, MBBS, PhD, Clinical and Translational Epidemiology Unit, Division of Gastroenterology, Massachusetts General Hospital, 55 Fruit Street, Boston, Massachusetts 02114, USA. Tel: +1 617 724 7558; Charlotte Andersson, MD, PhD, The Framingham Heart Study, 73 Mt Wayte Avenue, Framingham, Massachusetts 01702, USA. , Andrew T. Chan, MD, MPH, Massachusetts General Hospital and Harvard Medical School, Clinical and Translational Epidemiology Unit, Division of Gastroenterology, GRJ-825C, Boston, Massachusetts 02114, USA. Tel:+1 617 724 0283; Fax: +1 617 726 3673; , Andrew D. Johnson, PhD, Division of Intramural Research, National Heart, Lung and Blood Institute, Cardiovascular Epidemiology and Human Genomics Branch, The Framingham Heart Study, 73 Mt. Wayte Ave., Suite #2, Framingham, MA, 01702, USA. Tel: +1 508 663 4082; Fax: +1 508 626 1262;
| | - Charlotte Andersson
- The National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts.
| | - Stephan Buch
- Medical Department 1, University Hospital Dresden, TU Dresden, Dresden Germany
| | - Stefan Stender
- Department of Clinical Biochemistry, Rigshospitalet, Copenhagen, Denmark
| | - Raymond Noordam
- Department of Internal Medicine, Erasmus Medical Center, Rotterdam, the Netherlands,Department of Epidemiology, Erasmus Medical Center, Rotterdam, the Netherlands
| | - Lu-Chen Weng
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, MN
| | - Peter E. Weeke
- Department of Medicine, Vanderbilt University, Nashville, TN,Department of Cardiology, The Heart Centre, Rigshospitalet, Copenhagen University Hospital, Denmark
| | - Paul L. Auer
- Joseph J. Zilber School of Public Health, University of Wisconsin, Milwaukee,Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA
| | - Bernhard Boehm
- Department of Internal Medicine I, Ulm University Hospital, Ulm, Germany
| | - Constance Chen
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, MA
| | - Hyon Choi
- Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Boston, MA
| | - Gary Curhan
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,Renal Division, Department of Medicine, Brigham and Women’s Hospital, Boston, MA
| | - Joshua C. Denny
- Department of Medicine, Vanderbilt University, Nashville, TN,Department of Biomedical Informatics, Vanderbilt University, Nashville, TN
| | - Immaculata De Vivo
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, MA,Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,Department of Epidemiology, Harvard School of Public Health, Boston, MA
| | - John D. Eicher
- The National Heart, Lung, and Blood Institute’s Framingham Heart Study, Framingham, MA,Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Framingham, MA
| | - David Ellinghaus
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
| | - Aaron R. Folsom
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, MN
| | - Charles Fuchs
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA
| | - Manish Gala
- Division of Gastroenterology, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Jeffrey Haessler
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA
| | - Albert Hofman
- Department of Epidemiology, Erasmus Medical Center, Rotterdam, the Netherlands
| | - Frank Hu
- Department of Epidemiology, Harvard School of Public Health, Boston, MA,Department of Nutrition, Harvard School of Public Health, Boston, MA
| | - David J. Hunter
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, MA,Department of Epidemiology, Harvard School of Public Health, Boston, MA
| | - Harry L.A. Janssen
- Department of Gastroenterology and Hepatology, Erasmus MC, Rotterdam, the Netherlands,Toronto Centre for Liver Disease, Toronto Western and General Hospital, University Health Network, Toronto, Canada
| | - Jae H. Kang
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA
| | - Peter Kraft
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, MA,Department of Epidemiology, Harvard School of Public Health, Boston, MA
| | - Wolfgang Kratzer
- Department of Internal Medicine I, Ulm University Hospital, Ulm, Germany
| | - Wolfgang Lieb
- Institute of Epidemiology, Christian Albrechts Universität Kiel, Niemannsweg 11, Kiel, Germany
| | - Pamela L. Lutsey
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, MN
| | - Sarwa Darwish Murad
- Department of Gastroenterology and Hepatology, Erasmus MC, Rotterdam, the Netherlands
| | - Børge G. Nordestgaard
- The Copenhagen General Population Study and,Department of Clinical Biochemistry, Herlev Hospital, Herlev Denmark,Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Louis R. Pasquale
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, MA
| | - Alex P. Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA
| | - Paul M Ridker
- Division of Preventive Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
| | - Eric Rimm
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,Department of Epidemiology, Harvard School of Public Health, Boston, MA,Department of Nutrition, Harvard School of Public Health, Boston, MA
| | - Lynda M. Rose
- Division of Preventive Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
| | | | - Clemens Schafmayer
- Department of General, Abdominal, Thoracic and Transplantation Surgery, University of Kiel, Kiel, Germany
| | - Rulla M. Tamimi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,Department of Epidemiology, Harvard School of Public Health, Boston, MA
| | - André G Uitterlinden
- Department of Internal Medicine, Erasmus Medical Center, Rotterdam, the Netherlands,Department of Epidemiology, Erasmus Medical Center, Rotterdam, the Netherlands
| | - Uwe Völker
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Germany
| | - Henry Völzke
- Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany,German Center for Cardiovascular Research, Partner Site Greifswald,German Center for Diabetes Research, Site Greifswald
| | - Yoshiyuki Wakabayashi
- The National Heart, Lung, and Blood Institute, DNA Sequencing Core Laboratory, Bethesda, MD
| | - Janey L. Wiggs
- Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, MA
| | - Jun Zhu
- The National Heart, Lung, and Blood Institute, DNA Sequencing Core Laboratory, Bethesda, MD
| | - Dan M. Roden
- Department of Medicine, Vanderbilt University, Nashville, TN
| | - Bruno H. Stricker
- Department of Internal Medicine, Erasmus Medical Center, Rotterdam, the Netherlands,Department of Epidemiology, Erasmus Medical Center, Rotterdam, the Netherlands
| | - Weihong Tang
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, MN
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
| | - Jochen Hampe
- Medical Department 1, University Hospital Dresden, TU Dresden, Dresden Germany
| | - Anne Tybjærg-Hansen
- Department of Clinical Biochemistry, Rigshospitalet, Copenhagen, Denmark,Department of Clinical Biochemistry, Herlev Hospital, Herlev Denmark
| | - Daniel I. Chasman
- Division of Preventive Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
| | - Andrew T. Chan
- Division of Gastroenterology, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA,Clinical and Translational Epidemiology Unit, Massachusetts General Hospital Boston, MA,Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA,To whom correspondence should be addressed: Amit D. Joshi, MBBS, PhD, Clinical and Translational Epidemiology Unit, Division of Gastroenterology, Massachusetts General Hospital, 55 Fruit Street, Boston, Massachusetts 02114, USA. Tel: +1 617 724 7558; Charlotte Andersson, MD, PhD, The Framingham Heart Study, 73 Mt Wayte Avenue, Framingham, Massachusetts 01702, USA. , Andrew T. Chan, MD, MPH, Massachusetts General Hospital and Harvard Medical School, Clinical and Translational Epidemiology Unit, Division of Gastroenterology, GRJ-825C, Boston, Massachusetts 02114, USA. Tel:+1 617 724 0283; Fax: +1 617 726 3673; , Andrew D. Johnson, PhD, Division of Intramural Research, National Heart, Lung and Blood Institute, Cardiovascular Epidemiology and Human Genomics Branch, The Framingham Heart Study, 73 Mt. Wayte Ave., Suite #2, Framingham, MA, 01702, USA. Tel: +1 508 663 4082; Fax: +1 508 626 1262;
| | - Andrew D. Johnson
- The National Heart, Lung, and Blood Institute’s Framingham Heart Study, Framingham, MA,Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Framingham, MA,To whom correspondence should be addressed: Amit D. Joshi, MBBS, PhD, Clinical and Translational Epidemiology Unit, Division of Gastroenterology, Massachusetts General Hospital, 55 Fruit Street, Boston, Massachusetts 02114, USA. Tel: +1 617 724 7558; Charlotte Andersson, MD, PhD, The Framingham Heart Study, 73 Mt Wayte Avenue, Framingham, Massachusetts 01702, USA. , Andrew T. Chan, MD, MPH, Massachusetts General Hospital and Harvard Medical School, Clinical and Translational Epidemiology Unit, Division of Gastroenterology, GRJ-825C, Boston, Massachusetts 02114, USA. Tel:+1 617 724 0283; Fax: +1 617 726 3673; , Andrew D. Johnson, PhD, Division of Intramural Research, National Heart, Lung and Blood Institute, Cardiovascular Epidemiology and Human Genomics Branch, The Framingham Heart Study, 73 Mt. Wayte Ave., Suite #2, Framingham, MA, 01702, USA. Tel: +1 508 663 4082; Fax: +1 508 626 1262;
| |
Collapse
|
100
|
Drake BF, Brown K, McGowan LD, Haslag-Minoff J, Kaphingst K. Secondary consent to biospecimen use in a prostate cancer biorepository. BMC Res Notes 2016; 9:346. [PMID: 27431491 PMCID: PMC4949745 DOI: 10.1186/s13104-016-2159-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Accepted: 07/13/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biorepository research has substantial societal benefits. This is one of the few studies to focus on male willingness to allow future research use of biospecimens. METHODS This study analyzed the future research consent questions from a prostate cancer biorepository study (N = 1931). The consent form asked two questions regarding use of samples in future studies (1) without and (2) with protected health information (PHI). Yes to both questions of use of samples was categorized as Yes-Always; Yes to without and No to with PHI was categorized as Yes-Conditional; No to without PHI was categorized as Never. We analyzed this outcome to determine significant predictors for consent to Yes-Always vs. Yes-Conditional. RESULTS 99.33 % consented to future use of samples; 88.19 % consented to future use without PHI, and among those men 10.2 % consented to future use with PHI. Comparing Yes Always and Yes Conditional responses, bivariate analyses showed that race, family history, stage of cancer, and grade of cancer (Gleason), were significant at the α = 0.05 level. Using stepwise multivariable logistic regression, we found that African-American men were significantly more likely to respond Yes Always when compared to White men (p < 0.001). Those with a family history of prostate cancer were significantly more likely to respond Yes Always (p = 0.002). CONCLUSIONS There is general willingness to consent to future use of specimens without PHI among men.
Collapse
Affiliation(s)
- Bettina F. Drake
- />Division of Public Health Sciences, Washington University School of Medicine, 600 S. Taylor Ave, Campus Box 8100, St. Louis, MO 63110 USA
- />Alvin J. Siteman Cancer Center, St. Louis, MO USA
| | - Katherine Brown
- />Division of Public Health Sciences, Washington University School of Medicine, 600 S. Taylor Ave, Campus Box 8100, St. Louis, MO 63110 USA
| | | | | | - Kimberly Kaphingst
- />Department of Communication, University of Utah, Salt Lake City, UT USA
- />Huntsman Cancer Institute, Salt Lake City, UT USA
| |
Collapse
|