1
|
Lin YC, Zhang S, Vessels T, Bastarache L, Bejan CA, Hsie RS, Philips EJ, Ruderfer DM, Pulley JM, Edwards TL, Wells QS, Warner JL, Denny JC, Roden DM, Kang H, Xu Y. Overcome the Limitation of Phenome-Wide Association Studies (PheWAS): Extension of PheWAS to Efficient and Robust Large-Scale ICD Codes Analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305098. [PMID: 38699370 PMCID: PMC11065011 DOI: 10.1101/2024.04.15.24305098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
The Phenome-wide association studies (PheWAS) have become widely used for efficient, high-throughput evaluation of relationship between a genetic factor and a large number of disease phenotypes, typically extracted from a DNA biobank linked with electronic medical records (EMR). Phecodes, billing code-derived disease case-control status, are usually used as outcome variables in PheWAS and logistic regression has been the standard choice of analysis method. Since the clinical diagnoses in EMR are often inaccurate with errors which can lead to biases in the odds ratio estimates, much effort has been put to accurately define the cases and controls to ensure an accurate analysis. Specifically in order to correctly classify controls in the population, an exclusion criteria list for each Phecode was manually compiled to obtain unbiased odds ratios. However, the accuracy of the list cannot be guaranteed without extensive data curation process. The costly curation process limits the efficiency of large-scale analyses that take full advantage of all structured phenotypic information available in EMR. Here, we proposed to estimate relative risks (RR) instead. We first demonstrated the desired nature of R R that overcomes the inaccuracy in the controls via theoretical formula. With simulation and real data application, we further confirmed that R R is unbiased without compiling exclusion criteria lists. With R R as estimates, we are able to efficiently extend PheWAS to a larger-scale, phenome construction agnostic analysis of phenotypes, using ICD 9/10 codes, which preserve much more disease-related clinical information than Phecodes.
Collapse
Affiliation(s)
- Ya-Chen Lin
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN
| | - Siwei Zhang
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN
| | - Tess Vessels
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lisa Bastarache
- Department of Biomedical informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Cosmin Adrian Bejan
- Department of Biomedical informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Ryan S Hsie
- Department of Urology, Vanderbilt University Medical Center, Nashville, TN
| | - Elizabeth J Philips
- Center for Drug Safety and Immunology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Doug M Ruderfer
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biomedical informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Jill M Pulley
- Department of Allergy, Pulmonary and Critical Care Medicine, Vanderbilt University School of Medicine, Nashville, TN
| | - Todd L Edwards
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Quinn S Wells
- Department of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jeremy L Warner
- Division of Hematology and Oncology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Joshua C Denny
- Department of Biomedical informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Dan M Roden
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Hakmook Kang
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN
| | - Yaomin Xu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN
- Department of Biomedical informatics, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
2
|
Lo Barco T, Garcelon N, Neuraz A, Nabbout R. Natural history of rare diseases using natural language processing of narrative unstructured electronic health records: The example of Dravet syndrome. Epilepsia 2024; 65:350-361. [PMID: 38065926 DOI: 10.1111/epi.17855] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 12/07/2023] [Accepted: 12/07/2023] [Indexed: 12/31/2023]
Abstract
OBJECTIVE The increasing implementation of electronic health records allows the use of advanced text-mining methods for establishing new patient phenotypes and stratification, and for revealing outcome correlations. In this study, we aimed to explore the electronic narrative clinical reports of a cohort of patients with Dravet syndrome (DS) longitudinally followed at our center, to identify the capacity of this methodology to retrace natural history of DS during the early years. METHODS We used a document-based clinical data warehouse employing natural language processing to recognize the phenotype concepts in the narrative medical reports. We included patients with DS who have a medical report produced before the age of 2 years and a follow-up after the age of 3 years ("DS cohort," 56 individuals). We selected two control populations, a "general control cohort" (275 individuals) and a "neurological control cohort" (281 individuals), with similar characteristics in terms of gender, number of reports, and age at last report. To find concepts specifically associated with DS, we performed a phenome-wide association study using Cox regression, comparing the reports of the three cohorts. We then performed a qualitative analysis of the surviving concepts based on their median age at first appearance. RESULTS A total of 76 concepts were prevalent in the reports of children with DS. Concepts appearing during the first 2 years were mostly related with the epilepsy features at the onset of DS (convulsive and prolonged seizures triggered by fever, often requiring in-hospital care). Subsequently, concepts related to new types of seizures and to drug resistance appeared. A series of non-seizure-related concepts emerged after the age of 2-3 years, referring to the nonseizure comorbidities classically associated with DS. SIGNIFICANCE The extraction of clinical terms by narrative reports of children with DS allows outlining the known natural history of this rare disease in early childhood. This original model of "longitudinal phenotyping" could be applied to other rare and very rare conditions with poor natural history description.
Collapse
Affiliation(s)
- Tommaso Lo Barco
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Reference Center for Rare Epilepsies, Member of European Reference Network EpiCARE, Université Paris Cité, Paris, France
| | - Nicolas Garcelon
- Data Science Platform, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| | - Antoine Neuraz
- Data Science Platform, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| | - Rima Nabbout
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Reference Center for Rare Epilepsies, Member of European Reference Network EpiCARE, Université Paris Cité, Paris, France
- Translational Research for Neurological Disorders, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| |
Collapse
|
3
|
Lerner I, Serret-Larmande A, Rance B, Garcelon N, Burgun A, Chouchana L, Neuraz A. Mining Electronic Health Records for Drugs Associated With 28-day Mortality in COVID-19: Pharmacopoeia-wide Association Study (PharmWAS). JMIR Med Inform 2022; 10:e35190. [PMID: 35275837 PMCID: PMC8970341 DOI: 10.2196/35190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 01/10/2022] [Accepted: 01/31/2022] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Patients hospitalized for a given condition may be receiving other treatments for other contemporary conditions or comorbidities. The use of such observational clinical data for pharmacological hypothesis generation is appealing in the context of an emerging disease but particularly challenging due to the presence of drug indication bias. OBJECTIVE With this study, our main objective was the development and validation of a fully data-driven pipeline that would address this challenge. Our secondary objective was to generate pharmacological hypotheses in patients with COVID-19 and demonstrate the clinical relevance of the pipeline. METHODS We developed a pharmacopeia-wide association study (PharmWAS) pipeline inspired from the PheWAS methodology, which systematically screens for associations between the whole pharmacopeia and a clinical phenotype. First, a fully data-driven procedure based on adaptive least absolute shrinkage and selection operator (LASSO) determined drug-specific adjustment sets. Second, we computed several measures of association, including robust methods based on propensity scores (PSs) to control indication bias. Finally, we applied the Benjamini and Hochberg procedure of the false discovery rate (FDR). We applied this method in a multicenter retrospective cohort study using electronic medical records from 16 university hospitals of the Greater Paris area. We included all adult patients between 18 and 95 years old hospitalized in conventional wards for COVID-19 between February 1, 2020, and June 15, 2021. We investigated the association between drug prescription within 48 hours from admission and 28-day mortality. We validated our data-driven pipeline against a knowledge-based pipeline on 3 treatments of reference, for which experts agreed on the expected association with mortality. We then demonstrated its clinical relevance by screening all drugs prescribed in more than 100 patients to generate pharmacological hypotheses. RESULTS A total of 5783 patients were included in the analysis. The median age at admission was 69.2 (IQR 56.7-81.1) years, and 3390 (58.62%) of the patients were male. The performance of our automated pipeline was comparable or better for controlling bias than the knowledge-based adjustment set for 3 reference drugs: dexamethasone, phloroglucinol, and paracetamol. After correction for multiple testing, 4 drugs were associated with increased in-hospital mortality. Among these, diazepam and tramadol were the only ones not discarded by automated diagnostics, with adjusted odds ratios of 2.51 (95% CI 1.52-4.16, Q=.1) and 1.94 (95% CI 1.32-2.85, Q=.02), respectively. CONCLUSIONS Our innovative approach proved useful in generating pharmacological hypotheses in an outbreak setting, without requiring a priori knowledge of the disease. Our systematic analysis of early prescribed treatments from patients hospitalized for COVID-19 showed that diazepam and tramadol are associated with increased 28-day mortality. Whether these drugs could worsen COVID-19 needs to be further assessed.
Collapse
Affiliation(s)
- Ivan Lerner
- Inserm, Centre de Recherche des Cordeliers, Sorbonne Université, Paris, France
- Informatique biomédicale, Hôpital Necker-Enfants Malades, Assistance Publique - Hôpitaux de Paris, Paris, France
- HeKA Team, Inria, Paris, France
| | - Arnaud Serret-Larmande
- Inserm, Centre de Recherche des Cordeliers, Sorbonne Université, Paris, France
- Informatique biomédicale, Hôpital Necker-Enfants Malades, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Bastien Rance
- Inserm, Centre de Recherche des Cordeliers, Sorbonne Université, Paris, France
- HeKA Team, Inria, Paris, France
| | - Nicolas Garcelon
- HeKA Team, Inria, Paris, France
- Inserm UMR 1163, Data Science Platform, Université de Paris, Imagine Institute, Paris, France
| | - Anita Burgun
- Inserm, Centre de Recherche des Cordeliers, Sorbonne Université, Paris, France
- Informatique biomédicale, Hôpital Necker-Enfants Malades, Assistance Publique - Hôpitaux de Paris, Paris, France
- HeKA Team, Inria, Paris, France
| | - Laurent Chouchana
- Centre Régional de Pharmacovigilance, Service de Pharmacologie, Hôpital Cochin, Assistance Publique - Hôpitaux de Paris, Centre - Université de Paris, Paris, France
| | - Antoine Neuraz
- Inserm, Centre de Recherche des Cordeliers, Sorbonne Université, Paris, France
- Informatique biomédicale, Hôpital Necker-Enfants Malades, Assistance Publique - Hôpitaux de Paris, Paris, France
- HeKA Team, Inria, Paris, France
| |
Collapse
|
4
|
Maturation and application of phenome-wide association studies. Trends Genet 2022; 38:353-363. [PMID: 34991903 DOI: 10.1016/j.tig.2021.12.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 11/12/2021] [Accepted: 12/02/2021] [Indexed: 12/12/2022]
Abstract
In the past 10 years since its introduction, phenome-wide association studies (PheWAS) have uncovered novel genotype-phenotype relationships. Along the way, PheWAS have evolved in many aspects as a study design with the expanded availability of large data repositories with genome-wide data linked to detailed phenotypic data. Advancement in methods, including algorithms, software, and publicly available integrated resources, makes it feasible to more fully realize the potential of PheWAS, overcoming the previous computational and analytical limitations. We review here the most recent improvements and notable applications of PheWAS since the second half of the decade from its inception. We also note the challenges that remain embedded along the entire PheWAS analytical pipeline that necessitate further development of tools and resources to further advance the understanding of the complex genetic architecture underlying human diseases and traits.
Collapse
|
5
|
Wang L, Zhang X, Meng X, Koskeridis F, Georgiou A, Yu L, Campbell H, Theodoratou E, Li X. Methodology in phenome-wide association studies: a systematic review. J Med Genet 2021; 58:720-728. [PMID: 34272311 DOI: 10.1136/jmedgenet-2021-107696] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 05/27/2021] [Indexed: 11/04/2022]
Abstract
Phenome-wide association study (PheWAS) has been increasingly used to identify novel genetic associations across a wide spectrum of phenotypes. This systematic review aims to summarise the PheWAS methodology, discuss the advantages and challenges of PheWAS, and provide potential implications for future PheWAS studies. Medical Literature Analysis and Retrieval System Online (MEDLINE) and Excerpta Medica Database (EMBASE) databases were searched to identify all published PheWAS studies up until 24 April 2021. The PheWAS methodology incorporating how to perform PheWAS analysis and which software/tool could be used, were summarised based on the extracted information. A total of 1035 studies were identified and 195 eligible articles were finally included. Among them, 137 (77.0%) contained 10 000 or more study participants, 164 (92.1%) defined the phenome based on electronic medical records data, 140 (78.7%) used genetic variants as predictors, and 73 (41.0%) conducted replication analysis to validate PheWAS findings and almost all of them (94.5%) received consistent results. The methodology applied in these PheWAS studies was dissected into several critical steps, including quality control of the phenome, selecting predictors, phenotyping, statistical analysis, interpretation and visualisation of PheWAS results, and the workflow for performing a PheWAS was established with detailed instructions on each step. This study provides a comprehensive overview of PheWAS methodology to help practitioners achieve a better understanding of the PheWAS design, to detect understudied or overstudied outcomes, and to direct their research by applying the most appropriate software and online tools for their study data structure.
Collapse
Affiliation(s)
- Lijuan Wang
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Xiaomeng Zhang
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK
| | - Xiangrui Meng
- Vanke School of Public Health, Tsinghua University, Beijing, China
| | - Fotios Koskeridis
- Department of Hygiene and Epidemiology, University of Ioannina, Ioannina, Epirus, Greece
| | - Andrea Georgiou
- Department of Hygiene and Epidemiology, University of Ioannina, Ioannina, Epirus, Greece
| | - Lili Yu
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Harry Campbell
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK
| | - Evropi Theodoratou
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK.,Cancer Research UK Edinburgh Centre, The University of Edinburgh MRC Institute of Genetics and Molecular Medicine, Edinburgh, UK
| | - Xue Li
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| |
Collapse
|
6
|
Barco TL, Kuchenbuch M, Garcelon N, Neuraz A, Nabbout R. Improving early diagnosis of rare diseases using Natural Language Processing in unstructured medical records: an illustration from Dravet syndrome. Orphanet J Rare Dis 2021; 16:309. [PMID: 34256808 PMCID: PMC8278630 DOI: 10.1186/s13023-021-01936-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 06/27/2021] [Indexed: 12/01/2022] Open
Abstract
Background The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care. Methods Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions. Results We found significative higher representation of concepts related to seizures’ phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders. Conclusions Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.
Collapse
Affiliation(s)
- Tommaso Lo Barco
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France.,Child Neuropsychiatry, Department of Surgical Sciences, Dentistry, Gynecology and Pediatrics, University of Verona, Verona, Italy
| | - Mathieu Kuchenbuch
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France.,Imagine Institute, INSERM, UMR 1163, Université de Paris, 75015, Paris, France
| | - Nicolas Garcelon
- Imagine Institute, INSERM, UMR 1163, Université de Paris, 75015, Paris, France
| | - Antoine Neuraz
- Université de Paris, Paris, France.,INSERM, UMR1138, Centre de Recherche Des Cordeliers, Paris, France.,Department of Medical Informatics, University Hospital Necker-Enfants Malades, APHP, Paris, France
| | - Rima Nabbout
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France. .,Imagine Institute, INSERM, UMR 1163, Université de Paris, 75015, Paris, France. .,Université de Paris, Paris, France.
| |
Collapse
|
7
|
Zhao L, Batta I, Matloff W, O'Driscoll C, Hobel S, Toga AW. Neuroimaging PheWAS (Phenome-Wide Association Study): A Free Cloud-Computing Platform for Big-Data, Brain-Wide Imaging Association Studies. Neuroinformatics 2021; 19:285-303. [PMID: 32822005 DOI: 10.1007/s12021-020-09486-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Large-scale, case-control genome-wide association studies (GWASs) have revealed genetic variations associated with diverse neurological and psychiatric disorders. Recent advances in neuroimaging and genomic databases of large healthy and diseased cohorts have empowered studies to characterize effects of the discovered genetic factors on brain structure and function, implicating neural pathways and genetic mechanisms in the underlying biology. However, the unprecedented scale and complexity of the imaging and genomic data requires new advanced biomedical data science tools to manage, process and analyze the data. In this work, we introduce Neuroimaging PheWAS (phenome-wide association study): a web-based system for searching over a wide variety of brain-wide imaging phenotypes to discover true system-level gene-brain relationships using a unified genotype-to-phenotype strategy. This design features a user-friendly graphical user interface (GUI) for anonymous data uploading, study definition and management, and interactive result visualizations as well as a cloud-based computational infrastructure and multiple state-of-art methods for statistical association analysis and multiple comparison correction. We demonstrated the potential of Neuroimaging PheWAS with a case study analyzing the influences of the apolipoprotein E (APOE) gene on various brain morphological properties across the brain in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Benchmark tests were performed to evaluate the system's performance using data from UK Biobank. The Neuroimaging PheWAS system is freely available. It simplifies the execution of PheWAS on neuroimaging data and provides an opportunity for imaging genetics studies to elucidate routes at play for specific genetic variants on diseases in the context of detailed imaging phenotypic data.
Collapse
Affiliation(s)
- Lu Zhao
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Ishaan Batta
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - William Matloff
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Caroline O'Driscoll
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Samuel Hobel
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA
| | - Arthur W Toga
- Laboratory of Neuro Imaging, USC Mark and Mary Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
8
|
Porcu E, Sjaarda J, Lepik K, Carmeli C, Darrous L, Sulc J, Mounier N, Kutalik Z. Causal Inference Methods to Integrate Omics and Complex Traits. Cold Spring Harb Perspect Med 2021; 11:a040493. [PMID: 32816877 PMCID: PMC8091955 DOI: 10.1101/cshperspect.a040493] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Major biotechnological advances have facilitated a tremendous boost to the collection of (gen-/transcript-/prote-/methyl-/metabol-)omics data in very large sample sizes worldwide. Coordinated efforts have yielded a deluge of studies associating diseases with genetic markers (genome-wide association studies) or with molecular phenotypes. Whereas omics-disease associations have led to biologically meaningful and coherent mechanisms, the identified (non-germline) disease biomarkers may simply be correlates or consequences of the explored diseases. To move beyond this realm, Mendelian randomization provides a principled framework to integrate information on omics- and disease-associated genetic variants to pinpoint molecular traits causally driving disease development. In this review, we show the latest advances in this field, flag up key challenges for the future, and propose potential solutions.
Collapse
Affiliation(s)
- Eleonora Porcu
- Center for Integrative Genomics, University of Lausanne, Lausanne 1015, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
| | - Jennifer Sjaarda
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
| | - Kaido Lepik
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
- Institute of Computer Science, University of Tartu, Tartu 50409, Estonia
| | - Cristian Carmeli
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
| | - Liza Darrous
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
| | - Jonathan Sulc
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
| | - Ninon Mounier
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
| | - Zoltán Kutalik
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- University Center for Primary Care and Public Health, University of Lausanne, Lausanne 1010, Switzerland
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter EX2 5AX, United Kingdom
| |
Collapse
|
9
|
Electronic health records for the diagnosis of rare diseases. Kidney Int 2020; 97:676-686. [DOI: 10.1016/j.kint.2019.11.037] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 11/15/2019] [Accepted: 11/22/2019] [Indexed: 01/13/2023]
|
10
|
Belciug S. Oncologist at work. Artif Intell Cancer 2020. [DOI: 10.1016/b978-0-12-820201-2.00005-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
|
11
|
Wu P, Gifford A, Meng X, Li X, Campbell H, Varley T, Zhao J, Carroll R, Bastarache L, Denny JC, Theodoratou E, Wei WQ. Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation. JMIR Med Inform 2019; 7:e14325. [PMID: 31553307 PMCID: PMC6911227 DOI: 10.2196/14325] [Citation(s) in RCA: 320] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 08/03/2019] [Accepted: 09/24/2019] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR). OBJECTIVE The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes. METHODS We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS. RESULTS We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]). CONCLUSIONS This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR.
Collapse
Affiliation(s)
- Patrick Wu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
- Medical Scientist Training Program, Vanderbilt University School of Medicine, Nashville, TN, United States
| | - Aliya Gifford
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Xiangrui Meng
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom
| | - Xue Li
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom
| | - Harry Campbell
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom
| | - Tim Varley
- Public Health and Intelligence Strategic Business Unit, National Services Scotland, Edinburgh, United Kingdom
| | - Juan Zhao
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Robert Carroll
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Evropi Theodoratou
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom
- Edinburgh Cancer Research Centre, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| |
Collapse
|
12
|
Nevoret C, Jannot AS, Pallet N. Clinical and Pharmacological Aspects of Hospital-Acquired Acute Kidney Injuries Outside the Intensive Care Unit: A Phenome-Wide Association Study. KIDNEY DISEASES (BASEL, SWITZERLAND) 2019; 5:272-280. [PMID: 31768385 PMCID: PMC6872991 DOI: 10.1159/000501432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 06/08/2019] [Indexed: 06/10/2023]
Abstract
INTRODUCTION Acute kidney injury (AKI) occurring in the hospital in noncritically ill patients involves a broad spectrum of clinical conditions and medical scenarios that are better appreciated by systematic association studies. METHODS We extracted all diagnoses and drug prescriptions from an i2b2 clinical data warehouse for patients who stayed in an academic hospital between 2013 and 2017, and had at least two plasma creatinine measurements performed during the first week of their stay, and analyzed the association between AKI occurring outside the intensive care unit (ICU), as identified using the AKIN classification criteria, and International Classification of Diseases (ICD)-10 diagnosis codes and drug categories. RESULTS 16,662 hospital stays for unique individuals were extracted. The prevalence of AKI outside the ICU was 8%, with a distribution of frequencies that greatly varied according to the departments. 4% of patients with AKI died during their hospital stay (OR 6.17, 95% CI [2.59-17.9]). ICD-10 diagnosis codes were related to infections, kidney cancer, heart failure, respiratory failure, and chronic kidney disease. Drugs targeting the renin angiotensin system and loop diuretics had the larger size effect on AKI. The ICD-10 code N17/"Acute kidney failure" was recorded in average in only 16% of the cases with AKI, and its frequency ranged from 0 to 80%, according to the hospital department; the lack of encoding did not impact mortality. CONCLUSION A systematic search for the associations of AKI with prescribed drugs and medical diagnosis using a phenome-wide approach allows to describe in depth the epidemiology of AKI outside the ICU.
Collapse
Affiliation(s)
- Camille Nevoret
- Department of Medical Informatics, Biostatistics and Public Health, Hôpital Européen Georges Pompidou, Paris, France
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France
| | - Anne-Sophie Jannot
- Department of Medical Informatics, Biostatistics and Public Health, Hôpital Européen Georges Pompidou, Paris, France
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France
- INSERM UMR_S 1138, Centre de Recherche des Cordeliers, Paris, France
| | - Nicolas Pallet
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France
- Nephrology Department, Hôpital Européen Georges Pompidou, Paris, France
- Clinical Chemistry Department, Hôpital Européen Georges Pompidou, Paris, France
| |
Collapse
|
13
|
Choi L, Carroll RJ, Beck C, Mosley JD, Roden DM, Denny JC, Van Driest SL. Evaluating statistical approaches to leverage large clinical datasets for uncovering therapeutic and adverse medication effects. Bioinformatics 2019; 34:2988-2996. [PMID: 29912272 DOI: 10.1093/bioinformatics/bty306] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 04/16/2018] [Indexed: 12/31/2022] Open
Abstract
Motivation Phenome-wide association studies (PheWAS) have been used to discover many genotype-phenotype relationships and have the potential to identify therapeutic and adverse drug outcomes using longitudinal data within electronic health records (EHRs). However, the statistical methods for PheWAS applied to longitudinal EHR medication data have not been established. Results In this study, we developed methods to address two challenges faced with reuse of EHR for this purpose: confounding by indication, and low exposure and event rates. We used Monte Carlo simulation to assess propensity score (PS) methods, focusing on two of the most commonly used methods, PS matching and PS adjustment, to address confounding by indication. We also compared two logistic regression approaches (the default of Wald versus Firth's penalized maximum likelihood, PML) to address complete separation due to sparse data with low exposure and event rates. PS adjustment resulted in greater power than PS matching, while controlling Type I error at 0.05. The PML method provided reasonable P-values, even in cases with complete separation, with well controlled Type I error rates. Using PS adjustment and the PML method, we identify novel latent drug effects in pediatric patients exposed to two common antibiotic drugs, ampicillin and gentamicin. Availability and implementation R packages PheWAS and EHR are available at https://github.com/PheWAS/PheWAS and at CRAN (https://www.r-project.org/), respectively. The R script for data processing and the main analysis is available at https://github.com/choileena/EHR. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leena Choi
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Robert J Carroll
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Cole Beck
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Dan M Roden
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.,Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C Denny
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.,Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sara L Van Driest
- Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
14
|
Monnin P, Legrand J, Husson G, Ringot P, Tchechmedjiev A, Jonquet C, Napoli A, Coulet A. PGxO and PGxLOD: a reconciliation of pharmacogenomic knowledge of various provenances, enabling further comparison. BMC Bioinformatics 2019; 20:139. [PMID: 30999867 PMCID: PMC6471679 DOI: 10.1186/s12859-019-2693-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Background Pharmacogenomics (PGx) studies how genomic variations impact variations in drug response phenotypes. Knowledge in pharmacogenomics is typically composed of units that have the form of ternary relationships gene variant – drug – adverse event. Such a relationship states that an adverse event may occur for patients having the specified gene variant and being exposed to the specified drug. State-of-the-art knowledge in PGx is mainly available in reference databases such as PharmGKB and reported in scientific biomedical literature. But, PGx knowledge can also be discovered from clinical data, such as Electronic Health Records (EHRs), and in this case, may either correspond to new knowledge or confirm state-of-the-art knowledge that lacks “clinical counterpart” or validation. For this reason, there is a need for automatic comparison of knowledge units from distinct sources. Results In this article, we propose an approach, based on Semantic Web technologies, to represent and compare PGx knowledge units. To this end, we developed PGxO, a simple ontology that represents PGx knowledge units and their components. Combined with PROV-O, an ontology developed by the W3C to represent provenance information, PGxO enables encoding and associating provenance information to PGx relationships. Additionally, we introduce a set of rules to reconcile PGx knowledge, i.e. to identify when two relationships, potentially expressed using different vocabularies and levels of granularity, refer to the same, or to different knowledge units. We evaluated our ontology and rules by populating PGxO with knowledge units extracted from PharmGKB (2701), the literature (65,720) and from discoveries reported in EHR analysis studies (only 10, manually extracted); and by testing their similarity. We called PGxLOD (PGx Linked Open Data) the resulting knowledge base that represents and reconciles knowledge units of those various origins. Conclusions The proposed ontology and reconciliation rules constitute a first step toward a more complete framework for knowledge comparison in PGx. In this direction, the experimental instantiation of PGxO, named PGxLOD, illustrates the ability and difficulties of reconciling various existing knowledge sources. Electronic supplementary material The online version of this article (10.1186/s12859-019-2693-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pierre Monnin
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.
| | - Joël Legrand
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Graziella Husson
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Patrice Ringot
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | | | - Clément Jonquet
- LIRMM, Université de Montpellier, CNRS, Montpellier, 34095, France.,Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, 94305, California, USA
| | - Amedeo Napoli
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Adrien Coulet
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.,Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, 94305, California, USA
| |
Collapse
|
15
|
James G, Reisberg S, Lepik K, Galwey N, Avillach P, Kolberg L, Mägi R, Esko T, Alexander M, Waterworth D, Loomis AK, Vilo J. An exploratory phenome wide association study linking asthma and liver disease genetic variants to electronic health records from the Estonian Biobank. PLoS One 2019; 14:e0215026. [PMID: 30978214 PMCID: PMC6461350 DOI: 10.1371/journal.pone.0215026] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 03/25/2019] [Indexed: 12/22/2022] Open
Abstract
The Estonian Biobank, governed by the Institute of Genomics at the University of Tartu (Biobank), has stored genetic material/DNA and continuously collected data since 2002 on a total of 52,274 individuals representing ~5% of the Estonian adult population and is increasing. To explore the utility of data available in the Biobank, we conducted a phenome-wide association study (PheWAS) in two areas of interest to healthcare researchers; asthma and liver disease. We used 11 asthma and 13 liver disease-associated single nucleotide polymorphisms (SNPs), identified from published genome-wide association studies, to test our ability to detect established associations. We confirmed 2 asthma and 5 liver disease associated variants at nominal significance and directionally consistent with published results. We found 2 associations that were opposite to what was published before (rs4374383:AA increases risk of NASH/NAFLD, rs11597086 increases ALT level). Three SNP-diagnosis pairs passed the phenome-wide significance threshold: rs9273349 and E06 (thyroiditis, p = 5.50x10-8); rs9273349 and E10 (type-1 diabetes, p = 2.60x10-7); and rs2281135 and K76 (non-alcoholic liver diseases, including NAFLD, p = 4.10x10-7). We have validated our approach and confirmed the quality of the data for these conditions. Importantly, we demonstrate that the extensive amount of genetic and medical information from the Estonian Biobank can be successfully utilized for scientific research.
Collapse
Affiliation(s)
- Glen James
- AstraZeneca, Global Medical Affairs, Cambridge, United Kingdom
| | - Sulev Reisberg
- Institute of Computer Science, University of Tartu, Tartu, Estonia
- STACC, Tartu, Estonia
- Quretec, Tartu, Estonia
| | - Kaido Lepik
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Nicholas Galwey
- GlaxoSmithKline, Research and Development, Stevenage, United Kingdom
| | - Paul Avillach
- Department of Biomedical Informatics, Harvard Medical School, Boston, United States of America
- Department of Medical Informatics, Erasmus University Medical Center Rotterdam, Rotterdam, Netherlands
| | - Liis Kolberg
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Reedik Mägi
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Tõnu Esko
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Myriam Alexander
- GlaxoSmithKline, Research and Development, Stevenage, United Kingdom
| | - Dawn Waterworth
- GlaxoSmithKline, Genetics, Collegeville, PA, United States of America
| | - A. Katrina Loomis
- Pfizer Worldwide Research and Development, Groton, CT, United States of America
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| |
Collapse
|
16
|
Scott ER, Wallsten RL. A Look to the Future. Pharmacogenomics 2019. [DOI: 10.1016/b978-0-12-812626-4.00010-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
17
|
Linge J, Borga M, West J, Tuthill T, Miller MR, Dumitriu A, Thomas EL, Romu T, Tunón P, Bell JD, Dahlqvist Leinhard O. Body Composition Profiling in the UK Biobank Imaging Study. Obesity (Silver Spring) 2018; 26:1785-1795. [PMID: 29785727 PMCID: PMC6220857 DOI: 10.1002/oby.22210] [Citation(s) in RCA: 145] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 04/17/2018] [Accepted: 04/20/2018] [Indexed: 12/20/2022]
Abstract
OBJECTIVE This study aimed to investigate the value of imaging-based multivariable body composition profiling by describing its association with coronary heart disease (CHD), type 2 diabetes (T2D), and metabolic health on individual and population levels. METHODS The first 6,021 participants scanned by UK Biobank were included. Body composition profiles (BCPs) were calculated, including abdominal subcutaneous adipose tissue, visceral adipose tissue (VAT), thigh muscle volume, liver fat, and muscle fat infiltration (MFI), determined using magnetic resonance imaging. Associations between BCP and metabolic status were investigated using matching procedures and multivariable statistical modeling. RESULTS Matched control analysis showed that higher VAT and MFI were associated with CHD and T2D (P < 0.001). Higher liver fat was associated with T2D (P < 0.001) and lower liver fat with CHD (P < 0.05), matching on VAT. Multivariable modeling showed that lower VAT and MFI were associated with metabolic health (P < 0.001), and liver fat was nonsignificant. Associations remained significant adjusting for sex, age, BMI, alcohol, smoking, and physical activity. CONCLUSIONS Body composition profiling enabled an intuitive visualization of body composition and showed the complexity of associations between fat distribution and metabolic status, stressing the importance of a multivariable approach. Different diseases were linked to different BCPs, which could not be described by a single fat compartment alone.
Collapse
Affiliation(s)
| | - Magnus Borga
- AMRA Medical ABLinköpingSweden
- Centre for Medical Image Science and VisualizationLinköping UniversityLinköpingSweden
- Department of Biomedical EngineeringLinköping UniversityLinköpingSweden
| | - Janne West
- AMRA Medical ABLinköpingSweden
- Centre for Medical Image Science and VisualizationLinköping UniversityLinköpingSweden
- Department of Medical and Health SciencesLinköping UniversityLinköpingSweden
| | - Theresa Tuthill
- Imaging, Precision Medicine, Pfizer Inc.Cambridge MassachusettsUSA
| | - Melissa R. Miller
- WRD Genome Sciences & Technologies, Pfizer Inc.Cambridge, MassachusettsUSA
| | - Alexandra Dumitriu
- WRD Genome Sciences & Technologies, Pfizer Inc.Cambridge, MassachusettsUSA
| | - E. Louise Thomas
- Research Centre for Optimal Health, School of Life SciencesUniversity of WestminsterLondonUK
| | - Thobias Romu
- AMRA Medical ABLinköpingSweden
- Centre for Medical Image Science and VisualizationLinköping UniversityLinköpingSweden
- Department of Biomedical EngineeringLinköping UniversityLinköpingSweden
| | | | - Jimmy D. Bell
- Research Centre for Optimal Health, School of Life SciencesUniversity of WestminsterLondonUK
| | - Olof Dahlqvist Leinhard
- AMRA Medical ABLinköpingSweden
- Centre for Medical Image Science and VisualizationLinköping UniversityLinköpingSweden
- Department of Medical and Health SciencesLinköping UniversityLinköpingSweden
| |
Collapse
|
18
|
Coulet A, Shah NH, Wack M, Chawki MB, Jay N, Dumontier M. Predicting the need for a reduced drug dose, at first prescription. Sci Rep 2018; 8:15558. [PMID: 30349060 PMCID: PMC6197198 DOI: 10.1038/s41598-018-33980-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 10/06/2018] [Indexed: 01/21/2023] Open
Abstract
Prescribing the right drug with the right dose is a central tenet of precision medicine. We examined the use of patients’ prior Electronic Health Records to predict a reduction in drug dosage. We focus on drugs that interact with the P450 enzyme family, because their dosage is known to be sensitive and variable. We extracted diagnostic codes, conditions reported in clinical notes, and laboratory orders from Stanford’s clinical data warehouse to construct cohorts of patients that either did or did not need a dose change. After feature selection, we trained models to predict the patients who will (or will not) require a dose change after being prescribed one of 34 drugs across 23 drug classes. Overall, we can predict (AUC ≥ 0.70–0.95) a dose reduction for 23 drugs and 22 drug classes. Several of these drugs are associated with clinical guidelines that recommend dose reduction exclusively in the case of adverse reaction. For these cases, a reduction in dosage may be considered as a surrogate for an adverse reaction, which our system could indirectly help predict and prevent. Our study illustrates the role machine learning may take in providing guidance in setting the starting dose for drugs associated with response variability.
Collapse
Affiliation(s)
- Adrien Coulet
- Université de Lorraine, CNRS, Inria, LORIA, 54000, Nancy, France. .,Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA.
| | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA
| | - Maxime Wack
- Service d'Evaluation et d'Information Médicales, University Hospital of Nancy (CHRU), Nancy, France
| | - Mohammad B Chawki
- Service d'Evaluation et d'Information Médicales, University Hospital of Nancy (CHRU), Nancy, France
| | - Nicolas Jay
- Université de Lorraine, CNRS, Inria, LORIA, 54000, Nancy, France.,Service d'Evaluation et d'Information Médicales, University Hospital of Nancy (CHRU), Nancy, France
| | - Michel Dumontier
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA.,Institute of Data Science, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
19
|
Wu L, Wang J, Wu H, Chen J, Xiao Z, Qin X, Zhang Z, Lin W. Comparative Metagenomic Analysis of Rhizosphere Microbial Community Composition and Functional Potentials under Rehmannia glutinosa Consecutive Monoculture. Int J Mol Sci 2018; 19:ijms19082394. [PMID: 30110928 PMCID: PMC6121535 DOI: 10.3390/ijms19082394] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 08/08/2018] [Accepted: 08/08/2018] [Indexed: 11/16/2022] Open
Abstract
Consecutive monoculture of Rehmannia glutinosa, highly valued in traditional Chinese medicine, leads to a severe decline in both quality and yield. Rhizosphere microbiome was reported to be closely associated with the soil health and plant performance. In this study, comparative metagenomics was applied to investigate the shifts in rhizosphere microbial structures and functional potentials under consecutive monoculture. The results showed R. glutinosa monoculture significantly decreased the relative abundances of Pseudomonadaceae and Burkholderiaceae, but significantly increased the relative abundances of Sphingomonadaceae and Streptomycetaceae. Moreover, the abundances of genera Pseudomonas, Azotobacter, Burkholderia, and Lysobacter, among others, were significantly lower in two-year monocultured soil than in one-year cultured soil. For potentially harmful/indicator microorganisms, the percentages of reads categorized to defense mechanisms (i.e., ATP-binding cassette (ABC) transporters, efflux transporter, antibiotic resistance) and biological metabolism (i.e., lipid transport and metabolism, secondary metabolites biosynthesis, transport and catabolism, nucleotide transport and metabolism, transcription) were significantly higher in two-year monocultured soil than in one-year cultured soil, but the opposite was true for potentially beneficial microorganisms, which might disrupt the equilibrium between beneficial and harmful microbes. Collectively, our results provide important insights into the shifts in genomic diversity and functional potentials of rhizosphere microbiome in response to R. glutinosa consecutive monoculture.
Collapse
Affiliation(s)
- Linkun Wu
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Juanying Wang
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Hongmiao Wu
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Jun Chen
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Zhigang Xiao
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Xianjin Qin
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Key Laboratory of Crop Ecology and Molecular Physiology (Fujian Agriculture and Forestry University), Fujian Province University, Fuzhou 350002, China.
| | - Zhongyi Zhang
- Key Laboratory of Crop Ecology and Molecular Physiology (Fujian Agriculture and Forestry University), Fujian Province University, Fuzhou 350002, China.
- College of Crop Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Wenxiong Lin
- Fujian Provincial Key Laboratory of Agroecological Processing and Safety Monitoring, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
- Key Laboratory of Crop Ecology and Molecular Physiology (Fujian Agriculture and Forestry University), Fujian Province University, Fuzhou 350002, China.
| |
Collapse
|
20
|
Garcelon N, Neuraz A, Salomon R, Faour H, Benoit V, Delapalme A, Munnich A, Burgun A, Rance B. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse. J Biomed Inform 2018; 80:52-63. [DOI: 10.1016/j.jbi.2018.02.019] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Revised: 02/22/2018] [Accepted: 02/28/2018] [Indexed: 01/26/2023]
|
21
|
Robinson JR, Denny JC, Roden DM, Van Driest SL. Genome-wide and Phenome-wide Approaches to Understand Variable Drug Actions in Electronic Health Records. Clin Transl Sci 2018; 11:112-122. [PMID: 29148204 PMCID: PMC5866959 DOI: 10.1111/cts.12522] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Accepted: 10/14/2017] [Indexed: 12/24/2022] Open
Affiliation(s)
- Jamie R. Robinson
- Department of Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
- Department of SurgeryVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Joshua C. Denny
- Department of Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
- Department of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Dan M. Roden
- Department of Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
- Department of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
- Department of PharmacologyVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Sara L. Van Driest
- Department of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
- Department of PediatricsVanderbilt University Medical CenterNashvilleTennesseeUSA
| |
Collapse
|
22
|
Abstract
PURPOSE OF REVIEW Over many decades, researchers have been designing studies to investigate the relationship between genotypes and phenotypes to gain an understanding about the effect of genetics on disease. Recently, a high-throughput approach called phenome-wide associations studies (PheWAS) have been extensively used to identify associations between genetic variants and many diseases and traits simultaneously. In this review, we describe the value of PheWAS along with methodological issues and challenges in interpretation for current applications of PheWAS. RECENT FINDINGS PheWAS have uncovered a paradigm to identify new associations for genetic loci across many diseases. The application of PheWAS have been effective with phenotype data from electronic health records, epidemiological studies, and clinical trials data. SUMMARY The key strength of a PheWAS is to identify the association of one or more genetic variants with multiple phenotypes, which can showcase interconnections among the phenotypes due to shared genetic associations. While the PheWAS approach appears promising, there are a number of challenges that need to be addressed to provide additional robustness to PheWAS findings.
Collapse
Affiliation(s)
- Anurag Verma
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
| | - Marylyn D Ritchie
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA
| |
Collapse
|
23
|
Girardeau Y, Jannot AS, Chatellier G, Saint-Jean O. Association between borderline dysnatremia and mortality insight into a new data mining approach. BMC Med Inform Decis Mak 2017; 17:152. [PMID: 29166900 PMCID: PMC5700671 DOI: 10.1186/s12911-017-0549-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 11/14/2017] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Even small variations of serum sodium concentration may be associated with mortality. Our objective was to confirm the impact of borderline dysnatremia for patients admitted to hospital on in-hospital mortality using real life care data from our electronic health record (EHR) and a phenome-wide association analysis (PheWAS). METHODS Retrospective observational study based on patient data admitted to Hôpital Européen George Pompidou, between 01/01/2008 and 31/06/2014; including 45,834 patients with serum sodium determinations on admission. We analyzed the association between dysnatremia and in-hospital mortality, using a multivariate logistic regression model to adjust for classical potential confounders. We performed a PheWAS to identify new potential confounders. RESULTS Hyponatremia and hypernatremia were recorded for 12.0% and 1.0% of hospital stays, respectively. Adjusted odds ratios (ORa) for severe, moderate and borderline hyponatremia were 3.44 (95% CI, 2.41-4.86), 2.48 (95% CI, 1.96-3.13) and 1.98 (95% CI, 1.73-2.28), respectively. ORa for severe, moderate and borderline hypernatremia were 4.07 (95% CI, 2.92-5.62), 4.42 (95% CI, 2.04-9.20) and 3.72 (95% CI, 1.53-8.45), respectively. Borderline hyponatremia (ORa = 1.57 95% CI, 1.35-1.81) and borderline hypernatremia (ORa = 3.47 95% CI, 2.43-4.90) were still associated with in-hospital mortality after adjustment for classical and new confounding factors identified through the PheWAS analysis. CONCLUSION Borderline dysnatremia on admission are independently associated with a higher risk of in-hospital mortality. By using medical data automatically collected in EHR and a new data mining approach, we identified new potential confounding factors that were highly associated with both mortality and dysnatremia.
Collapse
Affiliation(s)
- Yannick Girardeau
- Biomedical Informatics and Public Health Department, Hôpital Européen G. Pompidou, Assistance Publique-Hôpitaux de Paris, Paris, France. .,Sorbonne Universités, UPMC Univ Paris 06, UMR_S 1138, Centre de Recherche des Cordeliers, F-75006, Paris, France. .,Division of Geriatrics, Hôpital Européen G. Pompidou, Assistance Publique-Hôpitaux de Paris, Paris, France.
| | - Anne-Sophie Jannot
- Biomedical Informatics and Public Health Department, Hôpital Européen G. Pompidou, Assistance Publique-Hôpitaux de Paris, Paris, France.,Sorbonne Universités, UPMC Univ Paris 06, UMR_S 1138, Centre de Recherche des Cordeliers, F-75006, Paris, France
| | - Gilles Chatellier
- Biomedical Informatics and Public Health Department, Hôpital Européen G. Pompidou, Assistance Publique-Hôpitaux de Paris, Paris, France.,Université Paris Descartes, Paris, France.,Institut National de la Santé et de la Recherche Médicale (INSERM), Centre d'Investigations Cliniques, 1418, Paris, France
| | - Olivier Saint-Jean
- Division of Geriatrics, Hôpital Européen G. Pompidou, Assistance Publique-Hôpitaux de Paris, Paris, France
| |
Collapse
|
24
|
Doss J, Mo H, Carroll RJ, Crofford LJ, Denny JC. Phenome-Wide Association Study of Rheumatoid Arthritis Subgroups Identifies Association Between Seronegative Disease and Fibromyalgia. Arthritis Rheumatol 2017; 69:291-300. [PMID: 27589350 DOI: 10.1002/art.39851] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2015] [Accepted: 08/11/2016] [Indexed: 01/10/2023]
Abstract
OBJECTIVE The differences between seronegative and seropositive rheumatoid arthritis (RA) have not been widely reported. We performed electronic health record (EHR)-based phenome-wide association studies (PheWAS) to identify disease associations in seropositive and seronegative RA. METHODS A validated algorithm identified RA subjects from the de-identified version of the Vanderbilt University Medical Center EHR. Serotypes were determined by rheumatoid factor (RF) and anti-cyclic citrullinated peptide antibody (ACPA) values. We tested EHR-derived phenotypes using PheWAS comparing seropositive RA and seronegative RA, yielding disease associations. PheWAS was also performed in RF-positive versus RF-negative subjects and ACPA-positive versus ACPA-negative subjects. Following PheWAS, select phenotypes were then manually reviewed, and fibromyalgia was specifically evaluated using a validated algorithm. RESULTS A total of 2,199 RA individuals with either RF or ACPA testing were identified. Of these, 1,382 patients (63%) were classified as seropositive. Seronegative RA was associated with myalgia and myositis (odds ratio [OR] 2.1, P = 3.7 × 10-10 ) and back pain. A manual review of the health record showed that among subjects coded for Myalgia and Myositis, ∼80% had fibromyalgia. Follow-up with a specific EHR algorithm for fibromyalgia confirmed that seronegative RA was associated with fibromyalgia (OR 1.8, P = 4.0 × 10-6 ). Seropositive RA was associated with chronic airway obstruction (OR 2.2, P = 1.4 × 10-4 ) and tobacco use (OR 2.2, P = 7.0 × 10-4 ). CONCLUSION This PheWAS of RA patients identifies a strong association between seronegativity and fibromyalgia. It also affirms relationships between seropositivity and chronic airway obstruction and between seropositivity and tobacco use. These findings demonstrate the utility of the PheWAS approach to discover novel phenotype associations within different subgroups of a disease.
Collapse
Affiliation(s)
| | - Huan Mo
- Loma Linda University Medical Center, Loma Linda, California
| | | | | | | |
Collapse
|
25
|
Liao KP, Sparks JA, Hejblum BP, Kuo IH, Cui J, Lahey LJ, Cagan A, Gainer VS, Liu W, Cai TT, Sokolove J, Cai T. Phenome-Wide Association Study of Autoantibodies to Citrullinated and Noncitrullinated Epitopes in Rheumatoid Arthritis. Arthritis Rheumatol 2017; 69:742-749. [PMID: 27792870 PMCID: PMC5378622 DOI: 10.1002/art.39974] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Accepted: 10/27/2016] [Indexed: 12/22/2022]
Abstract
OBJECTIVE Patients with rheumatoid arthritis (RA) develop autoantibodies against a spectrum of antigens, but the clinical significance of these autoantibodies is unclear. Using a phenome-wide association study (PheWAS) approach, we examined the association between autoantibodies and clinical subphenotypes of RA. METHODS This study was conducted in a cohort of RA patients identified from the electronic medical records (EMRs) of 2 tertiary care centers. Using a published multiplex bead assay, we measured 36 autoantibodies targeting epitopes implicated in RA. We extracted all International Classification of Diseases, Ninth Revision (ICD-9) codes for each subject and grouped them into disease categories (PheWAS codes), using a published method. We tested for the association of each autoantibody (grouped by the targeted protein) with PheWAS codes. To determine significant associations (at a false discovery rate [FDR] of ≤0.1), we reviewed the medical records of 50 patients with each PheWAS code to determine positive predictive values (PPVs). RESULTS We studied 1,006 RA patients; the mean ± SD age of the patients was 61.0 ± 12.9 years, and 79.0% were female. A total of 3,568 unique ICD-9 codes were grouped into 625 PheWAS codes; the 206 PheWAS codes with a prevalence of ≥3% were studied. Using the PheWAS method, we identified 24 significant associations of autoantibodies to epitopes at an FDR of ≤0.1. The associations that were strongest and had the highest PPV for the PheWAS code were autoantibodies against fibronectin and obesity (P = 6.1 × 10-4 , PPV 100%), and that between fibrinogen and pneumonopathy (P = 2.7 × 10-4 , PPV 96%). Pneumonopathy codes included diagnoses for cryptogenic organizing pneumonia and obliterative bronchiolitis. CONCLUSION We demonstrated application of a bioinformatics method, the PheWAS, to screen for the clinical significance of RA-related autoantibodies. Using the PheWAS approach, we identified potentially significant links between variations in the levels of autoantibodies and comorbidities of interest in RA.
Collapse
Affiliation(s)
- Katherine P Liao
- Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Jeffrey A Sparks
- Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Boris P Hejblum
- Harvard T. H. Chan School of Public Health, Boston, Massachusetts
| | - I-Hsin Kuo
- Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, and Biogen, Cambridge, Massachusetts
| | - Jing Cui
- Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Lauren J Lahey
- VA Palo Alto Healthcare System and Stanford University School of Medicine, Palo Alto, California
| | | | | | - Weidong Liu
- Shanghai Jiao Tong University, Shanghai, China
| | - T Tony Cai
- The Wharton School, University of Pennsylvania, Philadelphia
| | - Jeremy Sokolove
- VA Palo Alto Healthcare System and Stanford University School of Medicine, Palo Alto, California
| | - Tianxi Cai
- Harvard T. H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
26
|
Wei WQ, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ, Gamazon ER, Cox NJ, Roden DM, Denny JC. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS One 2017; 12:e0175508. [PMID: 28686612 PMCID: PMC5501393 DOI: 10.1371/journal.pone.0175508] [Citation(s) in RCA: 250] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 03/27/2017] [Indexed: 12/20/2022] Open
Abstract
OBJECTIVE To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated "phecodes" designed to facilitate phenome-wide association studies (PheWAS) in EHRs. METHODS AND MATERIALS We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. RESULTS Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. CONCLUSION Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.
Collapse
Affiliation(s)
- Wei-Qi Wei
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Lisa A. Bastarache
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Robert J. Carroll
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Joy E. Marlo
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Travis J. Osterman
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Eric R. Gamazon
- Vanderbilt Genetic Institute and the Division of Genetic Medicine, Vanderbilt University, Nashville, TN, United States of America
- Department of Clinical Epidemiology, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
- Department of Department of Psychiatry, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands
| | - Nancy J. Cox
- Vanderbilt Genetic Institute and the Division of Genetic Medicine, Vanderbilt University, Nashville, TN, United States of America
| | - Dan M. Roden
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Department of Clinical Pharmacology, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Joshua C. Denny
- Departments of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Departments of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
- * E-mail:
| |
Collapse
|
27
|
Jannot AS, Burgun A, Thervet E, Pallet N. The Diagnosis-Wide Landscape of Hospital-Acquired AKI. Clin J Am Soc Nephrol 2017; 12:874-884. [PMID: 28495862 PMCID: PMC5460713 DOI: 10.2215/cjn.10981016] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 03/01/2017] [Indexed: 11/23/2022]
Abstract
BACKGROUND AND OBJECTIVES The exploration of electronic hospital records offers a unique opportunity to describe in-depth the prevalence of conditions associated with diagnoses at an unprecedented level of comprehensiveness. We used a diagnosis-wide approach, adapted from phenome-wide association studies (PheWAS), to perform an exhaustive analysis of all diagnoses associated with hospital-acquired AKI (HA-AKI) in a French urban tertiary academic hospital over a period of 10 years. DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS We retrospectively extracted all diagnoses from an i2b2 (Informatics for Integrating Biology and the Bedside) clinical data warehouse for patients who stayed in this hospital between 2006 and 2015 and had at least two plasma creatinine measurements performed during the first week of their stay. We then analyzed the association between HA-AKI and each International Classification of Diseases (ICD)-10 diagnostic category to draw a comprehensive picture of diagnoses associated with AKI. Hospital stays for 126,736 unique individuals were extracted. RESULTS Hemodynamic impairment and surgical procedures are the main factors associated with HA-AKI and five clusters of diagnoses were identified: sepsis, heart diseases, polytrauma, liver disease, and cardiovascular surgery. The ICD-10 code corresponding to AKI (N17) was recorded in 30% of the cases with HA-AKI identified, and in this situation, 20% of the diagnoses associated with HA-AKI corresponded to kidney diseases such as tubulointerstitial nephritis, necrotizing vasculitis, or myeloma cast nephropathy. Codes associated with HA-AKI that demonstrated the greatest increase in prevalence with time were related to influenza, polytrauma, and surgery of neoplasms of the genitourinary system. CONCLUSIONS Our approach, derived from PheWAS, is a valuable way to comprehensively identify and classify all of the diagnoses and clusters of diagnoses associated with HA-AKI. Our analysis delivers insights into how diagnoses associated with HA-AKI evolved over time. On the basis of ICD-10 codes, HA-AKI appears largely underestimated in this academic hospital.
Collapse
Affiliation(s)
- Anne-Sophie Jannot
- Departments of Medical Informatics, Biostatistics and Public Health
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France; and
- National Institute for Health and Research (INSERM) U1138, Centre de Recherche des Cordeliers, Paris, France
| | - Anita Burgun
- Departments of Medical Informatics, Biostatistics and Public Health
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France; and
- National Institute for Health and Research (INSERM) U1138, Centre de Recherche des Cordeliers, Paris, France
| | - Eric Thervet
- Nephrology, and
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France; and
| | - Nicolas Pallet
- Nephrology, and
- Clinical Chemistry, Hôpital Européen Georges Pompidou, Paris, France
- Assistance Publique Hôpitaux de Paris, Paris, France
- Paris Descartes University, Paris, France; and
| |
Collapse
|
28
|
Murphy SN, Avillach P, Bellazzi R, Phillips L, Gabetta M, Eran A, McDuffie MT, Kohane IS. Combining clinical and genomics queries using i2b2 - Three methods. PLoS One 2017; 12:e0172187. [PMID: 28388645 PMCID: PMC5384666 DOI: 10.1371/journal.pone.0172187] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Accepted: 02/01/2017] [Indexed: 12/30/2022] Open
Abstract
We are fortunate to be living in an era of twin biomedical data surges: a burgeoning representation of human phenotypes in the medical records of our healthcare systems, and high-throughput sequencing making rapid technological advances. The difficulty representing genomic data and its annotations has almost by itself led to the recognition of a biomedical "Big Data" challenge, and the complexity of healthcare data only compounds the problem to the point that coherent representation of both systems on the same platform seems insuperably difficult. We investigated the capability for complex, integrative genomic and clinical queries to be supported in the Informatics for Integrating Biology and the Bedside (i2b2) translational software package. Three different data integration approaches were developed: The first is based on Sequence Ontology, the second is based on the tranSMART engine, and the third on CouchDB. These novel methods for representing and querying complex genomic and clinical data on the i2b2 platform are available today for advancing precision medicine.
Collapse
Affiliation(s)
- Shawn N. Murphy
- Research IS and Computing, Partners HealthCare, Charlestown, Massachusetts, United States of America
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Laboratory of Computer Science, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Paul Avillach
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Children’s Hospital Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - Riccardo Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
- IRCCS Fondazione S. Maugeri, Pavia, Italy
- Centre for Health Technologies, University of Pavia, Pavia, Italy
| | - Lori Phillips
- Research IS and Computing, Partners HealthCare, Charlestown, Massachusetts, United States of America
| | - Matteo Gabetta
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
- Biomeris s.r.l, Via Ferrata, Pavia, Italy
| | - Alal Eran
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Children’s Hospital Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - Michael T. McDuffie
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Children’s Hospital Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - Isaac S. Kohane
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Children’s Hospital Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| |
Collapse
|
29
|
Denny JC, Bastarache L, Roden DM. Phenome-Wide Association Studies as a Tool to Advance Precision Medicine. Annu Rev Genomics Hum Genet 2016; 17:353-73. [PMID: 27147087 PMCID: PMC5480096 DOI: 10.1146/annurev-genom-090314-024956] [Citation(s) in RCA: 164] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Beginning in the early 2000s, the accumulation of biospecimens linked to electronic health records (EHRs) made possible genome-phenome studies (i.e., comparative analyses of genetic variants and phenotypes) using only data collected as a by-product of typical health care. In addition to disease and trait genetics, EHRs proved a valuable resource for analyzing pharmacogenetic traits and developing reverse genetics approaches such as phenome-wide association studies (PheWASs). PheWASs are designed to survey which of many phenotypes may be associated with a given genetic variant. PheWAS methods have been validated through replication of hundreds of known genotype-phenotype associations, and their use has differentiated between true pleiotropy and clinical comorbidity, added context to genetic discoveries, and helped define disease subtypes, and may also help repurpose medications. PheWAS methods have also proven to be useful with research-collected data. Future efforts that integrate broad, robust collection of phenotype data (e.g., EHR data) with purpose-collected research data in combination with a greater understanding of EHR data will create a rich resource for increasingly more efficient and detailed genome-phenome analysis to usher in new discoveries in precision medicine.
Collapse
Affiliation(s)
- Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
| | - Dan M Roden
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| |
Collapse
|
30
|
Han Y, Li L, Zhang Y, Yuan H, Ye L, Zhao J, Duan DD. Phenomics of Vascular Disease: The Systematic Approach to the Combination Therapy. Curr Vasc Pharmacol 2016; 13:433-40. [PMID: 25313004 PMCID: PMC4397150 DOI: 10.2174/1570161112666141014144829] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Revised: 02/15/2014] [Accepted: 05/21/2014] [Indexed: 12/28/2022]
Abstract
Vascular diseases are usually caused by multifactorial pathogeneses involving genetic and environmental factors. Our current understanding of vascular disease is, however, based on the focused genotype/phenotype studies driven by the “one-gene/one-phenotype” hypothesis. Drugs with “pure target” at individual molecules involved in the pathophysiological pathways are the mainstream of current clinical treatments and the basis of combination therapy of vascular diseases. Recently, the combination of genomics, proteomics, and metabolomics has unraveled the etiology and pathophysiology of vascular disease in a big-data fashion and also revealed unmatched relationships between the omic variability and the much narrower definition of various clinical phenotypes of vascular disease in individual patients. Here, we introduce the phenomics strategy that will change the conventional focused phenotype/genotype/genome study to a new systematic phenome/genome/proteome approach to the understanding of pathophysiology and combination therapy of vascular disease. A phenome is the sum total of an organism’s phenotypic traits that signify the expression of genome and specific environmental influence. Phenomics is the study of phenome to quantitatively correlate complex traits to variability not only in genome, but also in transcriptome, proteome, metabolome, interactome, and environmental factors by exploring the systems biology that links the genomic and phenomic spaces. The application of phenomics and the phenome-wide associated study (PheWAS) will not only identify a systemically-integrated set of biomarkers for diagnosis and prognosis of vascular disease but also provide novel treatment targets for combination therapy and thus make a revolutionary paradigm shift in the clinical treatment of these devastating diseases.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Dayue Darrel Duan
- Laboratory of Cardiovascular Phenomics, Department of Pharmacology, University of Nevada School of Medicine, Center for Molecular Medicine 303F, 1664 N Virginia Street/MS 318, Reno, Nevada 89557-0318, USA.
| |
Collapse
|
31
|
Zhang YP, Zhang YY, Duan DD. From Genome-Wide Association Study to Phenome-Wide Association Study: New Paradigms in Obesity Research. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2016; 140:185-231. [PMID: 27288830 DOI: 10.1016/bs.pmbts.2016.02.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Obesity is a condition in which excess body fat has accumulated over an extent that increases the risk of many chronic diseases. The current clinical classification of obesity is based on measurement of body mass index (BMI), waist-hip ratio, and body fat percentage. However, these measurements do not account for the wide individual variations in fat distribution, degree of fatness or health risks, and genetic variants identified in the genome-wide association studies (GWAS). In this review, we will address this important issue with the introduction of phenome, phenomics, and phenome-wide association study (PheWAS). We will discuss the new paradigm shift from GWAS to PheWAS in obesity research. In the era of precision medicine, phenomics and PheWAS provide the required approaches to better definition and classification of obesity according to the association of obese phenome with their unique molecular makeup, lifestyle, and environmental impact.
Collapse
Affiliation(s)
- Y-P Zhang
- Pediatric Heart Center, Beijing Anzhen Hospital, Capital Medical University, Beijing, China
| | - Y-Y Zhang
- Department of Cardiology, Changzhou Second People's Hospital, Changzhou, Jiangsu, China
| | - D D Duan
- Laboratory of Cardiovascular Phenomics, Center for Cardiovascular Research, Department of Pharmacology, and Center for Molecular Medicine, University of Nevada School of Medicine, Reno, NV, United States.
| |
Collapse
|
32
|
Unravelling the human genome-phenome relationship using phenome-wide association studies. Nat Rev Genet 2016; 17:129-45. [PMID: 26875678 DOI: 10.1038/nrg.2015.36] [Citation(s) in RCA: 182] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Advances in genotyping technology have, over the past decade, enabled the focused search for common genetic variation associated with human diseases and traits. With the recently increased availability of detailed phenotypic data from electronic health records and epidemiological studies, the impact of one or more genetic variants on the phenome is starting to be characterized both in clinical and population-based settings using phenome-wide association studies (PheWAS). These studies reveal a number of challenges that will need to be overcome to unlock the full potential of PheWAS for the characterization of the complex human genome-phenome relationship.
Collapse
|
33
|
Pendergrass SA, Verma A, Okula A, Hall MA, Crawford DC, Ritchie MD. Phenome-Wide Association Studies: Embracing Complexity for Discovery. Hum Hered 2015. [PMID: 26201697 DOI: 10.1159/000381851] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The inherent complexity of biological systems can be leveraged for a greater understanding of the impact of genetic architecture on outcomes, traits, and pharmacological response. The genome-wide association study (GWAS) approach has well-developed methods and relatively straight-forward methodologies; however, the bigger picture of the impact of genetic architecture on phenotypic outcome still remains to be elucidated even with an ever-growing number of GWAS performed. Greater consideration of the complexity of biological processes, using more data from the phenome, exposome, and diverse -omic resources, including considering the interplay of pleiotropy and genetic interactions, may provide additional leverage for making the most of the incredible wealth of information available for study. Here, we describe how incorporating greater complexity into analyses through the use of additional phenotypic data and widespread deployment of phenome-wide association studies may provide new insights into genetic factors influencing diseases, traits, and pharmacological response.
Collapse
Affiliation(s)
- Sarah A Pendergrass
- Biomedical and Translational Informatics Program, Geisinger Health System, Danville, Pa., USA
| | | | | | | | | | | |
Collapse
|
34
|
Syed-Abdul S, Moldovan M, Nguyen PA, Enikeev R, Jian WS, Iqbal U, Hsu MH, Li YC. Profiling phenome-wide associations: a population-based observational study. J Am Med Inform Assoc 2015; 22:896-9. [PMID: 25656518 PMCID: PMC11737641 DOI: 10.1093/jamia/ocu019] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Revised: 10/23/2014] [Accepted: 11/02/2014] [Indexed: 12/31/2022] Open
Abstract
OBJECTIVES To objectively characterize phenome-wide associations observed in the entire Taiwanese population and represent them in a meaningful, interpretable way. STUDY DESIGN In this population-based observational study, we analyzed 782 million outpatient visits and 15 394 unique phenotypes that were observed in the entire Taiwanese population of over 22 million individuals. Our data was obtained from Taiwan's National Health Insurance Research Database.Results We stratified the population into 20 gender-age groups and generated 28.8 million and 31.8 million pairwise odds ratios from male and female subpopulations, respectively. These associations can be accessed online at http://associations.phr.tmu.edu.tw. To demonstrate the database and validate the association estimates obtained, we used correlation analysis to analyze 100 phenotypes that were observed to have the strongest positive association estimates with respect to essential hypertension. The results indicated that association patterns tended to have a strong positive correlation between adjacent age groups, while correlation estimates tended to decline as groups became more distant in age, and they diverged when assessed across gender groups. CONCLUSIONS The correlation analysis of pairwise disease association patterns across different age and gender groups led to outcomes that were broadly predicted before the analysis, thus confirming the validity of the information contained in the presented database. More diverse individual disease-specific analyses would lead to a better understanding of phenome-wide associations and empower physicians to provide personalized care in terms of predicting, preventing, or initiating an early management of concomitant diseases.
Collapse
Affiliation(s)
- Shabbir Syed-Abdul
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Max Moldovan
- Centre for Clinical Governance Research, Australian Institute of Health Innovation, Faculty of Medicine, University of New South Wales, Sydney, Australia School of Population Health, Sansom Institute for Health Research, University of South Australia, South Australian Health & Medical Research Institute (SAHMRI)
| | - Phung-Anh Nguyen
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | | | - Wen-Shan Jian
- School of Health Care Administration, Taipei Medical University, Taipei, Taiwan
| | - Usman Iqbal
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Min-Huei Hsu
- Bureau of International Cooperation, Department of Health, Taipei, Taiwan
| | - Yu-Chuan Li
- Graduate Institute of Biomedical Informatics, College of Medicine Science and Technology; Department of Dermatology, Wan Fang Hospital, Taiwan. Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
35
|
Pendergrass SA, Ritchie MD. Phenome-Wide Association Studies: Leveraging Comprehensive Phenotypic and Genotypic Data for Discovery. CURRENT GENETIC MEDICINE REPORTS 2015; 3:92-100. [PMID: 26146598 PMCID: PMC4489156 DOI: 10.1007/s40142-015-0067-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
With the large volume of clinical and epidemiological data being collected, increasingly linked to extensive genotypic data, coupled with expanding high-performance computational resources, there are considerable opportunities for comprehensively exploring the networks of connections that exist between the phenome and the genome. These networks can be identified through Phenome-Wide Association Studies (PheWAS) where the association between a collection of genetic variants, or in some cases a particular clinical lab variable, and a wide and diverse range of phenotypes, diagnoses, traits, and/or outcomes are evaluated. This is a departure from the more familiar genome-wide association study (GWAS) approach, which has been used to identify single nucleotide polymorphisms (SNPs) associated with one outcome or a very limited phenotypic domain. In addition to highlighting novel connections between multiple phenotypes and elucidating more of the phenotype-genotype landscape, PheWAS can generate new hypotheses for further exploration, and can also be used to narrow the search space for research using comprehensive data collections. The complex results of PheWAS also have the potential for uncovering new mechanistic insights. We review here how the PheWAS approach has been used with data from epidemiological studies, clinical trials, and de-identified electronic health record data. We also review methodologies for the analyses underlying PheWAS, and emerging methods developed for evaluating the comprehensive results of PheWAS including genotype-phenotype networks. This review also highlights PheWAS as an important tool for identifying new biomarkers, elucidating the genetic architecture of complex traits, and uncovering pleiotropy. There are many directions and new methodologies for the future of PheWAS analyses, from the phenotypic data to the genetic data, and herein we also discuss some of these important future PheWAS developments.
Collapse
|
36
|
Katsila T, Patrinos GP. Whole genome sequencing in pharmacogenomics. Front Pharmacol 2015; 6:61. [PMID: 25859217 PMCID: PMC4374451 DOI: 10.3389/fphar.2015.00061] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 03/09/2015] [Indexed: 11/13/2022] Open
Abstract
Pharmacogenomics aims to shed light on the role of genes and genomic variants in clinical treatment response. Although, several drug-gene relationships are characterized to date, many challenges still remain toward the application of pharmacogenomics in the clinic; clinical guidelines for pharmacogenomic testing are still in their infancy, whereas the emerging high throughput genotyping technologies produce a tsunami of new findings. Herein, the potential of whole genome sequencing on pharmacogenomics research and clinical application are highlighted.
Collapse
Affiliation(s)
- Theodora Katsila
- Department of Pharmacy, School of Health Sciences, University of Patras Patras, Greec
| | - George P Patrinos
- Department of Pharmacy, School of Health Sciences, University of Patras Patras, Greec
| |
Collapse
|
37
|
Hebbring SJ, Rastegar-Mojarad M, Ye Z, Mayer J, Jacobson C, Lin S. Application of clinical text data for phenome-wide association studies (PheWASs). Bioinformatics 2015; 31:1981-7. [PMID: 25657332 DOI: 10.1093/bioinformatics/btv076] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Accepted: 02/02/2015] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Genome-wide association studies (GWASs) are effective for describing genetic complexities of common diseases. Phenome-wide association studies (PheWASs) offer an alternative and complementary approach to GWAS using data embedded in the electronic health record (EHR) to define the phenome. International Classification of Disease version 9 (ICD9) codes are used frequently to define the phenome, but using ICD9 codes alone misses other clinically relevant information from the EHR that can be used for PheWAS analyses and discovery. RESULTS As an alternative to ICD9 coding, a text-based phenome was defined by 23 384 clinically relevant terms extracted from Marshfield Clinic's EHR. Five single nucleotide polymorphisms (SNPs) with known phenotypic associations were genotyped in 4235 individuals and associated across the text-based phenome. All five SNPs genotyped were associated with expected terms (P<0.02), most at or near the top of their respective PheWAS ranking. Raw association results indicate that text data performed equivalently to ICD9 coding and demonstrate the utility of information beyond ICD9 coding for application in PheWAS.
Collapse
Affiliation(s)
- Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Majid Rastegar-Mojarad
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Zhan Ye
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - John Mayer
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Crystal Jacobson
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Simon Lin
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| |
Collapse
|
38
|
|
39
|
Namjou B, Marsolo K, Caroll RJ, Denny JC, Ritchie MD, Verma SS, Lingren T, Porollo A, Cobb BL, Perry C, Kottyan LC, Rothenberg ME, Thompson SD, Holm IA, Kohane IS, Harley JB. Phenome-wide association study (PheWAS) in EMR-linked pediatric cohorts, genetically links PLCL1 to speech language development and IL5-IL13 to Eosinophilic Esophagitis. Front Genet 2014; 5:401. [PMID: 25477900 PMCID: PMC4235428 DOI: 10.3389/fgene.2014.00401] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Accepted: 10/31/2014] [Indexed: 02/06/2023] Open
Abstract
Objective: We report the first pediatric specific Phenome-Wide Association Study (PheWAS) using electronic medical records (EMRs). Given the early success of PheWAS in adult populations, we investigated the feasibility of this approach in pediatric cohorts in which associations between a previously known genetic variant and a wide range of clinical or physiological traits were evaluated. Although computationally intensive, this approach has potential to reveal disease mechanistic relationships between a variant and a network of phenotypes. Method: Data on 5049 samples of European ancestry were obtained from the EMRs of two large academic centers in five different genotyped cohorts. Recently, these samples have undergone whole genome imputation. After standard quality controls, removing missing data and outliers based on principal components analyses (PCA), 4268 samples were used for the PheWAS study. We scanned for associations between 2476 single-nucleotide polymorphisms (SNP) with available genotyping data from previously published GWAS studies and 539 EMR-derived phenotypes. The false discovery rate was calculated and, for any new PheWAS findings, a permutation approach (with up to 1,000,000 trials) was implemented. Results: This PheWAS found a variety of common variants (MAF > 10%) with prior GWAS associations in our pediatric cohorts including Juvenile Rheumatoid Arthritis (JRA), Asthma, Autism and Pervasive Developmental Disorder (PDD) and Type 1 Diabetes with a false discovery rate < 0.05 and power of study above 80%. In addition, several new PheWAS findings were identified including a cluster of association near the NDFIP1 gene for mental retardation (best SNP rs10057309, p = 4.33 × 10−7, OR = 1.70, 95%CI = 1.38 − 2.09); association near PLCL1 gene for developmental delays and speech disorder [best SNP rs1595825, p = 1.13 × 10−8, OR = 0.65(0.57 − 0.76)]; a cluster of associations in the IL5-IL13 region with Eosinophilic Esophagitis (EoE) [best at rs12653750, p = 3.03 × 10−9, OR = 1.73 95%CI = (1.44 − 2.07)], previously implicated in asthma, allergy, and eosinophilia; and association of variants in GCKR and JAZF1 with allergic rhinitis in our pediatric cohorts [best SNP rs780093, p = 2.18 × 10−5, OR = 1.39, 95%CI = (1.19 − 1.61)], previously demonstrated in metabolic disease and diabetes in adults. Conclusion: The PheWAS approach with re-mapping ICD-9 structured codes for our European-origin pediatric cohorts, as with the previous adult studies, finds many previously reported associations as well as presents the discovery of associations with potentially important clinical implications.
Collapse
Affiliation(s)
- Bahram Namjou
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA ; College of Medicine, University of Cincinnati Cincinnati, OH, USA
| | - Keith Marsolo
- College of Medicine, University of Cincinnati Cincinnati, OH, USA ; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Robert J Caroll
- Department of Biomedical Informatics, Vanderbilt University School of Medicine Nashville, TN, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine Nashville, TN, USA ; Department of Medicine, Vanderbilt University School of Medicine Nashville, TN, USA
| | - Marylyn D Ritchie
- Center for Systems Genomics, The Pennsylvania State University Philadelphia, PA, USA
| | - Shefali S Verma
- Center for Systems Genomics, The Pennsylvania State University Philadelphia, PA, USA
| | - Todd Lingren
- College of Medicine, University of Cincinnati Cincinnati, OH, USA ; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Aleksey Porollo
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA ; College of Medicine, University of Cincinnati Cincinnati, OH, USA ; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Beth L Cobb
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Cassandra Perry
- Division of Genetics and Genomics, Boston Children's Hospital Boston, MA, USA
| | - Leah C Kottyan
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA ; College of Medicine, University of Cincinnati Cincinnati, OH, USA ; Division of Allergy and Immunology, Department of Pediatrics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Marc E Rothenberg
- Division of Allergy and Immunology, Department of Pediatrics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Susan D Thompson
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA ; College of Medicine, University of Cincinnati Cincinnati, OH, USA
| | - Ingrid A Holm
- Division of Genetics and Genomics, Department of Pediatrics, The Manton Center for Orphan Disease Research, Harvard Medical School, Boston Children's Hospital Boston, MA, USA
| | - Isaac S Kohane
- Children's Hospital Informatics Program, Center for Biomedical Informatics, Harvard Medical School Boston, MA, USA
| | - John B Harley
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA ; College of Medicine, University of Cincinnati Cincinnati, OH, USA ; U.S. Department of Veterans Affairs Medical Center Cincinnati, OH, USA
| |
Collapse
|
40
|
Mooney SD. Progress towards the integration of pharmacogenomics in practice. Hum Genet 2014; 134:459-65. [PMID: 25238897 DOI: 10.1007/s00439-014-1484-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 08/20/2014] [Indexed: 12/12/2022]
Abstract
Understanding the role genes and genetic variants play in clinical treatment response continues to be an active area of research with the goal of common clinical use. This goal has developed into today's industry of pharmacogenomics, where new drug-gene relationships are discovered and further characterized, published and then curated into national and international resources for use by researchers and clinicians. These efforts have given us insight into what a pharmacogenomic variant is, and how it differs from human disease variants and common polymorphisms. While publications continue to reveal pharmacogenomic relationships between genes and specific classes of drugs, many challenges remain toward the goal of widespread use clinically. First, the clinical guidelines for pharmacogenomic testing are still in their infancy. Second, sequencing technologies are changing rapidly making it somewhat unclear what genetic data will be available to the clinician at the time of care. Finally, what and when to return data to a patient is an area under constant debate. New innovations such as PheWAS approaches and whole genome sequencing studies are enabling a tsunami of new findings. In this review, pharmacogenomic variants, pharmacogenomic resources, interpretation clinical guidelines and challenges, such as WGS approaches, and the impact of pharmacogenomics on drug development and regulatory approval are reviewed.
Collapse
Affiliation(s)
- Sean D Mooney
- Buck Institute for Research on Aging, 8001 Redwood Blvd, Novato, CA, 94945, USA,
| |
Collapse
|
41
|
Monte AA, Brocker C, Nebert DW, Gonzalez FJ, Thompson DC, Vasiliou V. Improved drug therapy: triangulating phenomics with genomics and metabolomics. Hum Genomics 2014; 8:16. [PMID: 25181945 PMCID: PMC4445687 DOI: 10.1186/s40246-014-0016-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Accepted: 08/05/2014] [Indexed: 12/23/2022] Open
Abstract
Embracing the complexity of biological systems has a greater likelihood to improve prediction of clinical drug response. Here we discuss limitations of a singular focus on genomics, epigenomics, proteomics, transcriptomics, metabolomics, or phenomics-highlighting the strengths and weaknesses of each individual technique. In contrast, 'systems biology' is proposed to allow clinicians and scientists to extract benefits from each technique, while limiting associated weaknesses by supplementing with other techniques when appropriate. Perfect predictive modeling is not possible, whereas modeling of intertwined phenomic responses using genomic stratification with metabolomic modifications may greatly improve predictive values for drug therapy. We thus propose a novel-integrated approach to personalized medicine that begins with phenomic data, is stratified by genomics, and ultimately refined by metabolomic pathway data. Whereas perfect prediction of efficacy and safety of drug therapy is not possible, improvements can be achieved by embracing the complexity of the biological system. Starting with phenomics, the combination of linking metabolomics to identify common biologic pathways and then stratifying by genomic architecture, might increase predictive values. This systems biology approach has the potential, in specific subsets of patients, to avoid drug therapy that will be either ineffective or unsafe.
Collapse
Affiliation(s)
- Andrew A Monte
- University of Colorado Department of Emergency Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, 80045, USA.
- Rocky Mountain Poison & Drug Center, Denver, CO, 80204, USA.
| | - Chad Brocker
- Laboratory of Metabolism, Center for Cancer Research, National Institute of Cancer, Bethesda, MD, 20892, USA.
| | - Daniel W Nebert
- Division of Human Genetics, Department of Pediatrics and Molecular Developmental Biology, University of Cincinnati Medical Center, Cincinnati, OH, 45220, USA.
- Department of Environmental Health and Center for Environmental Genetics, University of Cincinnati Medical Center, Cincinnati, OH, 45220, USA.
| | - Frank J Gonzalez
- Laboratory of Metabolism, Center for Cancer Research, National Institute of Cancer, Bethesda, MD, 20892, USA.
| | - David C Thompson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, 80045, USA.
| | - Vasilis Vasiliou
- Skaggs School of Pharmacy and Pharmaceutical Sciences, Aurora, CO, 80045, USA.
| |
Collapse
|
42
|
Agúndez JAG, Esguevillas G, Amo G, García-Martín E. Clinical practice guidelines for translating pharmacogenomic knowledge to bedside. Focus on anticancer drugs. Front Pharmacol 2014; 5:188. [PMID: 25191268 PMCID: PMC4137539 DOI: 10.3389/fphar.2014.00188] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 07/24/2014] [Indexed: 11/13/2022] Open
Abstract
The development of clinical practice recommendations or guidelines for the clinical use of pharmacogenomics data is an essential issue for improving drug therapy, particularly for drugs with high toxicity and/or narrow therapeutic index such as anticancer drugs. Although pharmacogenomic-based recommendations have been formulated for over 40 anticancer drugs, the number of clinical practice guidelines available is very low. The guidelines already published indicate that pharmacogenomic testing is useful for patient selection, but final dosing adjustment should be carried out on the basis of clinical or analytical parameters rather than on pharmacogenomic information. Patient selection may seem a modest objective, but it constitutes a crucial improvement with regard to the pre-pharmacogenomics situation and it saves patients' lives. However, we should not overstate the current power of pharmacogenomics. At present the pharmacogenomics of anticancer drugs is not sufficiently developed for dose adjustments based on pharmacogenomics only, and no current guidelines recommend such adjustments without considering clinical and/or analytical parameters. This objective, if ever attained, would require the use of available guidelines, further implementation with clinical feedback, plus a combination of genomics and phenomics knowledge.
Collapse
Affiliation(s)
- José A G Agúndez
- Department of Pharmacology, University of Extremadura Cáceres, Spain ; ISCIII Research Network of Adverse Reactions to Allergens and Drugs Madrid, Spain
| | - Gara Esguevillas
- Department of Pharmacology, University of Extremadura Cáceres, Spain ; ISCIII Research Network of Adverse Reactions to Allergens and Drugs Madrid, Spain
| | - Gemma Amo
- Department of Pharmacology, University of Extremadura Cáceres, Spain ; ISCIII Research Network of Adverse Reactions to Allergens and Drugs Madrid, Spain
| | - Elena García-Martín
- Department of Pharmacology, University of Extremadura Cáceres, Spain ; ISCIII Research Network of Adverse Reactions to Allergens and Drugs Madrid, Spain
| |
Collapse
|
43
|
Denny JC. Surveying Recent Themes in Translational Bioinformatics: Big Data in EHRs, Omics for Drugs, and Personal Genomics. Yearb Med Inform 2014; 9:199-205. [PMID: 25123743 PMCID: PMC4287076 DOI: 10.15265/iy-2014-0015] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
OBJECTIVE To provide a survey of recent progress in the use of large-scale biologic data to impact clinical care, and the impact the reuse of electronic health record data has made in genomic discovery. METHOD Survey of key themes in translational bioinformatics, primarily from 2012 and 2013. RESULT This survey focuses on four major themes: the growing use of Electronic Health Records (EHRs) as a source for genomic discovery, adoption of genomics and pharmacogenomics in clinical practice, the possible use of genomic technologies for drug repurposing, and the use of personal genomics to guide care. CONCLUSION Reuse of abundant clinical data for research is speeding discovery, and implementation of genomic data into clinical medicine is impacting care with new classes of data rarely used previously in medicine.
Collapse
Affiliation(s)
- J C Denny
- Joshua C. Denny, MD, MS, 2525 West End Ave - Suite 672, Nashville, TN 37213, USA, E-mail:
| |
Collapse
|