1
|
Lingyu Z, Wanting Z, Ying G, Qianqian J, Chunfang W. Association of lipid-lowering drugs with venous thromboembolism outcomes: a phenome-wide association study and a drug-target Mendelian randomization study. J Thromb Thrombolysis 2025:10.1007/s11239-025-03108-z. [PMID: 40415139 DOI: 10.1007/s11239-025-03108-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/06/2025] [Indexed: 05/27/2025]
Abstract
Common anticoagulants can lead to potentially fatal internal bleeding, which restricts their extensive use in the prevention of Venous thromboembolism (VTE). We aimed to use a PheWAS and MR analysis to find novel therapeutic targets for VTE events(containing pulmonary embolism (PE) and deep vein thrombosis (DVT)) and offer new opportunities to develop safer and more effective preventative medications. The present study utilized a PheWAS analysis to examine 2005 health-related phenotypes from the MRC-IEU consortium on VTE risk genetic variants to pinpoint possible treatment targets. Subsequently, through Summary-data-based MR (SMR) and inverse-variance-weighted MR (IVW-MR) analysis, we assessed the associations between lipid-lowering drug targets (including HMGCR inhibitor, PCSK9 inhibitor, and NPC1L1 inhibitor) and VTE events. We utilized two types of genetic instruments to represent the exposure to lipid-lowering drugs: eQTLs of drug target genes and genetic variants within or near drug target genes associated with LDL cholesterol from genome-wide association studies. PheWAS analysis identified 13 cholesterol-related traits significantly associated with VTE risk, indicating lipid-lowering drugs might be targets of VTE outcomes. SMR analysis showed that higher NPC1L1 gene expression in the blood was a risk factor for PE (OR = 1.107, 95%CI = 1.026-1.195; p = 0.009). Additionally, an IVW-MR association was found between LDL mediated by NPC1L1 and blood clot in the lung (OR = 4.091, 95% CI = 1.375-12.173; p = 0.011). This study suggested a potential causal relationship between NPC1L1 inhibition and the reduced risk of PE.
Collapse
Affiliation(s)
- Zhang Lingyu
- Shanxi Key Laboratory of Human Disease and Animal Models, Experimental Animal Center of Shanxi Medical University, Taiyuan, 030001, Shanxi, China
| | - Zhong Wanting
- School of Medical Science, Shanxi Medical University, Taiyuan, 030001, Shanxi, China
| | - Guo Ying
- College of Basic Medical, Shanxi Medical University, Taiyuan, 030001, Shanxi, China
| | - Jin Qianqian
- Department of Forensic Pathology, Shanxi Medical University, Taiyuan, 030001, Shanxi, China.
| | - Wang Chunfang
- Shanxi Key Laboratory of Human Disease and Animal Models, Experimental Animal Center of Shanxi Medical University, Taiyuan, 030001, Shanxi, China.
- Department of Laboratory Animal Center, Shanxi Medical University, Taiyuan, China, Shanxi, 030001.
| |
Collapse
|
2
|
Allaire P, Fox J, Kitchner T, Gabor R, Folz C, Bettadahalli S, Hebbring S. Familial Renal Glucosuria and Potential Pharmacogenetic Impact on Sodium-Glucose Cotransporter-2 Inhibitors. KIDNEY360 2025; 6:521-530. [PMID: 39412882 PMCID: PMC12045503 DOI: 10.34067/kid.0000000621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 10/09/2024] [Indexed: 10/18/2024]
Abstract
Key Points A significant knowledge gap exists in SLC5A2 's role in familial renal glycosuria and sodium-glucose cotransporter-2 inhibitors' efficacy. Two percent of individuals in the All-of-Us cohort harbor rare genetic variants in SLC5A2 , potentially increasing the risk of familial renal glycosuria. Our trial suggests differential responses to sodium-glucose cotransporter-2 inhibitors in individuals with rare SLC5A2 alleles compared with wild types. Background Renal glucosuria is a rare inheritable trait caused by loss-of-function variants in the gene that encodes sodium-glucose cotransporter-2 (SGLT2) (i.e ., SLC5A2 ). The genetics of renal glucosuria is poorly understood, and even less is known on how loss-of-function variants in SLC5A2 may affect response to SGLT2 inhibitors, a new class of medication gaining popularity to treat diabetes by artificially inducing glucosuria. Methods We used two biobanks that link genomic with electronic health record data to study the genetics of renal glucosuria. This included 245,394 participants enrolled in the All of Us Research Program and 11,011 enrolled in Marshfield Clinic's Personalized Research Project (PMRP). Association studies in All of Us and PMRP identified ten variants that reached an experiment-wise Bonferroni threshold in either cohort, of which nine were novel. PMRP was further used as a recruitment source for a prospective SGLT2 pharmacogenetic trial. During a glucose tolerance test, the trial measured urine glucose concentrations in 15 SLC5A2 variant–positive individuals and 15 matched wild types with and without an SGLT2 inhibitor. Results This trial demonstrated that carriers of SLC5A2 risk variants may be more sensitive to SGLT2 inhibitors compared with wild types (P = 0.075). On the basis of population data, 2% of an ethnically diverse population carried rare variants in SLC5A2 and are at risk of renal glucosuria. Conclusions As a result, 2% of individuals being treated with SGLT2 inhibitors may respond differently to this new class of medication compared with the general population, suggesting that a larger investigation into SLC5A2 variants and SGLT2 inhibitors is needed.
Collapse
Affiliation(s)
- Patrick Allaire
- Center for Precision Medicine Research, Marshfield Clinic Health System , Marshfield, Wisconsin
| | | | | | | | | | | | | |
Collapse
|
3
|
Lin YH, Hung CC, Lin GC, Tsai IC, Lum CY, Hsiao TH. Utilizing polygenic risk score for breast cancer risk prediction in a Taiwanese population. Cancer Epidemiol 2025; 94:102701. [PMID: 39705763 DOI: 10.1016/j.canep.2024.102701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 10/25/2024] [Accepted: 11/04/2024] [Indexed: 12/23/2024]
Abstract
BACKGROUND Breast cancer has been the most frequently diagnosed cancer among women in Taiwan since 2003. While genetic variants play a significant role in the elevated risk of breast cancer, their implications have been less explored within Asian populations. Variant-based polygenic risk scores (PRS) have emerged as valuable tools for assessing the likelihood of developing breast cancer. In light of this, we attempted to establish a predictive breast cancer PRS tailored specifically for the Taiwanese population. METHODS The cohort analyzed in this study comprised 28,443 control subjects and 1501 breast cancer cases. These individuals were sourced from the Taiwan Precision Medicine Initiative (TPMI) array and the breast cancer registry lists at Taichung Veterans General Hospital (TCVGH). Utilizing the breast cancer-associated Polygenic Score (PGS) Catalog, we employed logistic regression to identify the most effective PRS for predicting breast cancer risk. Subsequently, we subjected the cohort of 1501 breast cancer patients to further analysis to investigate potential heterogeneity in breast cancer risk. RESULTS The Polygenic Score ID PGS000508 demonstrated a significant association with breast cancer risk in Taiwanese women with a 1.498-fold increase in cancer risk(OR = 1.498, 95 % CI(1.431-1.567, p=5.38×10^-68). Individuals in the highest quartile exhibited a substantially elevated risk compared to those in the lowest quartile, with an odds ratio (OR) of 3.11 (95 % CI: 2.70-3.59; p=1.15×10^-55). In a cohort of 1501 breast cancer cases stratified by PRS distribution, women in the highest quartile were diagnosed at a significantly younger age (p=0.003) compared to those in the lowest quartile. However, no significant differences were observed between PRS quartiles in relation to clinical stage (p=0.274), pathological stage (p=0.647), or tumor subtype distribution (p=0.244). CONCLUSION In our study, we pinpointed PGS000508 as a significant predictive factor for breast cancer risk in Taiwanese women. Furthermore, we found that a higher PGS000508 score was associated with younger age at the time of first diagnosis among the breast cancer cases examined.
Collapse
Affiliation(s)
- Yi-Hsuan Lin
- Division of Breast Surgery, Department of Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan
| | - Chih-Chiang Hung
- Division of Breast Surgery, Department of Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan; Department of Applied Cosmetology, College of Human Science and Social Innovation, Hung Kuang University, Taichung 43302, Taiwan; Ph.D Program in Translational Medicine, College of Life Sciences, National Chung Hsing University, Taichung 40227, Taiwan
| | - Guan-Cheng Lin
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - I-Chen Tsai
- Division of Breast Surgery, Department of Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan; College of Biomedical, China Medical University, Taichung, Taiwan
| | - Chih Yean Lum
- Division of Breast Surgery, Department of Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan.
| | - Tzu-Hung Hsiao
- Ph.D Program in Translational Medicine, College of Life Sciences, National Chung Hsing University, Taichung 40227, Taiwan; Department of Public Health, Fu Jen Catholic University, New Taipei City 24205, Taiwan; Institute of Genomics and Bioinformatics, National Chung Hsing University, Taichung 4022, Taiwan.
| |
Collapse
|
4
|
Wu MH, Zhang MH, Hu XD, Fan HX. Proteome-wide Mendelian randomization and therapeutic targets for bladder cancer. BMC Urol 2024; 24:273. [PMID: 39707285 DOI: 10.1186/s12894-024-01677-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Accepted: 12/16/2024] [Indexed: 12/23/2024] Open
Abstract
OBJECTIVE To identify therapeutic protein targets for bladder cancer (BCa) using Mendelian randomization (MR) and assess potential adverse effects of these targets. METHODS A proteome-wide MR study was conducted to determine causal relationships between plasma proteins and BCa risk. In the discovery stage, the plasma proteins (Exposure) were sourced from the R10 of Finnish database, Olink (619 samples across 2925 proteins) and SomaScan (828 samples across 7596 proteins), and Iceland database. In the replication stage, plasma proteins (Exposure) were sourced from the UK-Biobank-PPP database (54,219 participants and 2940 proteins). Summary-level data for BCa (Outcome) were obtained from the UK Biobank (UKB-SAIGE: cancer of bladder) in the discovery phase and the FinnGen consortium (FinnGen R11: cancer of bladder) in the replication phase. Colocalization and fix-effect meta-analyses were performed to validate MR findings. Finally, phenome-wide association study (Phe-WAS) was conducted to explore the side effects of druggable proteins utilizing UKB-SAIGE encompassing 783 phenotypes. RESULTS The MR analysis identified PSCA, LY6D, and SLURP1 as proteins with a genetic association to BCa risk. SLURP1 was confirmed in the replication phase, with a meta-analysis showing an odds ratio of 1.50 (95% CI: 1.30-1.74, P < 0.001). Phe-WAS indicated potential side effects for these targets. CONCLUSION This study provides insights into the causal relationships of plasma proteins with BCa, identifying PSCA, LY6D, and SLURP1 as potential therapeutic targets, with implications for future BCa treatment strategies.
Collapse
Affiliation(s)
- Meng-Hua Wu
- Department of Urology, Beijing Hospital of Traditional Chinese Medicine, Capital Medical University, Beijing, China
| | - Min-Heng Zhang
- Department of Gerontology, the First People's Hospital of Jinzhong, Jinzhong, Shanxi Province, China
| | - Xiao-Dong Hu
- First Hospital of Shanxi Medical University, No. 85 Jiefang South Road, Taiyuan, Shanxi Province, 030001, China
| | - Hai-Xia Fan
- First Hospital of Shanxi Medical University, No. 85 Jiefang South Road, Taiyuan, Shanxi Province, 030001, China.
| |
Collapse
|
5
|
Yang G, Schooling CM. Genetically mimicked effects of ASGR1 inhibitors on all-cause mortality and health outcomes: a drug-target Mendelian randomization study and a phenome-wide association study. BMC Med 2023; 21:235. [PMID: 37400795 DOI: 10.1186/s12916-023-02903-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 05/19/2023] [Indexed: 07/05/2023] Open
Abstract
BACKGROUND Asialoglycoprotein receptor 1 (ASGR1) is emerging as a potential drug target to reduce low-density lipoprotein (LDL)-cholesterol and coronary artery disease (CAD) risk. Here, we investigated genetically mimicked ASGR1 inhibitors on all-cause mortality and any possible adverse effects. METHODS We conducted a drug-target Mendelian randomization study to assess genetically mimicked effects of ASGR1 inhibitors on all-cause mortality and 25 a priori outcomes relevant to lipid traits, CAD, and possible adverse effects, i.e. liver function, cholelithiasis, adiposity and type 2 diabetes. We also performed a phenome-wide association study of 1951 health-related phenotypes to identify any novel effects. Associations found were compared with those for currently used lipid modifiers, assessed using colocalization, and replicated where possible. RESULTS Genetically mimicked ASGR1 inhibitors were associated with a longer lifespan (3.31 years per standard deviation reduction in LDL-cholesterol, 95% confidence interval 1.01 to 5.62). Genetically mimicked ASGR1 inhibitors were inversely associated with apolipoprotein B (apoB), triglycerides (TG) and CAD risk. Genetically mimicked ASGR1 inhibitors were positively associated with alkaline phosphatase, gamma glutamyltransferase, erythrocyte traits, insulin-like growth factor 1 (IGF-1) and C-reactive protein (CRP), but were inversely associated with albumin and calcium. Genetically mimicked ASGR1 inhibitors were not associated with cholelithiasis, adiposity or type 2 diabetes. Associations with apoB and TG were stronger for ASGR1 inhibitors compared with currently used lipid modifiers, and most non-lipid effects were specific to ASGR1 inhibitors. The probabilities for colocalization were > 0.80 for most of these associations, but were 0.42 for lifespan and 0.30 for CAD. These associations were replicated using alternative genetic instruments and other publicly available genetic summary statistics. CONCLUSIONS Genetically mimicked ASGR1 inhibitors reduced all-cause mortality. Beyond lipid-lowering, genetically mimicked ASGR1 inhibitors increased liver enzymes, erythrocyte traits, IGF-1 and CRP, but decreased albumin and calcium.
Collapse
Affiliation(s)
- Guoyi Yang
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
| | - C Mary Schooling
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Graduate School of Public Health and Health Policy, City University of New York, New York, USA
| |
Collapse
|
6
|
Vukadinovic M, Kwan AC, Yuan V, Salerno M, Lee DC, Albert CM, Cheng S, Li D, Ouyang D, Clarke SL. Deep learning-enabled analysis of medical images identifies cardiac sphericity as an early marker of cardiomyopathy and related outcomes. MED 2023; 4:252-262.e3. [PMID: 36996817 PMCID: PMC10106428 DOI: 10.1016/j.medj.2023.02.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 01/02/2023] [Accepted: 02/15/2023] [Indexed: 03/31/2023]
Abstract
BACKGROUND Quantification of chamber size and systolic function is a fundamental component of cardiac imaging. However, the human heart is a complex structure with significant uncharacterized phenotypic variation beyond traditional metrics of size and function. Examining variation in cardiac shape can add to our ability to understand cardiovascular risk and pathophysiology. METHODS We measured the left ventricle (LV) sphericity index (short axis length/long axis length) using deep learning-enabled image segmentation of cardiac magnetic resonance imaging data from the UK Biobank. Subjects with abnormal LV size or systolic function were excluded. The relationship between LV sphericity and cardiomyopathy was assessed using Cox analyses, genome-wide association studies, and two-sample Mendelian randomization. FINDINGS In a cohort of 38,897 subjects, we show that a one standard deviation increase in sphericity index is associated with a 47% increased incidence of cardiomyopathy (hazard ratio [HR]: 1.47, 95% confidence interval [CI]: 1.10-1.98, p = 0.01) and a 20% increased incidence of atrial fibrillation (HR: 1.20, 95% CI: 1.11-1.28, p < 0.001), independent of clinical factors and traditional magnetic resonance imaging (MRI) measurements. We identify four loci associated with sphericity at genome-wide significance, and Mendelian randomization supports non-ischemic cardiomyopathy as causal for LV sphericity. CONCLUSIONS Variation in LV sphericity in otherwise normal hearts predicts risk for cardiomyopathy and related outcomes and is caused by non-ischemic cardiomyopathy. FUNDING This study was supported by grants K99-HL157421 (D.O.) and KL2TR003143 (S.L.C.) from the National Institutes of Health.
Collapse
Affiliation(s)
- Milos Vukadinovic
- Department of Bioengineering, University of California Los Angeles, Los Angeles, CA 90095, USA; Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Alan C Kwan
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Victoria Yuan
- School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Michael Salerno
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94306, USA
| | - Daniel C Lee
- Department of Medicine and Radiology, Northwestern Medicine, Chicago, IL 60611, USA
| | - Christine M Albert
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Susan Cheng
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Debiao Li
- Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - David Ouyang
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Division of Artificial Intelligence in Medicine, Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.
| | - Shoa L Clarke
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94306, USA.
| |
Collapse
|
7
|
Zeng C, Bastarache LA, Tao R, Venner E, Hebbring S, Andujar JD, Bland ST, Crosslin DR, Pratap S, Cooley A, Pacheco JA, Christensen KD, Perez E, Zawatsky CLB, Witkowski L, Zouk H, Weng C, Leppig KA, Sleiman PMA, Hakonarson H, Williams MS, Luo Y, Jarvik GP, Green RC, Chung WK, Gharavi AG, Lennon NJ, Rehm HL, Gibbs RA, Peterson JF, Roden DM, Wiesner GL, Denny JC. Association of Pathogenic Variants in Hereditary Cancer Genes With Multiple Diseases. JAMA Oncol 2022; 8:835-844. [PMID: 35446370 PMCID: PMC9026237 DOI: 10.1001/jamaoncol.2022.0373] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Importance Knowledge about the spectrum of diseases associated with hereditary cancer syndromes may improve disease diagnosis and management for patients and help to identify high-risk individuals. Objective To identify phenotypes associated with hereditary cancer genes through a phenome-wide association study. Design, Setting, and Participants This phenome-wide association study used health data from participants in 3 cohorts. The Electronic Medical Records and Genomics Sequencing (eMERGEseq) data set recruited predominantly healthy individuals from 10 US medical centers from July 16, 2016, through February 18, 2018, with a mean follow-up through electronic health records (EHRs) of 12.7 (7.4) years. The UK Biobank (UKB) cohort recruited participants from March 15, 2006, through August 1, 2010, with a mean (SD) follow-up of 12.4 (1.0) years. The Hereditary Cancer Registry (HCR) recruited patients undergoing clinical genetic testing at Vanderbilt University Medical Center from May 1, 2012, through December 31, 2019, with a mean (SD) follow-up through EHRs of 8.8 (6.5) years. Exposures Germline variants in 23 hereditary cancer genes. Pathogenic and likely pathogenic variants for each gene were aggregated for association analyses. Main Outcomes and Measures Phenotypes in the eMERGEseq and HCR cohorts were derived from the linked EHRs. Phenotypes in UKB were from multiple sources of health-related data. Results A total of 214 020 participants were identified, including 23 544 in eMERGEseq cohort (mean [SD] age, 47.8 [23.7] years; 12 611 women [53.6%]), 187 234 in the UKB cohort (mean [SD] age, 56.7 [8.1] years; 104 055 [55.6%] women), and 3242 in the HCR cohort (mean [SD] age, 52.5 [15.5] years; 2851 [87.9%] women). All 38 established gene-cancer associations were replicated, and 19 new associations were identified. These included the following 7 associations with neoplasms: CHEK2 with leukemia (odds ratio [OR], 3.81 [95% CI, 2.64-5.48]) and plasma cell neoplasms (OR, 3.12 [95% CI, 1.84-5.28]), ATM with gastric cancer (OR, 4.27 [95% CI, 2.35-7.44]) and pancreatic cancer (OR, 4.44 [95% CI, 2.66-7.40]), MUTYH (biallelic) with kidney cancer (OR, 32.28 [95% CI, 6.40-162.73]), MSH6 with bladder cancer (OR, 5.63 [95% CI, 2.75-11.49]), and APC with benign liver/intrahepatic bile duct tumors (OR, 52.01 [95% CI, 14.29-189.29]). The remaining 12 associations with nonneoplastic diseases included BRCA1/2 with ovarian cysts (OR, 3.15 [95% CI, 2.22-4.46] and 3.12 [95% CI, 2.36-4.12], respectively), MEN1 with acute pancreatitis (OR, 33.45 [95% CI, 9.25-121.02]), APC with gastritis and duodenitis (OR, 4.66 [95% CI, 2.61-8.33]), and PTEN with chronic gastritis (OR, 15.68 [95% CI, 6.01-40.92]). Conclusions and Relevance The findings of this genetic association study analyzing the EHRs of 3 large cohorts suggest that these new phenotypes associated with hereditary cancer genes may facilitate early detection and better management of cancers. This study highlights the potential benefits of using EHR data in genomic medicine.
Collapse
Affiliation(s)
- Chenjie Zeng
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Lisa A Bastarache
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Ran Tao
- Department of Biostatistics, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Eric Venner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
| | - Scott Hebbring
- Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, Wisconsin
| | - Justin D Andujar
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Clinical and Translational Hereditary Cancer Program, Division of Genetic Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University, Nashville, Tennessee
| | - Sarah T Bland
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - David R Crosslin
- Department of Biomedical Informatics and Medical Education, University of Washington School of Medicine, Seattle
| | - Siddharth Pratap
- School of Graduate Studies and Research, Meharry Medical College, Nashville, Tennessee
| | - Ayorinde Cooley
- Department of Microbiology, Immunology and Physiology, Meharry Medical College, Nashville, Tennessee
| | - Jennifer A Pacheco
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Kurt D Christensen
- PRecisiOn Medicine Translational Research (PROMoTeR) Center, Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, Massachusetts.,Department of Population Medicine, Harvard Medical School, Boston, Massachusetts
| | - Emma Perez
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Carrie L Blout Zawatsky
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Leora Witkowski
- Centre Universitaire de Santé McGill, McGill University Health Centre, Montreal, Quebec, Canada
| | - Hana Zouk
- Laboratory for Molecular Medicine, Partners Healthcare Personalized Medicine, Cambridge, Massachusetts.,Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York
| | - Kathleen A Leppig
- Genetic Services and Kaiser Permanente Washington Health Research Institute, Kaiser Permanente of Washington, Seattle
| | - Patrick M A Sleiman
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.,Division of Human Genetics, Department of Pediatrics, The University of Pennsylvania Perelman School of Medicine, Philadelphia
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.,Division of Human Genetics, Department of Pediatrics, The University of Pennsylvania Perelman School of Medicine, Philadelphia
| | - Marc S Williams
- Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Gail P Jarvik
- Department of Medicine (Medical Genetics), University of Washington, Seattle.,Department of Genome Sciences, University of Washington, Seattle
| | - Robert C Green
- Brigham and Women's Hospital, Broad Institute, Ariadne Labs and Harvard Medical School, Boston, Massachusetts
| | - Wendy K Chung
- Department of Pediatrics, Columbia University, New York, New York.,Department of Medicine, Columbia University, New York, New York
| | - Ali G Gharavi
- Division of Nephrology, Department of Medicine, Columbia University Irving Medical Center, New York, New York.,Center for Precision Medicine and Genomics, Department of Medicine, Columbia University Irving Medical Center, New York, New York
| | - Niall J Lennon
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Heidi L Rehm
- Medical & Population Genetics Program and Genomics Platform, Broad Institute of MIT and Harvard Cambridge, Cambridge, Massachusetts.,Center for Genomic Medicine, Massachusetts General Hospital, Boston.,Department of Pathology, Harvard Medical School, Boston, Massachusetts
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
| | - Josh F Peterson
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Dan M Roden
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee.,Divisions of Cardiovascular Medicine and Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Department of Pharmacology, Vanderbilt University, Nashville, Tennessee
| | - Georgia L Wiesner
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Clinical and Translational Hereditary Cancer Program, Division of Genetic Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University, Nashville, Tennessee
| | - Joshua C Denny
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
8
|
Abstract
Electronic health records (EHRs) are a rich source of data for researchers, but extracting meaningful information out of this highly complex data source is challenging. Phecodes represent one strategy for defining phenotypes for research using EHR data. They are a high-throughput phenotyping tool based on ICD (International Classification of Diseases) codes that can be used to rapidly define the case/control status of thousands of clinically meaningful diseases and conditions. Phecodes were originally developed to conduct phenome-wide association studies to scan for phenotypic associations with common genetic variants. Since then, phecodes have been used to support a wide range of EHR-based phenotyping methods, including the phenotype risk score. This review aims to comprehensively describe the development, validation, and applications of phecodes and suggest some future directions for phecodes and high-throughput phenotyping.
Collapse
Affiliation(s)
- Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA;
| |
Collapse
|
9
|
Abstract
BACKGROUND Beyond their success in cardiovascular disease prevention, statins are increasingly recognized to have sex-specific pleiotropic effects. To gain additional insight, we characterized associations of genetically mimicked statins across the phenotype sex-specifically. We also assessed whether any apparently non-lipid effects identified extended to genetically mimicking other widely used lipid modifiers (proprotein convertase subtilisin/kexin type 9 (PCSK9) inhibitors and ezetimibe) or were a consequence of low-density lipoprotein cholesterol (LDL-c). METHODS We performed a sex-specific phenome-wide association study assessing the association of genetic variants in HMGCR, mimicking statins, with 1701 phenotypes. We used Mendelian randomization (MR) to assess if any non-lipid effects found were evident for genetically mimicked PCSK9 inhibitors and ezetimibe or for LDL-c. RESULTS As expected, genetically mimicking statins was inversely associated with LDL-c, apolipoprotein B (ApoB), and total cholesterol (TC) and positively associated with glycated hemoglobin (HbA1c) and was related to body composition. Genetically mimicking statins was also inversely associated with serum calcium, sex hormone-binding globulin (SHBG), and platelet count and positively associated with basal metabolic rate (BMR) and mean platelet volume. Stronger associations with genetically mimicked statins were evident for women than men for lipid traits (LDL-c, ApoB, and TC), calcium, and SHBG, but not for platelet attributes, body composition, or BMR. Genetically mimicking PCSK9 inhibitors or ezetimibe was also associated with lower lipids, but was not related to calcium, SHBG, BMR, or body composition. Genetically higher LDL-c increased lipids and decreased BMR, but did not affect calcium, HbA1c, platelet attributes, or SHBG with minor effects on body composition. CONCLUSIONS Similar inverse associations were found for genetically mimicking statins on lipid traits in men and women as for other lipid modifiers. Besides the positive associations with HbA1c, BMI (which may explain the higher BMR), and aspects of body composition in men and women, genetically mimicking statins was additionally associated with platelet attributes in both sexes and was inversely associated with serum calcium and SHBG in women. This genetic evidence suggests potential pathways that contribute to the effects of statins particularly in women. Further investigation is needed to confirm these findings and their implications for clinical practice.
Collapse
|
10
|
Abstract
BACKGROUND ABO blood group is associated with differences in lifespan, cardiovascular disease, and some cancers, for reasons which are incompletely understood. To gain sex-specific additional insight about potential mechanisms driving these common conditions for future interventions, we characterized associations of ABO blood group antigen across the phenotype sex-specifically. METHODS We performed a phenome-wide association study (PheWAS) assessing the association of tag single nucleotide polymorphisms (SNPs) for ABO blood group antigens (O, B, A1, and A2) with 3873 phenotypes. RESULTS The tag SNP for the O antigen was inversely associated with diseases of the circulatory system (particularly deep vein thrombosis (DVT)), total cholesterol, low-density lipoprotein cholesterol (LDL-C), and ovarian cancer, and positively associated with erythrocyte traits, leukocyte counts, diastolic blood pressure (DBP), and healthy body composition; the tag SNP for the A1 antigen tended to have associations in reverse to O. Stronger associations were more apparent for men than women for DVT, DBP, leukocyte traits, and some body composition traits, whereas larger effect sizes were found for women than men for some erythrocyte and lipid traits. CONCLUSION Blood group has a complex association with cardiovascular diseases and its major risk factors, including blood pressure and lipids, as well as with blood cell traits and body composition, with some differences by sex. Lower LDL-C may underlie some of the benefits of blood group O, but the complexity of associations with blood group antigen suggests overlooked drivers of common chronic diseases.
Collapse
Affiliation(s)
- Shun Li
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, 7 Sassoon Rd, Pokfulam, Hong Kong, Special Administrative Region, China
| | - C M Schooling
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, 7 Sassoon Rd, Pokfulam, Hong Kong, Special Administrative Region, China.
- School of Public Health and Health Policy, The City University of New York, 55 W 125 St, New York, NY, 10027, USA.
| |
Collapse
|
11
|
Dueñas HR, Seah C, Johnson JS, Huckins LM. Implicit bias of encoded variables: frameworks for addressing structured bias in EHR-GWAS data. Hum Mol Genet 2020; 29:R33-R41. [PMID: 32879975 PMCID: PMC7530523 DOI: 10.1093/hmg/ddaa192] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 08/17/2020] [Accepted: 08/18/2020] [Indexed: 12/20/2022] Open
Abstract
The 'discovery' stage of genome-wide association studies required amassing large, homogeneous cohorts. In order to attain clinically useful insights, we must now consider the presentation of disease within our clinics and, by extension, within our medical records. Large-scale use of electronic health record (EHR) data can help to understand phenotypes in a scalable manner, incorporating lifelong and whole-phenome context. However, extending analyses to incorporate EHR and biobank-based analyses will require careful consideration of phenotype definition. Judgements and clinical decisions that occur 'outside' the system inevitably contain some degree of bias and become encoded in EHR data. Any algorithmic approach to phenotypic characterization that assumes non-biased variables will generate compounded biased conclusions. Here, we discuss and illustrate potential biases inherent within EHR analyses, how these may be compounded across time and suggest frameworks for large-scale phenotypic analysis to minimize and uncover encoded bias.
Collapse
Affiliation(s)
- Hillary R Dueñas
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Carina Seah
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jessica S Johnson
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Laura M Huckins
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters Department of Veterans Affairs Medical Center, Bronx, NY 10468, USA
| |
Collapse
|
12
|
Braun JM, Kalloo G, Kingsley SL, Li N. Using phenome-wide association studies to examine the effect of environmental exposures on human health. ENVIRONMENT INTERNATIONAL 2019; 130:104877. [PMID: 31200158 PMCID: PMC6682449 DOI: 10.1016/j.envint.2019.05.071] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 05/10/2019] [Accepted: 05/27/2019] [Indexed: 05/04/2023]
Abstract
The field of environmental epidemiology has been using "-omics" technologies, including the exposome, metabolome, and methylome, to understand the potential effects and biological pathways of a number of environmental pollutants. However, the majority of studies have focused on a single disease or phenotype, and have not systematically considered patterns of multimorbidity and whether environmental pollutants have pleiotropic effects. These questions could be addressed by examining the relation between environmental exposures and the phenome - the patterns and profiles of human health that individuals experience from birth to death. By conducting Phenome Wide Association Studies (PheWAS), we can generate new hypotheses about new or poorly understood exposures, identify novel associations for established toxicants, and better understand biological pathways affected by environmental pollutants. In this article, we provide a conceptual framework for conducting PheWAS in environmental epidemiology and summarize some of the advantages and challenges to using the PheWAS to study environmental pollutant exposures. Ultimately, by adding the PheWAS to our "-omics" toolbox, we could substantially improve our understanding of the potential health effects of environmental pollutants.
Collapse
Affiliation(s)
- Joseph M Braun
- Department of Epidemiology, Brown University, Providence, RI, United States of America.
| | - Geetika Kalloo
- Department of Epidemiology, Brown University, Providence, RI, United States of America
| | - Samantha L Kingsley
- Department of Epidemiology, Brown University, Providence, RI, United States of America
| | - Nan Li
- Department of Epidemiology, Brown University, Providence, RI, United States of America
| |
Collapse
|
13
|
Abstract
RATIONALE Cystic fibrosis, like primary ciliary dyskinesia, is an autosomal recessive disorder characterized by abnormal mucociliary clearance and obstructive lung disease. We hypothesized that genes underlying the development or function of cilia may modify lung disease severity in persons with cystic fibrosis. OBJECTIVES To test this hypothesis, we compared variants in 93 candidate genes in both upper and lower tertiles of lung function in a large cohort of children and adults with cystic fibrosis with those of a population control dataset. METHODS Variants within candidate genes were tested for association using the SKAT-O test, comparing cystic fibrosis cases defined by poor (n = 127) or preserved (n = 127) lung function with population controls (n = 3,269 or 3,148, respectively). Associated variants were then tested for association with related phenotypes in independent datasets. RESULTS Variants in DNAH14 and DNAAF3 were associated with poor lung function in cystic fibrosis, whereas variants in DNAH14 and DNAH6 were associated with preserved lung function in cystic fibrosis. Associations between DNAH14 and lung function were replicated in disease-related phenotypes characterized by obstructive lung disease in adults. CONCLUSIONS Genetic variants within DNAH6, DNAH14, and DNAAF3 are associated with variation in lung function among persons with cystic fibrosis.
Collapse
|
14
|
Huang X, Elston RC, Rosa GJ, Mayer J, Ye Z, Kitchner T, Brilliant MH, Page D, Hebbring SJ. Applying family analyses to electronic health records to facilitate genetic research. Bioinformatics 2018; 34:635-642. [PMID: 28968884 DOI: 10.1093/bioinformatics/btx569] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Accepted: 09/13/2017] [Indexed: 12/20/2022] Open
Abstract
Motivation Pedigree analysis is a longstanding and powerful approach to gain insight into the underlying genetic factors in human health, but identifying, recruiting and genotyping families can be difficult, time consuming and costly. Development of high throughput methods to identify families and foster downstream analyses are necessary. Results This paper describes simple methods that allowed us to identify 173 368 family pedigrees with high probability using basic demographic data available in most electronic health records (EHRs). We further developed and validate a novel statistical method that uses EHR data to identify families more likely to have a major genetic component to their diseases risk. Lastly, we showed that incorporating EHR-linked family data into genetic association testing may provide added power for genetic mapping without additional recruitment or genotyping. The totality of these results suggests that EHR-linked families can enable classical genetic analyses in a high-throughput manner. Availability and implementation Pseudocode is provided as supplementary information. Contact HEBBRING.SCOTT@marshfieldresearch.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiayuan Huang
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Robert C Elston
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Guilherme J Rosa
- Department of Animal Science, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Zhan Ye
- Biomedical Informatics Research Center
| | - Terrie Kitchner
- Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, WI 54449, USA
| | - Murray H Brilliant
- Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, WI 54449, USA.,Department of Medical Genetics
| | - David Page
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA.,Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, WI 54449, USA.,Department of Medical Genetics
| |
Collapse
|
15
|
Genomic and Phenomic Research in the 21st Century. Trends Genet 2018; 35:29-41. [PMID: 30342790 DOI: 10.1016/j.tig.2018.09.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 09/24/2018] [Accepted: 09/25/2018] [Indexed: 02/06/2023]
Abstract
The field of human genomics has changed dramatically over time. Initial genomic studies were predominantly restricted to rare disorders in small families. Over the past decade, researchers changed course from family-based studies and instead focused on common diseases and traits in populations of unrelated individuals. With further advancements in biobanking, computer science, electronic health record (EHR) data, and more affordable high-throughput genomics, we are experiencing a new paradigm in human genomic research. Rapidly changing technologies and resources now make it possible to study thousands of diseases simultaneously at the genomic level. This review will focus on these advancements as scientists begin to incorporate phenome-wide strategies in human genomic research to understand the etiology of human diseases and develop new drugs to treat them.
Collapse
|
16
|
Huang L, Zhang X, Tam POS, Chen H, Hao F, Pang CP, Wen F, Yang Z. Association of coding and UTR variants in the known regions with wet age-related macular degeneration in Han Chinese population. J Hum Genet 2018; 63:1055-1070. [PMID: 30026504 DOI: 10.1038/s10038-018-0490-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 07/04/2018] [Accepted: 07/04/2018] [Indexed: 11/09/2022]
Abstract
Age-related macular degeneration (AMD) is the leading cause worldwide of severe visual impairment among people older than 55 years of age. This study aimed to investigate the genetic association between coding and untranslated region (UTR) variants in previously reported loci and exudative age-related macular degeneration (wet AMD) in a Han Chinese population. Using our previously published whole exome sequencing dataset of 349 wet AMD patients and 1253 controls, we searched for associations between coding and UTR variants of the 72 genes located within the 47 reported wet AMD loci regions. From these, 25 variants in 18 of the 72 genes with P < 10 × 10-3 were selected for the first replication of Sequenom mass-array genotyping in 885 wet AMD subjects and 562 controls. Next, four SNPs were selected for further validation by SNaPshot genotyping in a third Chinese cohort with 456 wet AMD subjects and 211 controls. As a result, we identified two new potential coding and UTR variant SNPs (rs189132250 in BBX located in 3q12.1 and rs144351944 in FILIP1L located in 3q12.1) that showed weak associations with wet AMD in the Han Chinese population. These findings provide new information regarding the coding and UTR variants of the known wet AMD loci in the studied Chinese cohort.
Collapse
Affiliation(s)
- Lulin Huang
- Sichuan Provincial Key Laboratory for Human Disease Gene Study and the Department of Laboratory Medicine, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Xiongze Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Pancy O S Tam
- Department of Ophthalmology and Visual Sciences, Chinese University of Hong Kong, Hong Kong, China
| | - Haoyu Chen
- Joint Shantou International Eye Center, Shantou University and Chinese University of Hong Kong, Shantou, China
| | - Fang Hao
- Sichuan Provincial Key Laboratory for Human Disease Gene Study and the Department of Laboratory Medicine, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Chi-Pui Pang
- Department of Ophthalmology and Visual Sciences, Chinese University of Hong Kong, Hong Kong, China
| | - Fen Wen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China.
| | - Zhenglin Yang
- Sichuan Provincial Key Laboratory for Human Disease Gene Study and the Department of Laboratory Medicine, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China. .,Institute of Chengdu Biology, and Sichuan Translational Medicine Hospital, Chinese Academy of Sciences, Chengdu, Sichuan, China. .,Center of Information in Biomedicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| |
Collapse
|
17
|
Verma A, Lucas A, Verma SS, Zhang Y, Josyula N, Khan A, Hartzel DN, Lavage DR, Leader J, Ritchie MD, Pendergrass SA. PheWAS and Beyond: The Landscape of Associations with Medical Diagnoses and Clinical Measures across 38,662 Individuals from Geisinger. Am J Hum Genet 2018; 102:592-608. [PMID: 29606303 PMCID: PMC5985339 DOI: 10.1016/j.ajhg.2018.02.017] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 02/20/2018] [Indexed: 01/23/2023] Open
Abstract
Most phenome-wide association studies (PheWASs) to date have used a small to moderate number of SNPs for association with phenotypic data. We performed a large-scale single-cohort PheWAS, using electronic health record (EHR)-derived case-control status for 541 diagnoses using International Classification of Disease version 9 (ICD-9) codes and 25 median clinical laboratory measures. We calculated associations between these diagnoses and traits with ∼630,000 common frequency SNPs with minor allele frequency > 0.01 for 38,662 individuals. In this landscape PheWAS, we explored results within diseases and traits, comparing results to those previously reported in genome-wide association studies (GWASs), as well as previously published PheWASs. We further leveraged the context of functional impact from protein-coding to regulatory regions, providing a deeper interpretation of these associations. The comprehensive nature of this PheWAS allows for novel hypothesis generation, the identification of phenotypes for further study for future phenotypic algorithm development, and identification of cross-phenotype associations.
Collapse
Affiliation(s)
- Anurag Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Anastasia Lucas
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shefali S Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Yu Zhang
- Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA
| | - Navya Josyula
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Anqa Khan
- Mount Holyoke College, South Hadley, MA 01075, USA
| | - Dustin N Hartzel
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Daniel R Lavage
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Joseph Leader
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA; Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Sarah A Pendergrass
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA.
| |
Collapse
|
18
|
Verma A, Bradford Y, Dudek S, Lucas AM, Verma SS, Pendergrass SA, Ritchie MD. A simulation study investigating power estimates in phenome-wide association studies. BMC Bioinformatics 2018; 19:120. [PMID: 29618318 PMCID: PMC5885318 DOI: 10.1186/s12859-018-2135-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 03/26/2018] [Indexed: 01/01/2023] Open
Abstract
Background Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. Results We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. Conclusions This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses. Electronic supplementary material The online version of this article (10.1186/s12859-018-2135-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anurag Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA
| | - Yuki Bradford
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Scott Dudek
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Anastasia M Lucas
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Shefali S Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA
| | | | - Marylyn D Ritchie
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA. .,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
19
|
Verma SS, Josyula N, Verma A, Zhang X, Veturi Y, Dewey FE, Hartzel DN, Lavage DR, Leader J, Ritchie MD, Pendergrass SA. Rare variants in drug target genes contributing to complex diseases, phenome-wide. Sci Rep 2018; 8:4624. [PMID: 29545597 PMCID: PMC5854600 DOI: 10.1038/s41598-018-22834-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 03/01/2018] [Indexed: 12/30/2022] Open
Abstract
The DrugBank database consists of ~800 genes that are well characterized drug targets. This list of genes is a useful resource for association testing. For example, loss of function (LOF) genetic variation has the potential to mimic the effect of drugs, and high impact variation in these genes can impact downstream traits. Identifying novel associations between genetic variation in these genes and a range of diseases can also uncover new uses for the drugs that target these genes. Phenome Wide Association Studies (PheWAS) have been successful in identifying genetic associations across hundreds of thousands of diseases. We have conducted a novel gene based PheWAS to test the effect of rare variants in DrugBank genes, evaluating associations between these genes and more than 500 quantitative and dichotomous phenotypes. We used whole exome sequencing data from 38,568 samples in Geisinger MyCode Community Health Initiative. We evaluated the results of this study when binning rare variants using various filters based on potential functional impact. We identified multiple novel associations, and the majority of the significant associations were driven by functionally annotated variation. Overall, this study provides a sweeping exploration of rare variant associations within functionally relevant genes across a wide range of diagnoses.
Collapse
Affiliation(s)
- Shefali Setia Verma
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Navya Josyula
- Biomedical and Translational Informatics Institute, Geisinger, Danville, PA, 17221, USA
| | - Anurag Verma
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Xinyuan Zhang
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yogasudha Veturi
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | | | - Dustin N Hartzel
- Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Daniel R Lavage
- Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Joe Leader
- Biomedical and Translational Informatics Institute, Geisinger, Danville, PA, 17221, USA.,Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Marylyn D Ritchie
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sarah A Pendergrass
- Biomedical and Translational Informatics Institute, Geisinger, Danville, PA, 17221, USA.
| |
Collapse
|
20
|
Abstract
PURPOSE OF REVIEW Over many decades, researchers have been designing studies to investigate the relationship between genotypes and phenotypes to gain an understanding about the effect of genetics on disease. Recently, a high-throughput approach called phenome-wide associations studies (PheWAS) have been extensively used to identify associations between genetic variants and many diseases and traits simultaneously. In this review, we describe the value of PheWAS along with methodological issues and challenges in interpretation for current applications of PheWAS. RECENT FINDINGS PheWAS have uncovered a paradigm to identify new associations for genetic loci across many diseases. The application of PheWAS have been effective with phenotype data from electronic health records, epidemiological studies, and clinical trials data. SUMMARY The key strength of a PheWAS is to identify the association of one or more genetic variants with multiple phenotypes, which can showcase interconnections among the phenotypes due to shared genetic associations. While the PheWAS approach appears promising, there are a number of challenges that need to be addressed to provide additional robustness to PheWAS findings.
Collapse
Affiliation(s)
- Anurag Verma
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
| | - Marylyn D Ritchie
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA
| |
Collapse
|
21
|
Liu J, Zhao R, Ye Z, Frey AJ, Schriver ER, Snyder NW, Hebbring SJ. Relationship of SULT1A1 copy number variation with estrogen metabolism and human health. J Steroid Biochem Mol Biol 2017; 174:169-175. [PMID: 28867356 PMCID: PMC5675753 DOI: 10.1016/j.jsbmb.2017.08.017] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 07/28/2017] [Accepted: 08/30/2017] [Indexed: 11/30/2022]
Abstract
Human cytosolic sulfotransferase 1A1 (SULT1A1) is considered to be one of the most important SULT isoforms for metabolism, detoxification, and carcinogenesis. This theory is driven by observations that SULT1A1 is widely expressed in multiple tissues and acts on a wide range of phenolic substrates. SULT1A1 is subject to functional common copy number variation (CNV) including deletions or duplications. However, it is less clear how SULT1A1 CNV impacts health and disease. To better understand the biological role of SULT1A1 in human health, we genotyped CNV in 14,275 Marshfield Clinic patients linked to an extensive electronic health record. Since SULT1A1 is linked to steroid metabolism, select serum steroid hormones were measured in 100 individuals with a wide spectrum of SULT1A1 CNV genotypes. Furthermore, comprehensive phenome-wide association studies (PheWAS) were conducted using diagnostic codes and clinical text data. For the first time, individuals homozygous null for SULT1A1 were identified in a human population. Thirty-six percent of the population carried >2 copies of SULT1A1 whereas 4% had ≤1 copy. Results indicate SULT1A1 CNV was negatively correlated with estrone-sulfate to estrone ratio predominantly in males (E1S/E1; p=0.03, r=-0.21) and may be associated with increased risk for common allergies. The effect of SULT1A1 CNV on circulating estrogen metabolites was opposite to the predicted CNV-metabolite trend based on enzymatic function. This finding, and the potential association with common allergies reported herein, warrants future studies.
Collapse
Affiliation(s)
- Jixia Liu
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI, USA
| | - Ran Zhao
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI, USA
| | - Zhan Ye
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI, USA
| | - Alexander J Frey
- A.J. Drexel Autism Institute, Drexel University, Philadelphia, PA, USA
| | - Emily R Schriver
- A.J. Drexel Autism Institute, Drexel University, Philadelphia, PA, USA; Division of Infectious Diseases, Children's Hospital of Philadelphia, PA, USA
| | | | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI, USA.
| |
Collapse
|
22
|
Kim T, Havighurst T, Kim K, Hebbring SJ, Ye Z, Aylward J, Keles S, Xu YG, Spiegelman VS. RNA-Binding Protein IGF2BP1 in Cutaneous Squamous Cell Carcinoma. J Invest Dermatol 2016; 137:772-775. [PMID: 27856289 DOI: 10.1016/j.jid.2016.10.042] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Revised: 10/17/2016] [Accepted: 10/25/2016] [Indexed: 11/24/2022]
Affiliation(s)
- TaeWon Kim
- Department of Dermatology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA; Molecular and Environmental Toxicology Center, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Thomas Havighurst
- Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - KyungMann Kim
- Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA; Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - Zhan Ye
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - Juliet Aylward
- Department of Dermatology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Sunduz Keles
- Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Yaohui G Xu
- Department of Dermatology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA; Carbone Cancer Center, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA.
| | - Vladimir S Spiegelman
- Department of Dermatology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA; Department of Pediatrics, Pennsylvania State University, Hershey, Pennsylvania, USA.
| |
Collapse
|
23
|
Denny JC, Bastarache L, Roden DM. Phenome-Wide Association Studies as a Tool to Advance Precision Medicine. Annu Rev Genomics Hum Genet 2016; 17:353-73. [PMID: 27147087 PMCID: PMC5480096 DOI: 10.1146/annurev-genom-090314-024956] [Citation(s) in RCA: 164] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Beginning in the early 2000s, the accumulation of biospecimens linked to electronic health records (EHRs) made possible genome-phenome studies (i.e., comparative analyses of genetic variants and phenotypes) using only data collected as a by-product of typical health care. In addition to disease and trait genetics, EHRs proved a valuable resource for analyzing pharmacogenetic traits and developing reverse genetics approaches such as phenome-wide association studies (PheWASs). PheWASs are designed to survey which of many phenotypes may be associated with a given genetic variant. PheWAS methods have been validated through replication of hundreds of known genotype-phenotype associations, and their use has differentiated between true pleiotropy and clinical comorbidity, added context to genetic discoveries, and helped define disease subtypes, and may also help repurpose medications. PheWAS methods have also proven to be useful with research-collected data. Future efforts that integrate broad, robust collection of phenotype data (e.g., EHR data) with purpose-collected research data in combination with a greater understanding of EHR data will create a rich resource for increasingly more efficient and detailed genome-phenome analysis to usher in new discoveries in precision medicine.
Collapse
Affiliation(s)
- Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
| | - Dan M Roden
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee 37203;
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
- Department of Pharmacology, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| |
Collapse
|
24
|
eMERGE Phenome-Wide Association Study (PheWAS) identifies clinical associations and pleiotropy for stop-gain variants. BMC Med Genomics 2016; 9 Suppl 1:32. [PMID: 27535653 PMCID: PMC4989894 DOI: 10.1186/s12920-016-0191-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND We explored premature stop-gain variants to test the hypothesis that variants, which are likely to have a consequence on protein structure and function, will reveal important insights with respect to the phenotypes associated with them. We performed a phenome-wide association study (PheWAS) exploring the association between a selected list of functional stop-gain genetic variants (variation resulting in truncated proteins or in nonsense-mediated decay) and an extensive group of diagnoses to identify novel associations and uncover potential pleiotropy. RESULTS In this study, we selected 25 stop-gain variants: 5 stop-gain variants with previously reported phenotypic associations, and a set of 20 putative stop-gain variants identified using dbSNP. For the PheWAS, we used data from the electronic MEdical Records and GEnomics (eMERGE) Network across 9 sites with a total of 41,057 unrelated patients. We divided all these samples into two datasets by equal proportion of eMERGE site, sex, race, and genotyping platform. We calculated single effect associations between these 25 stop-gain variants and ICD-9 defined case-control diagnoses. We also performed stratified analyses for samples of European and African ancestry. Associations were adjusted for sex, site, genotyping platform and the first three principal components to account for global ancestry. We identified previously known associations, such as variants in LPL associated with hyperglyceridemia indicating that our approach was robust. We also found a total of three significant associations with p < 0.01 in both datasets, with the most significant replicating result being LPL SNP rs328 and ICD-9 code 272.1 "Disorder of Lipoid metabolism" (pdiscovery = 2.59x10-6, preplicating = 2.7x10-4). The other two significant replicated associations identified by this study are: variant rs1137617 in KCNH2 gene associated with ICD-9 code category 244 "Acquired Hypothyroidism" (pdiscovery = 5.31x103, preplicating = 1.15x10-3) and variant rs12060879 in DPT gene associated with ICD-9 code category 996 "Complications peculiar to certain specified procedures" (pdiscovery = 8.65x103, preplicating = 4.16x10-3). CONCLUSION In conclusion, this PheWAS revealed novel associations of stop-gained variants with interesting phenotypes (ICD-9 codes) along with pleiotropic effects.
Collapse
|
25
|
Liu J, Ye Z, Mayer JG, Hoch BA, Green C, Rolak L, Cold C, Khor SS, Zheng X, Miyagawa T, Tokunaga K, Brilliant MH, Hebbring SJ. Phenome-wide association study maps new diseases to the human major histocompatibility complex region. J Med Genet 2016; 53:681-9. [PMID: 27287392 DOI: 10.1136/jmedgenet-2016-103867] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Accepted: 05/19/2016] [Indexed: 12/19/2022]
Abstract
BACKGROUND Over 160 disease phenotypes have been mapped to the major histocompatibility complex (MHC) region on chromosome 6 by genome-wide association study (GWAS), suggesting that the MHC region as a whole may be involved in the aetiology of many phenotypes, including unstudied diseases. The phenome-wide association study (PheWAS), a powerful and complementary approach to GWAS, has demonstrated its ability to discover and rediscover genetic associations. The objective of this study is to comprehensively investigate the MHC region by PheWAS to identify new phenotypes mapped to this genetically important region. METHODS In the current study, we systematically explored the MHC region using PheWAS to associate 2692 MHC-linked variants (minor allele frequency ≥0.01) with 6221 phenotypes in a cohort of 7481 subjects from the Marshfield Clinic Personalized Medicine Research Project. RESULTS Findings showed that expected associations previously identified by GWAS could be identified by PheWAS (eg, psoriasis, ankylosing spondylitis, type I diabetes and coeliac disease) with some having strong cross-phenotype associations potentially driven by pleiotropic effects. Importantly, novel associations with eight diseases not previously assessed by GWAS (eg, lichen planus) were also identified and replicated in an independent population. Many of these associated diseases appear to be immune-related disorders. Further assessment of these diseases in 16 484 Marshfield Clinic twins suggests that some of these diseases, including lichen planus, may have genetic aetiologies. CONCLUSIONS These results demonstrate that the PheWAS approach is a powerful and novel method to discover SNP-disease associations, and is ideal when characterising cross-phenotype associations, and further emphasise the importance of the MHC region in human health and disease.
Collapse
Affiliation(s)
- Jixia Liu
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - Zhan Ye
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - John G Mayer
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - Brian A Hoch
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - Clayton Green
- Department of Dermatology, Marshfield Clinic, Marshfield, Wisconsin, USA
| | - Loren Rolak
- Department of Neurology, Marshfield Clinic, Marshfield, Wisconsin, USA
| | - Christopher Cold
- Department of Pathology, Marshfield Clinic, Marshfield, Wisconsin, USA
| | - Seik-Soon Khor
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Xiuwen Zheng
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Taku Miyagawa
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan Sleep Disorders Project, Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Katsushi Tokunaga
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Murray H Brilliant
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA
| |
Collapse
|
26
|
Biological findings from the PheWAS catalog: focus on connective tissue-related disorders (pelvic floor dysfunction, abdominal hernia, varicose veins and hemorrhoids). Hum Genet 2016; 135:779-95. [DOI: 10.1007/s00439-016-1672-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 04/17/2016] [Indexed: 01/31/2023]
|
27
|
Mosley JD, Witte JS, Larkin EK, Bastarache L, Shaffer CM, Karnes JH, Stein CM, Phillips E, Hebbring SJ, Brilliant MH, Mayer J, Ye Z, Roden DM, Denny JC. Identifying genetically driven clinical phenotypes using linear mixed models. Nat Commun 2016; 7:11433. [PMID: 27109359 PMCID: PMC4848547 DOI: 10.1038/ncomms11433] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 03/24/2016] [Indexed: 01/06/2023] Open
Abstract
We hypothesized that generalized linear mixed models (GLMMs), which estimate the additive genetic variance underlying phenotype variability, would facilitate rapid characterization of clinical phenotypes from an electronic health record. We evaluated 1,288 phenotypes in 29,349 subjects of European ancestry with single-nucleotide polymorphism (SNP) genotyping on the Illumina Exome Beadchip. We show that genetic liability estimates are primarily driven by SNPs identified by prior genome-wide association studies and SNPs within the human leukocyte antigen (HLA) region. We identify 44 (false discovery rate q<0.05) phenotypes associated with HLA SNP variation and show that hypothyroidism is genetically correlated with Type I diabetes (rG=0.31, s.e. 0.12, P=0.003). We also report novel SNP associations for hypothyroidism near HLA-DQA1/HLA-DQB1 at rs6906021 (combined odds ratio (OR)=1.2 (95% confidence interval (CI): 1.1-1.2), P=9.8 × 10(-11)) and for polymyalgia rheumatica near C6orf10 at rs6910071 (OR=1.5 (95% CI: 1.3-1.6), P=1.3 × 10(-10)). Phenome-wide application of GLMMs identifies phenotypes with important genetic drivers, and focusing on these phenotypes can identify novel genetic associations.
Collapse
Affiliation(s)
- Jonathan D. Mosley
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - John S. Witte
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California 94158, USA
| | - Emma K. Larkin
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - Lisa Bastarache
- Biomedical Informatics, Vanderbilt University, Nashville, Tennessee 37203, USA
| | | | - Jason H. Karnes
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - C. Michael Stein
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - Elizabeth Phillips
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - Scott J. Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin 54449, USA
| | - Murray H. Brilliant
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin 54449, USA
| | - John Mayer
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin 54449, USA
| | - Zhan Ye
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin 54449, USA
| | - Dan M. Roden
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - Joshua C. Denny
- Department of Medicine, Vanderbilt University, Nashville, Tennessee 37232, USA
- Biomedical Informatics, Vanderbilt University, Nashville, Tennessee 37203, USA
| |
Collapse
|
28
|
Heatherly R, Rasmussen LV, Peissig PL, Pacheco JA, Harris P, Denny JC, Malin BA. A multi-institution evaluation of clinical profile anonymization. J Am Med Inform Assoc 2016; 23:e131-7. [PMID: 26567325 PMCID: PMC4954623 DOI: 10.1093/jamia/ocv154] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 08/17/2015] [Accepted: 09/09/2015] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND AND OBJECTIVE There is an increasing desire to share de-identified electronic health records (EHRs) for secondary uses, but there are concerns that clinical terms can be exploited to compromise patient identities. Anonymization algorithms mitigate such threats while enabling novel discoveries, but their evaluation has been limited to single institutions. Here, we study how an existing clinical profile anonymization fares at multiple medical centers. METHODS We apply a state-of-the-artk-anonymization algorithm, withkset to the standard value 5, to the International Classification of Disease, ninth edition codes for patients in a hypothyroidism association study at three medical centers: Marshfield Clinic, Northwestern University, and Vanderbilt University. We assess utility when anonymizing at three population levels: all patients in 1) the EHR system; 2) the biorepository; and 3) a hypothyroidism study. We evaluate utility using 1) changes to the number included in the dataset, 2) number of codes included, and 3) regions generalization and suppression were required. RESULTS Our findings yield several notable results. First, we show that anonymizing in the context of the entire EHR yields a significantly greater quantity of data by reducing the amount of generalized regions from ∼15% to ∼0.5%. Second, ∼70% of codes that needed generalization only generalized two or three codes in the largest anonymization. CONCLUSIONS Sharing large volumes of clinical data in support of phenome-wide association studies is possible while safeguarding privacy to the underlying individuals.
Collapse
Affiliation(s)
- Raymond Heatherly
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - Luke V Rasmussen
- Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Peggy L Peissig
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI, USA
| | | | - Paul Harris
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Bradley A Malin
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA Department of Electrical Engineering & Computer Science, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
29
|
Unravelling the human genome-phenome relationship using phenome-wide association studies. Nat Rev Genet 2016; 17:129-45. [PMID: 26875678 DOI: 10.1038/nrg.2015.36] [Citation(s) in RCA: 182] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Advances in genotyping technology have, over the past decade, enabled the focused search for common genetic variation associated with human diseases and traits. With the recently increased availability of detailed phenotypic data from electronic health records and epidemiological studies, the impact of one or more genetic variants on the phenome is starting to be characterized both in clinical and population-based settings using phenome-wide association studies (PheWAS). These studies reveal a number of challenges that will need to be overcome to unlock the full potential of PheWAS for the characterization of the complex human genome-phenome relationship.
Collapse
|
30
|
Abstract
The combination of next-generation sequencing technologies and high-throughput genotyping platforms has revolutionized the pursuit of genetic variants that contribute towards disease. Furthermore, these technologies have provided invaluable insight into the genetic factors that prevent individuals from developing disease. Exploiting the evolutionary mechanisms that were designed by nature to help prevent disease is an attractive line of enquiry. Such efforts have the potential to generate a therapeutic target roadmap and rejuvenate the current drug-discovery pathway. By delineating the genomic factors that are protective against disease, there is potential to derive highly effective, genomically anchored medicines that assist in maintaining health.
Collapse
|
31
|
Opportunities for drug repositioning from phenome-wide association studies. Nat Biotechnol 2015; 33:342-5. [PMID: 25850054 DOI: 10.1038/nbt.3183] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
32
|
Tyler AL, Crawford DC, Pendergrass SA. The detection and characterization of pleiotropy: discovery, progress, and promise. Brief Bioinform 2015. [PMID: 26223525 DOI: 10.1093/bib/bbv050] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The impact of a single genetic locus on multiple phenotypes, or pleiotropy, is an important area of research. Biological systems are dynamic complex networks, and these networks exist within and between cells. In humans, the consideration of multiple phenotypes such as physiological traits, clinical outcomes and drug response, in the context of genetic variation, can provide ways of developing a more complete understanding of the complex relationships between genetic architecture and how biological systems function in health and disease. In this article, we describe recent studies exploring the relationships between genetic loci and more than one phenotype. We also cover methodological developments incorporating pleiotropy applied to model organisms as well as humans, and discuss how stepping beyond the analysis of a single phenotype leads to a deeper understanding of complex genetic architecture.
Collapse
|
33
|
Aggarwal S, Gheware A, Agrawal A, Ghosh S, Prasher B, Mukerji M. Combined genetic effects of EGLN1 and VWF modulate thrombotic outcome in hypoxia revealed by Ayurgenomics approach. J Transl Med 2015; 13:184. [PMID: 26047609 PMCID: PMC4457985 DOI: 10.1186/s12967-015-0542-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 05/18/2015] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Extreme constitution "Prakriti" types of Ayurveda exhibit systemic physiological attributes. Our earlier genetic study has revealed differences in EGLN1, key modulator of hypoxia axis between Prakriti types. This was associated with differences in high altitude adaptation and susceptibility to high altitude pulmonary edema (HAPE). In this study we investigate other molecular differences that contribute to systemic attributes of Prakriti that would be relevant in predictive marker discovery. METHODS Genotyping of 96 individuals of the earlier cohort was carried out in a panel of 2,800 common genic SNPs represented in Indian Genomic Variation Consortium (IGVC) panel from 24 diverse populations. Frequency distribution patterns of Prakriti differentiating variations (FDR correction P < 0.05) was studied in IGVC and 55 global populations (HGDP-CEPH) panels. Genotypic interactions between VWF, identified from the present analysis, and EGLN1 was analyzed using multinomial logistic regression in Prakriti and Indian populations from contrasting altitudes. Spearman's Rank correlation was used to study this genotypic interaction with respect to altitude in HGDP-CEPH panel. Validation of functional link between EGLN1 and VWF was carried out in a mouse model using chemical inhibition and siRNA studies. RESULT Significant differences in allele frequencies were observed in seven genes (SPTA1, VWF, OLR1, UCP2, OR6K3, LEPR, and OR10Z1) after FDR correction (P < 0.05). A non synonymous variation (C/T, rs1063856) associated with thrombosis/bleeding susceptibility respectively, differed significantly between Kapha (C-allele) and Pitta (T-allele) constitution types. A combination of derived EGLN1 allele (HAPE associated) and ancestral VWF allele (thrombosis associated) was significantly high in Kapha group compared to Pitta (p < 10(-5)). The combination of risk-associated Kapha alleles was nearly absent in natives of high altitude. Inhibition of EGLN1 using (DHB) and an EGLN1 specific siRNA in a mouse model lead to a marked increase in vWF levels as well as pro-thrombotic phenotype viz. reduced bleeding time and enhanced platelet count and activation. CONCLUSION We demonstrate for the first time a genetic link between EGLN1 and VWF in a constitution specific manner which could modulate thrombosis/bleeding susceptibility and outcomes of hypoxia. Integration of Prakriti in population stratification may help assemble common variations in key physiological axes that confers differences in disease occurrence and patho-phenotypic outcomes.
Collapse
Affiliation(s)
- Shilpi Aggarwal
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, India.
| | - Atish Gheware
- CSIR's Ayurgenomics Unit-TRISUTRA (Translational Research and Innovative Science ThRough Ayurgenomics), CSIR-Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110 020, India. .,Academy of Scientific and Innovative Research (AcSIR), New Delhi, India.
| | - Anurag Agrawal
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, India. .,Academy of Scientific and Innovative Research (AcSIR), New Delhi, India.
| | | | - Bhavana Prasher
- CSIR's Ayurgenomics Unit-TRISUTRA (Translational Research and Innovative Science ThRough Ayurgenomics), CSIR-Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110 020, India. .,Academy of Scientific and Innovative Research (AcSIR), New Delhi, India.
| | - Mitali Mukerji
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, India. .,CSIR's Ayurgenomics Unit-TRISUTRA (Translational Research and Innovative Science ThRough Ayurgenomics), CSIR-Institute of Genomics and Integrative Biology, Sukhdev Vihar, Mathura Road, New Delhi, 110 020, India. .,Academy of Scientific and Innovative Research (AcSIR), New Delhi, India.
| | | |
Collapse
|
34
|
Pendergrass SA, Ritchie MD. Phenome-Wide Association Studies: Leveraging Comprehensive Phenotypic and Genotypic Data for Discovery. CURRENT GENETIC MEDICINE REPORTS 2015; 3:92-100. [PMID: 26146598 PMCID: PMC4489156 DOI: 10.1007/s40142-015-0067-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
With the large volume of clinical and epidemiological data being collected, increasingly linked to extensive genotypic data, coupled with expanding high-performance computational resources, there are considerable opportunities for comprehensively exploring the networks of connections that exist between the phenome and the genome. These networks can be identified through Phenome-Wide Association Studies (PheWAS) where the association between a collection of genetic variants, or in some cases a particular clinical lab variable, and a wide and diverse range of phenotypes, diagnoses, traits, and/or outcomes are evaluated. This is a departure from the more familiar genome-wide association study (GWAS) approach, which has been used to identify single nucleotide polymorphisms (SNPs) associated with one outcome or a very limited phenotypic domain. In addition to highlighting novel connections between multiple phenotypes and elucidating more of the phenotype-genotype landscape, PheWAS can generate new hypotheses for further exploration, and can also be used to narrow the search space for research using comprehensive data collections. The complex results of PheWAS also have the potential for uncovering new mechanistic insights. We review here how the PheWAS approach has been used with data from epidemiological studies, clinical trials, and de-identified electronic health record data. We also review methodologies for the analyses underlying PheWAS, and emerging methods developed for evaluating the comprehensive results of PheWAS including genotype-phenotype networks. This review also highlights PheWAS as an important tool for identifying new biomarkers, elucidating the genetic architecture of complex traits, and uncovering pleiotropy. There are many directions and new methodologies for the future of PheWAS analyses, from the phenotypic data to the genetic data, and herein we also discuss some of these important future PheWAS developments.
Collapse
|
35
|
Fangjiomics: revealing adaptive omics pharmacological mechanisms of the myriad combination therapies to achieve personalized medicine. Acta Pharmacol Sin 2015; 36:651-3. [PMID: 26036241 DOI: 10.1038/aps.2015.33] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
36
|
Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med 2015; 7:41. [PMID: 25937834 PMCID: PMC4416392 DOI: 10.1186/s13073-015-0166-y] [Citation(s) in RCA: 158] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The convergence of two rapidly developing technologies - high-throughput genotyping and electronic health records (EHRs) - gives scientists an unprecedented opportunity to utilize routine healthcare data to accelerate genomic discovery. Institutions and healthcare systems have been building EHR-linked DNA biobanks to enable such a vision. However, the precise extraction of detailed disease and drug-response phenotype information hidden in EHRs is not an easy task. EHR-based studies have successfully replicated known associations, made new discoveries for diseases and drug response traits, rapidly contributed cases and controls to large meta-analyses, and demonstrated the potential of EHRs for broad-based phenome-wide association studies. In this review, we summarize the advantages and challenges of repurposing EHR data for genetic research. We also highlight recent notable studies and novel approaches to provide an overview of advanced EHR-based phenotyping.
Collapse
|
37
|
Hebbring SJ, Rastegar-Mojarad M, Ye Z, Mayer J, Jacobson C, Lin S. Application of clinical text data for phenome-wide association studies (PheWASs). Bioinformatics 2015; 31:1981-7. [PMID: 25657332 DOI: 10.1093/bioinformatics/btv076] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Accepted: 02/02/2015] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Genome-wide association studies (GWASs) are effective for describing genetic complexities of common diseases. Phenome-wide association studies (PheWASs) offer an alternative and complementary approach to GWAS using data embedded in the electronic health record (EHR) to define the phenome. International Classification of Disease version 9 (ICD9) codes are used frequently to define the phenome, but using ICD9 codes alone misses other clinically relevant information from the EHR that can be used for PheWAS analyses and discovery. RESULTS As an alternative to ICD9 coding, a text-based phenome was defined by 23 384 clinically relevant terms extracted from Marshfield Clinic's EHR. Five single nucleotide polymorphisms (SNPs) with known phenotypic associations were genotyped in 4235 individuals and associated across the text-based phenome. All five SNPs genotyped were associated with expected terms (P<0.02), most at or near the top of their respective PheWAS ranking. Raw association results indicate that text data performed equivalently to ICD9 coding and demonstrate the utility of information beyond ICD9 coding for application in PheWAS.
Collapse
Affiliation(s)
- Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Majid Rastegar-Mojarad
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Zhan Ye
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - John Mayer
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Crystal Jacobson
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| | - Simon Lin
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA and Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, WI 54449, USA
| |
Collapse
|