1
|
Susmitha P, Kumar P, Yadav P, Sahoo S, Kaur G, Pandey MK, Singh V, Tseng TM, Gangurde SS. Genome-wide association study as a powerful tool for dissecting competitive traits in legumes. FRONTIERS IN PLANT SCIENCE 2023; 14:1123631. [PMID: 37645459 PMCID: PMC10461012 DOI: 10.3389/fpls.2023.1123631] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/08/2023] [Indexed: 08/31/2023]
Abstract
Legumes are extremely valuable because of their high protein content and several other nutritional components. The major challenge lies in maintaining the quantity and quality of protein and other nutritional compounds in view of climate change conditions. The global need for plant-based proteins has increased the demand for seeds with a high protein content that includes essential amino acids. Genome-wide association studies (GWAS) have evolved as a standard approach in agricultural genetics for examining such intricate characters. Recent development in machine learning methods shows promising applications for dimensionality reduction, which is a major challenge in GWAS. With the advancement in biotechnology, sequencing, and bioinformatics tools, estimation of linkage disequilibrium (LD) based associations between a genome-wide collection of single-nucleotide polymorphisms (SNPs) and desired phenotypic traits has become accessible. The markers from GWAS could be utilized for genomic selection (GS) to predict superior lines by calculating genomic estimated breeding values (GEBVs). For prediction accuracy, an assortment of statistical models could be utilized, such as ridge regression best linear unbiased prediction (rrBLUP), genomic best linear unbiased predictor (gBLUP), Bayesian, and random forest (RF). Both naturally diverse germplasm panels and family-based breeding populations can be used for association mapping based on the nature of the breeding system (inbred or outbred) in the plant species. MAGIC, MCILs, RIAILs, NAM, and ROAM are being used for association mapping in several crops. Several modifications of NAM, such as doubled haploid NAM (DH-NAM), backcross NAM (BC-NAM), and advanced backcross NAM (AB-NAM), have also been used in crops like rice, wheat, maize, barley mustard, etc. for reliable marker-trait associations (MTAs), phenotyping accuracy is equally important as genotyping. Highthroughput genotyping, phenomics, and computational techniques have advanced during the past few years, making it possible to explore such enormous datasets. Each population has unique virtues and flaws at the genomics and phenomics levels, which will be covered in more detail in this review study. The current investigation includes utilizing elite breeding lines as association mapping population, optimizing the choice of GWAS selection, population size, and hurdles in phenotyping, and statistical methods which will analyze competitive traits in legume breeding.
Collapse
Affiliation(s)
- Pusarla Susmitha
- Regional Agricultural Research Station, Acharya N.G. Ranga Agricultural University, Andhra Pradesh, India
| | - Pawan Kumar
- Department of Genetics and Plant Breeding, College of Agriculture, Chaudhary Charan Singh (CCS) Haryana Agricultural University, Hisar, India
| | - Pankaj Yadav
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Rajasthan, India
| | - Smrutishree Sahoo
- Department of Genetics and Plant Breeding, School of Agriculture, Gandhi Institute of Engineering and Technology (GIET) University, Odisha, India
| | - Gurleen Kaur
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Manish K. Pandey
- Department of Genomics, Prebreeding and Bioinformatics, International Crops Research Institute for the Semi-Arid Tropics, Hyderabad, India
| | - Varsha Singh
- Department of Plant and Soil Sciences, Mississippi State University, Starkville, MS, United States
| | - Te Ming Tseng
- Department of Plant and Soil Sciences, Mississippi State University, Starkville, MS, United States
| | - Sunil S. Gangurde
- Department of Plant Pathology, University of Georgia, Tifton, GA, United States
| |
Collapse
|
2
|
Pagadala M, Sears TJ, Wu VH, Pérez-Guijarro E, Kim H, Castro A, Talwar JV, Gonzalez-Colin C, Cao S, Schmiedel BJ, Goudarzi S, Kirani D, Au J, Zhang T, Landi T, Salem RM, Morris GP, Harismendy O, Patel SP, Alexandrov LB, Mesirov JP, Zanetti M, Day CP, Fan CC, Thompson WK, Merlino G, Gutkind JS, Vijayanand P, Carter H. Germline modifiers of the tumor immune microenvironment implicate drivers of cancer risk and immunotherapy response. Nat Commun 2023; 14:2744. [PMID: 37173324 PMCID: PMC10182072 DOI: 10.1038/s41467-023-38271-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 04/24/2023] [Indexed: 05/15/2023] Open
Abstract
With the continued promise of immunotherapy for treating cancer, understanding how host genetics contributes to the tumor immune microenvironment (TIME) is essential to tailoring cancer screening and treatment strategies. Here, we study 1084 eQTLs affecting the TIME found through analysis of The Cancer Genome Atlas and literature curation. These TIME eQTLs are enriched in areas of active transcription, and associate with gene expression in specific immune cell subsets, such as macrophages and dendritic cells. Polygenic score models built with TIME eQTLs reproducibly stratify cancer risk, survival and immune checkpoint blockade (ICB) response across independent cohorts. To assess whether an eQTL-informed approach could reveal potential cancer immunotherapy targets, we inhibit CTSS, a gene implicated by cancer risk and ICB response-associated polygenic models; CTSS inhibition results in slowed tumor growth and extended survival in vivo. These results validate the potential of integrating germline variation and TIME characteristics for uncovering potential targets for immunotherapy.
Collapse
Affiliation(s)
- Meghana Pagadala
- Biomedical Sciences Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Timothy J Sears
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Victoria H Wu
- Department of Pharmacology, UCSD Moores Cancer Center, La Jolla, CA, 92093, USA
| | - Eva Pérez-Guijarro
- Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Hyo Kim
- Undergraduate Bioengineering Program, Jacobs School of Engineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Andrea Castro
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - James V Talwar
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | | | - Steven Cao
- Division of Epidemiology, Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, 92093, USA
| | | | | | - Divya Kirani
- Undergraduate Biology and Bioinformatics Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Jessica Au
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Tongwu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Teresa Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Rany M Salem
- Division of Epidemiology, Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, 92093, USA
| | - Gerald P Morris
- Department of Pathology, University of California San Diego, La Jolla, CA, 92093, USA
| | - Olivier Harismendy
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego School of Medicine, La Jolla, CA, 92093, USA
| | - Sandip Pravin Patel
- Center for Personalized Cancer Therapy, Division of Hematology and Oncology, UC San Diego Moores Cancer Center, San Diego, CA, 92037, USA
| | - Ludmil B Alexandrov
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Jill P Mesirov
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Maurizio Zanetti
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92093, USA
- The Laboratory of Immunology and Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Chi-Ping Day
- Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Chun Chieh Fan
- Center for Population Neuroscience and Genetics, Laureate Institute for Brain Research, Tulsa, OK, 74136, USA
- Department of Radiology, University of California San Diego, La Jolla, CA, 92093, USA
| | - Wesley K Thompson
- Division of Biostatistics, Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, 92093, USA
| | - Glenn Merlino
- Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - J Silvio Gutkind
- Department of Pharmacology, UCSD Moores Cancer Center, La Jolla, CA, 92093, USA
| | | | - Hannah Carter
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92093, USA.
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
3
|
Zhang X, Lucas AM, Veturi Y, Drivas TG, Bone WP, Verma A, Chung WK, Crosslin D, Denny JC, Hebbring S, Jarvik GP, Kullo I, Larson EB, Rasmussen-Torvik LJ, Schaid DJ, Smoller JW, Stanaway IB, Wei WQ, Weng C, Ritchie MD. Large-scale genomic analyses reveal insights into pleiotropy across circulatory system diseases and nervous system disorders. Nat Commun 2022; 13:3428. [PMID: 35701404 PMCID: PMC9198016 DOI: 10.1038/s41467-022-30678-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 05/10/2022] [Indexed: 01/18/2023] Open
Abstract
Clinical and epidemiological studies have shown that circulatory system diseases and nervous system disorders often co-occur in patients. However, genetic susceptibility factors shared between these disease categories remain largely unknown. Here, we characterized pleiotropy across 107 circulatory system and 40 nervous system traits using an ensemble of methods in the eMERGE Network and UK Biobank. Using a formal test of pleiotropy, five genomic loci demonstrated statistically significant evidence of pleiotropy. We observed region-specific patterns of direction of genetic effects for the two disease categories, suggesting potential antagonistic and synergistic pleiotropy. Our findings provide insights into the relationship between circulatory system diseases and nervous system disorders which can provide context for future prevention and treatment strategies.
Collapse
Affiliation(s)
- Xinyuan Zhang
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Anastasia M Lucas
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yogasudha Veturi
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Theodore G Drivas
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - William P Bone
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Anurag Verma
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Wendy K Chung
- Department of Pediatrics and Medicine, Columbia University, New York, NY, 10032, USA
| | - David Crosslin
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Joshua C Denny
- Department of Medicine, Vanderbilt University, Nashville, TN, 37235, USA
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, 37230, USA
| | - Scott Hebbring
- Center for Human Genetics, Marshfield Clinic, Marshfield, WI, 54449, USA
| | - Gail P Jarvik
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Iftikhar Kullo
- Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, 55905, USA
| | - Eric B Larson
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, 98101, USA
| | - Laura J Rasmussen-Torvik
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Daniel J Schaid
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Ian B Stanaway
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, 37230, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY, 10032, USA
| | - Marylyn D Ritchie
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
4
|
Choe EK, Shivakumar M, Verma A, Verma SS, Choi SH, Kim JS, Kim D. Leveraging deep phenotyping from health check-up cohort with 10,000 Korean individuals for phenome-wide association study of 136 traits. Sci Rep 2022; 12:1930. [PMID: 35121771 PMCID: PMC8817039 DOI: 10.1038/s41598-021-04580-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 12/17/2021] [Indexed: 11/09/2022] Open
Abstract
The expanding use of the phenome-wide association study (PheWAS) faces challenges in the context of using International Classification of Diseases billing codes for phenotype definition, imbalanced study population ethnicity, and constrained application of the results in research. We performed a PheWAS utilizing 136 deep phenotypes corroborated by comprehensive health check-ups in a Korean population, along with trans-ethnic comparisons through using the UK Biobank and Biobank Japan Project. Meta-analysis with Korean and Japanese population was done. The PheWAS associated 65 phenotypes with 14,101 significant variants (P < 4.92 × 10-10). Network analysis, visualization of cross-phenotype mapping, and causal inference mapping with Mendelian randomization were conducted. Among phenotype pairs from the genotype-driven cross-phenotype associations, we evaluated penetrance in correlation analysis using a clinical database. We focused on the application of PheWAS in order to make it robust and to aid the derivation of biological meaning post-PheWAS. This comprehensive analysis of PheWAS results based on a health check-up database will provide researchers and clinicians with a panoramic overview of the networks among multiple phenotypes and genetic variants, laying groundwork for the practical application of precision medicine.
Collapse
Affiliation(s)
- Eun Kyung Choe
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA.,Department of Surgery, Seoul National University Hospital Healthcare System Gangnam Center, Seoul, 06236, South Korea
| | - Manu Shivakumar
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA
| | - Anurag Verma
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Shefali Setia Verma
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Seung Ho Choi
- Department of Internal Medicine, Seoul National University Hospital Healthcare System Gangnam Center, Seoul, 06236, South Korea
| | - Joo Sung Kim
- Department of Internal Medicine, Seoul National University Hospital Healthcare System Gangnam Center, Seoul, 06236, South Korea. .,Department of Internal Medicine and Liver Research Institute, Seoul National University College of Medicine, Seoul, 03080, South Korea.
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, B304 Richards Building, 3700 Hamilton Walk, Philadelphia, PA, 19104-6116, USA. .,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
5
|
Wang L, Zhang X, Meng X, Koskeridis F, Georgiou A, Yu L, Campbell H, Theodoratou E, Li X. Methodology in phenome-wide association studies: a systematic review. J Med Genet 2021; 58:720-728. [PMID: 34272311 DOI: 10.1136/jmedgenet-2021-107696] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 05/27/2021] [Indexed: 11/04/2022]
Abstract
Phenome-wide association study (PheWAS) has been increasingly used to identify novel genetic associations across a wide spectrum of phenotypes. This systematic review aims to summarise the PheWAS methodology, discuss the advantages and challenges of PheWAS, and provide potential implications for future PheWAS studies. Medical Literature Analysis and Retrieval System Online (MEDLINE) and Excerpta Medica Database (EMBASE) databases were searched to identify all published PheWAS studies up until 24 April 2021. The PheWAS methodology incorporating how to perform PheWAS analysis and which software/tool could be used, were summarised based on the extracted information. A total of 1035 studies were identified and 195 eligible articles were finally included. Among them, 137 (77.0%) contained 10 000 or more study participants, 164 (92.1%) defined the phenome based on electronic medical records data, 140 (78.7%) used genetic variants as predictors, and 73 (41.0%) conducted replication analysis to validate PheWAS findings and almost all of them (94.5%) received consistent results. The methodology applied in these PheWAS studies was dissected into several critical steps, including quality control of the phenome, selecting predictors, phenotyping, statistical analysis, interpretation and visualisation of PheWAS results, and the workflow for performing a PheWAS was established with detailed instructions on each step. This study provides a comprehensive overview of PheWAS methodology to help practitioners achieve a better understanding of the PheWAS design, to detect understudied or overstudied outcomes, and to direct their research by applying the most appropriate software and online tools for their study data structure.
Collapse
Affiliation(s)
- Lijuan Wang
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Xiaomeng Zhang
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK
| | - Xiangrui Meng
- Vanke School of Public Health, Tsinghua University, Beijing, China
| | - Fotios Koskeridis
- Department of Hygiene and Epidemiology, University of Ioannina, Ioannina, Epirus, Greece
| | - Andrea Georgiou
- Department of Hygiene and Epidemiology, University of Ioannina, Ioannina, Epirus, Greece
| | - Lili Yu
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Harry Campbell
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK
| | - Evropi Theodoratou
- Centre for Global Health, The University of Edinburgh Usher Institute of Population Health Sciences and Informatics, Edinburgh, UK.,Cancer Research UK Edinburgh Centre, The University of Edinburgh MRC Institute of Genetics and Molecular Medicine, Edinburgh, UK
| | - Xue Li
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| |
Collapse
|
6
|
Hall MA, Wallace J, Lucas AM, Bradford Y, Verma SS, Müller-Myhsok B, Passero K, Zhou J, McGuigan J, Jiang B, Pendergrass SA, Zhang Y, Peissig P, Brilliant M, Sleiman P, Hakonarson H, Harley JB, Kiryluk K, Van Steen K, Moore JH, Ritchie MD. Novel EDGE encoding method enhances ability to identify genetic interactions. PLoS Genet 2021; 17:e1009534. [PMID: 34086673 PMCID: PMC8208534 DOI: 10.1371/journal.pgen.1009534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/16/2021] [Accepted: 04/06/2021] [Indexed: 11/26/2022] Open
Abstract
Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)–rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action. Although traditional genetic encodings are widely implemented in genetics research, including in genome-wide association studies (GWAS) and epistasis, each method makes assumptions that may not reflect the underlying etiology. Here, we introduce a novel encoding method that estimates and assigns an individualized data-driven encoding for each single nucleotide polymorphism (SNP): the elastic data-driven genetic encoding (EDGE). With simulations, we demonstrate that this novel method is more accurate and robust than traditional encoding methods in estimating heterozygous genotype values, reducing the type I error, and detecting SNP-SNP interactions. We further applied the traditional encodings and EDGE to biomedical data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes, and EDGE identified a novel interaction for age-related cataract not detected by traditional methods, which replicated in data from the UK Biobank. EDGE provides an alternative approach to understanding and modeling diverse SNP models and is recommended for studying complex genetics in common human phenotypes.
Collapse
Affiliation(s)
- Molly A. Hall
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Penn State Cancer Institute, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - John Wallace
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Anastasia M. Lucas
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Shefali S. Verma
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Bertram Müller-Myhsok
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | - Kristin Passero
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jiayan Zhou
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - John McGuigan
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Beibei Jiang
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | | | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Peggy Peissig
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Murray Brilliant
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Patrick Sleiman
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Hakon Hakonarson
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - John B. Harley
- Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- United States Department of Veterans Affairs Medical Center, Cincinnati, Ohio, United States of America
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, College of Physicians and Surgeons, Columbia University, New York, New York, United States of America
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liège, Belgium
- Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Jason H. Moore
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
7
|
Li B, Veturi Y, Verma A, Bradford Y, Daar ES, Gulick RM, Riddler SA, Robbins GK, Lennox JL, Haas DW, Ritchie MD. Tissue specificity-aware TWAS (TSA-TWAS) framework identifies novel associations with metabolic, immunologic, and virologic traits in HIV-positive adults. PLoS Genet 2021; 17:e1009464. [PMID: 33901188 PMCID: PMC8102009 DOI: 10.1371/journal.pgen.1009464] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 05/06/2021] [Accepted: 03/03/2021] [Indexed: 01/01/2023] Open
Abstract
As a type of relatively new methodology, the transcriptome-wide association study (TWAS) has gained interest due to capacity for gene-level association testing. However, the development of TWAS has outpaced statistical evaluation of TWAS gene prioritization performance. Current TWAS methods vary in underlying biological assumptions about tissue specificity of transcriptional regulatory mechanisms. In a previous study from our group, this may have affected whether TWAS methods better identified associations in single tissues versus multiple tissues. We therefore designed simulation analyses to examine how the interplay between particular TWAS methods and tissue specificity of gene expression affects power and type I error rates for gene prioritization. We found that cross-tissue identification of expression quantitative trait loci (eQTLs) improved TWAS power. Single-tissue TWAS (i.e., PrediXcan) had robust power to identify genes expressed in single tissues, but, often found significant associations in the wrong tissues as well (therefore had high false positive rates). Cross-tissue TWAS (i.e., UTMOST) had overall equal or greater power and controlled type I error rates for genes expressed in multiple tissues. Based on these simulation results, we applied a tissue specificity-aware TWAS (TSA-TWAS) analytic framework to look for gene-based associations with pre-treatment laboratory values from AIDS Clinical Trial Group (ACTG) studies. We replicated several proof-of-concept transcriptionally regulated gene-trait associations, including UGT1A1 (encoding bilirubin uridine diphosphate glucuronosyltransferase enzyme) and total bilirubin levels (p = 3.59×10-12), and CETP (cholesteryl ester transfer protein) with high-density lipoprotein cholesterol (p = 4.49×10-12). We also identified several novel genes associated with metabolic and virologic traits, as well as pleiotropic genes that linked plasma viral load, absolute basophil count, and/or triglyceride levels. By highlighting the advantages of different TWAS methods, our simulation study promotes a tissue specificity-aware TWAS analytic framework that revealed novel aspects of HIV-related traits.
Collapse
Affiliation(s)
- Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, California, United States of America
| | - Yogasudha Veturi
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Anurag Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Eric S. Daar
- Lundquist Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Roy M. Gulick
- Weill Cornell Medicine, New York City, New York, United States of America
| | - Sharon A. Riddler
- Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Gregory K. Robbins
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Jeffrey L. Lennox
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - David W. Haas
- Departments of Medicine, Pharmacology, Pathology, Microbiology & Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Internal Medicine, Meharry Medical College, Nashville, Tennessee, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
8
|
Drivas TG, Lucas A, Zhang X, Ritchie MD. Mendelian pathway analysis of laboratory traits reveals distinct roles for ciliary subcompartments in common disease pathogenesis. Am J Hum Genet 2021; 108:482-501. [PMID: 33636100 PMCID: PMC8008498 DOI: 10.1016/j.ajhg.2021.02.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 02/05/2021] [Indexed: 12/17/2022] Open
Abstract
Rare monogenic disorders of the primary cilium, termed ciliopathies, are characterized by extreme presentations of otherwise common diseases, such as diabetes, hepatic fibrosis, and kidney failure. However, despite a recent revolution in our understanding of the cilium's role in rare disease pathogenesis, the organelle's contribution to common disease remains largely unknown. Hypothesizing that common genetic variants within Mendelian ciliopathy genes might contribute to common complex diseases pathogenesis, we performed association studies of 16,874 common genetic variants across 122 ciliary genes with 12 quantitative laboratory traits characteristic of ciliopathy syndromes in 452,593 individuals in the UK Biobank. We incorporated tissue-specific gene expression analysis, expression quantitative trait loci, and Mendelian disease phenotype information into our analysis and replicated our findings in meta-analysis. 101 statistically significant associations were identified across 42 of the 122 examined ciliary genes (including eight novel replicating associations). These ciliary genes were widely expressed in tissues relevant to the phenotypes being studied, and eQTL analysis revealed strong evidence for correlation between ciliary gene expression levels and laboratory traits. Perhaps most interestingly, our analysis identified different ciliary subcompartments as being specifically associated with distinct sets of phenotypes. Taken together, our data demonstrate the utility of a Mendelian pathway-based approach to genomic association studies, challenge the widely held belief that the cilium is an organelle important mainly in development and in rare syndromic disease pathogenesis, and provide a framework for the continued integration of common and rare disease genetics to provide insight into the pathophysiology of human diseases of immense public health burden.
Collapse
Affiliation(s)
- Theodore George Drivas
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19194, USA; Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| | - Anastasia Lucas
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19194, USA
| | - Xinyuan Zhang
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19194, USA
| | - Marylyn DeRiggi Ritchie
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19194, USA; Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19194, USA.
| |
Collapse
|
9
|
Klarin D, Verma SS, Judy R, Dikilitas O, Wolford BN, Paranjpe I, Levin MG, Pan C, Tcheandjieu C, Spin JM, Lynch J, Assimes TL, Åldstedt Nyrønning L, Mattsson E, Edwards TL, Denny J, Larson E, Lee MTM, Carrell D, Zhang Y, Jarvik GP, Gharavi AG, Harley J, Mentch F, Pacheco JA, Hakonarson H, Skogholt AH, Thomas L, Gabrielsen ME, Hveem K, Nielsen JB, Zhou W, Fritsche L, Huang J, Natarajan P, Sun YV, DuVall SL, Rader DJ, Cho K, Chang KM, Wilson PWF, O'Donnell CJ, Kathiresan S, Scali ST, Berceli SA, Willer C, Jones GT, Bown MJ, Nadkarni G, Kullo IJ, Ritchie M, Damrauer SM, Tsao PS. Genetic Architecture of Abdominal Aortic Aneurysm in the Million Veteran Program. Circulation 2020; 142:1633-1646. [PMID: 32981348 PMCID: PMC7580856 DOI: 10.1161/circulationaha.120.047544] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Supplemental Digital Content is available in the text. Abdominal aortic aneurysm (AAA) is an important cause of cardiovascular mortality; however, its genetic determinants remain incompletely defined. In total, 10 previously identified risk loci explain a small fraction of AAA heritability.
Collapse
Affiliation(s)
- Derek Klarin
- Malcolm Randall VA Medical Center, Gainesville, FL (D.K., S.T.S., S.A.B.).,Division of Vascular Surgery and Endovascular Therapy, University of Florida College of Medicine, Gainesville (D.K., S.T.S., S.A.B.).,Center for Genomic Medicine (D.K., W.Z., P.N.), Massachusetts General Hospital, Harvard Medical School, Boston.,Program in Medical and Population Genetics (D.K.), Broad Institute of MIT and Harvard, Cambridge, MA
| | - Shefali Setia Verma
- Department of Genetics (S.S.V., M.R.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Renae Judy
- Department of Surgery (R.J., S.M.D.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.,Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA (R.J., M.G.L., K.-M.C., S.M.D.)
| | - Ozan Dikilitas
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN (O.D., I.J.K.)
| | - Brooke N Wolford
- Department of Computational Medicine and Bioinformatics (B.N.W., C.W.), University of Michigan Medical School, Ann Arbor
| | - Ishan Paranjpe
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY (I.P., G.N.)
| | - Michael G Levin
- Division of Cardiovascular Medicine (M.G.L.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.,Department of Medicine (M.G.L., D.J.R., K.-M.C.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.,Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA (R.J., M.G.L., K.-M.C., S.M.D.)
| | - Cuiping Pan
- Palo Alto Epidemiology Research and Information Center for Genomics (C.P.), CA
| | - Catherine Tcheandjieu
- VA Palo Alto Health Care System (C.T., J.M.S., T.L.A., P.S.T.), CA.,Division of Cardiovascular Medicine, Department of Medicine (C.T., J.M.S., T.L.A., P.S.T.), Stanford University School of Medicine, CA.,Department of Pediatric Cardiology (C.T.), Stanford University School of Medicine, CA
| | - Joshua M Spin
- VA Palo Alto Health Care System (C.T., J.M.S., T.L.A., P.S.T.), CA.,Division of Cardiovascular Medicine, Department of Medicine (C.T., J.M.S., T.L.A., P.S.T.), Stanford University School of Medicine, CA
| | - Julie Lynch
- Edith Nourse VA Medical Center, Bedford, MA (J.L.).,VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, UT (J.L., S.L.D.)
| | - Themistocles L Assimes
- VA Palo Alto Health Care System (C.T., J.M.S., T.L.A., P.S.T.), CA.,Division of Cardiovascular Medicine, Department of Medicine (C.T., J.M.S., T.L.A., P.S.T.), Stanford University School of Medicine, CA
| | - Linn Åldstedt Nyrønning
- Department of Vascular Surgery, St. Olavs Hospital, Trondheim, Norway (L.Å.N., E.M.).,Department of Circulation and Medical Imaging (L.Å.N., E.M.), Norwegian University of Science and Technology, Trondheim, Norway
| | - Erney Mattsson
- Department of Vascular Surgery, St. Olavs Hospital, Trondheim, Norway (L.Å.N., E.M.).,Department of Circulation and Medical Imaging (L.Å.N., E.M.), Norwegian University of Science and Technology, Trondheim, Norway
| | - Todd L Edwards
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center (T.L.E.), Vanderbilt University Medical Center, Nashville, TN.,Vanderbilt Genetics Institute (T.L.E., J.D.), Vanderbilt University Medical Center, Nashville, TN
| | - Josh Denny
- Vanderbilt Genetics Institute (T.L.E., J.D.), Vanderbilt University Medical Center, Nashville, TN.,Department of Biomedical Informatics (J.D., E.L., D.C.), Vanderbilt University Medical Center, Nashville, TN.,Kaiser Permanente Washington Health Research Institute, Seattle (J.D., E.L., D.C.)
| | - Eric Larson
- Department of Biomedical Informatics (J.D., E.L., D.C.), Vanderbilt University Medical Center, Nashville, TN.,Kaiser Permanente Washington Health Research Institute, Seattle (J.D., E.L., D.C.).,Departments of Medicine and Health Services (E.L.), University of Washington, Seattle
| | - Ming Ta Michael Lee
- Genomic Medicine Institute, Geisinger Health System, Danville, PA (M.T.M.L., Y.Z.)
| | - David Carrell
- Department of Biomedical Informatics (J.D., E.L., D.C.), Vanderbilt University Medical Center, Nashville, TN.,Kaiser Permanente Washington Health Research Institute, Seattle (J.D., E.L., D.C.)
| | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger Health System, Danville, PA (M.T.M.L., Y.Z.)
| | - Gail P Jarvik
- Division of Medical Genetics, Departments of Medicine and Genome Sciences (G.P.J.), University of Washington, Seattle
| | - Ali G Gharavi
- Division of Nephrology and Center for Precision Medicine and Genomics, Columbia University, New York, NY (A.G.G.)
| | - John Harley
- Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, OH (J.H.).,Department of Pediatrics, University of Cincinnati College of Medicine, OH (J.H.).,US Department of Veterans Affairs, Cincinnati, OH (J.H.)
| | - Frank Mentch
- Center for Applied Genomics, The Children's Hospital of Philadelphia, PA (F.M., H.H.)
| | - Jennifer A Pacheco
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL (J.A.P.)
| | - Hakon Hakonarson
- Department of Pediatrics (H.H.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.,Center for Applied Genomics, The Children's Hospital of Philadelphia, PA (F.M., H.H.)
| | - Anne Heidi Skogholt
- Faculty of Medicine and Health Sciences (A.H.S., L.T., M.E.G., K.H., J.B.N.), Norwegian University of Science and Technology, Trondheim, Norway
| | - Laurent Thomas
- Faculty of Medicine and Health Sciences (A.H.S., L.T., M.E.G., K.H., J.B.N.), Norwegian University of Science and Technology, Trondheim, Norway.,Department of Clinical and Molecular Medicine (L.T.), Norwegian University of Science and Technology, Trondheim, Norway
| | - Maiken Elvestad Gabrielsen
- Faculty of Medicine and Health Sciences (A.H.S., L.T., M.E.G., K.H., J.B.N.), Norwegian University of Science and Technology, Trondheim, Norway
| | - Kristian Hveem
- Faculty of Medicine and Health Sciences (A.H.S., L.T., M.E.G., K.H., J.B.N.), Norwegian University of Science and Technology, Trondheim, Norway
| | - Jonas Bille Nielsen
- Faculty of Medicine and Health Sciences (A.H.S., L.T., M.E.G., K.H., J.B.N.), Norwegian University of Science and Technology, Trondheim, Norway.,K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Department of Epidemiology Research, Statens Serum Institute, Copenhagen, Denmark (J.B.N.)
| | - Wei Zhou
- Center for Genomic Medicine (D.K., W.Z., P.N.), Massachusetts General Hospital, Harvard Medical School, Boston.,Stanley Center for Psychiatric Research (W.Z.), Broad Institute of MIT and Harvard, Cambridge, MA.,Analytic and Translational Genetics Unit (W.Z.), Massachusetts General Hospital, Boston
| | - Lars Fritsche
- Department of Biostatistics (L.F.), University of Michigan Medical School, Ann Arbor
| | - Jie Huang
- Boston VA Healthcare System, MA (J.H., P.N., K.C., C.J.O.)
| | - Pradeep Natarajan
- Center for Genomic Medicine (D.K., W.Z., P.N.), Massachusetts General Hospital, Harvard Medical School, Boston.,Department of Medicine (P.N.), Massachusetts General Hospital, Harvard Medical School, Boston.,Cardiovascular Research Center (P.N.), Massachusetts General Hospital, Boston.,Boston VA Healthcare System, MA (J.H., P.N., K.C., C.J.O.)
| | - Yan V Sun
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA (Y.V.S.).,Atlanta VA Health Care System, Decatur, GA (Y.V.S., P.W.F.W.)
| | - Scott L DuVall
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, UT (J.L., S.L.D.).,Division of Epidemiology, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City (S.L.D.)
| | - Daniel J Rader
- Department of Medicine (M.G.L., D.J.R., K.-M.C.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Kelly Cho
- Boston VA Healthcare System, MA (J.H., P.N., K.C., C.J.O.)
| | - Kyong-Mi Chang
- Department of Medicine (M.G.L., D.J.R., K.-M.C.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.,Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA (R.J., M.G.L., K.-M.C., S.M.D.)
| | - Peter W F Wilson
- Atlanta VA Health Care System, Decatur, GA (Y.V.S., P.W.F.W.).,Emory Clinical Cardiovascular Research Institute, Atlanta, GA (P.W.F.W.)
| | - Christopher J O'Donnell
- Boston VA Healthcare System, MA (J.H., P.N., K.C., C.J.O.).,Cardiovascular Medicine Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA (C.J.O.)
| | | | - Salvatore T Scali
- Malcolm Randall VA Medical Center, Gainesville, FL (D.K., S.T.S., S.A.B.).,Division of Vascular Surgery and Endovascular Therapy, University of Florida College of Medicine, Gainesville (D.K., S.T.S., S.A.B.)
| | - Scott A Berceli
- Malcolm Randall VA Medical Center, Gainesville, FL (D.K., S.T.S., S.A.B.).,Division of Vascular Surgery and Endovascular Therapy, University of Florida College of Medicine, Gainesville (D.K., S.T.S., S.A.B.)
| | - Cristen Willer
- Department of Computational Medicine and Bioinformatics (B.N.W., C.W.), University of Michigan Medical School, Ann Arbor.,Department of Internal Medicine, Division of Cardiology (C.W.), University of Michigan Medical School, Ann Arbor.,Department of Human Genetics (C.W.), University of Michigan Medical School, Ann Arbor
| | - Gregory T Jones
- Department of Surgical Sciences, Dunedin School of Medicine, University of Otago, New Zealand (G.T.J.)
| | - Matthew J Bown
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, University of Leicester, United Kingdom (M.J.B.)
| | - Girish Nadkarni
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY (I.P., G.N.)
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN (O.D., I.J.K.)
| | - Marylyn Ritchie
- Department of Genetics (S.S.V., M.R.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Scott M Damrauer
- Department of Surgery (R.J., S.M.D.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.,Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA (R.J., M.G.L., K.-M.C., S.M.D.)
| | - Philip S Tsao
- VA Palo Alto Health Care System (C.T., J.M.S., T.L.A., P.S.T.), CA.,Division of Cardiovascular Medicine, Department of Medicine (C.T., J.M.S., T.L.A., P.S.T.), Stanford University School of Medicine, CA
| | | |
Collapse
|
10
|
Investigation of gene-gene interactions in cardiac traits and serum fatty acid levels in the LURIC Health Study. PLoS One 2020; 15:e0238304. [PMID: 32915819 PMCID: PMC7485803 DOI: 10.1371/journal.pone.0238304] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Accepted: 08/13/2020] [Indexed: 01/25/2023] Open
Abstract
Epistasis analysis elucidates the effects of gene-gene interactions (G×G) between multiple loci for complex traits. However, the large computational demands and the high multiple testing burden impede their discoveries. Here, we illustrate the utilization of two methods, main effect filtering based on individual GWAS results and biological knowledge-based modeling through Biofilter software, to reduce the number of interactions tested among single nucleotide polymorphisms (SNPs) for 15 cardiac-related traits and 14 fatty acids. We performed interaction analyses using the two filtering methods, adjusting for age, sex, body mass index (BMI), waist-hip ratio, and the first three principal components from genetic data, among 2,824 samples from the Ludwigshafen Risk and Cardiovascular (LURIC) Health Study. Using Biofilter, one interaction nearly met Bonferroni significance: an interaction between rs7735781 in XRCC4 and rs10804247 in XRCC5 was identified for venous thrombosis with a Bonferroni-adjusted likelihood ratio test (LRT) p: 0.0627. A total of 57 interactions were identified from main effect filtering for the cardiac traits G×G (10) and fatty acids G×G (47) at Bonferroni-adjusted LRT p < 0.05. For cardiac traits, the top interaction involved SNPs rs1383819 in SNTG1 and rs1493939 (138kb from 5’ of SAMD12) with Bonferroni-adjusted LRT p: 0.0228 which was significantly associated with history of arterial hypertension. For fatty acids, the top interaction between rs4839193 in KCND3 and rs10829717 in LOC107984002 with Bonferroni-adjusted LRT p: 2.28×10−5 was associated with 9-trans 12-trans octadecanoic acid, an omega-6 trans fatty acid. The model inflation factor for the interactions under different filtering methods was evaluated from the standard median and the linear regression approach. Here, we applied filtering approaches to identify numerous genetic interactions related to cardiac-related outcomes as potential targets for therapy. The approaches described offer ways to detect epistasis in the complex traits and to improve precision medicine capability.
Collapse
|
11
|
Verma SS, Bergmeijer TO, Gong L, Reny JL, Lewis JP, Mitchell BD, Alexopoulos D, Aradi D, Altman RB, Bliden K, Bradford Y, Campo G, Chang K, Cleator JH, Déry JP, Dridi NP, Fernandez-Cadenas I, Fontana P, Gawaz M, Geisler T, Gensini GF, Giusti B, Gurbel PA, Hochholzer W, Holmvang L, Kim EY, Kim HS, Marcucci R, Montaner J, Backman JD, Pakyz RE, Roden DM, Schaeffeler E, Schwab M, Shin JG, Siller-Matula JM, Ten Berg JM, Trenk D, Valgimigli M, Wallace J, Wen MS, Kubo M, Lee MTM, Whaley R, Winter S, Klein TE, Shuldiner AR, Ritchie MD. Genomewide Association Study of Platelet Reactivity and Cardiovascular Response in Patients Treated With Clopidogrel: A Study by the International Clopidogrel Pharmacogenomics Consortium. Clin Pharmacol Ther 2020; 108:1067-1077. [PMID: 32472697 PMCID: PMC7689744 DOI: 10.1002/cpt.1911] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 05/08/2020] [Indexed: 01/07/2023]
Abstract
Antiplatelet response to clopidogrel shows wide variation, and poor response is correlated with adverse clinical outcomes. CYP2C19 loss‐of‐function alleles play an important role in this response, but account for only a small proportion of variability in response to clopidogrel. An aim of the International Clopidogrel Pharmacogenomics Consortium (ICPC) is to identify other genetic determinants of clopidogrel pharmacodynamics and clinical response. A genomewide association study (GWAS) was performed using DNA from 2,750 European ancestry individuals, using adenosine diphosphate‐induced platelet reactivity and major cardiovascular and cerebrovascular events as outcome parameters. GWAS for platelet reactivity revealed a strong signal for CYP2C19*2 (P value = 1.67e−33). After correction for CYP2C19*2 no other single‐nucleotide polymorphism reached genomewide significance. GWAS for a combined clinical end point of cardiovascular death, myocardial infarction, or stroke (5.0% event rate), or a combined end point of cardiovascular death or myocardial infarction (4.7% event rate) showed no significant results, although in coronary artery disease, percutaneous coronary intervention, and acute coronary syndrome subgroups, mutations in SCOS5P1, CDC42BPA, and CTRAC1 showed genomewide significance (lowest P values: 1.07e−09, 4.53e−08, and 2.60e−10, respectively). CYP2C19*2 is the strongest genetic determinant of on‐clopidogrel platelet reactivity. We identified three novel associations in clinical outcome subgroups, suggestive for each of these outcomes.
Collapse
Affiliation(s)
- Shefali Setia Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Thomas O Bergmeijer
- Department of Cardiology, St. Antonius Center for Platelet Function Research, Nieuwegein, The Netherlands
| | - Li Gong
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Jean-Luc Reny
- Internal Medicine, Béziers Hospital, Béziers, France.,Geneva Platelet Group, School of Medicine, University of Geneva, Geneva, Switzerland.,Department of Internal Medicine, Rehabilitation and Geriatrics, University Hospitals of Geneva, Geneva, Switzerland.,Geneva Platelet Group and Division of Angiology and Haemostasis, University Hospitals of Geneva, Geneva, Switzerland
| | - Joshua P Lewis
- Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Braxton D Mitchell
- Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland, Baltimore, Maryland, USA.,Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, Maryland, USA
| | - Dimitrios Alexopoulos
- National and Kapodistrian University of Athens Medical School, Attikon University Hospital, Athens, Greece
| | - Daniel Aradi
- Department of Cardiology, Heart Center Balatonfüred, Balatonfüred, Hungary
| | - Russ B Altman
- Department of Bioengineering, Genetics and Medicine, Stanford University, Stanford, California, USA
| | - Kevin Bliden
- Sinai Center for Thrombosis Research and Drug Development, Baltimore, Maryland, USA
| | - Yuki Bradford
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Gianluca Campo
- Cardiology Unit, Azienda Ospedaliero-Universitaria di Ferrara, Ferrara and Maria Cecilia Hospital, GVM Care and Research, Cotignola, Italy
| | - Kiyuk Chang
- Department of Internal Medicine, Cardiology Division, Seoul St. Mary's Hospital, The Catholic University of Korea, Seoul, South Korea
| | - John H Cleator
- Division of Cardiology and Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jean-Pierre Déry
- Quebec Heart and Lung Institute, University Laval, Quebec City, QC, Canada
| | - Nadia P Dridi
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Israel Fernandez-Cadenas
- Neurology, Stroke Pharmacogenomics and Genetics Group, Sant Pau Institute of Research, Barcelona, Spain
| | - Pierre Fontana
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA.,Geneva Platelet Group and Division of Angiology and Haemostasis, University Hospitals of Geneva, Geneva, Switzerland
| | - Meinrad Gawaz
- Department of Cardiology and Angiology, University of Tübingen, Tübingen, Germany
| | - Tobias Geisler
- Department of Cardiology and Angiology, Medizinische Klinik III, University Hospital Tübingen, Tübingen, Germany
| | - Gian Franco Gensini
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| | - Betti Giusti
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| | - Paul A Gurbel
- Sinai Center for Thrombosis Research and Drug Development, Baltimore, Maryland, USA
| | - Willibald Hochholzer
- Department of Cardiology and Angiology II, University Heart Center Freiburg Bad Krozingen, Bad Krozingen, Germany
| | - Lene Holmvang
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Eun-Young Kim
- Department of Clinical Pharmacology, Inje University, Busan Paik Hospital, Busan, South Korea
| | - Ho-Sook Kim
- Department of Clinical Pharmacology, Inje University, Busan Paik Hospital, Busan, South Korea
| | - Rossella Marcucci
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| | - Joan Montaner
- Neurovascular Research Laboratory, Vall d'Hebron Institute of Research, Barcelona, Spain
| | - Joshua D Backman
- Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Ruth E Pakyz
- Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Dan M Roden
- Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Elke Schaeffeler
- Dr. Margarete Fischer-Bosch Institute of Clinical Pharmacology and University of Tübingen, Tübingen, Germany
| | - Matthias Schwab
- Dr. Margarete Fischer-Bosch Institute of Clinical Pharmacology and University of Tübingen, Tübingen, Germany.,Department of Clinical Pharmacology, and Pharmacy and Biochemistry, University of Tübingen, Tübingen, Germany
| | - Jae Gook Shin
- Department of Clinical Pharmacology, Inje University, Busan Paik Hospital, Busan, South Korea.,Department of Pharmacology and Pharmacogenomics Research Center, Inje University, Busan Paik Hospital, Busan, South Korea
| | - Jolanta M Siller-Matula
- Department of Internal Medicine II, Division of Cardiology, Medical University of Vienna, Vienna, Austria.,Department of Experimental and Clinical Pharmacology, Centre for Preclinical Research and Technology (CEPT), Medical University of Warsaw, Warsaw, Poland
| | - Jurriën M Ten Berg
- Department of Cardiology, St. Antonius Center for Platelet Function Research, Nieuwegein, The Netherlands
| | - Dietmar Trenk
- Department of Cardiology and Angiology II, University Heart Center Freiburg Bad Krozingen, Bad Krozingen, Germany.,Department of Clinical Pharmacology, University Heart Centre Freiburg, Bad Krozingen, Germany
| | - Marco Valgimigli
- Department of Cardiology, Swiss Cardiovascular Center Bern, Bern University Hospital, Bern, Switzerland
| | - John Wallace
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, Pennsylvania, USA
| | - Ming-Shien Wen
- Division of Cardiology, Department of Internal Medicine, Chang Gung Memorial Hospital, Linkou and School of Medicine, Chang Gung University, Taoyuan City, Taiwan
| | - Michiaki Kubo
- Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
| | | | - Ryan Whaley
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Stefan Winter
- Dr. Margarete Fischer-Bosch Institute of Clinical Pharmacology and University of Tübingen, Tübingen, Germany
| | - Teri E Klein
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA.,Department of Medicine, Stanford University, Stanford, California, USA
| | - Alan R Shuldiner
- Department of Medicine and Program for Personalized and Genomic Medicine, University of Maryland, Baltimore, Maryland, USA
| | - Marylyn D Ritchie
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
12
|
Lucas AM, Palmiero NE, McGuigan J, Passero K, Zhou J, Orie D, Ritchie MD, Hall MA. CLARITE Facilitates the Quality Control and Analysis Process for EWAS of Metabolic-Related Traits. Front Genet 2019; 10:1240. [PMID: 31921293 PMCID: PMC6930237 DOI: 10.3389/fgene.2019.01240] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 11/08/2019] [Indexed: 02/03/2023] Open
Abstract
While genome-wide association studies are an established method of identifying genetic variants associated with disease, environment-wide association studies (EWAS) highlight the contribution of nongenetic components to complex phenotypes. However, the lack of high-throughput quality control (QC) pipelines for EWAS data lends itself to analysis plans where the data are cleaned after a first-pass analysis, which can lead to bias, or are cleaned manually, which is arduous and susceptible to user error. We offer a novel software, CLeaning to Analysis: Reproducibility-based Interface for Traits and Exposures (CLARITE), as a tool to efficiently clean environmental data, perform regression analysis, and visualize results on a single platform through user-guided automation. It exists as both an R package and a Python package. Though CLARITE focuses on EWAS, it is intended to also improve the QC process for phenotypes and clinical lab measures for a variety of downstream analyses, including phenome-wide association studies and gene-environment interaction studies. With the goal of demonstrating the utility of CLARITE, we performed a novel EWAS in the National Health and Nutrition Examination Survey (NHANES) (N overall Discovery=9063, N overall Replication=9874) for body mass index (BMI) and over 300 environment variables post-QC, adjusting for sex, age, race, socioeconomic status, and survey year. The analysis used survey weights along with cluster and strata information in order to account for the complex survey design. Sixteen BMI results replicated at a Bonferroni corrected p < 0.05. The top replicating results were serum levels of g-tocopherol (vitamin E) (Discovery Bonferroni p: 8.67x10-12, Replication Bonferroni p: 2.70x10-9) and iron (Discovery Bonferroni p: 1.09x10-8, Replication Bonferroni p: 1.73x10-10). Results of this EWAS are important to consider for metabolic trait analysis, as BMI is tightly associated with these phenotypes. As such, exposures predictive of BMI may be useful for covariate and/or interaction assessment of metabolic-related traits. CLARITE allows improved data quality for EWAS, gene-environment interactions, and phenome-wide association studies by establishing a high-throughput quality control infrastructure. Thus, CLARITE is recommended for studying the environmental factors underlying complex disease.
Collapse
Affiliation(s)
- Anastasia M Lucas
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| | - Nicole E Palmiero
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, PA, United States
| | - John McGuigan
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, PA, United States
| | - Kristin Passero
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, PA, United States.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, United States
| | - Jiayan Zhou
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, PA, United States
| | - Deven Orie
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, PA, United States
| | - Marylyn D Ritchie
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| | - Molly A Hall
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, PA, United States.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|
13
|
Manduchi E, Orzechowski PR, Ritchie MD, Moore JH. Exploration of a diversity of computational and statistical measures of association for genome-wide genetic studies. BioData Min 2019; 12:14. [PMID: 31320928 PMCID: PMC6617598 DOI: 10.1186/s13040-019-0201-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 06/14/2019] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The principal line of investigation in Genome Wide Association Studies (GWAS) is the identification of main effects, that is individual Single Nucleotide Polymorphisms (SNPs) which are associated with the trait of interest, independent of other factors. A variety of methods have been proposed to this end, mostly statistical in nature and differing in assumptions and type of model employed. Moreover, for a given model, there may be multiple choices for the SNP genotype encoding. As an alternative to statistical methods, machine learning methods are often applicable. Typically, for a given GWAS, a single approach is selected and utilized to identify potential SNPs of interest. Even when multiple GWAS are combined through meta-analyses within a consortium, each GWAS is typically analyzed with a single approach and the resulting summary statistics are then utilized in meta-analyses. RESULTS In this work we use as case studies a Type 2 Diabetes (T2D) and a breast cancer GWAS to explore a diversity of applicable approaches spanning different methods and encoding choices. We assess similarity of these approaches based on the derived ranked lists of SNPs and, for each GWAS, we identify a subset of representative approaches that we use as an ensemble to derive a union list of top SNPs. Among these are SNPs which are identified by multiple approaches as well as several SNPs identified by only one or a few of the less frequently used approaches. The latter include SNPs from established loci and SNPs which have other supporting lines of evidence in terms of their potential relevance to the traits. CONCLUSIONS Not every main effect analysis method is suitable for every GWAS, but for each GWAS there are typically multiple applicable methods and encoding options. We suggest a workflow for a single GWAS, extensible to multiple GWAS from consortia, where representative approaches are selected among a pool of suitable options, to yield a more comprehensive set of SNPs, potentially including SNPs that would typically be missed with the most popular analyses, but that could provide additional valuable insights for follow-up.
Collapse
Affiliation(s)
- Elisabetta Manduchi
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA USA
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA USA
| | - Patryk R. Orzechowski
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA USA
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA USA
| | - Marylyn D. Ritchie
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA USA
| | - Jason H. Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA USA
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA USA
| |
Collapse
|
14
|
Zhou H, Sinsheimer JS, Bates DM, Chu BB, German CA, Ji SS, Keys KL, Kim J, Ko S, Mosher GD, Papp JC, Sobel EM, Zhai J, Zhou JJ, Lange K. OPENMENDEL: a cooperative programming project for statistical genetics. Hum Genet 2019; 139:61-71. [PMID: 30915546 DOI: 10.1007/s00439-019-02001-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 03/15/2019] [Indexed: 01/06/2023]
Abstract
Statistical methods for genome-wide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet this need is called the OPENMENDEL project (https://openmendel.github.io). It aims to (1) enable interactive and reproducible analyses with informative intermediate results, (2) scale to big data analytics, (3) embrace parallel and distributed computing, (4) adapt to rapid hardware evolution, (5) allow cloud computing, (6) allow integration of varied genetic data types, and (7) foster easy communication between clinicians, geneticists, statisticians, and computer scientists. This article reviews and makes recommendations to the genetic epidemiology community in the context of the OPENMENDEL project.
Collapse
Affiliation(s)
- Hua Zhou
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, USA.
| | - Janet S Sinsheimer
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USA.
| | - Douglas M Bates
- Department of Statistics, University of Wisconsin, Madison, USA
| | - Benjamin B Chu
- Department of Biomathematics, David Geffen School of Medicine at UCLA, Los Angeles, USA
| | - Christopher A German
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, USA
| | - Sarah S Ji
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, USA
| | - Kevin L Keys
- Department of Medicine, University of California, San Francisco, USA
| | - Juhyun Kim
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, USA
| | - Seyoon Ko
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Gordon D Mosher
- Departments of Statistics and Computer Science, University of California, Riverside, USA
| | - Jeanette C Papp
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USA
| | - Eric M Sobel
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USA
| | - Jing Zhai
- Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, USA
| | - Jin J Zhou
- Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, USA
| | - Kenneth Lange
- Department of Biomathematics, David Geffen School of Medicine at UCLA, Los Angeles, USA.
| |
Collapse
|
15
|
Influence of tissue context on gene prioritization for predicted transcriptome-wide association studies. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019; 24:296-307. [PMID: 30864331 PMCID: PMC6417797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Transcriptome-wide association studies (TWAS) have recently gained great attention due to their ability to prioritize complex trait-associated genes and promote potential therapeutics development for complex human diseases. TWAS integrates genotypic data with expression quantitative trait loci (eQTLs) to predict genetically regulated gene expression components and associates predictions with a trait of interest. As such, TWAS can prioritize genes whose differential expressions contribute to the trait of interest and provide mechanistic explanation of complex trait(s). Tissue-specific eQTL information grants TWAS the ability to perform association analysis on tissues whose gene expression profiles are otherwise hard to obtain, such as liver and heart. However, as eQTLs are tissue context-dependent, whether and how the tissue-specificity of eQTLs influences TWAS gene prioritization has not been fully investigated. In this study, we addressed this question by adopting two distinct TWAS methods, PrediXcan and UTMOST, which assume single tissue and integrative tissue effects of eQTLs, respectively. Thirty-eight baseline laboratory traits in 4,360 antiretroviral treatment-naïve individuals from the AIDS Clinical Trials Group (ACTG) studies comprised the input dataset for TWAS. We performed TWAS in a tissue-specific manner and obtained a total of 430 significant gene-trait associations (q-value < 0.05) across multiple tissues. Single tissue-based analysis by PrediXcan contributed 116 of the 430 associations including 64 unique gene-trait pairs in 28 tissues. Integrative tissue-based analysis by UTMOST found the other 314 significant associations that include 50 unique gene-trait pairs across all 44 tissues. Both analyses were able to replicate some associations identified in past variant-based genome-wide association studies (GWAS), such as high-density lipoprotein (HDL) and CETP (PrediXcan, q-value = 3.2e-16). Both analyses also identified novel associations. Moreover, single tissue-based and integrative tissuebased analysis shared 11 of 103 unique gene-trait pairs, for example, PSRC1-low-density lipoprotein (PrediXcan's lowest q-value = 8.5e-06; UTMOST's lowest q-value = 1.8e-05). This study suggests that single tissue-based analysis may have performed better at discovering gene-trait associations when combining results from all tissues. Integrative tissue-based analysis was better at prioritizing genes in multiple tissues and in trait-related tissue. Additional exploration is needed to confirm this conclusion. Finally, although single tissue-based and integrative tissue-based analysis shared significant novel discoveries, tissue context-dependency of eQTLs impacted TWAS gene prioritization. This study provides preliminary data to support continued work on tissue contextdependency of eQTL studies and TWAS.
Collapse
|
16
|
Zhang X, Veturi Y, Verma S, Bone W, Verma A, Lucas A, Hebbring S, Denny JC, Stanaway IB, Jarvik GP, Crosslin D, Larson EB, Rasmussen-Torvik L, Pendergrass SA, Smoller JW, Hakonarson H, Sleiman P, Weng C, Fasel D, Wei WQ, Kullo I, Schaid D, Chung WK, Ritchie MD. Detecting potential pleiotropy across cardiovascular and neurological diseases using univariate, bivariate, and multivariate methods on 43,870 individuals from the eMERGE network. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019; 24:272-283. [PMID: 30864329 PMCID: PMC6457436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The link between cardiovascular diseases and neurological disorders has been widely observed in the aging population. Disease prevention and treatment rely on understanding the potential genetic nexus of multiple diseases in these categories. In this study, we were interested in detecting pleiotropy, or the phenomenon in which a genetic variant influences more than one phenotype. Marker-phenotype association approaches can be grouped into univariate, bivariate, and multivariate categories based on the number of phenotypes considered at one time. Here we applied one statistical method per category followed by an eQTL colocalization analysis to identify potential pleiotropic variants that contribute to the link between cardiovascular and neurological diseases. We performed our analyses on ~530,000 common SNPs coupled with 65 electronic health record (EHR)-based phenotypes in 43,870 unrelated European adults from the Electronic Medical Records and Genomics (eMERGE) network. There were 31 variants identified by all three methods that showed significant associations across late onset cardiac- and neurologic- diseases. We further investigated functional implications of gene expression on the detected "lead SNPs" via colocalization analysis, providing a deeper understanding of the discovered associations. In summary, we present the framework and landscape for detecting potential pleiotropy using univariate, bivariate, multivariate, and colocalization methods. Further exploration of these potentially pleiotropic genetic variants will work toward understanding disease causing mechanisms across cardiovascular and neurological diseases and may assist in considering disease prevention as well as drug repositioning in future research.
Collapse
Affiliation(s)
- Xinyuan Zhang
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA*Authors contributed equally to this work
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Manduchi E, Williams SM, Chesi A, Johnson ME, Wells AD, Grant SFA, Moore JH. Leveraging epigenomics and contactomics data to investigate SNP pairs in GWAS. Hum Genet 2018; 137:413-425. [PMID: 29797095 PMCID: PMC5996751 DOI: 10.1007/s00439-018-1893-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Accepted: 05/20/2018] [Indexed: 12/29/2022]
Abstract
Although Genome Wide Association Studies (GWAS) have led to many valuable insights into the genetic bases of common diseases over the past decade, the issue of missing heritability has surfaced, as the discovered main effect genetic variants found to date do not account for much of a trait's predicted genetic component. We present a workflow, integrating epigenomics and topologically associating domain data, aimed at discovering trait-associated SNP pairs from GWAS where neither SNP achieved independent genome-wide significance. Each analyzed SNP pair consists of one SNP in a putative active enhancer and another SNP in a putative physically interacting gene promoter in a trait-relevant tissue. As a proof-of-principle case study, we used this approach to identify focused collections of SNP pairs that we analyzed in three independent Type 2 diabetes (T2D) GWAS. This approach led us to discover 35 significant SNP pairs, encompassing both novel signals and signals for which we have found orthogonal support from other sources. Nine of these pairs are consistent with eQTL results, two are consistent with our own capture C experiments, and seven involve signals supported by recent T2D literature.
Collapse
Affiliation(s)
- Elisabetta Manduchi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Scott M Williams
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| | - Alessandra Chesi
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Matthew E Johnson
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Andrew D Wells
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Struan F A Grant
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
18
|
Verma SS, Lucas A, Zhang X, Veturi Y, Dudek S, Li B, Li R, Urbanowicz R, Moore JH, Kim D, Ritchie MD. Collective feature selection to identify crucial epistatic variants. BioData Min 2018; 11:5. [PMID: 29713383 PMCID: PMC5907720 DOI: 10.1186/s13040-018-0168-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 04/04/2018] [Indexed: 01/17/2023] Open
Abstract
Background Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remains a challenge due to the large number of features and relatively small sample size as input, thus leading to the so-called "short fat data" problem. The efficiency of machine learning methods can be increased by limiting the number of input features. Thus, it is very important to perform variable selection before searching for epistasis. Many methods have been evaluated and proposed to perform feature selection, but no single method works best in all scenarios. We demonstrate this by conducting two separate simulation analyses to evaluate the proposed collective feature selection approach. Results Through our simulation study we propose a collective feature selection approach to select features that are in the "union" of the best performing methods. We explored various parametric, non-parametric, and data mining approaches to perform feature selection. We choose our top performing methods to select the union of the resulting variables based on a user-defined percentage of variants selected from each method to take to downstream analysis. Our simulation analysis shows that non-parametric data mining approaches, such as MDR, may work best under one simulation criteria for the high effect size (penetrance) datasets, while non-parametric methods designed for feature selection, such as Ranger and Gradient boosting, work best under other simulation criteria. Thus, using a collective approach proves to be more beneficial for selecting variables with epistatic effects also in low effect size datasets and different genetic architectures. Following this, we applied our proposed collective feature selection approach to select the top 1% of variables to identify potential interacting variables associated with Body Mass Index (BMI) in ~ 44,000 samples obtained from Geisinger's MyCode Community Health Initiative (on behalf of DiscovEHR collaboration). Conclusions In this study, we were able to show that selecting variables using a collective feature selection approach could help in selecting true positive epistatic variables more frequently than applying any single method for feature selection via simulation studies. We were able to demonstrate the effectiveness of collective feature selection along with a comparison of many methods in our simulation analysis. We also applied our method to identify non-linear networks associated with obesity.
Collapse
Affiliation(s)
- Shefali S Verma
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Anastasia Lucas
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Xinyuan Zhang
- 2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Yogasudha Veturi
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Scott Dudek
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Binglan Li
- 2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Ruowang Li
- 3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Ryan Urbanowicz
- 3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Jason H Moore
- 3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| | - Dokyoon Kim
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA
| | - Marylyn D Ritchie
- 1Biomedical and Translational Bioinformatics Institute, Geisinger Health System, 100 N Academy Avenue, Danville, PA 17822 USA.,2Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA USA.,3Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Richards Building, 3700 Hamilton Walk, Philadelphia, PA 19104 USA
| |
Collapse
|
19
|
Verma A, Lucas A, Verma SS, Zhang Y, Josyula N, Khan A, Hartzel DN, Lavage DR, Leader J, Ritchie MD, Pendergrass SA. PheWAS and Beyond: The Landscape of Associations with Medical Diagnoses and Clinical Measures across 38,662 Individuals from Geisinger. Am J Hum Genet 2018; 102:592-608. [PMID: 29606303 PMCID: PMC5985339 DOI: 10.1016/j.ajhg.2018.02.017] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Accepted: 02/20/2018] [Indexed: 01/23/2023] Open
Abstract
Most phenome-wide association studies (PheWASs) to date have used a small to moderate number of SNPs for association with phenotypic data. We performed a large-scale single-cohort PheWAS, using electronic health record (EHR)-derived case-control status for 541 diagnoses using International Classification of Disease version 9 (ICD-9) codes and 25 median clinical laboratory measures. We calculated associations between these diagnoses and traits with ∼630,000 common frequency SNPs with minor allele frequency > 0.01 for 38,662 individuals. In this landscape PheWAS, we explored results within diseases and traits, comparing results to those previously reported in genome-wide association studies (GWASs), as well as previously published PheWASs. We further leveraged the context of functional impact from protein-coding to regulatory regions, providing a deeper interpretation of these associations. The comprehensive nature of this PheWAS allows for novel hypothesis generation, the identification of phenotypes for further study for future phenotypic algorithm development, and identification of cross-phenotype associations.
Collapse
Affiliation(s)
- Anurag Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Anastasia Lucas
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shefali S Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Yu Zhang
- Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA
| | - Navya Josyula
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Anqa Khan
- Mount Holyoke College, South Hadley, MA 01075, USA
| | - Dustin N Hartzel
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Daniel R Lavage
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Joseph Leader
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA; The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA; Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Sarah A Pendergrass
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA 17822, USA.
| |
Collapse
|
20
|
Verma A, Bradford Y, Dudek S, Lucas AM, Verma SS, Pendergrass SA, Ritchie MD. A simulation study investigating power estimates in phenome-wide association studies. BMC Bioinformatics 2018; 19:120. [PMID: 29618318 PMCID: PMC5885318 DOI: 10.1186/s12859-018-2135-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 03/26/2018] [Indexed: 01/01/2023] Open
Abstract
Background Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. Results We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. Conclusions This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses. Electronic supplementary material The online version of this article (10.1186/s12859-018-2135-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anurag Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA
| | - Yuki Bradford
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Scott Dudek
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Anastasia M Lucas
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
| | - Shefali S Verma
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA
| | | | - Marylyn D Ritchie
- Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA. .,The Huck Institutes of the Life Science, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
21
|
Ritchie MD, Van Steen K. The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. ANNALS OF TRANSLATIONAL MEDICINE 2018; 6:157. [PMID: 29862246 DOI: 10.21037/atm.2018.04.05] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
One of the primary goals in this era of precision medicine is to understand the biology of human diseases and their treatment, such that each individual patient receives the best possible treatment for their disease based on their genetic and environmental exposures. One way to work towards achieving this goal is to identify the environmental exposures and genetic variants that are relevant to each disease in question, as well as the complex interplay between genes and environment. Genome-wide association studies (GWAS) have allowed for a greater understanding of the genetic component of many complex traits. However, these genetic effects are largely small and thus, our ability to use these GWAS finding for precision medicine is limited. As more and more GWAS have been performed, rather than focusing only on common single nucleotide polymorphisms (SNPs) and additive genetic models, many researchers have begun to explore alternative heritable components of complex traits including rare variants, structural variants, epigenetics, and genetic interactions. While genetic interactions are a plausible reality that could explain some of the heritabliy that has not yet been identified, especially when one considers the identification of genetic interactions in model organisms as well as our understanding of biological complexity, still there are significant challenges and considerations in identifying these genetic interactions. Broadly, these can be summarized in three categories: abundance of methods, practical considerations, and biological interpretation. In this review, we will discuss these important elements in the search for genetic interactions along with some potential solutions. While genetic interactions are theoretically understood to be important for complex human disease, the body of evidence is still building to support this component of the underlying genetic architecture of complex human traits. Our hope is that more sophisticated modeling approaches and more robust computational techniques will enable the community to identify these important genetic interactions and improve our ability to implement precision medicine in the future.
Collapse
Affiliation(s)
- Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics Unit - BIO3, University of Liège, Liège, Belgium.,Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
22
|
Cha EDK, Veturi Y, Agarwal C, Patel A, Arbabshirani MR, Pendergrass SA. Using Adipose Measures from Health Care Provider-Based Imaging Data for Discovery. J Obes 2018; 2018:3253096. [PMID: 30363675 PMCID: PMC6180992 DOI: 10.1155/2018/3253096] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Accepted: 07/18/2018] [Indexed: 12/13/2022] Open
Abstract
The location and type of adipose tissue is an important factor in metabolic syndrome. A database of picture archiving and communication system (PACS) derived abdominal computerized tomography (CT) images from a large health care provider, Geisinger, was used for large-scale research of the relationship of volume of subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT) with obesity-related diseases and clinical laboratory measures. Using a "greedy snake" algorithm and 2,545 CT images from the Geisinger PACS, we measured levels of VAT, SAT, total adipose tissue (TAT), and adipose ratio volumes. Sex-combined and sex-stratified association testing was done between adipose measures and 1,233 disease diagnoses and 37 clinical laboratory measures. A genome-wide association study (GWAS) for adipose measures was also performed. SAT was strongly associated with obesity and morbid obesity. VAT levels were strongly associated with type 2 diabetes-related diagnoses (p = 1.5 × 10-58), obstructive sleep apnea (p = 7.7 × 10-37), high-density lipoprotein (HDL) levels (p = 1.42 × 10-36), triglyceride levels (p = 1.44 × 10-43), and white blood cell (WBC) counts (p = 7.37 × 10-9). Sex-stratified tests revealed stronger associations among women, indicating the increased influence of VAT on obesity-related disease outcomes particularly among women. The GWAS identified some suggestive associations. This study supports the utility of pursuing future clinical and genetic discoveries with existing imaging data-derived adipose tissue measures deployed at a larger scale.
Collapse
Affiliation(s)
- Elliot D. K. Cha
- Biomedical and Translational Informatics Institute, Geisinger Research, Danville, PA, USA
| | - Yogasudha Veturi
- Biomedical and Translational Informatics Institute, Geisinger Research, Danville, PA, USA
| | - Chirag Agarwal
- Department of Imaging Science and Innovation, Geisinger Research, Danville, PA, USA
- Department of Electrical & Computer Engineering, University of Illinois at Chicago, Chicago, IL, USA
- Department of Radiology, Geisinger, Danville, PA, USA
| | - Aalpen Patel
- Department of Imaging Science and Innovation, Geisinger Research, Danville, PA, USA
- Department of Radiology, Geisinger, Danville, PA, USA
| | - Mohammad R. Arbabshirani
- Biomedical and Translational Informatics Institute, Geisinger Research, Danville, PA, USA
- Department of Imaging Science and Innovation, Geisinger Research, Danville, PA, USA
| | - Sarah A. Pendergrass
- Biomedical and Translational Informatics Institute, Geisinger Research, Danville, PA, USA
| |
Collapse
|
23
|
Manduchi E, Chesi A, Hall MA, Grant SFA, Moore JH. Leveraging putative enhancer-promoter interactions to investigate two-way epistasis in Type 2 Diabetes GWAS. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2018; 23:548-558. [PMID: 29218913 PMCID: PMC5728670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We utilized evidence for enhancer-promoter interactions from functional genomics data in order to build biological filters to narrow down the search space for two-way Single Nucleotide Polymorphism (SNP) interactions in Type 2 Diabetes (T2D) Genome Wide Association Studies (GWAS). This has led us to the identification of a reproducible statistically significant SNP pair associated with T2D. As more functional genomics data are being generated that can help identify potentially interacting enhancer-promoter pairs in larger collection of tissues/cells, this approach has implications for investigation of epistasis from GWAS in general.
Collapse
Affiliation(s)
- Elisabetta Manduchi
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA, ²Division of Human Genetics and Endocrinology, The Children's Hospital of Philadelphia, 3615 Civic Center Boulevard, Philadelphia, PA 19104, USA,
| | | | | | | | | |
Collapse
|
24
|
Abstract
PURPOSE OF REVIEW Over many decades, researchers have been designing studies to investigate the relationship between genotypes and phenotypes to gain an understanding about the effect of genetics on disease. Recently, a high-throughput approach called phenome-wide associations studies (PheWAS) have been extensively used to identify associations between genetic variants and many diseases and traits simultaneously. In this review, we describe the value of PheWAS along with methodological issues and challenges in interpretation for current applications of PheWAS. RECENT FINDINGS PheWAS have uncovered a paradigm to identify new associations for genetic loci across many diseases. The application of PheWAS have been effective with phenotype data from electronic health records, epidemiological studies, and clinical trials data. SUMMARY The key strength of a PheWAS is to identify the association of one or more genetic variants with multiple phenotypes, which can showcase interconnections among the phenotypes due to shared genetic associations. While the PheWAS approach appears promising, there are a number of challenges that need to be addressed to provide additional robustness to PheWAS findings.
Collapse
Affiliation(s)
- Anurag Verma
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
| | - Marylyn D Ritchie
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA
| |
Collapse
|