1
|
Gregg JT, Himes BE, Asselbergs FW, Moore JH. Improving Genetic Association Studies with a Novel Methodology that Unveils the Hidden Complexity of All-Cause Heart Failure. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.08.02.23293567. [PMID: 37577697 PMCID: PMC10418568 DOI: 10.1101/2023.08.02.23293567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Motivation Genome-Wide Association Studies (GWAS) commonly assume phenotypic and genetic homogeneity that is not present in complex conditions. We designed Transformative Regression Analysis of Combined Effects (TRACE), a GWAS methodology that better accounts for clinical phenotype heterogeneity and identifies gene-by-environment (GxE) interactions. We demonstrated with UK Biobank (UKB) data that TRACE increased the variance explained in All-Cause Heart Failure (AHF) via the discovery of novel single nucleotide polymorphism (SNP) and SNP-by-environment (i.e. GxE) interaction associations. First, we transformed 312 AHF-related ICD10 codes (including AHF) into continuous low-dimensional features (i.e., latent phenotypes) for a more nuanced disease representation. Then, we ran a standard GWAS on our latent phenotypes to discover main effects and identified GxE interactions with target encoding. Genes near associated SNPs subsequently underwent enrichment analysis to explore potential functional mechanisms underlying associations. Latent phenotypes were regressed against their SNP hits and the estimated latent phenotype values were used to measure the amount of AHF variance explained. Results Our method identified over 100 main GWAS effects that were consistent with prior studies and hundreds of novel gene-by-smoking interactions, which collectively accounted for approximately 10% of AHF variance. This represents an improvement over traditional GWAS whose results account for a negligible proportion of AHF variance. Enrichment analyses suggested that hundreds of miRNAs mediated the SNP effect on various AHF-related biological pathways. The TRACE framework can be applied to decode the genetics of other complex diseases. Availability All code is available at https://github.com/EpistasisLab/latent_phenotype_project.
Collapse
Affiliation(s)
- John T. Gregg
- Department of Biostatistics Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Blanca E. Himes
- Department of Biostatistics Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | | | - Jason H. Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| |
Collapse
|
2
|
Woodward AA, Urbanowicz RJ, Naj AC, Moore JH. Genetic heterogeneity: Challenges, impacts, and methods through an associative lens. Genet Epidemiol 2022; 46:555-571. [PMID: 35924480 PMCID: PMC9669229 DOI: 10.1002/gepi.22497] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 01/07/2023]
Abstract
Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.
Collapse
Affiliation(s)
- Alexa A. Woodward
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Ryan J. Urbanowicz
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| | - Adam C. Naj
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Jason H. Moore
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| |
Collapse
|
3
|
Keith MH, Flinn MV, Durbin HJ, Rowan TN, Blomquist GE, Taylor KH, Taylor JF, Decker JE. Genetic ancestry, admixture, and population structure in rural Dominica. PLoS One 2021; 16:e0258735. [PMID: 34731205 PMCID: PMC8565749 DOI: 10.1371/journal.pone.0258735] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 10/04/2021] [Indexed: 12/23/2022] Open
Abstract
The Caribbean is a genetically diverse region with heterogeneous admixture compositions influenced by local island ecologies, migrations, colonial conflicts, and demographic histories. The Commonwealth of Dominica is a mountainous island in the Lesser Antilles historically known to harbor communities with unique patterns of migration, mixture, and isolation. This community-based population genetic study adds biological evidence to inform post-colonial narrative histories in a Dominican horticultural village. High density single nucleotide polymorphism data paired with a previously compiled genealogy provide the first genome-wide insights on genetic ancestry and population structure in Dominica. We assessed family-based clustering, inferred global ancestry, and dated recent admixture by implementing the fastSTRUCTURE clustering algorithm, modeling graph-based migration with TreeMix, assessing patterns of linkage disequilibrium decay with ALDER, and visualizing data from Dominica with Human Genome Diversity Panel references. These analyses distinguish family-based genetic structure from variation in African, European, and indigenous Amerindian admixture proportions, and analyses of linkage disequilibrium decay estimate admixture dates 5–6 generations (~160 years) ago. African ancestry accounts for the largest mixture components, followed by European and then indigenous components; however, our global ancestry inferences are consistent with previous mitochondrial, Y chromosome, and ancestry marker data from Dominica that show uniquely higher proportions of indigenous ancestry and lower proportions of African ancestry relative to known admixture in other French- and English-speaking Caribbean islands. Our genetic results support local narratives about the community’s history and founding, which indicate that newly emancipated people settled in the steep, dense vegetation along Dominica’s eastern coast in the mid-19th century. Strong genetic signals of post-colonial admixture and family-based structure highlight the localized impacts of colonial forces and island ecologies in this region, and more data from other groups are needed to more broadly inform on Dominica’s complex history and present diversity.
Collapse
Affiliation(s)
- Monica H. Keith
- Department of Anthropology, University of Missouri, Columbia, Missouri, United States of America
- * E-mail: (MHK); (JED)
| | - Mark V. Flinn
- Department of Anthropology, University of Missouri, Columbia, Missouri, United States of America
| | - Harly J. Durbin
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Troy N. Rowan
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- Genomics Center for the Advancement of Agriculture, University of Tennessee Institute for Agriculture, Knoxville, Tennessee, United States of America
| | - Gregory E. Blomquist
- Department of Anthropology, University of Missouri, Columbia, Missouri, United States of America
| | - Kristen H. Taylor
- Department of Anatomy and Pathological Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Jeremy F. Taylor
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Jared E. Decker
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- * E-mail: (MHK); (JED)
| |
Collapse
|
4
|
Strom NI, Grove J, Meier SM, Bækvad-Hansen M, Becker Nissen J, Damm Als T, Halvorsen M, Nordentoft M, Mortensen PB, Hougaard DM, Werge T, Mors O, Børglum AD, Crowley JJ, Bybjerg-Grauholm J, Mattheisen M. Polygenic Heterogeneity Across Obsessive-Compulsive Disorder Subgroups Defined by a Comorbid Diagnosis. Front Genet 2021; 12:711624. [PMID: 34531895 PMCID: PMC8438210 DOI: 10.3389/fgene.2021.711624] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 07/27/2021] [Indexed: 01/20/2023] Open
Abstract
Among patients with obsessive-compulsive disorder (OCD), 65-85% manifest another psychiatric disorder concomitantly or at some other time point during their life. OCD is highly heritable, as are many of its comorbidities. A possible genetic heterogeneity of OCD in relation to its comorbid conditions, however, has not yet been exhaustively explored. We used a framework of different approaches to study the genetic relationship of OCD with three commonly observed comorbidities, namely major depressive disorder (MDD), attention-deficit hyperactivity disorder (ADHD), and autism spectrum disorder (ASD). First, using publicly available summary statistics from large-scale genome-wide association studies, we compared genetic correlation patterns for OCD, MDD, ADHD, and ASD with 861 somatic and mental health phenotypes. Secondly, we examined how polygenic risk scores (PRS) of eight traits that showed heterogeneous correlation patterns with OCD, MDD, ADHD, and ASD partitioned across comorbid subgroups in OCD using independent unpublished data from the Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH). The comorbid subgroups comprised of patients with only OCD (N = 366), OCD and MDD (N = 1,052), OCD and ADHD (N = 443), OCD and ASD (N = 388), and OCD with more than 1 comorbidity (N = 429). We found that PRS of all traits but BMI were significantly associated with OCD across all subgroups (neuroticism: p = 1.19 × 10-32, bipolar disorder: p = 7.51 × 10-8, anorexia nervosa: p = 3.52 × 10-20, age at first birth: p = 9.38 × 10-5, educational attainment: p = 1.56 × 10-4, OCD: p = 1.87 × 10-6, insomnia: p = 2.61 × 10-5, BMI: p = 0.15). For age at first birth, educational attainment, and insomnia PRS estimates significantly differed across comorbid subgroups (p = 2.29 × 10-4, p = 1.63 × 10-4, and p = 0.045, respectively). Especially for anorexia nervosa, age at first birth, educational attainment, insomnia, and neuroticism the correlation patterns that emerged from genetic correlation analysis of OCD, MDD, ADHD, and ASD were mirrored in the PRS associations with the respective comorbid OCD groups. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across OCD comorbid subgroups.
Collapse
Affiliation(s)
- Nora I. Strom
- Department of Psychology, Humboldt Universität zu Berlin, Berlin, Germany
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
- Department of Biomedicine and the iSEQ Center, Aarhus University, Aarhus, Denmark
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany
| | - Jakob Grove
- Department of Biomedicine and the iSEQ Center, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Sandra M. Meier
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| | - Marie Bækvad-Hansen
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Judith Becker Nissen
- Center for Child and Adolescent Psychiatry, Aarhus University Hospital Risskov, Risskov, Denmark
| | - Thomas Damm Als
- Department of Biomedicine and the iSEQ Center, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Matthew Halvorsen
- Department of Genetics, University of North Carolina, Chapel Hill, NC, United States
| | - Merete Nordentoft
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
- Copenhagen Research Centre for Mental Health (CORE), Mental Health Centre Copenhagen, Copenhagen University Hospital, Copenhagen, Denmark
| | - Preben B. Mortensen
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Centre for Integrated Register-based Research, Aarhus University, Aarhus, Denmark
| | - David M. Hougaard
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Thomas Werge
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Copenhagen, Denmark
- Lundbeck Foundation Center for GeoGenetics, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Ole Mors
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Psychosis Research Unit, Aarhus University Hospital, Aarhus, Denmark
| | - Anders D. Børglum
- Department of Biomedicine and the iSEQ Center, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - James J. Crowley
- Department of Genetics, University of North Carolina, Chapel Hill, NC, United States
| | - Jonas Bybjerg-Grauholm
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark
- Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Manuel Mattheisen
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
- Department of Biomedicine and the iSEQ Center, Aarhus University, Aarhus, Denmark
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| |
Collapse
|
5
|
Wendt FR, Pathak GA, Overstreet C, Tylee DS, Gelernter J, Atkinson EG, Polimanti R. Characterizing the effect of background selection on the polygenicity of brain-related traits. Genomics 2021; 113:111-119. [PMID: 33278486 PMCID: PMC7855394 DOI: 10.1016/j.ygeno.2020.11.032] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 11/20/2020] [Accepted: 11/30/2020] [Indexed: 01/10/2023]
Abstract
BACKGROUND Genome-wide association studies (GWAS) have demonstrated that psychopathology phenotypes are affected by many risk alleles with small effect (polygenicity). It is unclear how ubiquitously evolutionary pressures influence the genetic architecture of these traits. METHODS We partitioned SNP heritability to assess the contribution of background (BGS) and positive selection, Neanderthal local ancestry, functional significance, and genotype networks in 75 brain-related traits (8411 ≤ N ≤ 1,131,181, mean N = 205,289). We applied binary annotations by dichotomizing each measure based on top 2%, 1%, and 0.5% of all scores genome-wide. Effect size distribution features were calculated using GENESIS. We tested the relationship between effect size distribution descriptive statistics and natural selection. In a subset of traits, we explore the inclusion of diagnostic heterogeneity (e.g., number of diagnostic combinations and total symptoms) in the tested relationship. RESULTS SNP-heritability was enriched (false discovery rate q < 0.05) for loci with elevated BGS (7 phenotypes) and in genic (34 phenotypes) and loss-of-function (LoF)-intolerant regions (67 phenotypes). These effects were strongest in GWAS of schizophrenia (1.90-fold BGS, 1.16-fold genic, and 1.92-fold LoF), educational attainment (1.86-fold BGS, 1.12-fold genic, and 1.79-fold LoF), and cognitive performance (2.29-fold BGS, 1.12-fold genic, and 1.79-fold LoF). BGS (top 2%) significantly predicted effect size variance for trait-associated loci (σ2 parameter) in 75 brain-related traits (β = 4.39 × 10-5, p = 1.43 × 10-5, model r2 = 0.548). Considering the number of DSM-5 diagnostic combinations per psychiatric disorder improved model fit (σ2 ~ BTop2% × Genic × diagnostic combinations; model r2 = 0.661). CONCLUSIONS Brain-related phenotypes with larger variance in risk locus effect sizes are associated with loci under BGS. We show exploratory results suggesting that diagnostic complexity may also contribute to the increased polygenicity of psychiatric disorders.
Collapse
Affiliation(s)
- Frank R Wendt
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA
| | - Gita A Pathak
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA
| | - Cassie Overstreet
- National Center for Posttraumatic Stress Disorder, Clinical Neurosciences Division, VA CT Healthcare System and Department of Psychiatry, Yale University School of Medicine, USA
| | - Daniel S Tylee
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA
| | - Joel Gelernter
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA; Departments of Genetics and Neuroscience, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Elizabeth G Atkinson
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renato Polimanti
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare System, West Haven, CT 06516, USA.
| |
Collapse
|
6
|
Zhang Q, Cai Z, Lhomme M, Sahana G, Lesnik P, Guerin M, Fredholm M, Karlskov-Mortensen P. Inclusion of endophenotypes in a standard GWAS facilitate a detailed mechanistic understanding of genetic elements that control blood lipid levels. Sci Rep 2020; 10:18434. [PMID: 33116219 PMCID: PMC7595098 DOI: 10.1038/s41598-020-75612-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 10/15/2020] [Indexed: 12/13/2022] Open
Abstract
Dyslipidemia is the primary cause of cardiovascular disease, which is a serious human health problem in large parts of the world. Therefore, it is important to understand the genetic and molecular mechanisms that regulate blood levels of cholesterol and other lipids. Discovery of genetic elements in the regulatory machinery is often based on genome wide associations studies (GWAS) focused on end-point phenotypes such as total cholesterol level or a disease diagnosis. In the present study, we add endophenotypes, such as serum levels of intermediate metabolites in the cholesterol synthesis pathways, to a GWAS analysis and use the pig as an animal model. We do this to increase statistical power and to facilitate biological interpretation of results. Although the study population was limited to ~ 300 individuals, we identify two genome-wide significant associations and ten suggestive associations. Furthermore, we identify 28 tentative associations to loci previously associated with blood lipids or dyslipidemia associated diseases. The associations with endophenotypes may inspire future studies that can dissect the biological mechanisms underlying these previously identified associations and add a new level of understanding to previously identified associations.
Collapse
Affiliation(s)
- Qianqian Zhang
- Bioinformatics Research Centre (BiRC), Aarhus University, C.F.Møllers Allé 8, 8000, Aarhus C, Denmark
| | - Zexi Cai
- Center for Quantitativ Genetics and Genomics, Aarhus University, Blichers Allé 20, 8830, Tjele, Danmark
| | - Marie Lhomme
- ICANalytics, Institute of Cardiometabolism and Nutrition (ICAN), 47-83 boulevard de l'hôpital, 75013, Paris, France
| | - Goutam Sahana
- Center for Quantitativ Genetics and Genomics, Aarhus University, Blichers Allé 20, 8830, Tjele, Danmark
| | - Philippe Lesnik
- Unité de Recherche sur les maladies cardiovasculaires, le métabolisme et la nutrition, INSERM UMR_S 1166, ICAN Institute of Cardiometabolism & Nutrition, Faculté de Médecine Sorbonne Université, Sorbonne Université, 4ème étage, Bureau 421,91, boulevard de l'Hôpital, 75634, Paris Cedex 13, France
| | - Maryse Guerin
- Unité de Recherche sur les maladies cardiovasculaires, le métabolisme et la nutrition, INSERM UMR_S 1166, ICAN Institute of Cardiometabolism & Nutrition, Faculté de Médecine Sorbonne Université, Sorbonne Université, 4ème étage, Bureau 421,91, boulevard de l'Hôpital, 75634, Paris Cedex 13, France
| | - Merete Fredholm
- Animal Genetics, Bioinformatics and Breeding, Department of Veterinary and Animal Sciences, University of Copenhagen, Gronnegaardsvej 3, 1870, Frederikgsberg C, Denmark
| | - Peter Karlskov-Mortensen
- Animal Genetics, Bioinformatics and Breeding, Department of Veterinary and Animal Sciences, University of Copenhagen, Gronnegaardsvej 3, 1870, Frederikgsberg C, Denmark.
| |
Collapse
|
7
|
Kulminski AM, Loika Y, Nazarian A, Culminskaya I. Quantitative and Qualitative Role of Antagonistic Heterogeneity in Genetics of Blood Lipids. J Gerontol A Biol Sci Med Sci 2020; 75:1811-1819. [PMID: 31566214 DOI: 10.1093/gerona/glz225] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Indexed: 12/18/2022] Open
Abstract
Prevailing strategies in genome-wide association studies (GWAS) mostly rely on principles of medical genetics emphasizing one gene, one function, one phenotype concept. Here, we performed GWAS of blood lipids leveraging a new systemic concept emphasizing complexity of genetic predisposition to such phenotypes. We focused on total cholesterol, low- and high-density lipoprotein cholesterols, and triglycerides available for 29,902 individuals of European ancestry from seven independent studies, men and women combined. To implement the new concept, we leveraged the inherent heterogeneity in genetic predisposition to such complex phenotypes and emphasized a new counter intuitive phenomenon of antagonistic genetic heterogeneity, which is characterized by misalignment of the directions of genetic effects and the phenotype correlation. This analysis identified 37 loci associated with blood lipids but only one locus, FBXO33, was not reported in previous top GWAS. We, however, found strong effect of antagonistic heterogeneity that leaded to profound (quantitative and qualitative) changes in the associations with blood lipids in most, 25 of 37 or 68%, loci. These changes suggested new roles for some genes, which functions were considered as well established such as GCKR, SIK3 (APOA1 locus), LIPC, LIPG, among the others. The antagonistic heterogeneity highlighted a new class of genetic associations emphasizing beneficial and adverse trade-offs in predisposition to lipids. Our results argue that rigorous analyses dissecting heterogeneity in genetic predisposition to complex traits such as lipids beyond those implemented in current GWAS are required to facilitate translation of genetic discoveries into health care.
Collapse
Affiliation(s)
- Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| | - Alireza Nazarian
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| |
Collapse
|
8
|
Lawson DJ, Davies NM, Haworth S, Ashraf B, Howe L, Crawford A, Hemani G, Davey Smith G, Timpson NJ. Is population structure in the genetic biobank era irrelevant, a challenge, or an opportunity? Hum Genet 2020; 139:23-41. [PMID: 31030318 PMCID: PMC6942007 DOI: 10.1007/s00439-019-02014-8] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2018] [Accepted: 04/12/2019] [Indexed: 12/11/2022]
Abstract
Replicable genetic association signals have consistently been found through genome-wide association studies in recent years. The recent dramatic expansion of study sizes improves power of estimation of effect sizes, genomic prediction, causal inference, and polygenic selection, but it simultaneously increases susceptibility of these methods to bias due to subtle population structure. Standard methods using genetic principal components to correct for structure might not always be appropriate and we use a simulation study to illustrate when correction might be ineffective for avoiding biases. New methods such as trans-ethnic modeling and chromosome painting allow for a richer understanding of the relationship between traits and population structure. We illustrate the arguments using real examples (stroke and educational attainment) and provide a more nuanced understanding of population structure, which is set to be revisited as a critical aspect of future analyses in genetic epidemiology. We also make simple recommendations for how problems can be avoided in the future. Our results have particular importance for the implementation of GWAS meta-analysis, for prediction of traits, and for causal inference.
Collapse
Affiliation(s)
- Daniel John Lawson
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK.
| | - Neil Martin Davies
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Simon Haworth
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Bilal Ashraf
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Laurence Howe
- Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, Gower Street, London, WC1E 6BT, UK
| | - Andrew Crawford
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Gibran Hemani
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - George Davey Smith
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Nicholas John Timpson
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| |
Collapse
|
9
|
The impact of disregarding family structure on genome-wide association analysis of complex diseases in cohorts with simple pedigrees. J Appl Genet 2019; 61:75-86. [PMID: 31755004 DOI: 10.1007/s13353-019-00526-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 09/19/2019] [Accepted: 10/10/2019] [Indexed: 12/12/2022]
Abstract
The generalized linear mixed models (GLMMs) methodology is the standard framework for genome-wide association studies (GWAS) of complex diseases in family-based cohorts. Fitting GLMMs in very large cohorts, however, can be computationally demanding. Also, the modified versions of GLMM using faster algorithms may underperform, for instance when a single nucleotide polymorphism (SNP) is correlated with fixed-effects covariates. We investigated the extent to which disregarding family structure may compromise GWAS in cohorts with simple pedigrees by contrasting logistic regression models (i.e., with no family structure) to three LMMs-based ones. Our analyses showed that the logistic regression models in general resulted in smaller P values compared with the LMMs-based models; however, the differences in P values were mostly minor. Disregarding family structure had little impact on determining disease-associated SNPs at genome-wide level of significance (i.e., P < 5E-08) as the four P values resulted from the tested methods for any SNP were all below or all above 5E-08. Nevertheless, larger discrepancies were detected between logistic regression and LMMs-based models at suggestive level of significance (i.e., of 5E-08 ≤ P < 5E-06). The SNP effects estimated by the logistic regression models were not statistically different from those estimated by GLMMs that implemented Wald's test. However, several SNP effects were significantly different from their counterparts in LMMs analyses. We suggest that fitting GLMMs with Wald's test on a pre-selected subset of SNPs obtained from logistic regression models can ensure the balance between the speed of analyses and the accuracy of parameters.
Collapse
|
10
|
Abstract
Risk of disease is multifactorial and can be shaped by socio-economic, demographic, cultural, environmental and genetic factors. Our understanding of the genetic determinants of disease risk has greatly advanced with the advent of genome-wide association studies (GWAS), which detect associations between genetic variants and complex traits or diseases by comparing populations of cases and controls. However, much of this discovery has occurred through GWAS of individuals of European ancestry, with limited representation of other populations, including from Africa, The Americas, Asia and Oceania. Population demography, genetic drift and adaptation to environments over thousands of years have led globally to the diversification of populations. This global genomic diversity can provide new opportunities for discovery and translation into therapies, as well as a better understanding of population disease risk. Large-scale multi-ethnic and representative biobanks and population health resources provide unprecedented opportunities to understand the genetic determinants of disease on a global scale.
Collapse
|
11
|
Kulminski AM, Loika Y, Huang J, Arbeev KG, Bagley O, Ukraintseva S, Yashin AI, Culminskaya I. Pleiotropic Meta-Analysis of Age-Related Phenotypes Addressing Evolutionary Uncertainty in Their Molecular Mechanisms. Front Genet 2019; 10:433. [PMID: 31134135 PMCID: PMC6524409 DOI: 10.3389/fgene.2019.00433] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 04/24/2019] [Indexed: 12/21/2022] Open
Abstract
Age-related phenotypes are characterized by genetic heterogeneity attributed to an uncertain role of evolution in establishing their molecular mechanisms. Here, we performed univariate and pleiotropic meta-analyses of 24 age-related phenotypes dealing with such evolutionary uncertainty and leveraging longitudinal information. Our analysis identified 237 novel single nucleotide polymorphisms (SNPs) in 199 loci with phenotype-specific (61 SNPs) and pleiotropic (176 SNPs) associations and replicated associations for 160 SNPs in 68 loci in a modest sample of 26,371 individuals from five longitudinal studies. Most pleiotropic associations (65.3%, 115 of 176 SNPs) were impacted by heterogeneity, with the natural-selection—free genetic heterogeneity as its inevitable component. This pleiotropic heterogeneity was dominated (93%, 107 of 115 SNPs) by antagonistic genetic heterogeneity, a phenomenon that is characterized by antagonistic directions of genetic effects for directly correlated phenotypes. Genetic association studies of age-related phenotypes addressing the evolutionary uncertainty in establishing their molecular mechanisms have power to substantially improve the efficiency of the analyses. A dominant form of heterogeneous pleiotropy, antagonistic genetic heterogeneity, provides unprecedented insight into the genetic origin of age-related phenotypes and side effects in medical care that is counter-intuitive in medical genetics but naturally expected when molecular mechanisms of age-related phenotypes are not due to direct evolutionary selection.
Collapse
Affiliation(s)
- Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Jian Huang
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Konstantin G Arbeev
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Olivia Bagley
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Svetlana Ukraintseva
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Anatoliy I Yashin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| |
Collapse
|
12
|
Nazarian A, Yashin AI, Kulminski AM. Genome-wide analysis of genetic predisposition to Alzheimer's disease and related sex disparities. ALZHEIMERS RESEARCH & THERAPY 2019; 11:5. [PMID: 30636644 PMCID: PMC6330399 DOI: 10.1186/s13195-018-0458-8] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 12/06/2018] [Indexed: 12/21/2022]
Abstract
BACKGROUND Alzheimer's disease (AD) is the most common cause of dementia in the elderly and the sixth leading cause of death in the United States. AD is mainly considered a complex disorder with polygenic inheritance. Despite discovering many susceptibility loci, a major proportion of AD genetic variance remains to be explained. METHODS We investigated the genetic architecture of AD in four publicly available independent datasets through genome-wide association, transcriptome-wide association, and gene-based and pathway-based analyses. To explore differences in the genetic basis of AD between males and females, analyses were performed on three samples in each dataset: males and females combined, only males, or only females. RESULTS Our genome-wide association analyses corroborated the associations of several previously detected AD loci and revealed novel significant associations of 35 single-nucleotide polymorphisms (SNPs) outside the chromosome 19q13 region at the suggestive significance level of p < 5E-06. These SNPs were mapped to 21 genes in 19 chromosomal regions. Of these, 17 genes were not associated with AD at genome-wide or suggestive levels of associations by previous genome-wide association studies. Also, the chromosomal regions corresponding to 8 genes did not contain any previously detected AD-associated SNPs with p < 5E-06. Our transcriptome-wide association and gene-based analyses revealed that 26 genes located in 20 chromosomal regions outside chromosome 19q13 had evidence of potential associations with AD at a false discovery rate of 0.05. Of these, 13 genes/regions did not contain any previously AD-associated SNPs at genome-wide or suggestive levels of associations. Most of the newly detected AD-associated SNPs and genes were sex specific, indicating sex disparities in the genetic basis of AD. Also, 7 of 26 pathways that showed evidence of associations with AD in our pathway-bases analyses were significant only in females. CONCLUSIONS Our findings, particularly the newly discovered sex-specific genetic contributors, provide novel insight into the genetic architecture of AD and can advance our understanding of its pathogenesis.
Collapse
Affiliation(s)
- Alireza Nazarian
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St., Durham, NC, 27705, USA.
| | - Anatoliy I Yashin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St., Durham, NC, 27705, USA
| | - Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St., Durham, NC, 27705, USA.
| |
Collapse
|
13
|
Marigorta UM, Rodríguez JA, Gibson G, Navarro A. Replicability and Prediction: Lessons and Challenges from GWAS. Trends Genet 2018; 34:504-517. [PMID: 29716745 PMCID: PMC6003860 DOI: 10.1016/j.tig.2018.03.005] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 03/12/2018] [Accepted: 03/26/2018] [Indexed: 12/29/2022]
Abstract
Since the publication of the Wellcome Trust Case Control Consortium (WTCCC) landmark study a decade ago, genome-wide association studies (GWAS) have led to the discovery of thousands of risk variants involved in disease etiology. This success story has two angles that are often overlooked. First, GWAS findings are highly replicable. This is an unprecedented phenomenon in complex trait genetics, and indeed in many areas of science, which in past decades have been plagued by false positives. At a time of increasing concerns about the lack of reproducibility, we examine the biological and methodological reasons that account for the replicability of GWAS and identify the challenges ahead. In contrast to the exemplary success of disease gene discovery, at present GWAS findings are not useful for predicting phenotypes. We close with an overview of the prospects for individualized prediction of disease risk and its foreseeable impact in clinical practice.
Collapse
Affiliation(s)
- Urko M Marigorta
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA, USA; These authors contributed equally
| | - Juan Antonio Rodríguez
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Catalonia, Spain; Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), Barcelona, Catalonia, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain; These authors contributed equally. https://twitter.com/jrotwitguez
| | - Greg Gibson
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA, USA
| | - Arcadi Navarro
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Catalonia, Spain; Institute of Evolutionary Biology (UPF-CSIC), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain; National Institute for Bioinformatics (INB), Barcelona, Catalonia, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Barcelona, Catalonia, Spain.
| |
Collapse
|
14
|
Kulminski AM, Huang J, Loika Y, Arbeev KG, Bagley O, Yashkin A, Duan M, Culminskaya I. Strong impact of natural-selection-free heterogeneity in genetics of age-related phenotypes. Aging (Albany NY) 2018; 10:492-514. [PMID: 29615537 PMCID: PMC5892700 DOI: 10.18632/aging.101407] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 03/24/2018] [Indexed: 11/25/2022]
Abstract
A conceptual difficulty in genetics of age-related phenotypes that make individuals vulnerable to disease in post-reproductive life is genetic heterogeneity attributed to an undefined role of evolution in establishing their molecular mechanisms. Here, we performed univariate and pleiotropic genome-wide meta-analyses of 20 age-related phenotypes leveraging longitudinal information in a sample of 33,431 individuals and dealing with the natural-selection-free genetic heterogeneity. We identified 142 non-proxy single nucleotide polymorphisms (SNPs) with phenotype-specific (18 SNPs) and pleiotropic (124 SNPs) associations at genome-wide level. Univariate meta-analysis identified two novel (11.1%) and replicated 16 SNPs whereas pleiotropic meta-analysis identified 115 novel (92.7%) and nine replicated SNPs. Pleiotropic associations for most novel (93.9%) and all replicated SNPs were strongly impacted by the natural-selection-free genetic heterogeneity in its unconventional form of antagonistic heterogeneity, implying antagonistic directions of genetic effects for directly correlated phenotypes. Our results show that the common genome-wide approach is well adapted to handle homogeneous univariate associations within Mendelian framework whereas most associations with age-related phenotypes are more complex and well beyond that framework. Dissecting the natural-selection-free genetic heterogeneity is critical for gaining insights into genetics of age-related phenotypes and has substantial and unexplored yet potential for improving efficiency of genome-wide analysis.
Collapse
Affiliation(s)
- Alexander M. Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Jian Huang
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Konstantin G. Arbeev
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Olivia Bagley
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Arseniy Yashkin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Matt Duan
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| |
Collapse
|