1
|
Kontou PI, Bagos PG. The goldmine of GWAS summary statistics: a systematic review of methods and tools. BioData Min 2024; 17:31. [PMID: 39238044 PMCID: PMC11375927 DOI: 10.1186/s13040-024-00385-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/27/2024] [Indexed: 09/07/2024] Open
Abstract
Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.
Collapse
Affiliation(s)
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35131, Lamia, Greece.
| |
Collapse
|
2
|
Reinert S. Quantitative genetics of pleiotropy and its potential for plant sciences. JOURNAL OF PLANT PHYSIOLOGY 2022; 276:153784. [PMID: 35944292 DOI: 10.1016/j.jplph.2022.153784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/14/2022] [Accepted: 07/18/2022] [Indexed: 06/15/2023]
Affiliation(s)
- Stephan Reinert
- Friedrich-Alexander-University Erlangen-Nürnberg, Department of Biology, Division of Biochemistry, Biocomputing Lab, Staudtstraße 5, 91058, Erlangen, Germany.
| |
Collapse
|
3
|
Aguate FM, Vazquez AI, Merriman TR, de Los Campos G. Mapping pleiotropic loci using a fast-sequential testing algorithm. Eur J Hum Genet 2021; 29:1762-1773. [PMID: 34145383 PMCID: PMC8633382 DOI: 10.1038/s41431-021-00911-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/27/2021] [Accepted: 05/19/2021] [Indexed: 02/07/2023] Open
Abstract
Pleiotropy (i.e., genes with effects on multiple traits) leads to genetic correlations between traits and contributes to the development of many syndromes. Identifying variants with pleiotropic effects on multiple health-related traits can improve the biological understanding of gene action and disease etiology, and can help to advance disease-risk prediction. Sequential testing is a powerful approach for mapping genes with pleiotropic effects. However, the existing methods and the available software do not scale to analyses involving millions of SNPs and large datasets. This has limited the adoption of sequential testing for pleiotropy mapping at large scale. In this study, we present a sequential test and software that can be used to test pleiotropy in large systems of traits with biobank-sized data. Using simulations, we show that the methods implemented in the software are powerful and have adequate type-I error rate control. To demonstrate the use of the methods and software, we present a whole-genome scan in search of loci with pleiotropic effects on seven traits related to metabolic syndrome (MetS) using UK-Biobank data (n~300 K distantly related white European participants). We found abundant pleiotropy and report 170, 44, and 18 genomic regions harboring SNPs with pleiotropic effects in at least two, three, and four of the seven traits, respectively. We validate our results using previous studies documented in the GWAS-catalog and using data from GTEx. Our results confirm previously reported loci and lead to several novel discoveries that link MetS-related traits through plausible biological pathways.
Collapse
Affiliation(s)
- Fernando M Aguate
- Department of Epidemiology & Biostatistics, IQ - Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA.
| | - Ana I Vazquez
- Department of Epidemiology & Biostatistics, IQ - Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Tony R Merriman
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Gustavo de Los Campos
- Department of Epidemiology & Biostatistics, IQ - Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA.
- Department of Statistics & Probability, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
4
|
Di Narzo A, Frades I, Crane HM, Crane PK, Hulot JS, Kasarskis A, Hart A, Argmann C, Dubinsky M, Peter I, Hao K. Meta-analysis of sample-level dbGaP data reveals novel shared genetic link between body height and Crohn's disease. Hum Genet 2021; 140:865-877. [PMID: 33452914 DOI: 10.1007/s00439-020-02250-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 12/19/2020] [Indexed: 12/29/2022]
Abstract
To further explore genetic links between complex traits, we developed a comprehensive framework to harmonize and integrate extensive genotype and phenotype data from the four well-characterized cohorts with the focus on cardiometabolic diseases deposited to the database of Genotypes and Phenotypes (dbGaP). We generated a series of polygenic risk scores (PRS) to investigate pleiotropic effects of loci that confer genetic risk for 19 common diseases and traits on body height, type 2 diabetes (T2D), and myocardial infarction (MI). In a meta-analysis of 20,021 subjects, we identified shared genetic determinants of Crohn's Disease (CD), a type of inflammatory bowel disease, and body height (p = 5.5 × 10-5). The association of PRS-CD with height was replicated in UK Biobank (p = 1.1 × 10-5) and an independent cohort of 510 CD cases and controls (1.57 cm shorter height per PRS-CD interquartile increase, p = 5.0 × 10-3 and a 28% reduction in CD risk per interquartile increase in PRS-height, p = 1.1 × 10-3, with the effect independent of CD diagnosis). A pathway analysis of the variants overlapping between PRS-height and PRS-CD detected significant enrichment of genes from the inflammatory, immune-mediated and growth factor regulation pathways. This finding supports the clinical observation of growth failure in patients with childhood-onset CD and demonstrates the value of using individual-level data from dbGaP in searching for shared genetic determinants. This information can help provide a refined insight into disease pathogenesis and may have major implications for novel therapies and drug repurposing.
Collapse
Affiliation(s)
- Antonio Di Narzo
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA.,Icahn School of Medicine At Mount Sinai, Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
| | - Itziar Frades
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA.,Computational Biology and Systems Biomedicine Research Group, Biodonostia Health Research Institute, San Sebastián, Spain
| | - Heidi M Crane
- Department of Medicine, University of Washington, Seattle, WA, USA.,Center for AIDS Research, University of Washington, Seattle, WA, USA
| | - Paul K Crane
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Jean-Sebastian Hulot
- Université de Paris, INSERM, PARCC, CIC1418, F-75015, Paris, France.,Cardiovascular Research Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Andrew Kasarskis
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA.,Icahn School of Medicine At Mount Sinai, Icahn Institute for Data Science and Genomic Technology, New York, NY, USA.,Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Amy Hart
- Janssen R&D, LLC, 1400 McKean Road, Spring House, PA, USA
| | - Carmen Argmann
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA.,Icahn School of Medicine At Mount Sinai, Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
| | - Marla Dubinsky
- Department of Pediatric Gastroenterology and Nutrition, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Inga Peter
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA.,Icahn School of Medicine At Mount Sinai, Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
| | - Ke Hao
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Ave, New York, NY, 10029, USA. .,Icahn School of Medicine At Mount Sinai, Icahn Institute for Data Science and Genomic Technology, New York, NY, USA.
| |
Collapse
|
5
|
Knutson KA, Deng Y, Pan W. Implicating causal brain imaging endophenotypes in Alzheimer's disease using multivariable IWAS and GWAS summary data. Neuroimage 2020; 223:117347. [PMID: 32898681 PMCID: PMC7778364 DOI: 10.1016/j.neuroimage.2020.117347] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Revised: 08/24/2020] [Accepted: 08/28/2020] [Indexed: 02/06/2023] Open
Abstract
Recent evidence suggests the existence of many undiscovered heritable brain phenotypes involved in Alzheimer's Disease (AD) pathogenesis. This finding necessitates methods for the discovery of causal brain changes in AD that integrate Magnetic Resonance Imaging measures and genotypic data. However, existing approaches for causal inference in this setting, such as the univariate Imaging Wide Association Study (UV-IWAS), suffer from inconsistent effect estimation and inflated Type I errors in the presence of genetic pleiotropy, the phenomenon in which a variant affects multiple causal intermediate risk phenotypes. In this study, we implement a multivariate extension to the IWAS model, namely MV-IWAS, to consistently estimate and test for the causal effects of multiple brain imaging endophenotypes from the Alzheimer's Disease Neuroimaging Initiative (ADNI) in the presence of pleiotropic and possibly correlated SNPs. We further extend MV-IWAS to incorporate variant-specific direct effects on AD, analogous to the existing Egger regression Mendelian Randomization approach, which allows for testing of remaining pleiotropy after adjusting for multiple intermediate pathways. We propose a convenient approach for implementing MV-IWAS that solely relies on publicly available GWAS summary data and a reference panel. Through simulations with either individual-level or summary data, we demonstrate the well controlled Type I errors and superior power of MV-IWAS over UV-IWAS in the presence of pleiotropic SNPs. We apply the summary statistic based tests to 1578 heritable imaging derived phenotypes (IDPs) from the UK Biobank. MV-IWAS detected numerous IDPs as possible false positives by UV-IWAS while uncovering many additional causal neuroimaging phenotypes in AD which are strongly supported by the existing literature.
Collapse
Affiliation(s)
- Katherine A Knutson
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota United States
| | - Yangqing Deng
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota United States
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota United States.
| |
Collapse
|
6
|
Abstract
PURPOSE OF REVIEW Disturbances in mineral metabolism are common among individuals with chronic kidney disease and have consistently been associated with cardiovascular and bone disease. The current review aims to describe the current knowledge of the genetic aspects of mineral metabolism disturbances and to suggest directions for future studies to uncover the cause and pathogenesis of chronic kidney disease - mineral bone disorder. RECENT FINDINGS The most severe disorders of mineral metabolism are caused by highly penetrant, rare, single-gene disruptive mutations. More recently, genome-wide association studies (GWAS) have made an important contribution to our understanding of the genetic determinants of circulating levels of 25-hydroxyvitamin D, calcium, phosphorus, fibroblast growth factor-23, parathyroid hormone, fetuin-A and osteoprotegerin. Although the majority of these genes are known members of mineral homeostasis pathways, GWAS with larger sample sizes have enabled the discovery of many genes not known to be involved in the regulation of mineral metabolism. SUMMARY GWAS have enabled remarkable developments in our ability to discover the genetic basis of mineral metabolism disturbances. Although we are far from using these findings to inform clinical practice, we are gaining understanding of novel biological mechanisms and providing insight into ethnic variation in these traits.
Collapse
|
7
|
Hui Y, Zhang Y, Wang K, Pan C, Chen H, Qu L, Song X, Lan X. Goat DNMT3B: An indel mutation detection, association analysis with litter size and mRNA expression in gonads. Theriogenology 2020; 147:108-115. [PMID: 32122684 DOI: 10.1016/j.theriogenology.2020.02.025] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 02/10/2020] [Accepted: 02/16/2020] [Indexed: 12/20/2022]
Abstract
DNA methyltransferase 3β (DNMT3B) is a gene encoding a de novo methylation enzyme that is required for DNA methylation during mammalian embryo development. Previous genome-wide association analysis suggested DNMT3B is a candidate gene for goat fertility, but there is no study on the effect of DNMT3B on litter size in goat. The aim of this study was to identify possible insertion/deletion (indel) mutations associated with litter size. Seven putative indels were designed to study their association with litter size, but just one 11-bp insertion variant of intron 22 (the last intron) was found in healthy female Shaanbei white cashmere goats (SBWC goats) (n = 1534). Statistical analysis showed that the 11-bp insertion was related to the first-born litter size (P < 0.01) and the goats with the deletion/deletion genotype had a higher average first-born litter size (P < 0.01). In addition, the expression profile of the DNMT3B mRNA in goat was detected, which revealed significant differences in DNMT3B mRNA expression in the gonads. Additionally, the results of western blotting revealed that the ovaries of mothers of multi-lamb (MML) had a higher level of DNMT3B protein than the ovaries of mothers of single-lamb (MSL). Furthermore, the mRNA of DNMT3B was widely expressed in male goats. Differences in mRNA expression levels were observed in the ovaries of MSL and MML. These findings indicated that the 11-bp indel in DNMT3B was significantly associated with first-born litter size, which can be used for marker-assisted selection (MAS) of goats for breeding.
Collapse
Affiliation(s)
- Yiqing Hui
- College of Animals Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling, Shaanxi, 712100, P.R. China.
| | - Yanghai Zhang
- College of Animals Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling, Shaanxi, 712100, P.R. China.
| | - Ke Wang
- College of Animals Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling, Shaanxi, 712100, P.R. China.
| | - Chuanying Pan
- College of Animals Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling, Shaanxi, 712100, P.R. China.
| | - Hong Chen
- College of Animals Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling, Shaanxi, 712100, P.R. China.
| | - Lei Qu
- Shaanxi Provincial Engineering and Technology Research Center of Cashmere Goats, Yulin University, Yulin, Shaanxi, 719000, PR China; College of Life Sciences, Yulin University, Yulin, Shaanxi, 719000, PR China.
| | - Xiaoyue Song
- Shaanxi Provincial Engineering and Technology Research Center of Cashmere Goats, Yulin University, Yulin, Shaanxi, 719000, PR China; College of Life Sciences, Yulin University, Yulin, Shaanxi, 719000, PR China.
| | - Xianyong Lan
- College of Animals Science and Technology, Northwest A&F University, No.22 Xinong Road, Yangling, Shaanxi, 712100, P.R. China.
| |
Collapse
|
8
|
Deng Y, Pan W. A powerful and versatile colocalization test. PLoS Comput Biol 2020; 16:e1007778. [PMID: 32275709 PMCID: PMC7176287 DOI: 10.1371/journal.pcbi.1007778] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 04/22/2020] [Accepted: 03/08/2020] [Indexed: 12/17/2022] Open
Abstract
Transcriptome-wide association studies (TWAS and PrediXcan) have been increasingly applied to detect associations between genetically predicted gene expressions and GWAS traits, which may suggest, however do not completely determine, causal genes for GWAS traits, due to the likely violation of their imposed strong assumptions for causal inference. Testing colocalization moves it closer to establishing causal relationships: if a GWAS trait and a gene's expression share the same associated SNP, it may suggest a regulatory (and thus putative causal) role of the SNP mediated through the gene on the GWAS trait. Accordingly, it is of interest to develop and apply various colocalization testing approaches. The existing approaches may each have some severe limitations. For instance, some methods test the null hypothesis that there is colocalization, which is not ideal because often the null hypothesis cannot be rejected simply due to limited statistical power (with too small sample sizes). Some other methods arbitrarily restrict the maximum number of causal SNPs in a locus, which may lead to loss of power in the presence of wide-spread allelic heterogeneity. Importantly, most methods cannot be applied to either GWAS/eQTL summary statistics or cases with more than two possibly correlated traits. Here we present a simple and general approach based on conditional analysis of a locus on multiple traits, overcoming the above and other shortcomings of the existing methods. We demonstrate that, compared with other methods, our new method can be applied to a wider range of scenarios and often perform better. We showcase its applications to both simulated and real data, including a large-scale Alzheimer's disease GWAS summary dataset and a gene expression dataset, and a large-scale blood lipid GWAS summary association dataset. An R package "jointsum" implementing the proposed method is publicly available at github.
Collapse
Affiliation(s)
- Yangqing Deng
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
9
|
Neyhart JL, Lorenz AJ, Smith KP. Multi-trait Improvement by Predicting Genetic Correlations in Breeding Crosses. G3 (BETHESDA, MD.) 2019; 9:3153-3165. [PMID: 31358561 PMCID: PMC6778794 DOI: 10.1534/g3.119.400406] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Accepted: 07/25/2019] [Indexed: 12/22/2022]
Abstract
The many quantitative traits of interest to plant breeders are often genetically correlated, which can complicate progress from selection. Improving multiple traits may be enhanced by identifying parent combinations - an important breeding step - that will deliver more favorable genetic correlations (rG ). Modeling the segregation of genomewide markers with estimated effects may be one method of predicting rG in a cross, but this approach remains untested. Our objectives were to: (i) use simulations to assess the accuracy of genomewide predictions of rG and the long-term response to selection when selecting crosses on the basis of such predictions; and (ii) empirically measure the ability to predict genetic correlations using data from a barley (Hordeum vulgare L.) breeding program. Using simulations, we found that the accuracy to predict rG was generally moderate and influenced by trait heritability, population size, and genetic correlation architecture (i.e., pleiotropy or linkage disequilibrium). Among 26 barley breeding populations, the empirical prediction accuracy of rG was low (-0.012) to moderate (0.42), depending on trait complexity. Within a simulated plant breeding program employing indirect selection, choosing crosses based on predicted rG increased multi-trait genetic gain by 11-27% compared to selection on the predicted cross mean. Importantly, when the starting genetic correlation was negative, such cross selection mitigated or prevented an unfavorable response in the trait under indirect selection. Prioritizing crosses based on predicted genetic correlation can be a feasible and effective method of improving unfavorably correlated traits in breeding programs.
Collapse
Affiliation(s)
- Jeffrey L Neyhart
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108
| | - Aaron J Lorenz
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108
| | - Kevin P Smith
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108
| |
Collapse
|
10
|
Zhang Y, Zhou Y, van der Mei IAF, Simpson S, Ponsonby AL, Lucas RM, Tettey P, Charlesworth J, Kostner K, Taylor BV. Lipid-related genetic polymorphisms significantly modulate the association between lipids and disability progression in multiple sclerosis. J Neurol Neurosurg Psychiatry 2019; 90:636-641. [PMID: 30782980 DOI: 10.1136/jnnp-2018-319870] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 12/14/2018] [Accepted: 12/24/2018] [Indexed: 12/31/2022]
Abstract
OBJECTIVE To investigate whether lipid-related or body mass index (BMI)-related common genetic polymorphisms modulate the associations between serum lipid levels, BMI and disability progression in multiple sclerosis (MS). METHODS The association between disability progression (annualised Expanded Disability Status Scale (EDSS) change over 5 years, ΔEDSS) and lipid-related or BMI-related genetic polymorphisms was evaluated in a longitudinal cohort (n=184), diagnosed with MS. We constructed a cumulative genetic risk score (CGRS) of associated polymorphisms (p<0.05) and examined the interactions between the CGRS and lipid levels (measured at baseline) in predicting ΔEDSS. All analyses were conducted using linear regression. RESULTS Five lipid polymorphisms (rs2013208, rs9488822, rs17173637, rs10401969 and rs2277862) and one BMI polymorphism (rs2033529) were nominally associated with ΔEDSS. The constructed lipid CGRS showed a significant, dose-dependent association with ΔEDSS (ptrend=1.4×10-6), such that participants having ≥6 risk alleles progressed 0.38 EDSS points per year faster compared with those having ≤3. This CGRS model explained 16% of the variance in ΔEDSS. We also found significant interactions between the CGRS and lipid levels in modulating ΔEDSS, including high-density lipoprotein (HDL; pinteraction=0.005) and total cholesterol:high-density lipoprotein ratio (TC:HDL; pinteraction=0.030). The combined model (combination of CGRS and the lipid parameter) explained 26% of the disability variance for HDL and 27% for TC:HDL. INTERPRETATION In this prospective cohort study, both lipid levels and lipid-related polymorphisms individually and jointly were associated with significantly increased disability progression in MS. These results indicate that these polymorphisms and tagged genes might be potential points of intervention to moderate disability progression.
Collapse
Affiliation(s)
- Yan Zhang
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Yuan Zhou
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Ingrid A F van der Mei
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Steve Simpson
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia.,Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Victoria, Australia
| | - Anne-Louise Ponsonby
- Murdoch Children's Research Institute, The University of Melbourne, Melbourne, Victoria, Australia
| | - Robyn M Lucas
- National Centre for Epidemiology and Population Health, Research School of Population Health, College of Medicine, Biology and Environment, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Prudence Tettey
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia.,School of Public Health, University of Ghana, Accra, Ghana
| | - Jac Charlesworth
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Karam Kostner
- Mater Hospital, University of Queensland, Brisbane, Queensland, Australia
| | - Bruce V Taylor
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | | |
Collapse
|
11
|
Abstract
It is useful to detect allelic heterogeneity (AH), i.e., the presence of multiple causal SNPs in a locus, which, for example, may guide the development of new methods for fine mapping and determine how to interpret an appearing epistasis. In contrast to Mendelian traits, the existence and extent of AH for complex traits had been largely unknown until Hormozdiari et al. proposed a Bayesian method, called causal variants identification in associated regions (CAVIAR), and uncovered widespread AH in complex traits. However, there are several limitations with CAVIAR. First, it assumes a maximum number of causal SNPs in a locus, typically up to six, to save computing time; this assumption, as will be shown, may influence the outcome. Second, its computational time can be too demanding to be feasible since it examines all possible combinations of causal SNPs (under the assumed upper bound). Finally, it outputs a posterior probability of AH, which may be difficult to calibrate with a commonly used nominal significance level. Here, we introduce an intersection-union test (IUT) based on a joint/conditional regression model with all the SNPs in a locus to infer AH. We also propose two sequential IUT-based testing procedures to estimate the number of causal SNPs. Our proposed methods are applicable to not only individual-level genotypic and phenotypic data, but also genome-wide association study (GWAS) summary statistics. We provide numerical examples based on both simulated and real data, including large-scale schizophrenia (SCZ) and high-density lipoprotein (HDL) GWAS summary data sets, to demonstrate the effectiveness of the new methods. In particular, for both the SCZ and HDL data, our proposed IUT not only was faster, but also detected more AH loci than CAVIAR. Our proposed methods are expected to be useful in further uncovering the extent of AH in complex traits.
Collapse
Affiliation(s)
- Yangqing Deng
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota 55455
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota 55455
| |
Collapse
|