1
|
Weiner DJ, Nadig A, Jagadeesh KA, Dey KK, Neale BM, Robinson EB, Karczewski KJ, O'Connor LJ. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 2023; 614:492-499. [PMID: 36755099 PMCID: PMC10614218 DOI: 10.1038/s41586-022-05684-z] [Citation(s) in RCA: 89] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 12/22/2022] [Indexed: 02/10/2023]
Abstract
Both common and rare genetic variants influence complex traits and common diseases. Genome-wide association studies have identified thousands of common-variant associations, and more recently, large-scale exome sequencing studies have identified rare-variant associations in hundreds of genes1-3. However, rare-variant genetic architecture is not well characterized, and the relationship between common-variant and rare-variant architecture is unclear4. Here we quantify the heritability explained by the gene-wise burden of rare coding variants across 22 common traits and diseases in 394,783 UK Biobank exomes5. Rare coding variants (allele frequency < 1 × 10-3) explain 1.3% (s.e. = 0.03%) of phenotypic variance on average-much less than common variants-and most burden heritability is explained by ultrarare loss-of-function variants (allele frequency < 1 × 10-5). Common and rare variants implicate the same cell types, with similar enrichments, and they have pleiotropic effects on the same pairs of traits, with similar genetic correlations. They partially colocalize at individual genes and loci, but not to the same extent: burden heritability is strongly concentrated in significant genes, while common-variant heritability is more polygenic, and burden heritability is also more strongly concentrated in constrained genes. Finally, we find that burden heritability for schizophrenia and bipolar disorder6,7 is approximately 2%. Our results indicate that rare coding variants will implicate a tractable number of large-effect genes, that common and rare associations are mechanistically convergent, and that rare coding variants will contribute only modestly to missing heritability and population risk stratification.
Collapse
Affiliation(s)
- Daniel J Weiner
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Ajay Nadig
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Karthik A Jagadeesh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Kushal K Dey
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Benjamin M Neale
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Elise B Robinson
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Konrad J Karczewski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Luke J O'Connor
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
2
|
Kreiner JM, Tranel PJ, Weigel D, Stinchcombe JR, Wright SI. The genetic architecture and population genomic signatures of glyphosate resistance in Amaranthus tuberculatus. Mol Ecol 2021; 30:5373-5389. [PMID: 33853196 DOI: 10.1111/mec.15920] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 03/15/2021] [Accepted: 04/06/2021] [Indexed: 01/04/2023]
Abstract
Much of what we know about the genetic basis of herbicide resistance has come from detailed investigations of monogenic adaptation at known target-sites, despite the increasingly recognized importance of polygenic resistance. Little work has been done to characterize the broader genomic basis of herbicide resistance, including the number and distribution of genes involved, their effect sizes, allele frequencies and signatures of selection. In this work, we implemented genome-wide association (GWA) and population genomic approaches to examine the genetic architecture of glyphosate (Round-up) resistance in the problematic agricultural weed Amaranthus tuberculatus. A GWA was able to correctly identify the known target-gene but statistically controlling for two causal target-site mechanisms revealed an additional 250 genes across all 16 chromosomes associated with non-target-site resistance (NTSR). The encoded proteins had functions that have been linked to NTSR, the most significant of which is response to chemicals, but also showed pleiotropic roles in reproduction and growth. Compared to an empirical null that accounts for complex population structure, the architecture of NTSR was enriched for large effect sizes and low allele frequencies, suggesting the role of pleiotropic constraints on its evolution. The enrichment of rare alleles also suggested that the genetic architecture of NTSR may be population-specific and heterogeneous across the range. Despite their rarity, we found signals of recent positive selection on NTSR-alleles by both window- and haplotype-based statistics, and an enrichment of amino acid changing variants. In our samples, genome-wide single nucleotide polymorphisms explain a comparable amount of the total variation in glyphosate resistance to monogenic mechanisms, even in a collection of individuals where 80% of resistant individuals have large-effect TSR mutations, indicating an underappreciated polygenic contribution to the evolution of herbicide resistance in weed populations.
Collapse
Affiliation(s)
- Julia M Kreiner
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Patrick J Tranel
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - John R Stinchcombe
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
- Koffler Scientific Reserve, University of Toronto, King City, ON, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
3
|
Horvath G, Knopik VS, Marceau K. Polygenic Influences on Pubertal Timing and Tempo and Depressive Symptoms in Boys and Girls. JOURNAL OF RESEARCH ON ADOLESCENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR RESEARCH ON ADOLESCENCE 2020; 30:78-94. [PMID: 31008555 PMCID: PMC6810710 DOI: 10.1111/jora.12502] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This study used polygenic scoring (PGS) to test whether puberty-related genes were correlated with depressive symptoms, and whether there were indirect effects through pubertal maturation. The sample included 8,795 adolescents from the Avon Longitudinal Study of Parents and Children (measures of puberty drawn ages 8-17 years; of depressive symptoms at age 16.5 years). The PGS (derived from a genome-wide meta-analysis of later age at menarche) predicted boys' and girls' later pubertal timing, boys' slower gonadal development, and girls' faster breast development. Earlier perceived breast development timing predicted more depressive symptoms in girls. Findings support shared genetic underpinnings for boys' and girls' puberty, contributing to multiple pubertal phenotypes with differences in how these genetic variants affect boys' and girls' development.
Collapse
|
4
|
Melamud E, Taylor DL, Sethi A, Cule M, Baryshnikova A, Saleheen D, van Bruggen N, FitzGerald GA. The promise and reality of therapeutic discovery from large cohorts. J Clin Invest 2020; 130:575-581. [PMID: 31929188 PMCID: PMC6994121 DOI: 10.1172/jci129196] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Technological advances in rapid data acquisition have transformed medical biology into a data mining field, where new data sets are routinely dissected and analyzed by statistical models of ever-increasing complexity. Many hypotheses can be generated and tested within a single large data set, and even small effects can be statistically discriminated from a sea of noise. On the other hand, the development of therapeutic interventions moves at a much slower pace. They are determined from carefully randomized and well-controlled experiments with explicitly stated outcomes as the principal mechanism by which a single hypothesis is tested. In this paradigm, only a small fraction of interventions can be tested, and an even smaller fraction are ultimately deemed therapeutically successful. In this Review, we propose strategies to leverage large-cohort data to inform the selection of targets and the design of randomized trials of novel therapeutics. Ultimately, the incorporation of big data and experimental medicine approaches should aim to reduce the failure rate of clinical trials as well as expedite and lower the cost of drug development.
Collapse
Affiliation(s)
- Eugene Melamud
- Calico Life Sciences LLC, South San Francisco, California, USA
| | | | - Anurag Sethi
- Calico Life Sciences LLC, South San Francisco, California, USA
| | - Madeleine Cule
- Calico Life Sciences LLC, South San Francisco, California, USA
| | | | | | | | - Garret A. FitzGerald
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
5
|
Yang T, Kim J, Wu C, Ma Y, Wei P, Pan W. An adaptive test for meta-analysis of rare variant association studies. Genet Epidemiol 2020; 44:104-116. [PMID: 31830326 PMCID: PMC6980317 DOI: 10.1002/gepi.22273] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 11/12/2019] [Accepted: 11/25/2019] [Indexed: 01/02/2023]
Abstract
Single genome-wide studies may be underpowered to detect trait-associated rare variants with moderate or weak effect sizes. As a viable alternative, meta-analysis is widely used to increase power by combining different studies. The power of meta-analysis critically depends on the underlying association patterns and heterogeneity levels, which are unknown and vary from locus to locus. However, existing methods mainly focus on one or only a few combinations of the association pattern and heterogeneity level, thus may lose power in many situations. To address this issue, we propose a general and unified framework by combining a class of tests including and beyond some existing ones, leading to high power across a wide range of scenarios. We demonstrate that the proposed test is more powerful than some existing methods in simulation studies, then show their performance with the NHLBI Exome-Sequencing Project (ESP) data. One gene (B4GALNT2) was found by our proposed test, but not by others, to be statistically significantly associated with plasma triglyceride. The signal was driven by African-ancestry subjects but it was previously reported to be associated with coronary artery disease among European-ancestry subjects. We implemented our method in an R package aSPUmeta, publicly available at https://github.com/ytzhong/metaRV and will be on CRAN soon.
Collapse
Affiliation(s)
- Tianzhong Yang
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Junghi Kim
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Yiding Ma
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Peng Wei
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
6
|
Verma SS, Ritchie MD. Another Round of "Clue" to Uncover the Mystery of Complex Traits. Genes (Basel) 2018; 9:E61. [PMID: 29370075 PMCID: PMC5852557 DOI: 10.3390/genes9020061] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/19/2017] [Accepted: 01/15/2018] [Indexed: 12/13/2022] Open
Abstract
A plethora of genetic association analyses have identified several genetic risk loci. Technological and statistical advancements have now led to the identification of not only common genetic variants, but also low-frequency variants, structural variants, and environmental factors, as well as multi-omics variations that affect the phenotypic variance of complex traits in a population, thus referred to as complex trait architecture. The concept of heritability, or the proportion of phenotypic variance due to genetic inheritance, has been studied for several decades, but its application is mainly in addressing the narrow sense heritability (or additive genetic component) from Genome-Wide Association Studies (GWAS). In this commentary, we reflect on our perspective on the complexity of understanding heritability for human traits in comparison to model organisms, highlighting another round of clues beyond GWAS and an alternative approach, investigating these clues comprehensively to help in elucidating the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Shefali Setia Verma
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Marylyn D Ritchie
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
7
|
Grinde KE, Arbet J, Green A, O'Connell M, Valcarcel A, Westra J, Tintle N. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association. Front Genet 2017; 8:117. [PMID: 28959274 PMCID: PMC5603735 DOI: 10.3389/fgene.2017.00117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Accepted: 08/25/2017] [Indexed: 11/13/2022] Open
Abstract
To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as "winner's curse." We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10-6) and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures.
Collapse
Affiliation(s)
- Kelsey E Grinde
- Department of Biostatistics, University of WashingtonSeattle, WA, United States
| | - Jaron Arbet
- Department of Biostatistics, University of MinnesotaMinneapolis, MN, United States
| | - Alden Green
- Department of Statistics, Carnegie Mellon UniversityPittsburgh, PA, United States
| | - Michael O'Connell
- Department of Biostatistics, University of MinnesotaMinneapolis, MN, United States
| | - Alessandra Valcarcel
- Department of Biostatistics and Epidemiology, University of PennsylvaniaPhiladelphia, PA, United States
| | - Jason Westra
- Department of Statistics, Iowa State UniversityAmes, IA, United States.,Department of Mathematics, Statistics, and Computer Science, Dordt CollegeSioux Center, IA, United States
| | - Nathan Tintle
- Department of Mathematics, Statistics, and Computer Science, Dordt CollegeSioux Center, IA, United States
| |
Collapse
|
8
|
Wu Y, Zheng Z, Visscher PM, Yang J. Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data. Genome Biol 2017; 18:86. [PMID: 28506277 PMCID: PMC5432979 DOI: 10.1186/s13059-017-1216-0] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Accepted: 04/20/2017] [Indexed: 01/21/2023] Open
Abstract
Background Understanding the mapping precision of genome-wide association studies (GWAS), that is the physical distances between the top associated single-nucleotide polymorphisms (SNPs) and the causal variants, is essential to design fine-mapping experiments for complex traits and diseases. Results Using simulations based on whole-genome sequencing (WGS) data from 3642 unrelated individuals of European descent, we show that the association signals at rare causal variants (minor allele frequency ≤ 0.01) are very unlikely to be mapped to common variants in GWAS using either WGS data or imputed data and vice versa. We predict that at least 80% of the common variants identified from published GWAS using imputed data are within 33.5 Kbp of the causal variants, a resolution that is comparable with that using WGS data. Mapping precision at these loci will improve with increasing sample sizes of GWAS in the future. For rare variants, the mapping precision of GWAS using WGS data is extremely high, suggesting WGS is an efficient strategy to detect and fine-map rare variants simultaneously. We further assess the mapping precision by linkage disequilibrium between GWAS hits and causal variants and develop an online tool (gwasMP) to query our results with different thresholds of physical distance and/or linkage disequilibrium (http://cnsgenomics.com/shiny/gwasMP). Conclusions Our findings provide a benchmark to inform future design and development of fine-mapping experiments and technologies to pinpoint the causal variants at GWAS loci. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1216-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yang Wu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia.,Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Zhili Zheng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia.,The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, 325027, China
| | - Peter M Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia.,Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Jian Yang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia. .,Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia.
| |
Collapse
|
9
|
Abstract
In human genome research, genetic association studies of rare variants have been widely studied since the advent of high-throughput DNA sequencing platforms. However, detection of outcome-related rare variants still remains a statistically challenging problem because the number of observed genetic mutations is extremely rare. Recently, a power set-based statistical selection procedure has been proposed to locate both risk and protective rare variants within the outcome-related genes or genetic regions. Although it can perform an individual selection of rare variants, the procedure has a limitation that it cannot measure the certainty of selected rare variants. In this article, we propose a selection probability of individual rare variants, where selection frequencies of rare variants are computed based on bootstrap resampling. Therefore, it can quantify the certainty of both selected and unselected rare variants. Also, a new selection approach using a threshold of selection probability is introduced and compared with some existing selection procedures from extensive simulation studies and real sequencing data analysis. We have demonstrated that the proposed approach outperforms the existing methods in terms of a selection power.
Collapse
Affiliation(s)
- Gira Lee
- Department of Statistics, Pusan National University , Busan, Korea
| | - Hokeun Sun
- Department of Statistics, Pusan National University , Busan, Korea
| |
Collapse
|
10
|
He Z, Zhang D, Renton AE, Li B, Zhao L, Wang GT, Goate AM, Mayeux R, Leal SM. The Rare-Variant Generalized Disequilibrium Test for Association Analysis of Nuclear and Extended Pedigrees with Application to Alzheimer Disease WGS Data. Am J Hum Genet 2017; 100:193-204. [PMID: 28065470 PMCID: PMC5294711 DOI: 10.1016/j.ajhg.2016.12.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 12/06/2016] [Indexed: 01/10/2023] Open
Abstract
Whole-genome and exome sequence data can be cost-effectively generated for the detection of rare-variant (RV) associations in families. Causal variants that aggregate in families usually have larger effect sizes than those found in sporadic cases, so family-based designs can be a more powerful approach than population-based designs. Moreover, some family-based designs are robust to confounding due to population admixture or substructure. We developed a RV extension of the generalized disequilibrium test (GDT) to analyze sequence data obtained from nuclear and extended families. The GDT utilizes genotype differences of all discordant relative pairs to assess associations within a family, and the RV extension combines the single-variant GDT statistic over a genomic region of interest. The RV-GDT has increased power by efficiently incorporating information beyond first-degree relatives and allows for the inclusion of covariates. Using simulated genetic data, we demonstrated that the RV-GDT method has well-controlled type I error rates, even when applied to admixed populations and populations with substructure. It is more powerful than existing family-based RV association methods, particularly for the analysis of extended pedigrees and pedigrees with missing data. We analyzed whole-genome sequence data from families affected by Alzheimer disease to illustrate the application of the RV-GDT. Given the capability of the RV-GDT to adequately control for population admixture or substructure and analyze pedigrees with missing genotype data and its superior power over other family-based methods, it is an effective tool for elucidating the involvement of RVs in the etiology of complex traits.
Collapse
Affiliation(s)
- Zongxiao He
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Di Zhang
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Alan E. Renton
- Department of Neuroscience and Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA
| | - Biao Li
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Linhai Zhao
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Gao T. Wang
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Alison M. Goate
- Department of Neuroscience and Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA
| | - Richard Mayeux
- Department of Neurology, Taub Institute on Alzheimer’s Disease and the Aging Brain and Gertrude H. Sergievsky Center, Columbia University, New York, NY 10027, USA
| | - Suzanne M. Leal
- Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA,Corresponding author
| |
Collapse
|
11
|
Rare Variants in Transcript and Potential Regulatory Regions Explain a Small Percentage of the Missing Heritability of Complex Traits in Cattle. PLoS One 2015; 10:e0143945. [PMID: 26642058 PMCID: PMC4671594 DOI: 10.1371/journal.pone.0143945] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Accepted: 11/11/2015] [Indexed: 11/19/2022] Open
Abstract
The proportion of genetic variation in complex traits explained by rare variants is a key question for genomic prediction, and for identifying the basis of “missing heritability”–the proportion of additive genetic variation not captured by common variants on SNP arrays. Sequence variants in transcript and regulatory regions from 429 sequenced animals were used to impute high density SNP genotypes of 3311 Holstein sires to sequence. There were 675,062 common variants (MAF>0.05), 102,549 uncommon variants (0.01<MAF<0.05), and 83,856 rare variants (MAF<0.01). We describe a novel method for estimating the proportion of the rare variants that are sequencing errors using parent-progeny duos. We then used mixed model methodology to estimate the proportion of variance captured by these different classes of variants for fat, milk and protein yields, as well as for fertility. Common sequence variants captured 83%, 77%, 76% and 84% of the total genetic variance for fat, milk, and protein yields and fertility, respectively. This was between 2 and 5% more variance than that captured from 600k SNPs on a high density chip, although the difference was not significant. Rare variants captured 3%, 0%, 1% and 14% of the genetic variance for fat, milk and protein yields, and fertility respectively, whereas pedigree explained the remaining amount of genetic variance (none for fertility). The proportion of variation explained by rare variants is likely to be under-estimated due to reduced accuracies of imputation for this class of variants. Using common sequence variants slightly improved accuracy of genomic predictions for fat and milk yield, compared to high density SNP array genotypes. However, including rare variants from transcript regions did not increase the accuracy of genomic predictions. These results suggest that rare variants recover a small percentage of the missing heritability for complex traits, however very large reference sets will be required to exploit this to improve the accuracy of genomic predictions. Our results do suggest the contribution of rare variants to genetic variation may be greater for fitness traits.
Collapse
|
12
|
Kim S, Lee K, Sun H. Statistical selection strategy for risk and protective rare variants associated with complex traits. J Comput Biol 2015; 22:1034-43. [PMID: 26469994 DOI: 10.1089/cmb.2015.0091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
In genetic association studies with deep sequencing data, it is a challenging statistical problem to precisely locate rare variants associated with complex diseases or traits due to the limited number of observed genetic mutations. In particular, both risk and protective rare variants can be present in the same gene or genetic region. There currently exist very few statistical methods to separate casual rare variants from noncausal variants within a disease/trait-related gene or a genetic region, while there are relatively many statistical tests to detect a phenotypic association of a group of rare variants such as a gene or a genetic region. In this article, we propose a new statistical selection strategy that is able to locate causal rare variants within the disease/trait-related gene or a genetic region. The proposed procedure is to linearly combine potential risk and protective variants in order to find the optimal combination of rare variants that can have the strongest association signal. It is also computationally very efficient since the procedure is based on forward selection. In simulation studies we demonstrate that the selection performance of the proposed procedure is more powerful than other existing methods when both risk and protective variants are present. We also applied it to the real sequencing data on the ANGPTL gene family from the Dallas Heart Study.
Collapse
Affiliation(s)
- Sera Kim
- Department of Statistics, Pusan National University , Busan, Korea
| | - Kyeongjun Lee
- Department of Statistics, Pusan National University , Busan, Korea
| | - Hokeun Sun
- Department of Statistics, Pusan National University , Busan, Korea
| |
Collapse
|
13
|
Abstract
The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results. Low read depth sequencing of whole genomes and high read depth exomes of nearly 10,000 extensively phenotyped individuals are combined to help characterize novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with lipid-related traits; in addition to describing population structure and providing functional annotation of rare and low-frequency variants the authors use the data to estimate the benefits of sequencing for association studies. This paper, combining data and initial findings from the different arms of the UK10K project, describes insights from low-read-depth sequencing of whole genomes or high-read-depth exome sequencing of nearly 10,000 individuals sampled from a range of disease collections, as well as participants from healthy population based cohorts. The authors characterize novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with lipid-related traits. In addition to describing population structure and providing functional annotation of rare and low frequency variants, they use the data to estimate the benefits of sequencing for association studies.
Collapse
|
14
|
Tada H, Kawashiri MA, Konno T, Yamagishi M, Hayashi K. Common and Rare Variant Association Study for Plasma Lipids and Coronary Artery Disease. J Atheroscler Thromb 2015; 23:241-56. [PMID: 26347050 DOI: 10.5551/jat.31393] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Blood lipid levels are highly heritable and modifiable risk factors for coronary artery disease (CAD), and are the leading cause of death worldwide. These facts have motivated human genetic association studies that have the substantial potential to define the risk factors that are causal and to identify pathways and therapeutic targets for lipids and CAD.The success of the HapMap project that provided an extensive catalog of human genetic variations and the development of microarray based genotyping chips (typically containing variations with allele frequencies > 5%) facilitated common variant association study (CVAS; formerly termed genome-wide association study, GWAS) identifying disease-associated variants in a genome-wide manner. To date, 157 loci associated with blood lipids and 46 loci with CAD have been successfully identified, accounting for approximately 12%-14% of heritability for lipids and 10% of heritability for CAD. However, there is yet a major challenge termed "missing heritability problem," namely the observation that loci detected by CVAS explain only a small fraction of the inferred genetic variations. To explain such missing portions, focuses in genetic association studies have shifted from common to rare variants. However, it is challenging to apply rare variant association study (RVAS) in an unbiased manner because such variants typically lack the sufficient number to be identified statistically.In this review, we provide a current understanding of the genetic architecture mostly derived from CVAS, and several updates on the progress and limitations of RVAS for lipids and CAD.
Collapse
Affiliation(s)
- Hayato Tada
- Division of Cardiovascular Medicine, Kanazawa University Graduate School of Medicine
| | | | | | | | | |
Collapse
|
15
|
Zhang Q. Associating rare genetic variants with human diseases. Front Genet 2015; 6:133. [PMID: 25904936 PMCID: PMC4389536 DOI: 10.3389/fgene.2015.00133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 03/19/2015] [Indexed: 11/20/2022] Open
Affiliation(s)
- Qunyuan Zhang
- Division of Statistical Genomics, Washington University School of Medicine St. Louis, MO, USA
| |
Collapse
|
16
|
Reynolds CA, Finkel D. A meta-analysis of heritability of cognitive aging: minding the "missing heritability" gap. Neuropsychol Rev 2015; 25:97-112. [PMID: 25732892 DOI: 10.1007/s11065-015-9280-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Accepted: 01/26/2015] [Indexed: 12/19/2022]
Abstract
The etiologies underlying variation in adult cognitive performance and cognitive aging have enjoyed much attention in the literature, but much of that attention has focused on broad factors, principally general cognitive ability. The current review provides meta-analyses of age trends in heritability of specific cognitive abilities and considers the profile of genetic and environmental factors contributing to cognitive aging to address the 'missing heritability' issue. Our findings, based upon evaluating 27 reports in the literature, indicate that verbal ability demonstrated declining heritability, after about age 60, as did spatial ability and perceptual speed more modestly. Trends for general memory, working memory, and spatial ability generally indicated stability, or small increases in heritability in mid-life. Equivocal results were found for executive function. A second meta-analysis then considered the gap between twin-based versus SNP-based heritability derived from population-based GWAS studies. Specifically, we considered twin correlation ratios to agnostically re-evaluate biometrical models across age and by cognitive domain. Results modestly suggest that nonadditive genetic variance may become increasingly important with age, especially for verbal ability. If so, this would support arguments that lower SNP-based heritability estimates result in part from uncaptured non-additive influences (e.g., dominance, gene-gene interactions), and possibly gene-environment (GE) correlations. Moreover, consistent with longitudinal twin studies of aging, as rearing environment becomes a distal factor, increasing genetic variance may result in part from nonadditive genetic influences or possible GE correlations. Sensitivity to life course dynamics is crucial to understanding etiological contributions to adult cognitive performance and cognitive aging.
Collapse
Affiliation(s)
- Chandra A Reynolds
- Department of Psychology, University of California Riverside, Riverside, CA, 92521, USA,
| | | |
Collapse
|
17
|
Gusev A, Lee S, Trynka G, Finucane H, Vilhjálmsson B, Xu H, Zang C, Ripke S, Bulik-Sullivan B, Stahl E, Kähler AK, Hultman CM, Purcell SM, McCarroll SA, Daly M, Pasaniuc B, Sullivan PF, Neale BM, Wray NR, Raychaudhuri S, Price AL, Ripke S, Neale B, Corvin A, Walters J, Farh KH, Holmans P, Lee P, Bulik-Sullivan B, Collier D, Huang H, Pers T, Agartz I, Agerbo E, Albus M, Alexander M, Amin F, Bacanu S, Begemann M, Belliveau R, Bene J, Bergen S, Bevilacqua E, Bigdeli T, Black D, Børglum A, Bruggeman R, Buccola N, Buckner R, Byerley W, Cahn W, Cai G, Campion D, Cantor R, Carr V, Carrera N, Catts S, Chambert K, Chan R, Chen R, Chen E, Cheng W, Cheung E, Chong S, Cloninger C, Cohen D, Cohen N, Cormican P, Craddock N, Crowley J, Curtis D, Davidson M, Davis K, Degenhardt F, Del Favero J, DeLisi L, Demontis D, Dikeos D, Dinan T, Djurovic S, Donohoe G, Drapeau E, Duan J, Dudbridge F, Durmishi N, Eichhammer P, Eriksson J, Escott-Price V, Essioux L, Fanous A, Farrell M, Frank J, Franke L, Freedman R, Freimer N, Friedl M, Friedman J, Fromer M, Genovese G, Georgieva L, et alGusev A, Lee S, Trynka G, Finucane H, Vilhjálmsson B, Xu H, Zang C, Ripke S, Bulik-Sullivan B, Stahl E, Kähler AK, Hultman CM, Purcell SM, McCarroll SA, Daly M, Pasaniuc B, Sullivan PF, Neale BM, Wray NR, Raychaudhuri S, Price AL, Ripke S, Neale B, Corvin A, Walters J, Farh KH, Holmans P, Lee P, Bulik-Sullivan B, Collier D, Huang H, Pers T, Agartz I, Agerbo E, Albus M, Alexander M, Amin F, Bacanu S, Begemann M, Belliveau R, Bene J, Bergen S, Bevilacqua E, Bigdeli T, Black D, Børglum A, Bruggeman R, Buccola N, Buckner R, Byerley W, Cahn W, Cai G, Campion D, Cantor R, Carr V, Carrera N, Catts S, Chambert K, Chan R, Chen R, Chen E, Cheng W, Cheung E, Chong S, Cloninger C, Cohen D, Cohen N, Cormican P, Craddock N, Crowley J, Curtis D, Davidson M, Davis K, Degenhardt F, Del Favero J, DeLisi L, Demontis D, Dikeos D, Dinan T, Djurovic S, Donohoe G, Drapeau E, Duan J, Dudbridge F, Durmishi N, Eichhammer P, Eriksson J, Escott-Price V, Essioux L, Fanous A, Farrell M, Frank J, Franke L, Freedman R, Freimer N, Friedl M, Friedman J, Fromer M, Genovese G, Georgieva L, Gershon E, Giegling I, Giusti-Rodrguez P, Godard S, Goldstein J, Golimbet V, Gopal S, Gratten J, Grove J, de Haan L, Hammer C, Hamshere M, Hansen M, Hansen T, Haroutunian V, Hartmann A, Henskens F, Herms S, Hirschhorn J, Hoffmann P, Hofman A, Hollegaard M, Hougaard D, Ikeda M, Joa I, Julià A, Kahn R, Kalaydjieva L, Karachanak-Yankova S, Karjalainen J, Kavanagh D, Keller M, Kelly B, Kennedy J, Khrunin A, Kim Y, Klovins J, Knowles J, Konte B, Kucinskas V, Kucinskiene Z, Kuzelova-Ptackova H, Kähler A, Laurent C, Keong J, Lee S, Legge S, Lerer B, Li M, Li T, Liang KY, Lieberman J, Limborska S, Loughland C, Lubinski J, Lnnqvist J, Macek M, Magnusson P, Maher B, Maier W, Mallet J, Marsal S, Mattheisen M, Mattingsdal M, McCarley R, McDonald C, McIntosh A, Meier S, Meijer C, Melegh B, Melle I, Mesholam-Gately R, Metspalu A, Michie P, Milani L, Milanova V, Mokrab Y, Morris D, Mors O, Mortensen P, Murphy K, Murray R, Myin-Germeys I, Mller-Myhsok B, Nelis M, Nenadic I, Nertney D, Nestadt G, Nicodemus K, Nikitina-Zake L, Nisenbaum L, Nordin A, O’Callaghan E, O’Dushlaine C, O’Neill F, Oh SY, Olincy A, Olsen L, Van Os J, Pantelis C, Papadimitriou G, Papiol S, Parkhomenko E, Pato M, Paunio T, Pejovic-Milovancevic M, Perkins D, Pietilinen O, Pimm J, Pocklington A, Powell J, Price A, Pulver A, Purcell S, Quested D, Rasmussen H, Reichenberg A, Reimers M, Richards A, Roffman J, Roussos P, Ruderfer D, Salomaa V, Sanders A, Schall U, Schubert C, Schulze T, Schwab S, Scolnick E, Scott R, Seidman L, Shi J, Sigurdsson E, Silagadze T, Silverman J, Sim K, Slominsky P, Smoller J, So HC, Spencer C, Stahl E, Stefansson H, Steinberg S, Stogmann E, Straub R, Strengman E, Strohmaier J, Stroup T, Subramaniam M, Suvisaari J, Svrakic D, Szatkiewicz J, Sderman E, Thirumalai S, Toncheva D, Tooney P, Tosato S, Veijola J, Waddington J, Walsh D, Wang D, Wang Q, Webb B, Weiser M, Wildenauer D, Williams N, Williams S, Witt S, Wolen A, Wong E, Wormley B, Wu J, Xi H, Zai C, Zheng X, Zimprich F, Wray N, Stefansson K, Visscher P, Adolfsson R, Andreassen O, Blackwood D, Bramon E, Buxbaum J, Brglum A, Cichon S, Darvasi A, Domenici E, Ehrenreich H, Esko T, Gejman P, Gill M, Gurling H, Hultman C, Iwata N, Jablensky A, Jönsson E, Kendler K, Kirov G, Knight J, Lencz T, Levinson D, Li Q, Liu J, Malhotra A, McCarroll S, McQuillin A, Moran J, Mortensen P, Mowry B, Nthen M, Ophoff R, Owen M, Palotie A, Pato C, Petryshen T, Posthuma D, Rietschel M, Riley B, Rujescu D, Sham P, Sklar P, St. Clair D, Weinberger D, Wendland J, Werge T, Daly M, Sullivan P, O’Donovan M, Ripke S, O’Dushlaine C, Chambert K, Moran JL, Kähler AK, Akterin S, Bergen S, Magnusson PK, Neale BM, Ruderfer D, Scolnick E, Purcell S, McCarroll S, Sklar P, Hultman CM, Sullivan PF. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am J Hum Genet 2014; 95:535-52. [PMID: 25439723 PMCID: PMC4225595 DOI: 10.1016/j.ajhg.2014.10.004] [Show More Authors] [Citation(s) in RCA: 440] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Accepted: 10/02/2014] [Indexed: 10/25/2022] Open
Abstract
Regulatory and coding variants are known to be enriched with associations identified by genome-wide association studies (GWASs) of complex disease, but their contributions to trait heritability are currently unknown. We applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg(2)) across functional categories (while accounting for shared variance due to linkage disequilibrium). Extensive simulations showed that in contrast to current estimates from GWAS summary statistics, the variance-component approach partitions heritability accurately under a wide range of complex-disease architectures. Across the 11 diseases DNaseI hypersensitivity sites (DHSs) from 217 cell types spanned 16% of imputed SNPs (and 24% of genotyped SNPs) but explained an average of 79% (SE = 8%) of hg(2) from imputed SNPs (5.1× enrichment; p = 3.7 × 10(-17)) and 38% (SE = 4%) of hg(2) from genotyped SNPs (1.6× enrichment, p = 1.0 × 10(-4)). Further enrichment was observed at enhancer DHSs and cell-type-specific DHSs. In contrast, coding variants, which span 1% of the genome, explained <10% of hg(2) despite having the highest enrichment. We replicated these findings but found no significant contribution from rare coding variants in independent schizophrenia cohorts genotyped on GWAS and exome chips. Our results highlight the value of analyzing components of heritability to unravel the functional architecture of common disease.
Collapse
|
18
|
den Hollander AI, de Jong EK. Highly penetrant alleles in age-related macular degeneration. Cold Spring Harb Perspect Med 2014; 5:a017202. [PMID: 25377141 DOI: 10.1101/cshperspect.a017202] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Age-related macular degeneration (AMD) is a complex disease caused by a combination of genetic and environmental factors. Genome-wide association studies have identified several common genetic variants associated with AMD, which together account for 15%-65% of the heritability of AMD. Multiple hypotheses to clarify the unexplained portion of genetic variance have been proposed, such as gene-gene interactions, gene-environment interactions, structural variations, epigenetics, and rare variants. Several studies support a role for rare variants with large effect sizes in the pathogenesis of AMD. In this work, we review the methods that can be used to detect rare variants in common diseases, as well as the recent progress that has been made in the identification of rare variants in AMD. In addition, the relevance of these rare variants for diagnosis, prognosis, and treatment of AMD is highlighted.
Collapse
Affiliation(s)
- Anneke I den Hollander
- Department of Ophthalmology and Department of Human Genetics, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Eiko K de Jong
- Department of Ophthalmology and Department of Human Genetics, Radboud University Medical Centre, Nijmegen, The Netherlands
| |
Collapse
|
19
|
Sadee W, Hartmann K, Seweryn M, Pietrzak M, Handelman SK, Rempala GA. Missing heritability of common diseases and treatments outside the protein-coding exome. Hum Genet 2014; 133:1199-1215. [PMID: 25107510 PMCID: PMC4169001 DOI: 10.1007/s00439-014-1476-7] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 07/23/2014] [Indexed: 02/07/2023]
Abstract
Genetic factors strongly influence risk of common human diseases and treatment outcomes but the causative variants remain largely unknown; this gap has been called the 'missing heritability'. We propose several hypotheses that in combination have the potential to narrow the gap. First, given a multi-stage path from wellness to disease, we propose that common variants under positive evolutionary selection represent normal variation and gate the transition between wellness and an 'off-well' state, revealing adaptations to changing environmental conditions. In contrast, genome-wide association studies (GWAS) focus on deleterious variants conveying disease risk, accelerating the path from off-well to illness and finally specific diseases, while common 'normal' variants remain hidden in the noise. Second, epistasis (dynamic gene-gene interactions) likely assumes a central role in adaptations and evolution; yet, GWAS analyses currently are poorly designed to reveal epistasis. As gene regulation is germane to adaptation, we propose that epistasis among common normal regulatory variants, or between common variants and less frequent deleterious variants, can have strong protective or deleterious phenotypic effects. These gene-gene interactions can be highly sensitive to environmental stimuli and could account for large differences in drug response between individuals. Residing largely outside the protein-coding exome, common regulatory variants affect either transcription of coding and non-coding RNAs (regulatory SNPs, or rSNPs) or RNA functions and processing (structural RNA SNPs, or srSNPs). Third, with the vast majority of causative variants yet to be discovered, GWAS rely on surrogate markers, a confounding factor aggravated by the presence of more than one causative variant per gene and by epistasis. We propose that the confluence of these factors may be responsible to large extent for the observed heritability gap.
Collapse
Affiliation(s)
- Wolfgang Sadee
- Department of Pharmacology, Center for Pharmacogenomics, College of Medicine, The Ohio State University Wexner Medical Center, 5184A Graves Hall, 333 West 10th Avenue, Columbus, OH, 43210, USA,
| | | | | | | | | | | |
Collapse
|
20
|
Sun H, Wang S. A power set-based statistical selection procedure to locate susceptible rare variants associated with complex traits with sequencing data. Bioinformatics 2014; 30:2317-23. [PMID: 24755303 DOI: 10.1093/bioinformatics/btu207] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION Existing association methods for rare variants from sequencing data have focused on aggregating variants in a gene or a genetic region because of the fact that analysing individual rare variants is underpowered. However, these existing rare variant detection methods are not able to identify which rare variants in a gene or a genetic region of all variants are associated with the complex diseases or traits. Once phenotypic associations of a gene or a genetic region are identified, the natural next step in the association study with sequencing data is to locate the susceptible rare variants within the gene or the genetic region. RESULTS In this article, we propose a power set-based statistical selection procedure that is able to identify the locations of the potentially susceptible rare variants within a disease-related gene or a genetic region. The selection performance of the proposed selection procedure was evaluated through simulation studies, where we demonstrated the feasibility and superior power over several comparable existing methods. In particular, the proposed method is able to handle the mixed effects when both risk and protective variants are present in a gene or a genetic region. The proposed selection procedure was also applied to the sequence data on the ANGPTL gene family from the Dallas Heart Study to identify potentially susceptible rare variants within the trait-related genes. AVAILABILITY AND IMPLEMENTATION An R package 'rvsel' can be downloaded from http://www.columbia.edu/∼sw2206/ and http://statsun.pusan.ac.kr.
Collapse
Affiliation(s)
- Hokeun Sun
- Department of Statistics, Pusan National University, Pusan 609-735, Korea and Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| | - Shuang Wang
- Department of Statistics, Pusan National University, Pusan 609-735, Korea and Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA
| |
Collapse
|
21
|
Okazaki S, Watanabe Y, Hishimoto A, Sasada T, Mouri K, Shiroiwa K, Eguchi N, Ratta-Apha W, Otsuka I, Nunokawa A, Kaneko N, Shibuya M, Someya T, Shirakawa O, Sora I. Association analysis of putative cis-acting polymorphisms of interleukin-19 gene with schizophrenia. Prog Neuropsychopharmacol Biol Psychiatry 2014; 50:151-6. [PMID: 24361379 DOI: 10.1016/j.pnpbp.2013.12.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Revised: 12/10/2013] [Accepted: 12/10/2013] [Indexed: 11/17/2022]
Abstract
BACKGROUND Genome-wide association studies (GWAS) and gene expression analyses have revealed that single nucleotide polymorphisms (SNPs) associated with multifactorial diseases, such as schizophrenia, are significantly more likely to be associated with expression quantitative trait loci (eQTL). It was recently suggested that an immune system imbalance plays an important role in the pathogenesis of schizophrenia. Interleukin-19 is a novel cytokine that may play multiple roles in immune regulation and various diseases. METHOD We selected eight tag SNPs in the eQTL of the IL-19 gene. Seven of the SNPs are putative cis-acting SNPs. Then, we conducted a case-control study using two independent samples. The first sample comprised 567 schizophrenia patients and 710 controls, and the second sample comprised 677 schizophrenia patients and 667 controls. RESULT We identified the TGAA haplotype as being significantly associated with schizophrenia (p=0.0036 and corrected p=0.0264), although a combined analysis of the TGAA haplotype with the replication samples exhibited a nominally significant difference (p=0.022 and corrected p=0.235). CONCLUSIONS These results suggest that the IL-19 gene might slightly contribute to the genetic risk of schizophrenia. Thus, further research on the association of eQTL SNPs with schizophrenia is warranted.
Collapse
Affiliation(s)
- Satoshi Okazaki
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Yuichiro Watanabe
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Akitoyo Hishimoto
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan.
| | - Toru Sasada
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Kentaro Mouri
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Kyoichi Shiroiwa
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Noriomi Eguchi
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Woraphat Ratta-Apha
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Ikuo Otsuka
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Ayako Nunokawa
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Naoshi Kaneko
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Masako Shibuya
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Toshiyuki Someya
- Department of Psychiatry, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
| | - Osamu Shirakawa
- Department of Neuropsychiatry, Kinki University School of Medicine, Osaka, Japan
| | - Ichiro Sora
- Department of Psychiatry, Kobe University Graduate School of Medicine, Kobe, Japan
| |
Collapse
|
22
|
Li B, Liu DJ, Leal SM. Identifying rare variants associated with complex traits via sequencing. ACTA ACUST UNITED AC 2014; Chapter 1:Unit 1.26. [PMID: 23853079 DOI: 10.1002/0471142905.hg0126s78] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Although genome-wide association studies have been successful in detecting associations with common variants, there is currently an increasing interest in identifying low-frequency and rare variants associated with complex traits. Next-generation sequencing technologies make it feasible to survey the full spectrum of genetic variation in coding regions or the entire genome. The association analysis for rare variants is challenging, and traditional methods are ineffective, however, due to the low frequency of rare variants, coupled with allelic heterogeneity. Recently a battery of new statistical methods has been proposed for identifying rare variants associated with complex traits. These methods test for associations by aggregating multiple rare variants across a gene or a genomic region or among a group of variants in the genome. In this unit, we describe key concepts for rare variant association for complex traits, survey some of the recent methods, discuss their statistical power under various scenarios, and provide practical guidance on analyzing next-generation sequencing data for identifying rare variants associated with complex traits.
Collapse
Affiliation(s)
- Bingshan Li
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, USA
| | | | | |
Collapse
|
23
|
Schulte EC, Stahl I, Czamara D, Ellwanger DC, Eck S, Graf E, Mollenhauer B, Zimprich A, Lichtner P, Haubenberger D, Pirker W, Brücke T, Bereznai B, Molnar MJ, Peters A, Gieger C, Müller-Myhsok B, Trenkwalder C, Winkelmann J. Rare variants in PLXNA4 and Parkinson's disease. PLoS One 2013; 8:e79145. [PMID: 24244438 PMCID: PMC3823607 DOI: 10.1371/journal.pone.0079145] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 09/18/2013] [Indexed: 11/18/2022] Open
Abstract
Approximately 20% of individuals with Parkinson's disease (PD) report a positive family history. Yet, a large portion of causal and disease-modifying variants is still unknown. We used exome sequencing in two affected individuals from a family with late-onset familial PD followed by frequency assessment in 975 PD cases and 1014 ethnically-matched controls and linkage analysis to identify potentially causal variants. Based on the predicted penetrance and the frequencies, a variant in PLXNA4 proved to be the best candidate and PLXNA4 was screened for additional variants in 862 PD cases and 940 controls, revealing an excess of rare non-synonymous coding variants in PLXNA4 in individuals with PD. Although we cannot conclude that the variant in PLXNA4 is indeed the causative variant, these findings are interesting in the light of a surfacing role of axonal guidance mechanisms in neurodegenerative disorders but, at the same time, highlight the difficulties encountered in the study of rare variants identified by next-generation sequencing in diseases with autosomal dominant or complex patterns of inheritance.
Collapse
Affiliation(s)
- Eva C. Schulte
- Neurologische Klinik und Poliklinik, Klinikum rechts der Isar, Technische Universität, München, Munich, Germany
- Institut für Humangenetik, Helmholtz Zentrum München, Munich, Germany
| | - Immanuel Stahl
- Neurologische Klinik und Poliklinik, Klinikum rechts der Isar, Technische Universität, München, Munich, Germany
- Institut für Humangenetik, Helmholtz Zentrum München, Munich, Germany
| | - Darina Czamara
- Max-Planck Institut für Psychiatrie, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Daniel C. Ellwanger
- Chair for Genome-Oriented Bioinformatics, Technische Universität München, Life and Food Science Center Weihenstephan, Freising-Weihenstephan, Germany
| | - Sebastian Eck
- Institut für Humangenetik, Helmholtz Zentrum München, Munich, Germany
| | - Elisabeth Graf
- Institut für Humangenetik, Helmholtz Zentrum München, Munich, Germany
| | - Brit Mollenhauer
- Paracelsus Elena Klinik, Kassel, Germany
- Neurochirurgische Klinik, Georg August Universität, Göttingen, Germany
| | | | - Peter Lichtner
- Institut für Humangenetik, Helmholtz Zentrum München, Munich, Germany
- Institut für Humangenetik, Technische Universität München, Munich, Germany
| | | | - Walter Pirker
- Department of Neurology, Medical University of Vienna, Vienna, Austria
| | - Thomas Brücke
- Department of Neurology, Wilhelminenspital, Vienna, Austria
| | - Benjamin Bereznai
- Center for Molecular Neurology, Department of Neurology, Semmelweis University, Budapest, Hungary
| | - Maria J. Molnar
- Center for Molecular Neurology, Department of Neurology, Semmelweis University, Budapest, Hungary
| | - Annette Peters
- Institute for Epidemiology II, Helmholtz Zentrum München, Munich, Germany
| | - Christian Gieger
- Institute for Genetic Epidemiology, Helmholtz Zentrum München, Munich, Germany
| | - Bertram Müller-Myhsok
- Max-Planck Institut für Psychiatrie, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Claudia Trenkwalder
- Paracelsus Elena Klinik, Kassel, Germany
- Neurochirurgische Klinik, Georg August Universität, Göttingen, Germany
| | - Juliane Winkelmann
- Neurologische Klinik und Poliklinik, Klinikum rechts der Isar, Technische Universität, München, Munich, Germany
- Institut für Humangenetik, Helmholtz Zentrum München, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institut für Humangenetik, Technische Universität München, Munich, Germany
- Department of Neurology and Neurosciences, Stanford University, Palo Alto, California, United States of America
- * E-mail:
| |
Collapse
|
24
|
Seneviratne C, Franklin J, Beckett K, Ma JZ, Ait-Daoud N, Payne TJ, Johnson BA, Li MD. Association, interaction, and replication analysis of genes encoding serotonin transporter and 5-HT3 receptor subunits A and B in alcohol dependence. Hum Genet 2013; 132:1165-76. [PMID: 23757001 PMCID: PMC3775919 DOI: 10.1007/s00439-013-1319-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 05/26/2013] [Indexed: 12/12/2022]
Abstract
On the basis of the converging evidence showing regulation of drinking behavior by 5-HT3AB receptors and the serotonin transporter, we hypothesized that the interactive effects of genetic variations in the genes HTR3A, HTR3B, and SLC6A4 confer greater susceptibility to alcohol dependence (AD) than do their effects individually. We examined the associations of AD with 22 SNPs across HTR3A, HTR3B, and two functional variants in SLC6A4 in 500 AD and 280 healthy control individuals of European descent. We found that the alleles of the low-frequency SNPs rs33940208:T in HTR3A and rs2276305:A in HTR3B were inversely and nominally significantly associated with AD with odds ratio (OR) and 95 % confidence interval of 0.212 and 0.073, 0.616 (P = 0.004) and 0.261 and 0.088, 0.777 (P = 0.016), respectively. Further, our gene-by-gene interaction analysis revealed that two four-variant models that differed by only one SNP carried a risk for AD (empirical P < 1 × 10(-6) for prediction accuracy of the two models based on 10(6) permutations). Subsequent analysis of these two interaction models revealed an OR of 2.71 and 2.80, respectively, for AD (P < 0.001) in carriers of genotype combinations 5'-HTTLPR:LL/LS(SLC6A4)-rs1042173:TT/TG(SLC6A4)-rs1176744:AC(HTR3B)-rs3782025:AG(HTR3B) and 5'-HTTLPR:LL/LS(SLC6A4)-rs10160548:GT/TT(HTR3A)-rs1176744:AC(HTR3B)-rs3782025:AG(HTR3B). Combining all five genotypes resulted in an OR of 3.095 (P = 2.0 × 10(-4)) for AD. Inspired by these findings, we conducted the analysis in an independent sample, OZ-ALC-GWAS (N = 6699), obtained from the NIH dbGAP database, which confirmed the findings, not only for all three risk genotype combinations (Z = 4.384, P = 1.0 × 10(-5); Z = 3.155, P = 1.6 × 10(-3); and Z = 3.389, P = 7.0 × 10(-4), respectively), but also protective effects for rs33940208:T (χ (2) = 3.316, P = 0.0686) and rs2276305:A (χ (2) = 7.224, P = 0.007). These findings reveal significant interactive effects among variants in SLC6A4-HTR3A-HTR3B affecting AD. Further studies are needed to confirm these findings and characterize the molecular mechanisms underlying these effects.
Collapse
Affiliation(s)
- Chamindi Seneviratne
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Jason Franklin
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Katherine Beckett
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA:
| | - Jennie Z. Ma
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Nassima Ait-Daoud
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Thomas J. Payne
- ACT Center for Tobacco Treatment, Education and Research, Department of Otolaryngology, University of Mississippi Medical Center, Jackson, USA
| | - Bankole A. Johnson
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Ming D. Li
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| |
Collapse
|
25
|
Hunt KA, Mistry V, Bockett NA, Ahmad T, Ban M, Barker JN, Barrett JC, Blackburn H, Brand O, Burren O, Capon F, Compston A, Gough SCL, Jostins L, Kong Y, Lee JC, Lek M, MacArthur DG, Mansfield JC, Mathew CG, Mein CA, Mirza M, Nutland S, Onengut-Gumuscu S, Papouli E, Parkes M, Rich SS, Sawcer S, Satsangi J, Simmonds MJ, Trembath RC, Walker NM, Wozniak E, Todd JA, Simpson MA, Plagnol V, van Heel DA. Negligible impact of rare autoimmune-locus coding-region variants on missing heritability. Nature 2013; 498:232-5. [PMID: 23698362 PMCID: PMC3736321 DOI: 10.1038/nature12170] [Citation(s) in RCA: 145] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 04/08/2013] [Indexed: 02/06/2023]
Abstract
Genome-wide association studies (GWAS) have identified common variants of modest-effect size at hundreds of loci for common autoimmune diseases; however, a substantial fraction of heritability remains unexplained, to which rare variants may contribute. To discover rare variants and test them for association with a phenotype, most studies re-sequence a small initial sample size and then genotype the discovered variants in a larger sample set. This approach fails to analyse a large fraction of the rare variants present in the entire sample set. Here we perform simultaneous amplicon-sequencing-based variant discovery and genotyping for coding exons of 25 GWAS risk genes in 41,911 UK residents of white European origin, comprising 24,892 subjects with six autoimmune disease phenotypes and 17,019 controls, and show that rare coding-region variants at known loci have a negligible role in common autoimmune disease susceptibility. These results do not support the rare-variant synthetic genome-wide-association hypothesis (in which unobserved rare causal variants lead to association detected at common tag variants). Many known autoimmune disease risk loci contain multiple, independently associated, common and low-frequency variants, and so genes at these loci are a priori stronger candidates for harbouring rare coding-region variants than other genes. Our data indicate that the missing heritability for common autoimmune diseases may not be attributable to the rare coding-region variant portion of the allelic spectrum, but perhaps, as others have proposed, may be a result of many common-variant loci of weak effect.
Collapse
Affiliation(s)
- Karen A Hunt
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AT, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Next-generation sequencing diagnostics for neurological diseases/disorders: from a clinical perspective. Hum Genet 2013; 132:721-34. [PMID: 23525706 DOI: 10.1007/s00439-013-1287-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Accepted: 03/02/2013] [Indexed: 12/13/2022]
Abstract
Neurological diseases encompass a broad, heterogeneous group of disorders ranging from pediatric neurodevelopmental diseases to late-onset neurodegenerative diseases, most of which are poorly understood and few of which are curable. Most of these diseases have a genetic basis and thus are expected to be amenable to genetic or genomic analysis by next-generation sequencing (NGS). While the advancement of contemporary technologies (such as NGS) is exciting, translating this tool into actual benefit for patients and clinicians can be challenging. In a clinical setting, a sequencing test that is fast, non-invasive, cheap and with perfect specificity would be ideal. However, in practice, there are several hurdles and caveats to consider even before a NGS diagnostic testing can be optimally applied. Proper definition of clinical phenotype, selection of the most appropriate subjects and the clinical setting, optimization of both sensitivity and specificity of the test, evaluation of the availability of the infrastructure and expertise, and consideration of economic, ethical and legal issues are vital in the final application of NGS diagnostic screening in the clinics.
Collapse
|
27
|
Bahcall O. Rare-variant association methods. Nat Genet 2012. [DOI: 10.1038/ng.2458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|