201
|
Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 2014; 15:335-46. [PMID: 24739678 DOI: 10.1038/nrg3706] [Citation(s) in RCA: 383] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Significance testing was developed as an objective method for summarizing statistical evidence for a hypothesis. It has been widely adopted in genetic studies, including genome-wide association studies and, more recently, exome sequencing studies. However, significance testing in both genome-wide and exome-wide studies must adopt stringent significance thresholds to allow multiple testing, and it is useful only when studies have adequate statistical power, which depends on the characteristics of the phenotype and the putative genetic variant, as well as the study design. Here, we review the principles and applications of significance testing and power calculation, including recently proposed gene-based tests for rare variants.
Collapse
Affiliation(s)
- Pak C Sham
- Centre for Genomic Sciences, Jockey Club Building for Interdisciplinary Research; State Key Laboratory of Brain and Cognitive Sciences, and Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Shaun M Purcell
- 1] Center for Statistical Genetics, Icahn School of Medicine at Mount Sinai, New York 10029-6574, USA. [2] Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
202
|
Liu JZ, Anderson CA. Genetic studies of Crohn's disease: past, present and future. Best Pract Res Clin Gastroenterol 2014; 28:373-86. [PMID: 24913378 PMCID: PMC4075408 DOI: 10.1016/j.bpg.2014.04.009] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Revised: 04/14/2014] [Accepted: 04/24/2014] [Indexed: 01/31/2023]
Abstract
The exact aetiology of Crohn's disease is unknown, though it is clear from early epidemiological studies that a combination of genetic and environmental risk factors contributes to an individual's disease susceptibility. Here, we review the history of gene-mapping studies of Crohn's disease, from the linkage-based studies that first implicated the NOD2 locus, through to modern-day genome-wide association studies that have discovered over 140 loci associated with Crohn's disease and yielded novel insights into the biological pathways underlying pathogenesis. We describe on-going and future gene-mapping studies that utilise next generation sequencing technology to pinpoint causal variants and identify rare genetic variation underlying Crohn's disease risk. We comment on the utility of genetic markers for predicting an individual's disease risk and discuss their potential for identifying novel drug targets and influencing disease management. Finally, we describe how these studies have shaped and continue to shape our understanding of the genetic architecture of Crohn's disease.
Collapse
Affiliation(s)
- Jimmy Z Liu
- The Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK
| | | |
Collapse
|
203
|
Thomson PA, Parla JS, McRae AF, Kramer M, Ramakrishnan K, Yao J, Soares DC, McCarthy S, Morris SW, Cardone L, Cass S, Ghiban E, Hennah W, Evans KL, Rebolini D, Millar JK, Harris SE, Starr JM, MacIntyre DJ, McIntosh AM, Watson JD, Deary IJ, Visscher PM, Blackwood DH, McCombie WR, Porteous DJ. 708 Common and 2010 rare DISC1 locus variants identified in 1542 subjects: analysis for association with psychiatric disorder and cognitive traits. Mol Psychiatry 2014; 19:668-75. [PMID: 23732877 PMCID: PMC4031635 DOI: 10.1038/mp.2013.68] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Revised: 04/22/2013] [Accepted: 04/23/2013] [Indexed: 12/16/2022]
Abstract
A balanced t(1;11) translocation that transects the Disrupted in schizophrenia 1 (DISC1) gene shows genome-wide significant linkage for schizophrenia and recurrent major depressive disorder (rMDD) in a single large Scottish family, but genome-wide and exome sequencing-based association studies have not supported a role for DISC1 in psychiatric illness. To explore DISC1 in more detail, we sequenced 528 kb of the DISC1 locus in 653 cases and 889 controls. We report 2718 validated single-nucleotide polymorphisms (SNPs) of which 2010 have a minor allele frequency of <1%. Only 38% of these variants are reported in the 1000 Genomes Project European subset. This suggests that many DISC1 SNPs remain undiscovered and are essentially private. Rare coding variants identified exclusively in patients were found in likely functional protein domains. Significant region-wide association was observed between rs16856199 and rMDD (P=0.026, unadjusted P=6.3 × 10(-5), OR=3.48). This was not replicated in additional recurrent major depression samples (replication P=0.11). Combined analysis of both the original and replication set supported the original association (P=0.0058, OR=1.46). Evidence for segregation of this variant with disease in families was limited to those of rMDD individuals referred from primary care. Burden analysis for coding and non-coding variants gave nominal associations with diagnosis and measures of mood and cognition. Together, these observations are likely to generalise to other candidate genes for major mental illness and may thus provide guidelines for the design of future studies.
Collapse
Affiliation(s)
- P A Thomson
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
| | - J S Parla
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - A F McRae
- University of Queensland Diamantina Institute, The University of Queensland, Princess Alexandra Hospital, Brisbane, QLD, Australia
| | - M Kramer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - K Ramakrishnan
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
| | - J Yao
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - D C Soares
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
| | - S McCarthy
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - S W Morris
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
| | - L Cardone
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - S Cass
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
| | - E Ghiban
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - W Hennah
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
- Institute for Molecular Medicine, Finland FIMM, University of Helsinki, Helsinki, Finland
| | - K L Evans
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
| | - D Rebolini
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - J K Millar
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
| | - S E Harris
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
| | - J M Starr
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
| | - D J MacIntyre
- Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, UK
| | - Generation Scotland7
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- University of Queensland Diamantina Institute, The University of Queensland, Princess Alexandra Hospital, Brisbane, QLD, Australia
- Institute for Molecular Medicine, Finland FIMM, University of Helsinki, Helsinki, Finland
- Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, UK
- Generation Scotland, A Collaboration between the University Medical Schools and NHS, Aberdeen, Dundee, Edinburgh and Glasgow, UK
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia
| | - A M McIntosh
- Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, UK
| | - J D Watson
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - I J Deary
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
| | - P M Visscher
- University of Queensland Diamantina Institute, The University of Queensland, Princess Alexandra Hospital, Brisbane, QLD, Australia
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia
| | - D H Blackwood
- Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, UK
| | - W R McCombie
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - D J Porteous
- Medical Genetics Section, University of Edinburgh Molecular Medicine Centre, MRC Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, UK
- Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, UK
| |
Collapse
|
204
|
Lohmueller KE. The impact of population demography and selection on the genetic architecture of complex traits. PLoS Genet 2014; 10:e1004379. [PMID: 24875776 PMCID: PMC4038606 DOI: 10.1371/journal.pgen.1004379] [Citation(s) in RCA: 105] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 03/28/2014] [Indexed: 02/06/2023] Open
Abstract
Population genetic studies have found evidence for dramatic population growth in recent human history. It is unclear how this recent population growth, combined with the effects of negative natural selection, has affected patterns of deleterious variation, as well as the number, frequency, and effect sizes of mutations that contribute risk to complex traits. Because researchers are performing exome sequencing studies aimed at uncovering the role of low-frequency variants in the risk of complex traits, this topic is of critical importance. Here I use simulations under population genetic models where a proportion of the heritability of the trait is accounted for by mutations in a subset of the exome. I show that recent population growth increases the proportion of nonsynonymous variants segregating in the population, but does not affect the genetic load relative to a population that did not expand. Under a model where a mutation's effect on a trait is correlated with its effect on fitness, rare variants explain a greater portion of the additive genetic variance of the trait in a population that has recently expanded than in a population that did not recently expand. Further, when using a single-marker test, for a given false-positive rate and sample size, recent population growth decreases the expected number of significant associations with the trait relative to the number detected in a population that did not expand. However, in a model where there is no correlation between a mutation's effect on fitness and the effect on the trait, common variants account for much of the additive genetic variance, regardless of demography. Moreover, here demography does not affect the number of significant associations detected. These findings suggest recent population history may be an important factor influencing the power of association tests and in accounting for the missing heritability of certain complex traits.
Collapse
Affiliation(s)
- Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
| |
Collapse
|
205
|
Gao F, Keinan A. High burden of private mutations due to explosive human population growth and purifying selection. BMC Genomics 2014; 15 Suppl 4:S3. [PMID: 25056720 PMCID: PMC4083409 DOI: 10.1186/1471-2164-15-s4-s3] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Background Recent studies have shown that human populations have experienced a complex demographic history, including a recent epoch of rapid population growth that led to an excess in the proportion of rare genetic variants in humans today. This excess can impact the burden of private mutations for each individual, defined here as the proportion of heterozygous variants in each newly sequenced individual that are novel compared to another large sample of sequenced individuals. Results We calculated the burden of private mutations predicted by different demographic models, and compared with empirical estimates based on data from the NHLBI Exome Sequencing Project and data from the Neutral Regions (NR) dataset. We observed a significant excess in the proportion of private mutations in the empirical data compared with models of demographic history without a recent epoch of population growth. Incorporating recent growth into the model provides a much improved fit to empirical observations. This phenomenon becomes more marked for larger sample sizes, e.g. extrapolating to a scenario in which 10,000 individuals from the same population have been sequenced with perfect accuracy, still about 1 in 400 heterozygous sites (or about 6,000 variants) at the 10,001st individual are predicted to be novel, 18-times as predicted in the absence of recent population growth. The proportion of private mutations is additionally increased by purifying selection, which differentially affect mutations of different functional annotations. Conclusions The burden of private mutations for each individual, which are singletons (i.e. appearing in a single copy) in a larger sample that includes this individual, is predicted to be greatly increased by recent population growth, as well as by purifying selection. Comparison with empirical data supports that European populations have experienced recent rapid population growth, consistent with previous studies. These results have important implications to the design and analysis of sequencing-based association studies of complex human disease as they pertain to private and very rare variants. They also imply that personalized genomics will indeed have to be very personal in accounting for the large number of private mutations.
Collapse
|
206
|
Abstract
BACKGROUND Genome wide association studies (GWAS) have revealed a large number of links between genome variation and complex disease. Among other benefits, it is expected that these insights will lead to new therapeutic strategies, particularly the identification of new drug targets. In this paper, we evaluate the power of GWAS studies to find drug targets by examining how many existing drug targets have been directly 'rediscovered' by this technique, and the extent to which GWAS results may be leveraged by network information to discover known and new drug targets. RESULTS We find that only a very small fraction of drug targets are directly detected in the relevant GWAS studies. We investigate two possible explanations for this observation. First, we find evidence of negative selection acting on drug target genes as a consequence of strong coupling with the disease phenotype, so reducing the incidence of SNPs linked to the disease. Second, we find that GWAS genes are substantially longer on average than drug targets and than all genes, suggesting there is a length related bias in GWAS results. In spite of the low direct relationship between drug targets and GWAS reported genes, we found these two sets of genes are closely coupled in the human protein network. As a consequence, machine-learning methods are able to recover known drug targets based on network context and the set of GWAS reported genes for the same disease. We show the approach is potentially useful for identifying drug repurposing opportunities. CONCLUSIONS Although GWA studies do not directly identify most existing drug targets, there are several reasons to expect that new targets will nevertheless be discovered using these data. Initial results on drug repurposing studies using network analysis are encouraging and suggest directions for future development.
Collapse
|
207
|
Fan R, Wang Y, Mills JL, Wilson AF, Bailey-Wilson JE, Xiong M. Functional linear models for association analysis of quantitative traits. Genet Epidemiol 2014; 37:726-42. [PMID: 24130119 DOI: 10.1002/gepi.21757] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 07/15/2013] [Accepted: 08/14/2013] [Indexed: 12/19/2022]
Abstract
Functional linear models are developed in this paper for testing associations between quantitative traits and genetic variants, which can be rare variants or common variants or the combination of the two. By treating multiple genetic variants of an individual in a human population as a realization of a stochastic process, the genome of an individual in a chromosome region is a continuum of sequence data rather than discrete observations. The genome of an individual is viewed as a stochastic function that contains both linkage and linkage disequilibrium (LD) information of the genetic markers. By using techniques of functional data analysis, both fixed and mixed effect functional linear models are built to test the association between quantitative traits and genetic variants adjusting for covariates. After extensive simulation analysis, it is shown that the F-distributed tests of the proposed fixed effect functional linear models have higher power than that of sequence kernel association test (SKAT) and its optimal unified test (SKAT-O) for three scenarios in most cases: (1) the causal variants are all rare, (2) the causal variants are both rare and common, and (3) the causal variants are common. The superior performance of the fixed effect functional linear models is most likely due to its optimal utilization of both genetic linkage and LD information of multiple genetic variants in a genome and similarity among different individuals, while SKAT and SKAT-O only model the similarities and pairwise LD but do not model linkage and higher order LD information sufficiently. In addition, the proposed fixed effect models generate accurate type I error rates in simulation studies. We also show that the functional kernel score tests of the proposed mixed effect functional linear models are preferable in candidate gene analysis and small sample problems. The methods are applied to analyze three biochemical traits in data from the Trinity Students Study.
Collapse
Affiliation(s)
- Ruzong Fan
- Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, Maryland, United States of America
| | | | | | | | | | | |
Collapse
|
208
|
Zielinski D, Markus B, Sheikh M, Gymrek M, Chu C, Zaks M, Srinivasan B, Hoffman JD, Aizenbud D, Erlich Y. OTX2 duplication is implicated in hemifacial microsomia. PLoS One 2014; 9:e96788. [PMID: 24816892 PMCID: PMC4016008 DOI: 10.1371/journal.pone.0096788] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Accepted: 04/11/2014] [Indexed: 12/21/2022] Open
Abstract
Hemifacial microsomia (HFM) is the second most common facial anomaly after cleft lip and palate. The phenotype is highly variable and most cases are sporadic. We investigated the disorder in a large pedigree with five affected individuals spanning eight meioses. Whole-exome sequencing results indicated the absence of a pathogenic coding point mutation. A genome-wide survey of segmental variations identified a 1.3 Mb duplication of chromosome 14q22.3 in all affected individuals that was absent in more than 1000 chromosomes of ethnically matched controls. The duplication was absent in seven additional sporadic HFM cases, which is consistent with the known heterogeneity of the disorder. To find the critical gene in the duplicated region, we analyzed signatures of human craniofacial disease networks, mouse expression data, and predictions of dosage sensitivity. All of these approaches implicated OTX2 as the most likely causal gene. Moreover, OTX2 is a known oncogenic driver in medulloblastoma, a condition that was diagnosed in the proband during the course of the study. Our findings suggest a role for OTX2 dosage sensitivity in human craniofacial development and raise the possibility of a shared etiology between a subtype of hemifacial microsomia and medulloblastoma.
Collapse
Affiliation(s)
- Dina Zielinski
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Barak Markus
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Mona Sheikh
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Melissa Gymrek
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
- Harvard-MIT Division of Health Sciences and Technology, MIT, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Molecular Biology and Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Clement Chu
- Counsyl, South San Francisco, California, United States of America
| | - Marta Zaks
- Rambam Health Care Campus, Haifa, Israel
| | | | - Jodi D. Hoffman
- Division of Genetics, Tufts Medical Center, Boston, Massachusetts, United States of America
| | | | - Yaniv Erlich
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
209
|
Effective filtering strategies to improve data quality from population-based whole exome sequencing studies. BMC Bioinformatics 2014; 15:125. [PMID: 24884706 PMCID: PMC4098776 DOI: 10.1186/1471-2105-15-125] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Accepted: 04/16/2014] [Indexed: 12/12/2022] Open
Abstract
Background Genotypes generated in next generation sequencing studies contain errors which can significantly impact the power to detect signals in common and rare variant association tests. These genotyping errors are not explicitly filtered by the standard GATK Variant Quality Score Recalibration (VQSR) tool and thus remain a source of errors in whole exome sequencing (WES) projects that follow GATK’s recommended best practices. Therefore, additional data filtering methods are required to effectively remove these errors before performing association analyses with complex phenotypes. Here we empirically derive thresholds for genotype and variant filters that, when used in conjunction with the VQSR tool, achieve higher data quality than when using VQSR alone. Results The detailed filtering strategies improve the concordance of sequenced genotypes with array genotypes from 99.33% to 99.77%; improve the percent of discordant genotypes removed from 10.5% to 69.5%; and improve the Ti/Tv ratio from 2.63 to 2.75. We also demonstrate that managing batch effects by separating samples based on different target capture and sequencing chemistry protocols results in a final data set containing 40.9% more high-quality variants. In addition, imputation is an important component of WES studies and is used to estimate common variant genotypes to generate additional markers for association analyses. As such, we demonstrate filtering methods for imputed data that improve genotype concordance from 79.3% to 99.8% while removing 99.5% of discordant genotypes. Conclusions The described filtering methods are advantageous for large population-based WES studies designed to identify common and rare variation associated with complex diseases. Compared to data processed through standard practices, these strategies result in substantially higher quality data for common and rare association analyses.
Collapse
|
210
|
Li A, Meyre D. Jumping on the Train of Personalized Medicine: A Primer for Non- Geneticist Clinicians: Part 3. Clinical Applications in the Personalized Medicine Area. CURRENT PSYCHIATRY REVIEWS 2014; 10:118-132. [PMID: 25598768 PMCID: PMC4287884 DOI: 10.2174/1573400510666140630170549] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Revised: 05/27/2014] [Accepted: 05/29/2014] [Indexed: 12/17/2022]
Abstract
The rapid decline of sequencing costs brings hope that personal genome sequencing will become a common feature of medical practice. This series of three reviews aim to help non-geneticist clinicians to jump into the fast-moving field of personalized genetic medicine. In the first two articles, we covered the fundamental concepts of molecular genetics and the methodologies used in genetic epidemiology. In this third article, we discuss the evolution of personalized medicine and illustrate the most recent success in the fields of Mendelian and complex human diseases. We also address the challenges that currently limit the use of personalized medicine to its full potential.
Collapse
Affiliation(s)
| | - David Meyre
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8N 3Z5, Canada
| |
Collapse
|
211
|
Simulation of Finnish population history, guided by empirical genetic data, to assess power of rare-variant tests in Finland. Am J Hum Genet 2014; 94:710-20. [PMID: 24768551 DOI: 10.1016/j.ajhg.2014.03.019] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 03/27/2014] [Indexed: 12/18/2022] Open
Abstract
Finnish samples have been extensively utilized in studying single-gene disorders, where the founder effect has clearly aided in discovery, and more recently in genome-wide association studies of complex traits, where the founder effect has had less obvious impacts. As the field starts to explore rare variants' contribution to polygenic traits, it is of great importance to characterize and confirm the Finnish founder effect in sequencing data and to assess its implications for rare-variant association studies. Here, we employ forward simulation, guided by empirical deep resequencing data, to model the genetic architecture of quantitative polygenic traits in both the general European and the Finnish populations simultaneously. We demonstrate that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants. SKAT-O, variable-threshold tests, and single-variant tests are more powerful than other rare-variant methods in the Finnish population across a range of genetic models. We also compare the relative power and efficiency of exome array genotyping to those of high-coverage exome sequencing. At a fixed cost, less expensive genotyping strategies have far greater power than sequencing; in a fixed number of samples, however, genotyping arrays miss a substantial portion of genetic signals detected in sequencing, even in the Finnish founder population. As genetic studies probe sequence variation at greater depth in more diverse populations, our simulation approach provides a framework for evaluating various study designs for gene discovery.
Collapse
|
212
|
Setta-Kaffetzi N, Simpson MA, Navarini AA, Patel VM, Lu HC, Allen MH, Duckworth M, Bachelez H, Burden AD, Choon SE, Griffiths CEM, Kirby B, Kolios A, Seyger MMB, Prins C, Smahi A, Trembath RC, Fraternali F, Smith CH, Barker JN, Capon F. AP1S3 mutations are associated with pustular psoriasis and impaired Toll-like receptor 3 trafficking. Am J Hum Genet 2014; 94:790-7. [PMID: 24791904 PMCID: PMC4067562 DOI: 10.1016/j.ajhg.2014.04.005] [Citation(s) in RCA: 138] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 04/09/2014] [Indexed: 11/26/2022] Open
Abstract
Adaptor protein complex 1 (AP-1) is an evolutionary conserved heterotetramer that promotes vesicular trafficking between the trans-Golgi network and the endosomes. The knockout of most murine AP-1 complex subunits is embryonically lethal, so the identification of human disease-associated alleles has the unique potential to deliver insights into gene function. Here, we report two founder mutations (c.11T>G [p.Phe4Cys] and c.97C>T [p.Arg33Trp]) in AP1S3, the gene encoding AP-1 complex subunit σ1C, in 15 unrelated individuals with a severe autoinflammatory skin disorder known as pustular psoriasis. Because the variants are predicted to destabilize the 3D structure of the AP-1 complex, we generated AP1S3-knockdown cell lines to investigate the consequences of AP-1 deficiency in skin keratinocytes. We found that AP1S3 silencing disrupted the endosomal translocation of the innate pattern-recognition receptor TLR-3 (Toll-like receptor 3) and resulted in a marked inhibition of downstream signaling. These findings identify pustular psoriasis as an autoinflammatory phenotype caused by defects in vesicular trafficking and demonstrate a requirement of AP-1 for Toll-like receptor homeostasis.
Collapse
Affiliation(s)
- Niovi Setta-Kaffetzi
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Michael A Simpson
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Alexander A Navarini
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Varsha M Patel
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Hui-Chun Lu
- Randall Division of Cell and Molecular Biophysics, King's College London, London SE1 9RT, UK
| | - Michael H Allen
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Michael Duckworth
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Hervé Bachelez
- Institut National de la Santé et de la Recherche Médicale Unité 781, Institut Imagine, Hopital Necker - Enfant Malades, Paris 75015, France; Department of Dermatology, Sorbonne Paris Cité Université Paris Diderot and Hôpital Saint-Louis, Assistance Publique - Hôpitaux de Paris, Paris 75010, France
| | - A David Burden
- Department of Dermatology, University of Glasgow, Glasgow G11 6NT, UK
| | - Siew-Eng Choon
- Department of Dermatology, Hospital Sultanah Aminah, Johor Bahru 80100, Malaysia
| | | | - Brian Kirby
- Department of Dermatology, St. Vincent University Hospital, Dublin 4, Ireland
| | - Antonios Kolios
- Department of Dermatology, Zurich University Hospital, Zurich 8091, Switzerland
| | - Marieke M B Seyger
- Department of Dermatology, Radboud University Nijmegen Medical Centre, 6500HB Nijmegen, the Netherlands
| | - Christa Prins
- Dermatology Service, Geneva University Hospital, 1211 Geneva 14, Switzerland
| | - Asma Smahi
- Institut National de la Santé et de la Recherche Médicale Unité 781, Institut Imagine, Hopital Necker - Enfant Malades, Paris 75015, France
| | - Richard C Trembath
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK; Queen Mary University of London, Barts and The London School of Medicine and Dentistry, London EC1M 6QB, UK
| | - Franca Fraternali
- Randall Division of Cell and Molecular Biophysics, King's College London, London SE1 9RT, UK
| | - Catherine H Smith
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Jonathan N Barker
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK
| | - Francesca Capon
- Division of Genetics and Molecular Medicine, King's College London, London SE1 9RT, UK.
| |
Collapse
|
213
|
Holmen OL, Zhang H, Zhou W, Schmidt E, Hovelson DH, Langhammer A, Løchen ML, Ganesh SK, Mathiesen EB, Vatten L, Platou C, Wilsgaard T, Chen J, Skorpen F, Dalen H, Boehnke M, Abecasis GR, Njølstad I, Hveem K, Willer CJ. No large-effect low-frequency coding variation found for myocardial infarction. Hum Mol Genet 2014; 23:4721-8. [PMID: 24728188 DOI: 10.1093/hmg/ddu175] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Genome-wide association studies have identified variants, primarily common, that are associated with coronary artery disease or myocardial infarction (MI), but have not tested the majority of the low frequency and rare variation in the genome. We explored the hypothesis that previously untested low frequency (1-5% minor allele frequency) and rare (<1% minor allele frequency) coding variants are associated with MI. We genotyped 2906 MI cases and 6738 non-MI controls from Norway using the Illumina HumanExome Beadchip, allowing for direct genotyping of 85 972 polymorphic coding variants as well as 48 known GWAS SNPs. We followed-up 34 coding variants in an additional 2350 MI cases and 2318 controls from Norway. We evaluated exome array coverage in a subset of these samples using whole exome sequencing (N = 151). The exome array provided successful genotyping for an estimated 72.5% of Norwegian loss-of-function or missense variants with frequency >1% and 66.2% of variants <1% frequency observed more than once. Despite 80% power in the two-stage study (N = 14 312) to detect association with low-frequency variants with high effect sizes [odds ratio (OR) >1.86 and >1.36 for 1 and 5% frequency, respectively], we did not identify any novel genes or single variants that reached significance. This suggests that low-frequency coding variants with large effect sizes (OR >2) may not exist for MI. Larger sample sizes may identify coding variants with more moderate effects.
Collapse
Affiliation(s)
- Oddgeir L Holmen
- HUNT Research Centre, Department of Public Health and General Practice, Norwegian University of Science and Technology, 7600 Levanger, Norway St. Olav Hospital, Trondheim University Hospital, Trondheim, Norway
| | - He Zhang
- Department of Internal Medicine, Division of Cardiology
| | - Wei Zhou
- Department of Internal Medicine, Division of Cardiology
| | - Ellen Schmidt
- Department of Internal Medicine, Division of Cardiology, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Daniel H Hovelson
- Department of Internal Medicine, Division of Cardiology, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Arnulf Langhammer
- HUNT Research Centre, Department of Public Health and General Practice, Norwegian University of Science and Technology, 7600 Levanger, Norway
| | - Maja-Lisa Løchen
- Epidemiology of Chronic Diseases Research Group, Department of Community Medicine, Faculty of Health Sciences
| | | | - Ellisiv B Mathiesen
- Brain and Circulation Research Group, Department of Clinical Medicine, Faculty of Health Sciences, UiT The Arctic University of Norway, 9037 Tromsø, Norway Department of Neurology and Neurophysiology, University Hospital of North Norway, 9037 Tromsø, Norway
| | | | - Carl Platou
- HUNT Research Centre, Department of Public Health and General Practice, Norwegian University of Science and Technology, 7600 Levanger, Norway
| | - Tom Wilsgaard
- Epidemiology of Chronic Diseases Research Group, Department of Community Medicine, Faculty of Health Sciences
| | - Jin Chen
- Department of Internal Medicine, Division of Cardiology
| | - Frank Skorpen
- Department of Laboratory Medicine, Children's and Women's Health Faculty of Medicine
| | - Håvard Dalen
- Medical Imaging Laboratory for Innovative Future Healthcare, Lab and Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, 7030 Trondheim, Norway Department of Medicine, Levanger Hospital, Nord-Trøndelag Health Trust, 7600 Levanger, Norway
| | - Michael Boehnke
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Goncalo R Abecasis
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Inger Njølstad
- Epidemiology of Chronic Diseases Research Group, Department of Community Medicine, Faculty of Health Sciences
| | - Kristian Hveem
- HUNT Research Centre, Department of Public Health and General Practice, Norwegian University of Science and Technology, 7600 Levanger, Norway Department of Medicine, Levanger Hospital, Nord-Trøndelag Health Trust, 7600 Levanger, Norway
| | - Cristen J Willer
- Department of Internal Medicine, Division of Cardiology, Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
214
|
Hellwege JN, Palmer ND, Raffield LM, Ng MCY, Hawkins GA, Long J, Lorenzo C, Norris JM, Ida Chen YD, Speliotes EK, Rotter JI, Langefeld CD, Wagenknecht LE, Bowden DW. Genome-wide family-based linkage analysis of exome chip variants and cardiometabolic risk. Genet Epidemiol 2014; 38:345-52. [PMID: 24719370 DOI: 10.1002/gepi.21801] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Revised: 02/14/2014] [Accepted: 02/28/2014] [Indexed: 01/31/2023]
Abstract
Linkage analysis of complex traits has had limited success in identifying trait-influencing loci. Recently, coding variants have been implicated as the basis for some biomedical associations. We tested whether coding variants are the basis for linkage peaks of complex traits in 42 African-American (n = 596) and 90 Hispanic (n = 1,414) families in the Insulin Resistance Atherosclerosis Family Study (IRASFS) using Illumina HumanExome Beadchips. A total of 92,157 variants in African Americans (34%) and 81,559 (31%) in Hispanics were polymorphic and tested using two-point linkage and association analyses with 37 cardiometabolic phenotypes. In African Americans 77 LOD scores greater than 3 were observed. The highest LOD score was 4.91 with the APOE SNP rs7412 (MAF = 0.13) with plasma apolipoprotein B (ApoB). This SNP was associated with ApoB (P-value = 4 × 10(-19)) and accounted for 16.2% of the variance in African Americans. In Hispanic families, 104 LOD scores were greater than 3. The strongest evidence of linkage (LOD = 4.29) was with rs5882 (MAF = 0.46) in CETP with HDL. CETP variants were strongly associated with HDL (0.00049 < P-value <4.6 × 10(-12)), accounting for up to 4.5% of the variance. These loci have previously been shown to have effects on the biomedical traits evaluated here. Thus, evidence of strong linkage in this genome wide survey of primarily coding variants was uncommon. Loci with strong evidence of linkage was characterized by large contributions to the variance, and, in these cases, are common variants. Less compelling evidence of linkage and association was observed with additional loci that may require larger family sets to confirm.
Collapse
Affiliation(s)
- Jacklyn N Hellwege
- Molecular Genetics and Genomics Program, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America; Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America; Center for Diabetes Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
215
|
Egalite N, Groisman IJ, Godard B. Genetic Counseling Practice in Next Generation Sequencing Research: Implications for the Ethical Oversight of the Informed Consent Process. J Genet Couns 2014; 23:661-70. [DOI: 10.1007/s10897-014-9703-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 02/14/2014] [Indexed: 12/20/2022]
|
216
|
Pulit SL, Leusink M, Menelaou A, de Bakker PIW. Association claims in the sequencing era. Genes (Basel) 2014; 5:196-213. [PMID: 24705293 PMCID: PMC3978519 DOI: 10.3390/genes5010196] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2013] [Revised: 02/24/2014] [Accepted: 02/24/2014] [Indexed: 12/13/2022] Open
Abstract
Since the completion of the Human Genome Project, the field of human genetics has been in great flux, largely due to technological advances in studying DNA sequence variation. Although community-wide adoption of statistical standards was key to the success of genome-wide association studies, similar standards have not yet been globally applied to the processing and interpretation of sequencing data. It has proven particularly challenging to pinpoint unequivocally disease variants in sequencing studies of polygenic traits. Here, we comment on a number of factors that may contribute to irreproducible claims of association in scientific literature and discuss possible steps that we can take towards cultural change.
Collapse
Affiliation(s)
- Sara L Pulit
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| | - Maarten Leusink
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| | - Androniki Menelaou
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| | - Paul I W de Bakker
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| |
Collapse
|
217
|
Jensen KP, Kranzler HR, Stein MB, Gelernter J. The effects of a MAP2K5 microRNA target site SNP on risk for anxiety and depressive disorders. Am J Med Genet B Neuropsychiatr Genet 2014; 165B:175-83. [PMID: 24436253 PMCID: PMC4174417 DOI: 10.1002/ajmg.b.32219] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Accepted: 12/18/2013] [Indexed: 12/16/2022]
Abstract
Functional variants that contribute to genomewide association study (GWAS) signals are difficult to identify. MicroRNAs could contribute to some of these gene-trait relationships. We compiled a set of GWAS trait gene SNPs that were predicted to affect microRNA regulation of mRNA. Trait associations were tested in a sample of 6725 European-American (EA) and African-American (AA) subjects that were interviewed using the polydiagnostic SSADDA to diagnose major psychiatric disorders. A predicted miR-330-3p target site SNP (rs41305272) in mitogen-activated protein kinase kinase 5 (MAP2K5) mRNA was in LD (d' = 1.0, r(2) = 0.02) with a reported GWAS-identified variant for restless legs syndrome (RLS), a disorder frequently comorbid with anxiety and depression, possibly because of a shared pathophysiology. We examined the SNP's association with mood and anxiety-related disorders. Rs41305272 was associated with agoraphobia (Ag) in EAs (odds ratio [OR] = 1.95, P = 0.007; 195 cases) and AAs (OR = 3.2, P = 0.03; 148 cases) and major depressive disorder (MDD) in AAs (OR = 2.64, P = 0.01; 427 cases), but not EAs (465 cases). Rs41305272*T carrier frequency was correlated with the number of anxiety and depressive disorders diagnosed per subject. RLS was not evaluated in our subjects. Predicted miR-330-3p target genes were enriched in pathways relevant to psychiatric disorders. These findings suggest that microRNA target site information may be useful in the analysis of GWAS signals for complex traits. MiR-330-3p and MAP2K5 are potentially important contributors to mood and anxiety-related traits. With support from additional studies, these findings could add to the large number of risk genes identified through association to medical disorders that have primary psychiatric effects.
Collapse
Affiliation(s)
- Kevin P. Jensen
- Department of Psychiatry, Division of Human Genetics, Yale University School of Medicine, New Haven, CT, USA and VA CT Health Care Center, West Haven, CT, USA
| | - Henry R. Kranzler
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine and the VISN4 MIRECC, Philadelphia VA Medical Center, Philadelphia, PA, USA
| | - Murray B. Stein
- Departments of Psychiatry and Family & Preventive Medicine, University of California San Diego, La Jolla, CA, USA
| | - Joel Gelernter
- Department of Psychiatry, Division of Human Genetics, Yale University School of Medicine, New Haven, CT, USA and VA CT Health Care Center, West Haven, CT, USA,Departments of Genetics and Neurobiology, Yale University School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
218
|
Purcell SM, Moran JL, Fromer M, Ruderfer D, Solovieff N, Roussos P, O’Dushlaine C, Chambert K, Bergen SE, Kähler A, Duncan L, Stahl E, Genovese G, Fernández E, Collins MO, Komiyama NH, Choudhary JS, Magnusson PKE, Banks E, Shakir K, Garimella K, Fennell T, de Pristo M, Grant SG, Haggarty S, Gabriel S, Scolnick EM, Lander ES, Hultman C, Sullivan PF, McCarroll SA, Sklar P. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 2014; 506:185-90. [PMID: 24463508 PMCID: PMC4136494 DOI: 10.1038/nature12975] [Citation(s) in RCA: 1040] [Impact Index Per Article: 94.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 12/24/2013] [Indexed: 12/11/2022]
Abstract
Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.
Collapse
Affiliation(s)
- Shaun M. Purcell
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
- Division of Psychiatric Genomics in the Department of Psychiatry, and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Jennifer L. Moran
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Menachem Fromer
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
- Division of Psychiatric Genomics in the Department of Psychiatry, and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Douglas Ruderfer
- Division of Psychiatric Genomics in the Department of Psychiatry, and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Nadia Solovieff
- Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Panos Roussos
- Division of Psychiatric Genomics in the Department of Psychiatry, and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Colm O’Dushlaine
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Kimberly Chambert
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Sarah E. Bergen
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
- Department of Medical Epidemiology and Biostatisics, Karolinska Institutet, Stockholm, SE-171 77, Sweden
| | - Anna Kähler
- Department of Medical Epidemiology and Biostatisics, Karolinska Institutet, Stockholm, SE-171 77, Sweden
| | - Laramie Duncan
- Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Eli Stahl
- Division of Psychiatric Genomics in the Department of Psychiatry, and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Giulio Genovese
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Esperanza Fernández
- Center for Human Genetics, KU Leuven, 3000 Leuven, Belgium; VIB Center for Biology of Disease, 3000 Leuven, Belgium
| | - Mark O Collins
- Proteomic Mass Spectrometry, The Wellcome Trust Sanger Institute, Cambridge, UK
| | - Noboru H. Komiyama
- Proteomic Mass Spectrometry, The Wellcome Trust Sanger Institute, Cambridge, UK
| | - Jyoti S. Choudhary
- Proteomic Mass Spectrometry, The Wellcome Trust Sanger Institute, Cambridge, UK
| | - Patrik K. E. Magnusson
- Department of Medical Epidemiology and Biostatisics, Karolinska Institutet, Stockholm, SE-171 77, Sweden
| | - Eric Banks
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Khalid Shakir
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Kiran Garimella
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Tim Fennell
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Mark de Pristo
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Seth G.N. Grant
- Genes to Cognition Programme, Centre for Clinical Brain Sciences and Centre for Neuroregeneration, The University of Edinburgh, Edinburgh, UK
| | - Stephen Haggarty
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
- Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Neurology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Stacey Gabriel
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Edward M. Scolnick
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Eric S. Lander
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
| | - Christina Hultman
- Department of Medical Epidemiology and Biostatisics, Karolinska Institutet, Stockholm, SE-171 77, Sweden
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina, CB# 7264, Chapel Hill, NC, 27599-7264, USA
| | - Steven A. McCarroll
- Stanley Center for Psychiatric Research, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
- Medical & Population Genetics Program, Broad Institute of MIT & Harvard, Cambridge, MA, 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
| | - Pamela Sklar
- Division of Psychiatric Genomics in the Department of Psychiatry, and Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| |
Collapse
|
219
|
Okada Y, Diogo D, Greenberg JD, Mouassess F, Achkar WAL, Fulton RS, Denny JC, Gupta N, Mirel D, Gabriel S, Li G, Kremer JM, Pappas DA, Carroll RJ, Eyler AE, Trynka G, Stahl EA, Cui J, Saxena R, Coenen MJH, Guchelaar HJ, Huizinga TWJ, Dieudé P, Mariette X, Barton A, Canhão H, Fonseca JE, de Vries N, Tak PP, Moreland LW, Bridges SL, Miceli-Richard C, Choi HK, Kamatani Y, Galan P, Lathrop M, Raj T, De Jager PL, Raychaudhuri S, Worthington J, Padyukov L, Klareskog L, Siminovitch KA, Gregersen PK, Mardis ER, Arayssi T, Kazkaz LA, Plenge RM. Integration of sequence data from a Consanguineous family with genetic data from an outbred population identifies PLB1 as a candidate rheumatoid arthritis risk gene. PLoS One 2014; 9:e87645. [PMID: 24520335 PMCID: PMC3919745 DOI: 10.1371/journal.pone.0087645] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Accepted: 12/19/2013] [Indexed: 12/30/2022] Open
Abstract
Integrating genetic data from families with highly penetrant forms of disease together with genetic data from outbred populations represents a promising strategy to uncover the complete frequency spectrum of risk alleles for complex traits such as rheumatoid arthritis (RA). Here, we demonstrate that rare, low-frequency and common alleles at one gene locus, phospholipase B1 (PLB1), might contribute to risk of RA in a 4-generation consanguineous pedigree (Middle Eastern ancestry) and also in unrelated individuals from the general population (European ancestry). Through identity-by-descent (IBD) mapping and whole-exome sequencing, we identified a non-synonymous c.2263G>C (p.G755R) mutation at the PLB1 gene on 2q23, which significantly co-segregated with RA in family members with a dominant mode of inheritance (P = 0.009). We further evaluated PLB1 variants and risk of RA using a GWAS meta-analysis of 8,875 RA cases and 29,367 controls of European ancestry. We identified significant contributions of two independent non-coding variants near PLB1 with risk of RA (rs116018341 [MAF = 0.042] and rs116541814 [MAF = 0.021], combined P = 3.2×10−6). Finally, we performed deep exon sequencing of PLB1 in 1,088 RA cases and 1,088 controls (European ancestry), and identified suggestive dispersion of rare protein-coding variant frequencies between cases and controls (P = 0.049 for C-alpha test and P = 0.055 for SKAT). Together, these data suggest that PLB1 is a candidate risk gene for RA. Future studies to characterize the full spectrum of genetic risk in the PLB1 genetic locus are warranted.
Collapse
Affiliation(s)
- Yukinori Okada
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- Department of Human Genetics and Disease Diversity, Tokyo Medical and Dental University Graduate School of Medical and Dental Sciences, Tokyo, Japan
- Laboratory for Statistical Analysis, Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
| | - Dorothee Diogo
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Jeffrey D. Greenberg
- New York University Hospital for Joint Diseases, New York, New York, United States of America
| | - Faten Mouassess
- Molecular Biology and Biotechnology Department, Human Genetics Division, Damascus, Syria
| | - Walid A. L. Achkar
- Molecular Biology and Biotechnology Department, Human Genetics Division, Damascus, Syria
| | - Robert S. Fulton
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Namrata Gupta
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Daniel Mirel
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Stacy Gabriel
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Gang Li
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Joel M. Kremer
- Department of Medicine, Albany Medical Center and The Center for Rheumatology, Albany, New York, United States of America
| | - Dimitrios A. Pappas
- Division of Rheumatology, Department of Medicine, New York, Presbyterian Hospital, College of Physicians and Surgeons, Columbia University, New York, New York, United States of America
| | - Robert J. Carroll
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Anne E. Eyler
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Gosia Trynka
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Eli A. Stahl
- The Department of Psychiatry at Mount Sinai School of Medicine, New York, New York, United States of America
| | - Jing Cui
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Richa Saxena
- Center for Human Genetics Research, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Marieke J. H. Coenen
- Department of Human Genetics, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Henk-Jan Guchelaar
- Department of Clinical Pharmacy and Toxicology, Leiden University Medical Center, Leiden, The Netherlands
| | - Tom W. J. Huizinga
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Philippe Dieudé
- Service de Rhumatologie et INSERM U699 Hôpital Bichat Claude Bernard, Assistance Publique des Hôpitaux de Paris, Paris, France
- Université Paris 7-Diderot, Paris, France
| | - Xavier Mariette
- Institut National de la Santé et de la Recherche Médicale (INSERM) U1012, Université Paris-Sud, Rhumatologie, Hôpitaux Universitaires Paris-Sud, Assistance Publique-Hôpitaux de Paris (AP-HP), Le Kremlin Bicêtre, France
| | - Anne Barton
- Arthritis Research UK Epidemiology Unit, Centre for Musculoskeletal Research, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Helena Canhão
- Rheumatology Research Unit, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, Lisbon, Portugal
- Rheumatology Department, Santa Maria Hospital–CHLN, Lisbon, Portugal
| | - João E. Fonseca
- Rheumatology Research Unit, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, Lisbon, Portugal
- Rheumatology Department, Santa Maria Hospital–CHLN, Lisbon, Portugal
| | - Niek de Vries
- Department of Clinical Immunology and Rheumatology & Department of Genome Analysis, Academic Medical Center/University of Amsterdam, Amsterdam, The Netherlands
| | - Paul P. Tak
- Department of Clinical Immunology and Rheumatology, Academic Medical Center/University of Amsterdam, Amsterdam, The Netherlands
- GlaxoSmithKline, Stevenage, United Kingdom
| | - Larry W. Moreland
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - S. Louis Bridges
- Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Corinne Miceli-Richard
- Institut National de la Santé et de la Recherche Médicale (INSERM) U1012, Université Paris-Sud, Rhumatologie, Hôpitaux Universitaires Paris-Sud, Assistance Publique-Hôpitaux de Paris (AP-HP), Le Kremlin Bicêtre, France
| | - Hyon K. Choi
- Channing Laboratory, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Section of Rheumatology, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Clinical Epidemiology Research and Training Unit, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Yoichiro Kamatani
- Laboratory for Statistical Analysis, Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
- Centre d'Etude du Polymorphisme Humain (CEPH), Paris, France
| | - Pilar Galan
- Université Paris 13 Sorbonne Paris Cité, UREN (Nutritional Epidemiology Research Unit), Inserm (U557), Inra (U1125), Cnam, Bobigny, France
| | - Mark Lathrop
- McGill University and Génome Québec Innovation Centre, Montréal, Canada
| | - Towfique Raj
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences, Department of Neurology, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Philip L. De Jager
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences, Department of Neurology, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Soumya Raychaudhuri
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- NIHR Manchester Musculoskeletal Biomedical, Research Unit, Central Manchester NHS Foundation Trust, Manchester Academic Health Sciences Centre, Manchester, United Kingdom
| | - Jane Worthington
- Arthritis Research UK Epidemiology Unit, Centre for Musculoskeletal Research, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
- National Institute for Health Research, Manchester Musculoskeletal Biomedical Research Unit, Central Manchester University Hospitals National Health Service Foundation Trust, Manchester Academic Health Sciences Centre, Manchester, United Kingdom
| | - Leonid Padyukov
- Rheumatology Unit, Department of Medicine (Solna), Karolinska Institutet, Stockholm, Sweden
| | - Lars Klareskog
- Rheumatology Unit, Department of Medicine (Solna), Karolinska Institutet, Stockholm, Sweden
| | - Katherine A. Siminovitch
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Canada
- Toronto General Research Institute, Toronto, Canada
- Department of Medicine, University of Toronto, Toronto, Canada
| | - Peter K. Gregersen
- The Feinstein Institute for Medical Research, North Shore–Long Island Jewish Health System, Manhasset, New York, United States of America
| | - Elaine R. Mardis
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Thurayya Arayssi
- Weill Cornell Medical College-Qatar, Education City, Doha, Qatar
| | - Layla A. Kazkaz
- Tishreen Hospital, Damascus, Syria
- Syrian Association for Rheumatology, Damascus, Syria
| | - Robert M. Plenge
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
220
|
Peloso G, Auer P, Bis J, Voorman A, Morrison A, Stitziel N, Brody J, Khetarpal S, Crosby J, Fornage M, Isaacs A, Jakobsdottir J, Feitosa M, Davies G, Huffman J, Manichaikul A, Davis B, Lohman K, Joon A, Smith A, Grove M, Zanoni P, Redon V, Demissie S, Lawson K, Peters U, Carlson C, Jackson R, Ryckman K, Mackey R, Robinson J, Siscovick D, Schreiner P, Mychaleckyj J, Pankow J, Hofman A, Uitterlinden A, Harris T, Taylor K, Stafford J, Reynolds L, Marioni R, Dehghan A, Franco O, Patel A, Lu Y, Hindy G, Gottesman O, Bottinger E, Melander O, Orho-Melander M, Loos R, Duga S, Merlini P, Farrall M, Goel A, Asselta R, Girelli D, Martinelli N, Shah S, Kraus W, Li M, Rader D, Reilly M, McPherson R, Watkins H, Ardissino D, Zhang Q, Wang J, Tsai M, Taylor H, Correa A, Griswold M, Lange L, Starr J, Rudan I, Eiriksdottir G, Launer L, Ordovas J, Levy D, Chen YD, Reiner A, Hayward C, Polasek O, Deary I, Borecki I, Liu Y, Gudnason V, Wilson J, van Duijn C, Kooperberg C, Rich S, Psaty B, Rotter J, O’Donnell C, Rice K, Boerwinkle E, Kathiresan S, Cupples L, Cupples LA. Association of low-frequency and rare coding-sequence variants with blood lipids and coronary heart disease in 56,000 whites and blacks. Am J Hum Genet 2014; 94:223-32. [PMID: 24507774 DOI: 10.1016/j.ajhg.2014.01.009] [Citation(s) in RCA: 267] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 01/09/2014] [Indexed: 10/25/2022] Open
Abstract
Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncertain whether the PCSK9 example represents a paradigm or an isolated exception. We used the "Exome Array" to genotype >200,000 low-frequency and rare coding sequence variants across the genome in 56,538 individuals (42,208 European ancestry [EA] and 14,330 African ancestry [AA]) and tested these variants for association with LDL-C, high-density lipoprotein cholesterol (HDL-C), and triglycerides. Although we did not identify new genes associated with LDL-C, we did identify four low-frequency (frequencies between 0.1% and 2%) variants (ANGPTL8 rs145464906 [c.361C>T; p.Gln121*], PAFAH1B2 rs186808413 [c.482C>T; p.Ser161Leu], COL18A1 rs114139997 [c.331G>A; p.Gly111Arg], and PCSK7 rs142953140 [c.1511G>A; p.Arg504His]) with large effects on HDL-C and/or triglycerides. None of these four variants was associated with risk for CHD, suggesting that examples of low-frequency coding variants with robust effects on both lipids and CHD will be limited.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA; National Heart, Lung, and Blood Institute (NHLBI) Framingham Heart Study, Framingham, MA 01702, USA.
| |
Collapse
|
221
|
Liu DJ, Peloso GM, Zhan X, Holmen OL, Zawistowski M, Feng S, Nikpay M, Auer PL, Goel A, Zhang H, Peters U, Farrall M, Orho-Melander M, Kooperberg C, McPherson R, Watkins H, Willer CJ, Hveem K, Melander O, Kathiresan S, Abecasis GR. Meta-analysis of gene-level tests for rare variant association. Nat Genet 2014; 46:200-4. [PMID: 24336170 PMCID: PMC3939031 DOI: 10.1038/ng.2852] [Citation(s) in RCA: 144] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 11/20/2013] [Indexed: 12/14/2022]
Abstract
The majority of reported complex disease associations for common genetic variants have been identified through meta-analysis, a powerful approach that enables the use of large sample sizes while protecting against common artifacts due to population structure and repeated small-sample analyses sharing individual-level data. As the focus of genetic association studies shifts to rare variants, genes and other functional units are becoming the focus of analysis. Here we propose and evaluate new approaches for performing meta-analysis of rare variant association tests, including burden tests, weighted burden tests, variable-threshold tests and tests that allow variants with opposite effects to be grouped together. We show that our approach retains useful features from single-variant meta-analysis approaches and demonstrate its use in a study of blood lipid levels in ∼18,500 individuals genotyped with exome arrays.
Collapse
Affiliation(s)
- Dajiang J. Liu
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Gina M. Peloso
- Broad Institute of Harvard and MIT, Cambridge, MA
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
| | - Xiaowei Zhan
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Oddgeir L. Holmen
- Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim 7489, Norway
- St. Olav Hospital, Trondheim University Hospital, Trondheim, Norway
| | - Matthew Zawistowski
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Shuang Feng
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| | - Majid Nikpay
- University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Paul L. Auer
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
- School of Public Health, University of Wisconsin-Milwaukee
| | - Anuj Goel
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - He Zhang
- Division of Cardiology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
- Department of Epidemiology, University of Washington School of Public Health, Seattle, WA
| | - Martin Farrall
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - Marju Orho-Melander
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
- Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Charles Kooperberg
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA
| | - Ruth McPherson
- University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Hugh Watkins
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - Cristen J. Willer
- Division of Cardiology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109
| | - Kristian Hveem
- Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim 7489, Norway
- Levanger Hospital, Levanger, Norway
| | - Olle Melander
- Department of Cardiovascular Medicine, University of Oxford, Oxford, UK
- Department of Clinical Sciences, Lund University, Malmö, Sweden
| | - Sekar Kathiresan
- Broad Institute of Harvard and MIT, Cambridge, MA
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Harvard Medical School, Cambridge, MA
| | - Gonçalo R. Abecasis
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109
| |
Collapse
|
222
|
Gala MK, Mizukami Y, Le LP, Moriichi K, Austin T, Yamamoto M, Lauwers GY, Bardeesy N, Chung DC. Germline mutations in oncogene-induced senescence pathways are associated with multiple sessile serrated adenomas. Gastroenterology 2014; 146:520-9. [PMID: 24512911 PMCID: PMC3978775 DOI: 10.1053/j.gastro.2013.10.045] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2013] [Revised: 10/17/2013] [Accepted: 10/21/2013] [Indexed: 12/20/2022]
Abstract
BACKGROUND & AIMS Little is known about the genetic factors that contribute to the development of sessile serrated adenomas (SSAs). SSAs contain somatic mutations in BRAF or KRAS early in development. However, evidence from humans and mouse models indicates that these mutations result in oncogene-induced senescence (OIS) of intestinal crypt cells. Progression to serrated neoplasia requires cells to escape OIS via inactivation of tumor suppressor pathways. We investigated whether subjects with multiple SSAs carry germline loss-of function mutations (nonsense and splice site) in genes that regulate OIS: the p16-Rb and ATM-ATR DNA damage response pathways. METHODS Through a bioinformatic analysis of the literature, we identified a set of genes that function at the main nodes of the p16-Rb and ATM-ATR DNA damage response pathways. We performed whole-exome sequencing of 20 unrelated subjects with multiple SSAs; most had features of serrated polyposis. We compared sequences with those from 4300 subjects matched for ethnicity (controls). We also used an integrative genomics approach to identify additional genes involved in senescence mechanisms. RESULTS We identified mutations in genes that regulate senescence (ATM, PIF1, TELO2,XAF1, and RBL1) in 5 of 20 subjects with multiple SSAs (odds ratio, 3.0; 95% confidence interval, 0.9–8.9; P =.04). In 2 subjects,we found nonsense mutations in RNF43, indicating that it is also associated with multiple serrated polyps (odds ratio, 460; 95% confidence interval, 23.1–16,384; P = 6.8 x 10(-5)). In knockdown experiments with pancreatic duct cells exposed to UV light, RNF43 appeared to function as a regulator of ATMATRDNA damage response. CONCLUSIONS We associated germline loss-of-function variants in genes that regulate senescence pathways with the development of multiple SSAs.We identified RNF43 as a regulator of the DNA damage response and associated nonsense variants in this gene with a high risk of developing SSAs.
Collapse
Affiliation(s)
- Manish K. Gala
- Massachusetts General Hospital Department of Medicine, G.I. Unit and Harvard Medical School, Boston, MA
| | - Yusuke Mizukami
- Massachusetts General Hospital Department of Medicine, G.I. Unit and Harvard Medical School, Boston, MA,Massachusetts General Hospital Cancer Center and Harvard Medical School, Boston, MA,Center for Clinical and Biomedical Research, Sapporo Higashi Tokushukai Hospital, Sapporo, Japan
| | - Long P. Le
- Massachusetts General Hospital Department of Pathology and Harvard Medical School, Boston, MA
| | - Kentaro Moriichi
- Massachusetts General Hospital Department of Medicine, G.I. Unit and Harvard Medical School, Boston, MA
| | - Thomas Austin
- Massachusetts General Hospital Department of Medicine, G.I. Unit and Harvard Medical School, Boston, MA
| | - Masayoshi Yamamoto
- Massachusetts General Hospital Department of Medicine, G.I. Unit and Harvard Medical School, Boston, MA
| | - Gregory Y. Lauwers
- Massachusetts General Hospital Department of Pathology and Harvard Medical School, Boston, MA
| | - Nabeel Bardeesy
- Massachusetts General Hospital Cancer Center and Harvard Medical School, Boston, MA
| | - Daniel C. Chung
- Massachusetts General Hospital Department of Medicine, G.I. Unit and Harvard Medical School, Boston, MA,Massachusetts General Hospital Cancer Center and Harvard Medical School, Boston, MA
| |
Collapse
|
223
|
Zakharov S, Teoh GHK, Salim A, Thalamuthu A. A method to incorporate prior information into score test for genetic association studies. BMC Bioinformatics 2014; 15:24. [PMID: 24450486 PMCID: PMC3904928 DOI: 10.1186/1471-2105-15-24] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Accepted: 01/17/2014] [Indexed: 12/13/2022] Open
Abstract
Background The interest of the scientific community in investigating the impact of rare variants on complex traits has stimulated the development of novel statistical methodologies for association studies. The fact that many of the recently proposed methods for association studies suffer from low power to identify a genetic association motivates the incorporation of prior knowledge into statistical tests. Results In this article we propose a methodology to incorporate prior information into the region-based score test. Within our framework prior information is used to partition variants within a region into several groups, following which asymptotically independent group statistics are constructed and then combined into a global test statistic. Under the null hypothesis the distribution of our test statistic has lower degrees of freedom compared with those of the region-based score statistic. Theoretical power comparison, population genetics simulations and results from analysis of the GAW17 sequencing data set suggest that under some scenarios our method may perform as well as or outperform the score test and other competing methods. Conclusions An approach which uses prior information to improve the power of the region-based score test is proposed. Theoretical power comparison, population genetics simulations and the results of GAW17 data analysis showed that for some scenarios power of our method is on the level with or higher than those of the score test and other methods.
Collapse
Affiliation(s)
- Sergii Zakharov
- Human Genetics, Genome Institute of Singapore, 60 Biopolis Street, #02-01 Genome, Singapore 138672, Singapore.
| | | | | | | |
Collapse
|
224
|
Bendjilali N, Hsueh WC, He Q, Willcox DC, Nievergelt CM, Donlon TA, Kwok PY, Suzuki M, Willcox BJ. Who are the Okinawans? Ancestry, genome diversity, and implications for the genetic study of human longevity from a geographically isolated population. J Gerontol A Biol Sci Med Sci 2014; 69:1474-84. [PMID: 24444611 DOI: 10.1093/gerona/glt203] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Isolated populations have advantages for genetic studies of longevity from decreased haplotype diversity and long-range linkage disequilibrium. This permits smaller sample sizes without loss of power, among other utilities. Little is known about the genome of the Okinawans, a potential population isolate, recognized for longevity. Therefore, we assessed genetic diversity, structure, and admixture in Okinawans, and compared this with Caucasians, Chinese, Japanese, and Africans from HapMap II, genotyped on the same Affymetrix GeneChip Human Mapping 500K array. Principal component analysis, haplotype coverage, and linkage disequilibrium decay revealed a distinct Okinawan genome-more homogeneity, less haplotype diversity, and longer range linkage disequilibrium. Population structure and admixture analyses utilizing 52 global reference populations from the Human Genome Diversity Cell Line Panel demonstrated that Okinawans clustered almost exclusively with East Asians. Sibling relative risk (λs) analysis revealed that siblings of Okinawan centenarians have 3.11 times (females) and 3.77 times (males) more likelihood of centenarianism. These findings suggest that Okinawans are genetically distinct and share several characteristics of a population isolate, which are prone to develop extreme phenotypes (eg, longevity) from genetic drift, natural selection, and population bottlenecks. These data support further exploration of genetic influence on longevity in the Okinawans.
Collapse
Affiliation(s)
| | - Wen-Chi Hsueh
- Departments of Medicine and Epidemiology & Biostatistics, University of California, San Francisco
| | - Qimei He
- Pacific Health Research and Education Institute, Honolulu, Hawaii. Department of Research, Kuakini Medical Center, Honolulu, Hawaii
| | | | | | - Timothy A Donlon
- Pacific Health Research and Education Institute, Honolulu, Hawaii. Ohana Genetics, Honolulu, Hawaii
| | - Pui-Yan Kwok
- Department of Dermatology, Institute for Human Genetics, and Cardiovascular Research Institute, University of California, San Francisco
| | - Makoto Suzuki
- Okinawa Research Center for Longevity Science, Urasoe, Okinawa, Japan. Faculty of Medicine, University of the Ryukyus, Nishihara, Okinawa, Japan
| | - Bradley J Willcox
- Pacific Health Research and Education Institute, Honolulu, Hawaii. Department of Research, Kuakini Medical Center, Honolulu, Hawaii. Okinawa Research Center for Longevity Science, Urasoe, Okinawa, Japan. Department of Geriatric Medicine, John A Burns School of Medicine, University of Hawaii, Honolulu, Hawaii
| |
Collapse
|
225
|
Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A 2014; 111:E455-64. [PMID: 24443550 DOI: 10.1073/pnas.1322563111] [Citation(s) in RCA: 440] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genetic studies have revealed thousands of loci predisposing to hundreds of human diseases and traits, revealing important biological pathways and defining novel therapeutic hypotheses. However, the genes discovered to date typically explain less than half of the apparent heritability. Because efforts have largely focused on common genetic variants, one hypothesis is that much of the missing heritability is due to rare genetic variants. Studies of common variants are typically referred to as genomewide association studies, whereas studies of rare variants are often simply called sequencing studies. Because they are actually closely related, we use the terms common variant association study (CVAS) and rare variant association study (RVAS). In this paper, we outline the similarities and differences between RVAS and CVAS and describe a conceptual framework for the design of RVAS. We apply the framework to address key questions about the sample sizes needed to detect association, the relative merits of testing disruptive alleles vs. missense alleles, frequency thresholds for filtering alleles, the value of predictors of the functional impact of missense alleles, the potential utility of isolated populations, the value of gene-set analysis, and the utility of de novo mutations. The optimal design depends critically on the selection coefficient against deleterious alleles and thus varies across genes. The analysis shows that common variant and rare variant studies require similarly large sample collections. In particular, a well-powered RVAS should involve discovery sets with at least 25,000 cases, together with a substantial replication set.
Collapse
|
226
|
Rare variant association testing by adaptive combination of P-values. PLoS One 2014; 9:e85728. [PMID: 24454922 PMCID: PMC3893264 DOI: 10.1371/journal.pone.0085728] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Accepted: 12/02/2013] [Indexed: 01/21/2023] Open
Abstract
With the development of next-generation sequencing technology, there is a great demand for powerful statistical methods to detect rare variants (minor allele frequencies (MAFs)<1%) associated with diseases. Testing for each variant site individually is known to be underpowered, and therefore many methods have been proposed to test for the association of a group of variants with phenotypes, by pooling signals of the variants in a chromosomal region. However, this pooling strategy inevitably leads to the inclusion of a large proportion of neutral variants, which may compromise the power of association tests. To address this issue, we extend the -MidP method (Cheung et al., 2012, Genet Epidemiol 36: 675–685) and propose an approach (named ‘adaptive combination of P-values for rare variant association testing’, abbreviated as ‘ADA’) that adaptively combines per-site P-values with the weights based on MAFs. Before combining P-values, we first imposed a truncation threshold upon the per-site P-values, to guard against the noise caused by the inclusion of neutral variants. This ADA method is shown to outperform popular burden tests and non-burden tests under many scenarios. ADA is recommended for next-generation sequencing data analysis where many neutral variants may be included in a functional region.
Collapse
|
227
|
Gazave E, Ma L, Chang D, Coventry A, Gao F, Muzny D, Boerwinkle E, Gibbs RA, Sing CF, Clark AG, Keinan A. Neutral genomic regions refine models of recent rapid human population growth. Proc Natl Acad Sci U S A 2014; 111:757-62. [PMID: 24379384 PMCID: PMC3896169 DOI: 10.1073/pnas.1310398110] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Human populations have experienced dramatic growth since the Neolithic revolution. Recent studies that sequenced a very large number of individuals observed an extreme excess of rare variants and provided clear evidence of recent rapid growth in effective population size, although estimates have varied greatly among studies. All these studies were based on protein-coding genes, in which variants are also impacted by natural selection. In this study, we introduce targeted sequencing data for studying recent human history with minimal confounding by natural selection. We sequenced loci far from genes that meet a wide array of additional criteria such that mutations in these loci are putatively neutral. As population structure also skews allele frequencies, we sequenced 500 individuals of relatively homogeneous ancestry by first analyzing the population structure of 9,716 European Americans. We used very high coverage sequencing to reliably call rare variants and fit an extensive array of models of recent European demographic history to the site frequency spectrum. The best-fit model estimates ∼ 3.4% growth per generation during the last ∼ 140 generations, resulting in a population size increase of two orders of magnitude. This model fits the data very well, largely due to our observation that assumptions of more ancient demography can impact estimates of recent growth. This observation and results also shed light on the discrepancy in demographic estimates among recent studies.
Collapse
Affiliation(s)
- Elodie Gazave
- Departments of Biological Statistics and Computational Biology and
| | - Li Ma
- Departments of Biological Statistics and Computational Biology and
| | - Diana Chang
- Departments of Biological Statistics and Computational Biology and
| | - Alex Coventry
- Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853
| | - Feng Gao
- Departments of Biological Statistics and Computational Biology and
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030; and
| | - Eric Boerwinkle
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030; and
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030; and
| | - Charles F. Sing
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48105
| | - Andrew G. Clark
- Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853
| | - Alon Keinan
- Departments of Biological Statistics and Computational Biology and
| |
Collapse
|
228
|
McCarthy JJ, McLeod HL, Ginsburg GS. Genomic medicine: a decade of successes, challenges, and opportunities. Sci Transl Med 2014; 5:189sr4. [PMID: 23761042 DOI: 10.1126/scitranslmed.3005785] [Citation(s) in RCA: 156] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Genomic medicine--an aspirational term 10 years ago--is gaining momentum across the entire clinical continuum from risk assessment in healthy individuals to genome-guided treatment in patients with complex diseases. We review the latest achievements in genome research and their impact on medicine, primarily in the past decade. In most cases, genomic medicine tools remain in the realm of research, but some tools are crossing over into clinical application, where they have the potential to markedly alter the clinical care of patients. In this State of the Art Review, we highlight notable examples including the use of next-generation sequencing in cancer pharmacogenomics, in the diagnosis of rare disorders, and in the tracking of infectious disease outbreaks. We also discuss progress in dissecting the molecular basis of common diseases, the role of the host microbiome, the identification of drug response biomarkers, and the repurposing of drugs. The significant challenges of implementing genomic medicine are examined, along with the innovative solutions being sought. These challenges include the difficulty in establishing clinical validity and utility of tests, how to increase awareness and promote their uptake by clinicians, a changing regulatory and coverage landscape, the need for education, and addressing the ethical aspects of genomics for patients and society. Finally, we consider the future of genomics in medicine and offer a glimpse of the forces shaping genomic medicine, such as fundamental shifts in how we define disease, how medicine is delivered to patients, and how consumers are managing their own health and affecting change.
Collapse
Affiliation(s)
- Jeanette J McCarthy
- Institute for Genome Sciences & Policy, Duke University, Durham, NC 27708, USA
| | | | | |
Collapse
|
229
|
Strom SP, Lee H, Das K, Vilain E, Nelson SF, Grody WW, Deignan JL. Assessing the necessity of confirmatory testing for exome-sequencing results in a clinical molecular diagnostic laboratory. Genet Med 2014; 16:510-5. [PMID: 24406459 PMCID: PMC4079763 DOI: 10.1038/gim.2013.183] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Accepted: 10/18/2013] [Indexed: 02/07/2023] Open
Abstract
Purpose Sanger sequencing is currently considered the gold standard methodology for clinical molecular diagnostic testing. However, next generation sequencing (NGS) has already emerged as a much more efficient means to identify genetic variants within gene panels, the exome, or the genome. We sought to assess the accuracy of NGS variant identification in our clinical genomics laboratory with the goal of establishing a quality score threshold for confirmatory Sanger-based testing. Methods Confirmation data for reported results from 144 sequential clinical exome sequencing cases (94 unique variants) and an additional set of 16 variants from comparable research samples were analyzed. Results 103 of 110 total SNVs analyzed had a quality score ≥Q500, 103 (100%) of which were confirmed by Sanger sequencing. Of the remaining 7 variants with quality scores <Q500, 6 were confirmed by Sanger sequencing (85%). Conclusions For single nucleotide variants, we predict we will be able to reduce our Sanger confirmation workload going forward by 70–80%. This serves as a proof of principle that as long as sufficient validation and quality control measures are implemented, the volume of Sanger confirmation can be reduced, alleviating a significant amount of the labor and cost burden on clinical laboratories wishing to utilize NGS technology. However, Sanger confirmation of low quality single nucleotide variants and all indels (insertions or deletions less than 10 bp) remains necessary at this time in our laboratory.
Collapse
Affiliation(s)
- Samuel P Strom
- 1] Department of Pathology and Laboratory Medicine¸ David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA [2] Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Hane Lee
- Department of Pathology and Laboratory Medicine¸ David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Kingshuk Das
- Department of Pathology and Laboratory Medicine¸ David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Eric Vilain
- 1] Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA [2] Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Stanley F Nelson
- 1] Department of Pathology and Laboratory Medicine¸ David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA [2] Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Wayne W Grody
- 1] Department of Pathology and Laboratory Medicine¸ David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA [2] Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA [3] Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Joshua L Deignan
- Department of Pathology and Laboratory Medicine¸ David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, USA
| |
Collapse
|
230
|
Zawistowski M, Reppell M, Wegmann D, St Jean PL, Ehm MG, Nelson MR, Novembre J, Zöllner S. Analysis of rare variant population structure in Europeans explains differential stratification of gene-based tests. Eur J Hum Genet 2014; 22:1137-44. [PMID: 24398795 DOI: 10.1038/ejhg.2013.297] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Revised: 11/27/2013] [Accepted: 11/28/2013] [Indexed: 11/09/2022] Open
Abstract
There is substantial interest in the role of rare genetic variants in the etiology of complex human diseases. Several gene-based tests have been developed to simultaneously analyze multiple rare variants for association with phenotypic traits. The tests can largely be partitioned into two classes - 'burden' tests and 'joint' tests - based on how they accumulate evidence of association across sites. We used the empirical joint site frequency spectra of rare, nonsynonymous variation from a large multi-population sequencing study to explore the effect of realistic rare variant population structure on gene-based tests. We observed an important difference between the two test classes: their susceptibility to population stratification. Focusing on European samples, we found that joint tests, which allow variants to have opposite directions of effect, consistently showed higher levels of P-value inflation than burden tests. We determined that the differential stratification was caused by two specific patterns in the interpopulation distribution of rare variants, each correlating with inflation in one of the test classes. The pattern that inflates joint tests is more prevalent in real data, explaining the higher levels of inflation in these tests. Furthermore, we show that the different sources of inflation between tests lead to heterogeneous responses to genomic control correction and the number of variants analyzed. Our results indicate that care must be taken when interpreting joint and burden analyses of the same set of rare variants, in particular, to avoid mistaking inflated P-values in joint tests for stronger signals of true associations.
Collapse
Affiliation(s)
- Matthew Zawistowski
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Mark Reppell
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Daniel Wegmann
- Department of Biology, University of Fribourg, Fribourg, Switzerland
| | - Pamela L St Jean
- Quantitative Sciences, GlaxoSmithKline, Research Triangle Park, NC, USA
| | - Margaret G Ehm
- Quantitative Sciences, GlaxoSmithKline, Research Triangle Park, NC, USA
| | - Matthew R Nelson
- Quantitative Sciences, GlaxoSmithKline, Research Triangle Park, NC, USA
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sebastian Zöllner
- 1] Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA [2] Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
231
|
Lange K, Papp JC, Sinsheimer JS, Sobel EM. Next Generation Statistical Genetics: Modeling, Penalization, and Optimization in High-Dimensional Data. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2014; 1:279-300. [PMID: 24955378 PMCID: PMC4062304 DOI: 10.1146/annurev-statistics-022513-115638] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Statistical genetics is undergoing the same transition to big data that all branches of applied statistics are experiencing. With the advent of inexpensive DNA sequencing, the transition is only accelerating. This brief review highlights some modern techniques with recent successes in statistical genetics. These include: (a) lasso penalized regression and association mapping, (b) ethnic admixture estimation, (c) matrix completion for genotype and sequence data, (d) the fused lasso and copy number variation, (e) haplotyping, (f) estimation of relatedness, (g) variance components models, and (h) rare variant testing. For more than a century, genetics has been both a driver and beneficiary of statistical theory and practice. This symbiotic relationship will persist for the foreseeable future.
Collapse
Affiliation(s)
- Kenneth Lange
- Depts of Biomathematics, Human Genetics, and Statistics, UCLA
| | | | - Janet S. Sinsheimer
- Depts of Biomathematics, Human Genetics, Statistics, and Biostatistics, UCLA
| | | |
Collapse
|
232
|
Liao XX, Zhan ZX, Luo YY, Li K, Wang JL, Guo JF, Yan XX, Xia K, Tang BS, Shen L. Association study between SNP rs150689919 in the DNA demethylation gene, TET1, and Parkinson's disease in Chinese Han population. BMC Neurol 2013; 13:196. [PMID: 24325350 PMCID: PMC4028872 DOI: 10.1186/1471-2377-13-196] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2013] [Accepted: 12/02/2013] [Indexed: 12/17/2022] Open
Abstract
Background Recent studies suggest that epigenetic factors may play an important role in the pathogenesis of Parkinson’s disease (PD). In our previous work, we sequenced the exomes of sixteen patients from eight Chinese PD families using whole exome sequencing technology, consequently three patients from different pedigrees were found sharing the variant c.1460C > T (rs150689919) in the coding region of the Tet methyl cytosine dioxygenase 1 (TET1) gene. Methods In order to evaluate the possible association between sporadic PD and the single nucleotide polymorphism (SNP) rs150689919 in TET1, a case–control cohort study was conducted in 514 sporadic PD patients and 529 normal controls. Genotyping was determined by PCR and direct sequencing. Statistical significance was analyzed by the Chi-squared test. Results There was no statistical significance in TET1 rs150689919 genotype or allele frequencies between the PD cases and healthy controls, even after being stratified by gender and age at onset. Conclusions Our findings suggest that rs150689919 in TET1 may not be associated with PD in Chinese population. However, due to the limited data in this study, replication studies in larger sample and other populations are required.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Lu Shen
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, P, R, China.
| |
Collapse
|
233
|
Lohmueller KE, Sparsø T, Li Q, Andersson E, Korneliussen T, Albrechtsen A, Banasik K, Grarup N, Hallgrimsdottir I, Kiil K, Kilpeläinen TO, Krarup NT, Pers TH, Sanchez G, Hu Y, Degiorgio M, Jørgensen T, Sandbæk A, Lauritzen T, Brunak S, Kristiansen K, Li Y, Hansen T, Wang J, Nielsen R, Pedersen O. Whole-exome sequencing of 2,000 Danish individuals and the role of rare coding variants in type 2 diabetes. Am J Hum Genet 2013; 93:1072-86. [PMID: 24290377 DOI: 10.1016/j.ajhg.2013.11.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 10/16/2013] [Accepted: 11/04/2013] [Indexed: 12/15/2022] Open
Abstract
It has been hypothesized that, in aggregate, rare variants in coding regions of genes explain a substantial fraction of the heritability of common diseases. We sequenced the exomes of 1,000 Danish cases with common forms of type 2 diabetes (including body mass index > 27.5 kg/m(2) and hypertension) and 1,000 healthy controls to an average depth of 56×. Our simulations suggest that our study had the statistical power to detect at least one causal gene (a gene containing causal mutations) if the heritability of these common diseases was explained by rare variants in the coding regions of a limited number of genes. We applied a series of gene-based tests to detect such susceptibility genes. However, no gene showed a significant association with disease risk after we corrected for the number of genes analyzed. Thus, we could reject a model for the genetic architecture of type 2 diabetes where rare nonsynonymous variants clustered in a modest number of genes (fewer than 20) are responsible for the majority of disease risk.
Collapse
Affiliation(s)
- Kirk E Lohmueller
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
234
|
Tenney JR, Prada CE, Hopkin RJ, Hallinan BE. Early spinal cord and brainstem involvement in infantile Leigh syndrome possibly caused by a novel variant. J Child Neurol 2013; 28:1681-5. [PMID: 23143729 DOI: 10.1177/0883073812464273] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Leigh syndrome, due to a dysfunction of mitochondrial energy metabolism, is a genetically heterogeneous and progressive neurologic disorder that usually occurs in infancy and childhood. Its clinical presentation and neuroimaging findings can be variable, especially early in the course of the disease. This report presents a patient with infantile Leigh syndrome who had atypical radiologic findings on serial neuroimaging studies with early and severe involvement of the cervical spinal cord and brainstem and injury to the thalami and basal ganglia occurring only late in the clinical course. Postmortem microscopic examination supported this timing of injury within the central nervous system. In addition, mitochondrial deoxyribonucleic acid sequencing showed a novel homoplasmic variant that could be responsible for this unique lethal form of Leigh syndrome.
Collapse
Affiliation(s)
- Jeffrey R Tenney
- 1Department of Pediatrics, Division of Neurology, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | | | | | | |
Collapse
|
235
|
Nourmohammad A, Held T, Lässig M. Universality and predictability in molecular quantitative genetics. Curr Opin Genet Dev 2013; 23:684-93. [PMID: 24291213 DOI: 10.1016/j.gde.2013.11.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Revised: 10/14/2013] [Accepted: 11/01/2013] [Indexed: 12/15/2022]
Abstract
Molecular traits, such as gene expression levels or protein binding affinities, are increasingly accessible to quantitative measurement by modern high-throughput techniques. Such traits measure molecular functions and, from an evolutionary point of view, are important as targets of natural selection. We review recent developments in evolutionary theory and experiments that are expected to become building blocks of a quantitative genetics of molecular traits. We focus on universal evolutionary characteristics: these are largely independent of a trait's genetic basis, which is often at least partially unknown. We show that universal measurements can be used to infer selection on a quantitative trait, which determines its evolutionary mode of conservation or adaptation. Furthermore, universality is closely linked to predictability of trait evolution across lineages. We argue that universal trait statistics extends over a range of cellular scales and opens new avenues of quantitative evolutionary systems biology.
Collapse
Affiliation(s)
- Armita Nourmohammad
- Joseph-Henri Laboratories of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | | | | |
Collapse
|
236
|
Affiliation(s)
- Evan Z Macosko
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | | |
Collapse
|
237
|
Genetics of psychiatric disorders in the GWAS era: an update on schizophrenia. Eur Arch Psychiatry Clin Neurosci 2013; 263 Suppl 2:S147-54. [PMID: 24071914 DOI: 10.1007/s00406-013-0450-z] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 09/16/2013] [Indexed: 01/21/2023]
Abstract
The influence of genetic factors in the development of schizophrenia has been convincingly demonstrated by family, twin, and adoption studies. The statistical construct of heritability is generally used for estimating the liability due to genetic factors. Heritability estimates for schizophrenia are reported to be between 60 and 80 %. Due to the technical achievements in whole genome-wide association studies, dissection of the underlying genetic factors was intensified recently, resulting in the conclusion that schizophrenia is essentially a polygenic, complex disorder. Most likely more than 100 genes, each with small effect size, contribute to disease risk. A most recent multi-stage genome-wide association study (Ripke et al. in Nat Genet 2013) identified 22 risk loci and estimated that 8,300 independent single-nucleotide polymorphisms contributed to the risk accounting collectively for 32 % in liability. In addition to this polygenic, complex inheritance, there is also strong indication that in some patients a deletion or insertion of a larger chromosomal region [so-called copy number variation (CNV)] might play a crucial role in pathogenesis. This could be specifically important in sporadic cases with schizophrenia, since a higher frequency of de novo mutations has been associated with these CNVs. Further studies, combining much larger sample sizes as well as application of newer technology, such as deep sequencing technologies will be necessary in order to obtain a more comprehensive understanding of the genetic foundations of schizophrenia.
Collapse
|
238
|
Cardinale CJ, Kelsen JR, Baldassano RN, Hakonarson H. Impact of exome sequencing in inflammatory bowel disease. World J Gastroenterol 2013; 19:6721-9. [PMID: 24187447 PMCID: PMC3812471 DOI: 10.3748/wjg.v19.i40.6721] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2013] [Revised: 09/11/2013] [Accepted: 09/16/2013] [Indexed: 02/06/2023] Open
Abstract
Approaches to understanding the genetic contribution to inflammatory bowel disease (IBD) have continuously evolved from family- and population-based epidemiology, to linkage analysis, and most recently, to genome-wide association studies (GWAS). The next stage in this evolution seems to be the sequencing of the exome, that is, the regions of the human genome which encode proteins. The GWAS approach has been very fruitful in identifying at least 163 loci as being associated with IBD, and now, exome sequencing promises to take our genetic understanding to the next level. In this review we will discuss the possible contributions that can be made by an exome sequencing approach both at the individual patient level to aid with disease diagnosis and future therapies, as well as in advancing knowledge of the pathogenesis of IBD.
Collapse
|
239
|
Ratnapriya R, Swaroop A. Genetic architecture of retinal and macular degenerative diseases: the promise and challenges of next-generation sequencing. Genome Med 2013; 5:84. [PMID: 24112618 PMCID: PMC4066589 DOI: 10.1186/gm488] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Inherited retinal degenerative diseases (RDDs) display wide variation in their mode of inheritance, underlying genetic defects, age of onset, and phenotypic severity. Molecular mechanisms have not been delineated for many retinal diseases, and treatment options are limited. In most instances, genotype-phenotype correlations have not been elucidated because of extensive clinical and genetic heterogeneity. Next-generation sequencing (NGS) methods, including exome, genome, transcriptome and epigenome sequencing, provide novel avenues towards achieving comprehensive understanding of the genetic architecture of RDDs. Whole-exome sequencing (WES) has already revealed several new RDD genes, whereas RNA-Seq and ChIP-Seq analyses are expected to uncover novel aspects of gene regulation and biological networks that are involved in retinal development, aging and disease. In this review, we focus on the genetic characterization of retinal and macular degeneration using NGS technology and discuss the basic framework for further investigations. We also examine the challenges of NGS application in clinical diagnosis and management.
Collapse
Affiliation(s)
- Rinki Ratnapriya
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Anand Swaroop
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
240
|
Niculescu AB. Convergent functional genomics of psychiatric disorders. Am J Med Genet B Neuropsychiatr Genet 2013; 162B:587-94. [PMID: 23728881 DOI: 10.1002/ajmg.b.32163] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Accepted: 03/19/2013] [Indexed: 12/27/2022]
Abstract
Genetic and gene expression studies, in humans and animal models of psychiatric and other medical disorders, are becoming increasingly integrated. Particularly for genomics, the convergence and integration of data across species, experimental modalities and technical platforms is providing a fit-to-disease way of extracting reproducible and biologically important signal, in contrast to the fit-to-cohort effect and limited reproducibility of human genetic analyses alone. With the advent of whole-genome sequencing and the realization that a major portion of the non-coding genome may contain regulatory variants, Convergent Functional Genomics (CFG) approaches are going to be essential to identify disease-relevant signal from the tremendous polymorphic variation present in the general population. Such work in psychiatry can provide an example of how to address other genetically complex disorders, and in turn will benefit by incorporating concepts from other areas, such as cancer, cardiovascular diseases, and diabetes.
Collapse
Affiliation(s)
- Alexander B Niculescu
- Department of Psychiatry, Indiana University School of Medicine, Indianapolis, Indiana; Indianapolis VA Medical Center, Indianapolis, Indiana
| |
Collapse
|
241
|
A practical method to detect SNVs and indels from whole genome and exome sequencing data. Sci Rep 2013; 3:2161. [PMID: 23831772 PMCID: PMC3703611 DOI: 10.1038/srep02161] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Accepted: 06/21/2013] [Indexed: 12/21/2022] Open
Abstract
The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98% for WGS calls and 99.99% for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068% and 0.17%) and WES (0.0036% and 0.0084%) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7%, WES: 97.3%). We believe our method can contribute to the greater understanding of human diseases.
Collapse
|
242
|
Abstract
The development of novel technologies for high-throughput DNA sequencing is having a major impact on our ability to measure and define normal and pathologic variation in humans. This review discusses advances in DNA sequencing that have been applied to benign hematologic disorders, including those affecting the red blood cell, the neutrophil, and other white blood cell lineages. Relevant examples of how these approaches have been used for disease diagnosis, gene discovery, and studying complex traits are provided. High-throughput DNA sequencing technology holds significant promise for impacting clinical care. This includes development of improved disease detection and diagnosis, better understanding of disease progression and stratification of risk of disease-specific complications, and development of improved therapeutic strategies, particularly patient-specific pharmacogenomics-based therapy, with monitoring of therapy by genomic biomarkers.
Collapse
|
243
|
Farrer RA, Henk DA, MacLean D, Studholme DJ, Fisher MC. Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 2013; 3:1512. [PMID: 23518929 PMCID: PMC3604800 DOI: 10.1038/srep01512] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 02/25/2013] [Indexed: 12/16/2022] Open
Abstract
Sequence alignments form the basis for many comparative and population genomic studies. Alignment tools provide a range of accuracies dependent on the divergence between the sequences and the alignment methods. Despite widespread use, there is no standard method for assessing the accuracy of a dataset and alignment strategy after resequencing. We present a framework and tool for determining the overall accuracies of an input read dataset, alignment and SNP-calling method providing an isolate in that dataset has a corresponding, or closely related reference sequence available. In addition to this tool for comparing False Discovery Rates (FDR), we include a method for determining homozygous and heterozygous positions from an alignment using binomial probabilities for an expected error rate. We benchmark this method against other SNP callers using our FDR method with three fungal genomes, finding that it was able achieve a high level of accuracy. These tools are available at http://cfdr.sourceforge.net/.
Collapse
Affiliation(s)
- Rhys A Farrer
- Department of Infectious Disease Epidemiology, St Mary's Hospital, Imperial College London, London, UK.
| | | | | | | | | |
Collapse
|
244
|
He X, Sanders SJ, Liu L, De Rubeis S, Lim ET, Sutcliffe JS, Schellenberg GD, Gibbs RA, Daly MJ, Buxbaum JD, State MW, Devlin B, Roeder K. Integrated model of de novo and inherited genetic variants yields greater power to identify risk genes. PLoS Genet 2013; 9:e1003671. [PMID: 23966865 PMCID: PMC3744441 DOI: 10.1371/journal.pgen.1003671] [Citation(s) in RCA: 195] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 06/10/2013] [Indexed: 01/31/2023] Open
Abstract
De novo mutations affect risk for many diseases and disorders, especially those with early-onset. An example is autism spectrum disorders (ASD). Four recent whole-exome sequencing (WES) studies of ASD families revealed a handful of novel risk genes, based on independent de novo loss-of-function (LoF) mutations falling in the same gene, and found that de novo LoF mutations occurred at a twofold higher rate than expected by chance. However successful these studies were, they used only a small fraction of the data, excluding other types of de novo mutations and inherited rare variants. Moreover, such analyses cannot readily incorporate data from case-control studies. An important research challenge in gene discovery, therefore, is to develop statistical methods that accommodate a broader class of rare variation. We develop methods that can incorporate WES data regarding de novo mutations, inherited variants present, and variants identified within cases and controls. TADA, for Transmission And De novo Association, integrates these data by a gene-based likelihood model involving parameters for allele frequencies and gene-specific penetrances. Inference is based on a Hierarchical Bayes strategy that borrows information across all genes to infer parameters that would be difficult to estimate for individual genes. In addition to theoretical development we validated TADA using realistic simulations mimicking rare, large-effect mutations affecting risk for ASD and show it has dramatically better power than other common methods of analysis. Thus TADA's integration of various kinds of WES data can be a highly effective means of identifying novel risk genes. Indeed, application of TADA to WES data from subjects with ASD and their families, as well as from a study of ASD subjects and controls, revealed several novel and promising ASD candidate genes with strong statistical support. The genetic underpinnings of autism spectrum disorder (ASD) have proven difficult to determine, despite a wealth of evidence for genetic causes and ongoing effort to identify genes. Recently investigators sequenced the coding regions of the genomes from ASD children along with their unaffected parents (ASD trios) and identified numerous new candidate genes by pinpointing spontaneously occurring (de novo) mutations in the affected offspring. A gene with a severe (de novo) mutation observed in more than one individual is immediately implicated in ASD; however, the majority of severe mutations are observed only once per gene. These genes create a short list of candidates, and our results suggest about 50% are true risk genes. To strengthen our inferences, we develop a novel statistical method (TADA) that utilizes inherited variation transmitted to affected offspring in conjunction with (de novo) mutations to identify risk genes. Through simulations we show that TADA dramatically increases power. We apply this approach to nearly 1000 ASD trios and 2000 subjects from a case-control study and identify several promising genes. Through simulations and application we show that TADA's integration of sequencing data can be a highly effective means of identifying risk genes.
Collapse
Affiliation(s)
- Xin He
- Lane Center of Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Stephan J. Sanders
- Departments of Psychiatry and Genetics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Li Liu
- Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Silvia De Rubeis
- Seaver Autism Center for Research and Treatment, Icahn Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Psychiatry, Icahn Mount Sinai School of Medicine, New York, New York, United States of America
| | - Elaine T. Lim
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - James S. Sutcliffe
- Vanderbilt Brain Institute, Departments of Molecular Physiology & Biophysics and Psychiatry, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Gerard D. Schellenberg
- Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Mark J. Daly
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - Joseph D. Buxbaum
- Seaver Autism Center for Research and Treatment, Icahn Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Psychiatry, Icahn Mount Sinai School of Medicine, New York, New York, United States of America
- Department of Genetics and Genomic Sciences, Icahn Mount Sinai School of Medicine, New York, New York, United States of America
- Friedman Brain Institute, Icahn Mount Sinai School of Medicine, New York, New York, United States of America
| | - Matthew W. State
- Departments of Psychiatry and Genetics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
| | - Kathryn Roeder
- Lane Center of Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
245
|
Neutral and weakly nonneutral sequence variants may define individuality. Proc Natl Acad Sci U S A 2013; 110:14255-60. [PMID: 23940345 DOI: 10.1073/pnas.1216613110] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Large-scale computational analyses of the growing wealth of genome-variation data consistently tell two distinct stories. The first is expected: coding variants reported in disease-related databases significantly alter the function of affected proteins. The second is surprising: the genomes of healthy individuals appear to carry many variants that are predicted to have some effect on function. As long as the complete experimental analysis of all human genome variants remains impossible, computational methods, such as PolyPhen, SNAP, and SIFT, might provide important insights. These methods capture the effects of particular variants very well and can highlight trends in populations of variants. Diseases are, arguably, extreme phenotypic variations and are often attributable to one or a few severely functionally disruptive variants. Our findings suggest a genomic basis of the different nondisease phenotypes. Prediction methods indicate that variants in seemingly healthy individuals tend to be neutral or weakly disruptive for protein molecular function. These variant effects are predicted to be largely either experimentally undetectable or are not deemed significant enough to be published. This may suggest that nondisease phenotypes arise through combinations of many variants whose effects are weakly nonneutral (damaging or enhancing) to the molecular protein function but fall within the wild-type range of overall physiological function.
Collapse
|
246
|
Hecht M, Bromberg Y, Rost B. News from the protein mutability landscape. J Mol Biol 2013; 425:3937-48. [PMID: 23896297 DOI: 10.1016/j.jmb.2013.07.028] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 07/08/2013] [Accepted: 07/19/2013] [Indexed: 12/16/2022]
Abstract
Some mutations of protein residues matter more than others, and these are often conserved evolutionarily. The explosion of deep sequencing and genotyping increasingly requires the distinction between effect and neutral variants. The simplest approach predicts all mutations of conserved residues to have an effect; however, this works poorly, at best. Many computational tools that are optimized to predict the impact of point mutations provide more detail. Here, we expand the perspective from the view of single variants to the level of sketching the entire mutability landscape. This landscape is defined by the impact of substituting every residue at each position in a protein by each of the 19 non-native amino acids. We review some of the powerful conclusions about protein function, stability and their robustness to mutation that can be drawn from such an analysis. Large-scale experimental and computational mutagenesis experiments are increasingly furthering our understanding of protein function and of the genotype-phenotype associations. We also discuss how these can be used to improve predictions of protein function and pathogenicity of missense variants.
Collapse
Affiliation(s)
- Maximilian Hecht
- Department of Bioinformatics and Computational Biology I12, Technische Universität München, Boltzmannstrasse 3, 85748 Garching, Germany.
| | | | | |
Collapse
|
247
|
Abstract
Genomic technologies are reaching the point of being able to detect genetic variation in patients at high accuracy and reduced cost, offering the promise of fundamentally altering medicine. Still, although scientists and policy advisers grapple with how to interpret and how to handle the onslaught and ambiguity of genome-wide data, established and well-validated molecular technologies continue to have an important role, especially in regions of the world that have more limited access to next-generation sequencing capabilities. Here we review the range of methods currently available in a clinical setting as well as emerging approaches in clinical molecular diagnostics. In parallel, we outline implementation challenges that will be necessary to address to ensure the future of genetic medicine.
Collapse
|
248
|
Gao Q, Sun W, You X, Froehler S, Chen W. A systematic evaluation of hybridization-based mouse exome capture system. BMC Genomics 2013; 14:492. [PMID: 23870319 PMCID: PMC3722074 DOI: 10.1186/1471-2164-14-492] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Accepted: 07/19/2013] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Exome sequencing is increasingly used to search for phenotypically-relevant sequence variants in the mouse genome. All of the current hybridization-based mouse exome capture systems are designed based on the genome reference sequences of the C57BL/6 J strain. Given that the substantial sequence divergence exists between C57BL/6 J and other distantly-related strains, the impact of sequence divergence on the efficiency of such capture systems needs to be systematically evaluated before they can be widely applied to the study of those strains. RESULTS Using the Agilent SureSelect mouse exome capture system, we performed exome sequencing on F1 generation hybrid mice that were derived by crossing two divergent strains, C57BL/6 J and SPRET/EiJ. Our results showed that the C57BL/6 J-based probes captured the sequences derived from C57BL/6 J alleles more efficiently and that the bias was higher for the target regions with greater sequence divergence. At low sequencing depths, the bias also affected the efficiency of variant detection. However, the effects became negligible when sufficient sequencing depth was achieved. CONCLUSION Sufficient sequence depth needs to be planned to match the sequence divergence between C57BL/6 J and the strain to be studied, when the C57BL/6 J-based Agilent SureSelect exome capture system is to be used.
Collapse
|
249
|
Sverdlov S, Thompson EA. Correlation between relatives given complete genotypes: from identity by descent to identity by function. Theor Popul Biol 2013; 88:57-67. [PMID: 23851163 DOI: 10.1016/j.tpb.2013.06.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2012] [Revised: 04/18/2013] [Accepted: 06/12/2013] [Indexed: 02/06/2023]
Abstract
In classical quantitative genetics, the correlation between the phenotypes of individuals with unknown genotypes and a known pedigree relationship is expressed in terms of probabilities of IBD states. In existing approaches to the inverse problem where genotypes are observed but pedigree relationships are not, dependence between phenotypes is either modeled as Bayesian uncertainty or mapped to an IBD model via inferred relatedness parameters. Neither approach yields a relationship between genotypic similarity and phenotypic similarity with a probabilistic interpretation corresponding to a generative model. We introduce a generative model for diploid allele effect based on the classic infinite allele mutation process. This approach motivates the concept of IBF (Identity by Function). The phenotypic covariance between two individuals given their diploid genotypes is expressed in terms of functional identity states. The IBF parameters define a genetic architecture for a trait without reference to specific alleles or population. Given full genome sequences, we treat a gene-scale functional region, rather than a SNP, as a QTL, modeling patterns of dominance for multiple alleles. Applications demonstrated by simulation include phenotype and effect prediction and association, and estimation of heritability and classical variance components. A simulation case study of the Missing Heritability problem illustrates a decomposition of heritability under the IBF framework into Explained and Unexplained components.
Collapse
Affiliation(s)
- Serge Sverdlov
- Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195, USA.
| | | |
Collapse
|
250
|
Indap AR, Cole R, Runge CL, Marth GT, Olivier M. Variant discovery in targeted resequencing using whole genome amplified DNA. BMC Genomics 2013; 14:468. [PMID: 23837845 PMCID: PMC3716764 DOI: 10.1186/1471-2164-14-468] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2012] [Accepted: 06/21/2013] [Indexed: 01/20/2023] Open
Abstract
Background Next generation sequencing and advances in genomic enrichment technologies have enabled the discovery of the full spectrum of variants from common to rare alleles in the human population. The application of such technologies can be limited by the amount of DNA available. Whole genome amplification (WGA) can overcome such limitations. Here we investigate applicability of using WGA by comparing SNP and INDEL variant calls from a single genomic/WGA sample pair from two capture separate experiments: a 50 Mbp whole exome capture and a custom capture array of 4 Mbp region on chr12. Results Our results comparing variant calls derived from genomic and WGA DNA show that the majority of variant SNP and INDEL calls are common to both callsets, both at the site and genotype level and suggest that allele bias plays a minimal role when using WGA DNA in re-sequencing studies. Conclusions Although the results of this study are based on a limited sample size, they suggest that using WGA DNA allows the discovery of the vast majority of variants, and achieves high concordance metrics, when comparing to genomic DNA calls.
Collapse
Affiliation(s)
- Amit R Indap
- Department of Biology, Boston College, Chestnut Hill, MA, USA.
| | | | | | | | | |
Collapse
|