1
|
The road less traveled: from genotype to phenotype in flies and humans. Mamm Genome 2017; 29:5-23. [DOI: 10.1007/s00335-017-9722-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 10/05/2017] [Indexed: 12/20/2022]
|
2
|
Zheng J, Rodriguez S, Laurin C, Baird D, Trela-Larsen L, Erzurumluoglu MA, Zheng Y, White J, Giambartolomei C, Zabaneh D, Morris R, Kumari M, Casas JP, Hingorani AD, Evans DM, Gaunt TR, Day INM. HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics. Bioinformatics 2017; 33:79-86. [PMID: 27591082 PMCID: PMC5544112 DOI: 10.1093/bioinformatics/btw565] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Revised: 04/29/2016] [Accepted: 08/26/2016] [Indexed: 11/21/2022] Open
Abstract
MOTIVATION Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. RESULTS Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). AVAILABILITY AND IMPLEMENTATION The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : jie.zheng@bristol.ac.uk or tom.gaunt@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jie Zheng
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Santiago Rodriguez
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Charles Laurin
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Denis Baird
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
| | - Lea Trela-Larsen
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Mesut A Erzurumluoglu
- School of Social and Community Medicine, University of Bristol, Bristol, UK
- Department of Health Sciences, Genetic Epidemiology Group, University of Leicester, Leicester, UK
| | - Yi Zheng
- Dedman College of Humanities and Sciences, Southern Methodist University, Dallas, TX, USA
| | - Jon White
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Claudia Giambartolomei
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Delilah Zabaneh
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Richard Morris
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Meena Kumari
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Juan P Casas
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
- Department of Primary Care & Population Health, University College London, Royal Free Campus, London, UK
| | - Aroon D Hingorani
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
- Centre for Clinical Pharmacology, University College London, London, UK, Division of Medicine
| | | | - David M Evans
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Australia, QLD
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Ian N M Day
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| |
Collapse
|
3
|
Abstract
During the 1990s and the first several years of this century, microsatellites or short tandem repeats were the workhorse genetic markers for hypothesis-independent studies in human genetics, facilitating genome-wide linkage studies and allelic imbalance studies. However, the rise of higher throughput and cost-effective single-nucleotide polymorphism (SNP) platforms led to the era of the SNP for genome scans. Nevertheless, it is important to note that microsatellites remain highly informative and useful measures of genomic variation for linkage and association studies. Their continued advantage in complementing SNPs lies in their greater allelic diversity than biallelic SNPs as well as in their population history, in which single-step expansion or contraction of the tandem repeat on the background of ancestral SNP haplotypes can break up common haplotypes, leading to greater haplotype diversity within the linkage disequilibrium block of interest. In fact, microsatellites have starred in association studies leading to widely replicated discoveries of type 2 diabetes (TCF7L2) and prostate cancer genes (the 8q21 region). At the end of the day, it will be important to catalog all variation, including SNPs, microsatellites, copy number variations, and polymorphic inversions in human genetic studies. This article describes the utilities of microsatellites and experimental approaches in their use.
Collapse
|
4
|
Yang J, Ferreira T, Morris AP, Medland SE, Madden PAF, Heath AC, Martin NG, Montgomery GW, Weedon MN, Loos RJ, Frayling TM, McCarthy MI, Hirschhorn JN, Goddard ME, Visscher PM. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 2012; 44:369-75, S1-3. [PMID: 22426310 DOI: 10.1038/ng.2213] [Citation(s) in RCA: 973] [Impact Index Per Article: 81.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2011] [Accepted: 02/06/2012] [Indexed: 12/14/2022]
Abstract
We present an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of genome-wide association studies (GWAS) and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. Using this method, we analyzed meta-analysis summary data from the GIANT Consortium for height and body mass index (BMI), with the LD structure estimated from genotype data in two independent cohorts. We identified 36 loci with multiple associated variants for height (38 leading and 49 additional SNPs, 87 in total) via a genome-wide SNP selection procedure. The 49 new SNPs explain approximately 1.3% of variance, nearly doubling the heritability explained at the 36 loci. We did not find any locus showing multiple associated SNPs for BMI. The method we present is computationally fast and is also applicable to case-control data, which we demonstrate in an example from meta-analysis of type 2 diabetes by the DIAGRAM Consortium.
Collapse
Affiliation(s)
- Jian Yang
- Queensland Institute of Medical Research, Brisbane, Queensland, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Wray NR. Allele Frequencies and ther2Measure of Linkage Disequilibrium: Impact on Design and Interpretation of Association Studies. Twin Res Hum Genet 2012. [DOI: 10.1375/twin.8.2.87] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThe design and interpretation of genetic association studies depends on the relationship between the genotyped variants and the underlying functional variant, often parameterized as the squared correlation orr2measure of linkage disequilibrium between two loci. While it has long been recognized that placing a constraint on ther2between two loci also places a constraint on the difference in frequencies between the coupled alleles, this constraint has not been quantified. Here, quantification of this severe constraint is presented. For example, forr2≥ .8, the maximum difference in allele frequency is ± .06 which occurs when one locus has allele frequency .5. Forr2≥ .8 and allele frequency at one locus of .1, the maximum difference in allele frequency at the second locus is only ± .02. The impact on the design and interpretation of association studies is discussed.
Collapse
|
6
|
Ke X, Kennedy LJ, Short AD, Seppälä EH, Barnes A, Clements DN, Wood SH, Carter SD, Happ GM, Lohi H, Ollier WER. Assessment of the functionality of genome-wide canine SNP arrays and implications for canine disease association studies. Anim Genet 2010; 42:181-90. [DOI: 10.1111/j.1365-2052.2010.02132.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
7
|
Möckelmann N, von Schönfels W, Buch S, von Kampen O, Sipos B, Egberts JH, Rosenstiel P, Franke A, Brosch M, Hinz S, Röder C, Kalthoff H, Fölsch UR, Krawczak M, Schreiber S, Bröring CD, Tepel J, Schafmayer C, Hampe J. Investigation of innate immunity genes CARD4, CARD8 and CARD15 as germline susceptibility factors for colorectal cancer. BMC Gastroenterol 2009; 9:79. [PMID: 19843337 PMCID: PMC2776017 DOI: 10.1186/1471-230x-9-79] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/26/2009] [Accepted: 10/20/2009] [Indexed: 02/08/2023] Open
Abstract
Background Variation in genes involved in the innate immune response may play a role in the predisposition to colorectal cancer (CRC). Several polymorphisms of the CARD15 gene (caspase activating recruitment domain, member 15) have been reported to be associated with an increased susceptibility to Crohn disease. Since the CARD15 gene product and other CARD proteins function in innate immunity, we investigated the impact of germline variation at the CARD4, CARD8 and CARD15 loci on the risk for sporadic CRC, using a large patient sample from Northern Germany. Methods A total of 1044 patients who had been operated with sporadic colorectal carcinoma (median age at diagnosis: 59 years) were recruited and compared to 724 sex-matched, population-based control individuals (median age: 68 years). Genetic investigation was carried out following both a coding SNP and haplotype tagging approach. Subgroup analyses for N = 143 patients with early manifestation of CRC (≤50 age at diagnosis) were performed for all CARD loci and subgroup analyses for diverse age strata were carried out for CARD15 mutations R702W, G908R and L1007fs. In addition, all SNPs were tested for association with disease presentation and family history of CRC. Results No significant differences were observed between the patient and control allelic or haplotypic spectra of the three genes under study for the total cohort (N = 1044 patients). None of the analysed SNPs was significantly associated with either tumour location or yielded significant association in the familial or non-familial CRC patient subgroups. However, in a patient subgroup (≤45 age at diagnosis) with early disease manifestation the mutant allele of CARD15 R702W was found to be significantly associated with disease susceptibility (9.7% in cases vs 4.6% in controls; Pallelic = 0.008, Pgenotypic = 0.0008, ORallelic = 2.22 (1.21-4.05) ORressessive = 21.9 (1.96-245.4). Conclusion Variation in the innate immunity genes CARD4, CARD8 and CARD15 is unlikely to play a major role in the susceptibility to CRC in the German population. But, we report a significant disease contribution of CARD15 for CRC patients with very early disease manifestation, mainly driven by variant R702W.
Collapse
Affiliation(s)
- Nikolaus Möckelmann
- Department of General Internal Medicine Christian-Albrechts-University, Kiel, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Xiong S, Hao Y, Rao S, Huang W, Hu B, Labu, Pubuzhuoma, Gesangzhuogab, Wang Y. Effects of cutoff thresholds for minor allele frequencies on HapMap resolution: A real dataset-based evaluation of the Chinese Han and Tibetan populations. Sci Bull (Beijing) 2009. [DOI: 10.1007/s11434-009-0302-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
9
|
Hong KW, Jin HS, Lim JE, Ryu HJ, Go MJ, Lee JY, Woo JT, Park HK, Oh B. RAPGEF1 gene variants associated with type 2 diabetes in the Korean population. Diabetes Res Clin Pract 2009; 84:117-22. [PMID: 19297053 DOI: 10.1016/j.diabres.2009.02.019] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2008] [Revised: 02/04/2009] [Accepted: 02/18/2009] [Indexed: 10/21/2022]
Abstract
Under the activation of insulin receptors, glucose transporter 4 (Glut4) translocation is regulated by two signal transduction pathways. These pathways are the PI 3-kinase-dependent pathway and the CAP/TC10 pathway. The adaptor protein Rap guanine exchange factor 1 (RAPGEF1) also known as C3G is a component of the CAP/TC10 pathway. Defects in the RAPGEF1 protein may contribute to insulin resistance and type 2 diabetes. Recently, the RAPGEF1 gene was suggested to be involved in the development of type 2 diabetes by FUSION study. To investigate this association in the Korean population, we sequenced the RAPGEF1 gene in 24 unrelated individuals and identified 39 sequence variants. Eleven single nucleotide polymorphisms (SNPs) were selected and genotyped in 1122 Korean patients with type 2 diabetes. There were 1138 non-diabetic controls. Using a logistic regression analysis, a significant association was found between SNP rs11243444 in the RAPGEF1 gene and type 2 diabetes [OR=0.490 (95% CI 0.296-0.813), p=0.006] in the recessive model, leading the protective effect of the GG genotype on the disease development. The present study examines genetic polymorphisms in the RAPGEF1 gene, and the positive association between one polymorphism and type 2 diabetes in the Korean population.
Collapse
Affiliation(s)
- Kyung-Won Hong
- Department of Biomedical Engineering, School of Medicine, Kyung Hee University, 1 Hoeki-dong, Dongdaemun-gu, Seoul 130-702, Republic of Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Xing J, Witherspoon DJ, Watkins WS, Zhang Y, Tolpinrud W, Jorde LB. HapMap tagSNP transferability in multiple populations: general guidelines. Genomics 2008; 92:41-51. [PMID: 18482828 DOI: 10.1016/j.ygeno.2008.03.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2008] [Revised: 03/26/2008] [Accepted: 03/28/2008] [Indexed: 11/30/2022]
Abstract
Linkage disequilibrium (LD) has received much attention recently because of its value in localizing disease-causing genes. Due to the extensive LD between neighboring loci in the human genome, it is believed that a subset of the single nucleotide polymorphisms in a region (tagSNPs) can be selected to capture most of the remaining SNP variants. In this study, we examined LD patterns and HapMap tagSNP transferability in more than 300 individuals. A South Indian sample and an African Mbuti Pygmy population sample were included to evaluate the performance of HapMap tagSNPs in geographically distinct and genetically isolated populations. Our results show that HapMap tagSNPs selected with r(2) >= 0.8 can capture more than 85% of the SNPs in populations that are from the same continental group. Combined tagSNPs from HapMap CEU and CHB+JPT serve as the best reference for the Indian sample. The HapMap YRI are a sufficient reference for tagSNP selection in the Pygmy sample. In addition to our findings, we reviewed over 25 recent studies of tagSNP transferability and propose a general guideline for selecting tagSNPs from HapMap populations.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | | | | | | | | | | |
Collapse
|
11
|
Constantine CC, Gurrin LC, McLaren CE, Bahlo M, Anderson GJ, Vulpe CD, Forrest SM, Allen KJ, Gertig DM. SNP selection for genes of iron metabolism in a study of genetic modifiers of hemochromatosis. BMC MEDICAL GENETICS 2008; 9:18. [PMID: 18366708 PMCID: PMC2289803 DOI: 10.1186/1471-2350-9-18] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2007] [Accepted: 03/20/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND We report our experience of selecting tag SNPs in 35 genes involved in iron metabolism in a cohort study seeking to discover genetic modifiers of hereditary hemochromatosis. METHODS We combined our own and publicly available resequencing data with HapMap to maximise our coverage to select 384 SNPs in candidate genes suitable for typing on the Illumina platform. RESULTS Validation/design scores above 0.6 were not strongly correlated with SNP performance as estimated by Gentrain score. We contrasted results from two tag SNP selection algorithms, LDselect and Tagger. Varying r2 from 0.5 to 1.0 produced a near linear correlation with the number of tag SNPs required. We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the > or = 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing. Resequencing can reveal adjacent SNPs (within 60 bp) which may affect assay performance. We report the number of SNPs present within the region of six of our larger candidate genes, for different versions of stock genotyping assays. CONCLUSION A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data. Tag SNP software must be fast and flexible to data changes, since tag SNP selection involves iteration as investigators seek to satisfy the competing demands of coverage within and between populations, and typability on the technology platform chosen.
Collapse
Affiliation(s)
- Clare C Constantine
- The Centre for Molecular, Environmental, Genetic and Analytic (MEGA) Epidemiology, School of Population Health, The University of Melbourne, Melbourne, Australia.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Windelinckx A, Vlietinck R, Aerssens J, Beunen G, Thomis MAI. Selection of genes and single nucleotide polymorphisms for fine mapping starting from a broad linkage region. Twin Res Hum Genet 2008; 10:871-85. [PMID: 18179400 DOI: 10.1375/twin.10.6.871] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Fine mapping of linkage peaks is one of the great challenges facing researchers who try to identify genes and genetic variants responsible for the variation in a certain trait or complex disease. Once the trait is linked to a certain chromosomal region, most studies use a candidate gene approach followed by a selection of polymorphisms within these genes, either based on their possibility to be functional, or based on the linkage disequilibrium between adjacent markers. For both candidate gene selection and SNP selection, several approaches have been described, and different software tools are available. However, mastering all these information sources and choosing between the different approaches can be difficult and time-consuming. Therefore, this article lists several of these in silico procedures, and the authors describe an empirical two-step fine mapping approach, in which candidate genes are prioritized using a bioinformatics approach (ENDEAVOUR), and the top genes are chosen for further SNP selection with a linkage disequilibrium based method (Tagger). The authors present the different actions that were applied within this approach on two previously identified linkage regions for muscle strength. This resulted in the selection of 331 polymorphisms located in 112 different candidate genes out of an initial set of 23,300 SNPs.
Collapse
Affiliation(s)
- An Windelinckx
- Research Center for Exercise and Health, Department of Biomedical Kinesiology, Faculty of Kinesiology and Rehabilitation Sciences, Katholieke Universiteit Leuven, Leuven, Belgium
| | | | | | | | | |
Collapse
|
13
|
Christoforou A, Le Hellard S, Thomson PA, Morris SW, Tenesa A, Pickard BS, Wray NR, Muir WJ, Blackwood DH, Porteous DJ, Evans KL. Association analysis of the chromosome 4p15-p16 candidate region for bipolar disorder and schizophrenia. Mol Psychiatry 2007; 12:1011-25. [PMID: 17457313 DOI: 10.1038/sj.mp.4002003] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Several independent linkage studies have identified chromosome 4p15-p16 as a putative region of susceptibility for bipolar disorder (BP), schizophrenia (SCZ) and related phenotypes. Previously, we identified two subregions (B and D) of the 4p15-p16 region that are shared by three of four 4p-linked families examined. Here, we describe a large-scale association analysis of regions B and D (3.8 and 4.5 Mb, respectively). We selected 408 haplotype-tagging single nucleotide polymorphisms (SNPs) on a block-by-block basis from the International HapMap project and tested them in 368 BP, 386 SCZ and 458 control individuals. Nominal significance thresholds were determined using principal component analysis as implemented in the program SNPSpD. In region B, overlapping SNPs and haplotypes met the region-wide threshold (P<or=0.0005) at the global and individual haplotype test level and clustered in two regions. In region D, no individual SNPs were nominally significant, but multiple global and individual haplotypes were associated with BP and/or SCZ (region-wide threshold, P<or=0.0003). These overlapping haplotypes fell into two regions. Within each of these four clusters, at least one globally significant haplotype withstood permutation testing (P(gp)<or=0.05). Five predicted genes were found within these associated regions, while Known/RefSeq genes, including KIAA0746 and PPARGC1A, mapped nearby. There were also nine other clusters within regions B and D with nominally significant haplotypes, but only at the individual haplotype level. KIAA0746, PPARGC1A, GPR125, CCKAR and DKFZp761B107 overlapped with these regions. This study has identified significant associations between BP and SCZ within the chromosome 4p linkage region, resulting in candidate regions worthy of further investigation.
Collapse
Affiliation(s)
- A Christoforou
- Medical Genetics Section, Molecular Medicine Centre, Western General Hospital, University of Edinburgh, Edinburgh, UK.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Schafmayer C, Völzke H, Buch S, Egberts J, Spille A, von Eberstein H, Franke A, Seeger M, Hinz S, Elsharawy A, Rosskopf D, Brosch M, Krawczak M, Foelsch UR, Schafmayer A, Lammert F, Schreiber S, Faendrich F, Hampe J, Tepel J. Investigation of the Lith6 candidate genes APOBEC1 and PPARG in human gallstone disease. Liver Int 2007; 27:910-9. [PMID: 17696929 DOI: 10.1111/j.1478-3231.2007.01536.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
BACKGROUND Genetic susceptibility contributes to the aetiology of gallbladder diseases as shown by multiple epidemiological studies. A major gallstone susceptibility locus (Lith6) was identified in 2003 by quantitative trait locus mapping in mice. Two attractive positional and functional candidate genes in apolipoprotein B mRNA-editing protein (APOBEC1) and peroxisome proliferator-activated receptor gamma (PPARG) are located in this interval. AIMS To investigate APOBEC1 and PPARG as candidate genes for common symptomatic gallstone disease in humans. PATIENTS AND METHODS Eight hundred and ten patients who underwent cholecystectomy for symptomatic gallstone disease (median age of onset 50) were compared with 718 sex-matched control individuals. An independent additional sample included 368 gallstone patients and 368 controls. Control individuals were sonographically free of gallstones. Haplotype tagging and all known coding single nucleotide polymorphisms were genotyped for PPARG (N=32) and APOBEC1 (N=11). RESULTS The investigated high-risk patient sample provides a power of greater than 80% for the detection of odds ratios down to 1.45. No evidence of association of the two genes in the single-point tagging markers, coding variants and in the sliding window haplotype analysis was detected (all nominal single point P-values >0.04). A logistic regression analysis including age, sex and BMI as covariates was also negative (nominal P-values > or =0.08). CONCLUSIONS In the investigated German samples, no evidence of association of APOBEC1 and PPARG with gallstone susceptibility was detected. Systematic fine mapping of the complete Lith6 region is required to identify the causative genetic variants for gallstone in mice and humans.
Collapse
Affiliation(s)
- Clemens Schafmayer
- Department of General Surgery and Thoracic Surgery, Christian-Albrechts-University Kiel, Kiel, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Wollstein A, Herrmann A, Wittig M, Nothnagel M, Franke A, Nürnberg P, Schreiber S, Krawczak M, Hampe J. Efficacy assessment of SNP sets for genome-wide disease association studies. Nucleic Acids Res 2007; 35:e113. [PMID: 17726055 PMCID: PMC2034459 DOI: 10.1093/nar/gkm621] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The power of a genome-wide disease association study depends critically upon the properties of the marker set used, particularly the number and physical spacing of markers, and the level of inter-marker association due to linkage disequilibrium. Extending our previously devised theoretical framework for the entropy-based selection of genetic markers, we have developed a local measure of the efficacy of a marker set, relative to including a maximally polymorphic single nucleotide polymorphism (SNP) at the map position of interest. Using this quantitative criterion, we evaluated five currently available SNP sets, namely Affymetrix 100K and 500K, and Illumina 100K, 300K and 550K in the CEU, YRI and JPT + CHB HapMap populations. At 50% relative efficacy, the commercial marker sets cover between 19 and 68% of the human genome, depending upon the population under study. An optimal technology-independent 500K marker set constructed from HapMap for Caucasians, in contrast, would achieve 73% coverage at the same relative efficacy.
Collapse
Affiliation(s)
- Andreas Wollstein
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Alexander Herrmann
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Michael Wittig
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Michael Nothnagel
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Andre Franke
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Peter Nürnberg
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Stefan Schreiber
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Michael Krawczak
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Jochen Hampe
- Cologne Center for Genomics, Cologne, Institute of Clinical Molecular Biology, Christian-Albrechts University, Ist Department of Medicine and Institute of Medical Informatics and Statistics, Christian-Albrechts University, University Hospital Schleswig-Holstein Campus Kiel, Kiel, Germany
- *To whom correspondence should be addressed. +49 431 597 1246+49 431 597 1842
| |
Collapse
|
16
|
Schafmayer C, Buch S, Egberts JH, Franke A, Brosch M, El Sharawy A, Conring M, Koschnick M, Schwiedernoch S, Katalinic A, Kremer B, Fölsch UR, Krawczak M, Fändrich F, Schreiber S, Tepel J, Hampe J. Genetic investigation of DNA-repair pathway genes PMS2, MLH1, MSH2, MSH6, MUTYH, OGG1 and MTH1 in sporadic colon cancer. Int J Cancer 2007; 121:555-8. [PMID: 17417778 DOI: 10.1002/ijc.22735] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Mutations in DNA repair genes have previously been identified as causative factors for hereditary nonpolyposis colon cancer (HNPCC). Recent evidence also supports an association between DNA sequence variation in these genes and sporadic colorectal carcinoma (CRC). Genetic investigation of DNA repair genes PMS2, MLH1, MSH2, MSH6, MUTYH, OGG1 and MTH1, as possible susceptibility factors for sporadic CRC, was done using both a haplotype tagging and a candidate (i.e. coding) single nucleotide polymorphism (SNP) approach. Some 1,068 patients with operated CRC (median age at diagnosis: 59 years) were compared to 738 sex-matched control individuals (median age: 67 years). Haplotype tagging SNPs, previously reported risk variants and all known coding SNPs with a minor allele frequency >0.005 were genotyped in PMS2 (N = 10), MLH1 (N = 11), MSH2 (N = 18), MSH6 (N = 15), MUTYH (N = 7), OGG1 (N = 11) and MTH1 (N = 3). No evidence for an association between CRC and any of the 7 genes was detected, neither with the tagging or coding SNPs nor in a sliding window haplotype analysis (all nominal p-values >0.05). The previously reported risk variants D132H in MLH1 and R154H in OGG1 were not even observed in the German population. Genetic CRC risk factors so far identified in DNA repair genes seem to be rare and population-specific. Their association with the disease could not be replicated in German CRC samples. It remains to be elucidated by more systematic, large-scale experiments whether common variants in the same genes, but present across populations, represent risk factors for sporadic CRC.
Collapse
Affiliation(s)
- Clemens Schafmayer
- Department of General and Thoracic Surgery, Christian-Albrechts-University, Kiel, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Montpetit A, Chagnon F. [The Haplotype Map of the human genome: a revolution in the genetics of complex diseases]. Med Sci (Paris) 2007; 22:1061-7. [PMID: 17156727 DOI: 10.1051/medsci/200622121061] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
More than 99.9 % of the sequence is identical when comparing the DNA from two individuals. The remaining 0.1 % is responsible, along with other factors such as the environment, for the risk level of developing complex diseases (such as asthma, diabetes or cancer) or for the different pharmacological response to drugs. Despite the incredible advances in genomics in the past few years, identifying the variants involved remains difficult because of the prodigious amount of information to process. The recent completion of the Haplotype Map of the human genome has raised great hopes in the field as it is expected to help reduce the complexity of association studies and thus accelerate the discovery of genes associated with complex diseases. This review details the rationale behind the HapMap project, gives a summary of the results and also describes potential applications of the Haplotype Map.
Collapse
Affiliation(s)
- Alexandre Montpetit
- Centre d'Innovation de Génome Québec et de l'Université McGill, 740 avenue Dr Penfield, Montréal, Québec, H3A 1A4 Canada.
| | | |
Collapse
|
18
|
Amundsen SS, Adamovic S, Hellqvist A, Nilsson S, Gudjónsdóttir AH, Ascher H, Ek J, Larsson K, Wahlström J, Lie BA, Sollid LM, Naluai AT. A comprehensive screen for SNP associations on chromosome region 5q31–33 in Swedish/Norwegian celiac disease families. Eur J Hum Genet 2007; 15:980-7. [PMID: 17551518 DOI: 10.1038/sj.ejhg.5201870] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Celiac disease (CD) is a gluten-induced enteropathy, which results from the interplay between environmental and genetic factors. There is a strong human leukocyte antigen (HLA) association with the disease, and HLA-DQ alleles represent a major genetic risk factor. In addition to HLA-DQ, non-HLA genes appear to be crucial for CD development. Chromosomal region 5q31-33 has demonstrated linkage with CD in several genome-wide studies, including in our Swedish/Norwegian cohort. In a European meta-analysis 5q31-33 was the only region that reached a genome-wide level of significance except for the HLA region. To identify the genetic variant(s) responsible for this linkage signal, we performed a comprehensive single nucleotide polymorphism (SNP) association screen in 97 Swedish/Norwegian multiplex families who demonstrate linkage to the region. We selected tag SNPs from a 16 Mb region representing the 95% confidence interval of the linkage peak. A total of 1,404 SNPs were used for the association analysis. We identified several regions with SNPs demonstrating moderate single- or multipoint associations. However, the isolated association signals appeared insufficient to account for the linkage signal seen in our cohort. Collective effects of multiple risk genes within the region, incomplete genetic coverage or effects related to copy number variation are possible explanations for our findings.
Collapse
Affiliation(s)
- Silja Svanstrøm Amundsen
- Institute of Immunology, University of Oslo, Rikshospitalet-Radiumhospitalet Medical Centre, Oslo, Norway.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Barnes KC, Grant AV, Hansel NN, Gao P, Dunston GM. African Americans with asthma: genetic insights. Ann Am Thorac Soc 2007; 4:58-68. [PMID: 17202293 PMCID: PMC2647616 DOI: 10.1513/pats.200607-146jg] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
It has been well established that genetic factors strongly affect susceptibility to asthma and its associated traits. It is less clear to what extent genetic variation contributes to the ethnic disparities observed for asthma morbidity and mortality. Individuals of African descent with asthma have more severe asthma, higher IgE levels, a higher degree of steroid dependency, and more severe clinical symptoms than individuals of European descent with asthma but relatively few studies have focused on this particularly vulnerable ethnic group. Similar underrepresentation exists for other minorities, including Hispanics. In this review, a summary of linkage and association studies in populations of African descent is presented, and the role of linkage disequilibrium in the dissection of a complex trait such as asthma is discussed. Consideration for the impact of population stratification in recently admixed populations (i.e., European, African) is essential in genetic association studies focusing on African ancestry groups. With the most recent update on the International HapMap Project, efficient selection of haplotype tagging single nucleotide polymorphisms (htSNPs) for African Americans has accelerated and efficiency of htSNPs chosen from one population to represent other continental groups (e.g., African) has been demonstrated. Cutting-edge approaches, such as genomewide association studies, admixture mapping, and phylogenetic analyses, offer new opportunities for dissecting the genetic basis for asthma in populations of African descent.
Collapse
Affiliation(s)
- Kathleen C Barnes
- Division of Allergy and Clinical Immunology, Department of Medicine, The Johns Hopkins University, Baltimore, Maryland, USA.
| | | | | | | | | |
Collapse
|
20
|
Carlton VEH, Ireland JS, Useche F, Faham M. Functional single nucleotide polymorphism-based association studies. Hum Genomics 2006; 2:391-402. [PMID: 16848977 PMCID: PMC3525158 DOI: 10.1186/1479-7364-2-6-391] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Association studies hold great promise for the elucidation of the genetic basis of diseases. Studies based on functional single nucleotide polymorphisms (SNPs) or on linkage disequilibrium (LD) represent two main types of designs. LD-based association studies can be comprehensive for common causative variants, but they perform poorly for rare alleles. Conversely, functional SNP-based studies are efficient because they focus on the SNPs with the highest a priori chance of being associated. Our poor ability to predict the functional effect of SNPs, however, hampers attempts to make these studies comprehensive. Recent progress in comparative genomics, and evidence that functional elements tend to lie in conserved regions, promises to change the landscape, permitting functional SNP association studies to be carried out that comprehensively assess common and rare alleles. SNP genotyping technologies are already sufficient for such studies, but studies will require continued genomic sequencing of multiple species, research on the functional role of conserved sequences and additional SNP discovery and validation efforts (including targeted SNP discovery to identify the rare alleles in functional regions). With these resources, we expect that comprehensive functional SNP association studies will soon be possible.
Collapse
Affiliation(s)
- Victoria EH Carlton
- ParAllele BioScience (Now Affymetrix, Inc), 7300 Shoreline Boulevard, South San Francisco, CA 94080, USA
| | - James S Ireland
- ParAllele BioScience (Now Affymetrix, Inc), 7300 Shoreline Boulevard, South San Francisco, CA 94080, USA
| | - Francisco Useche
- ParAllele BioScience (Now Affymetrix, Inc), 7300 Shoreline Boulevard, South San Francisco, CA 94080, USA
| | - Malek Faham
- ParAllele BioScience (Now Affymetrix, Inc), 7300 Shoreline Boulevard, South San Francisco, CA 94080, USA
| |
Collapse
|
21
|
Ding K, Kullo IJ. Methods for the selection of tagging SNPs: a comparison of tagging efficiency and performance. Eur J Hum Genet 2006; 15:228-36. [PMID: 17164795 DOI: 10.1038/sj.ejhg.5201755] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
There is great interest in the use of tagging single nucleotide polymorphisms (tSNPs) to facilitate association studies of complex diseases. This is based on the premise that a minimum set of tSNPs may be sufficient to capture most of the variation in certain regions of the human genome. Several methods have been described to select tSNPs, based on either haplotype-block structure or independent of the underlying block structure. In this paper, we compare eight methods for choosing tSNPs in 10 representative resequenced candidate genes (a total of 194.2 kb) with different levels of linkage disequilibrium (LD) in a sample of European-Americans. We compared tagging efficiency (TE) and prediction accuracy of tSNPs identified by these methods, as a function of several factors, including LD level, minor allele frequency, and tagging criteria. We also assessed tagging consistency between each method. We found that tSNPs selected based on the methods Haplotype Diversity and Haplotype r2 provided the highest TE, whereas the prediction accuracy was comparable among different methods. Tagging consistency between different methods of tSNPs selection was poor. This work demonstrates that when tSNPs-based association studies are undertaken, the choice of method for selecting tSNPs requires careful consideration.
Collapse
Affiliation(s)
- Keyue Ding
- Division of Cardiovascular Diseases, Mayo Clinic and Foundation, Rochester, MN 55905, USA
| | | |
Collapse
|
22
|
Paschou P, Mahoney MW, Javed A, Kidd JR, Pakstis AJ, Gu S, Kidd KK, Drineas P. Intra- and interpopulation genotype reconstruction from tagging SNPs. Genome Res 2006; 17:96-107. [PMID: 17151345 PMCID: PMC1716273 DOI: 10.1101/gr.5741407] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects of debate. We propose novel, scalable matrix algorithms that address these issues and we evaluate them on genotypic data from 38 populations and four genomic regions (248 SNPs typed for approximately 2000 individuals). We also evaluate these algorithms on a second data set consisting of genotypes available from the HapMap database (1336 SNPs for four populations) over the same genomic regions. Furthermore, we test these methods in the setting of a real association study using a publicly available family data set. The algorithms we use for tSNP selection and unassayed SNP reconstruction do not require haplotype inference and they are, in principle, scalable even to genome-wide analysis. Moreover, they are greedy variants of recently developed matrix algorithms with provable performance guarantees. Using a small set of carefully selected tSNPs, we achieve very good reconstruction accuracy of "untyped" genotypes for most of the populations studied. Additionally, we demonstrate in a quantitative manner that the chosen tSNPs exhibit substantial transferability, both within and across different geographic regions. Finally, we show that reconstruction can be applied to retrieve significant SNP associations with disease, with important genotyping savings.
Collapse
Affiliation(s)
- Peristera Paschou
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06511, USA.
| | | | | | | | | | | | | | | |
Collapse
|
23
|
de Bakker PIW, Burtt NP, Graham RR, Guiducci C, Yelensky R, Drake JA, Bersaglieri T, Penney KL, Butler J, Young S, Onofrio RC, Lyon HN, Stram DO, Haiman CA, Freedman ML, Zhu X, Cooper R, Groop L, Kolonel LN, Henderson BE, Daly MJ, Hirschhorn JN, Altshuler D. Transferability of tag SNPs in genetic association studies in multiple populations. Nat Genet 2006; 38:1298-303. [PMID: 17057720 DOI: 10.1038/ng1899] [Citation(s) in RCA: 201] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2006] [Accepted: 09/12/2006] [Indexed: 11/08/2022]
Abstract
A general question for linkage disequilibrium-based association studies is how power to detect an association is compromised when tag SNPs are chosen from data in one population sample and then deployed in another sample. Specifically, it is important to know how well tags picked from the HapMap DNA samples capture the variation in other samples. To address this, we collected dense data uniformly across the four HapMap population samples and eleven other population samples. We picked tag SNPs using genotype data we collected in the HapMap samples and then evaluated the effective coverage of these tags in comparison to the entire set of common variants observed in the other samples. We simulated case-control association studies in the non-HapMap samples under a disease model of modest risk, and we observed little loss in power. These results demonstrate that the HapMap DNA samples can be used to select tags for genome-wide association studies in many samples around the world.
Collapse
Affiliation(s)
- Paul I W de Bakker
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Seven Cambridge Center, Cambridge, Massachusetts, 02142, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Schafmayer C, Tepel J, Franke A, Buch S, Lieb S, Seeger M, Lammert F, Kremer B, Fölsch UR, Fändrich F, Schreiber S, Hampe J. Investigation of the Lith1 candidate genes ABCB11 and LXRA in human gallstone disease. Hepatology 2006; 44:650-7. [PMID: 16941683 DOI: 10.1002/hep.21289] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Genetic susceptibility in the causation of gallbladder diseases was recognized as early as 1937. A major gallstone susceptibility locus (Lith1) was identified in 1995 by quantitative trait locus mapping in mice. Two attractive positional and functional candidate genes in LXRA and ABCB11 are located in this interval. ABCB11 is associated with progressive familial cholestasis. This study was undertaken to investigate LXRA and ABCB11 as candidate genes for gallstone disease in humans. Eight hundred and ten patients who underwent cholecystectomy for symptomatic gallstone disease (median age of onset, 50 years) were compared with 718 sex-matched control individuals. Control individuals were sonographically free of gallstones. Haplotype tagging and all known coding single nucleotide polymorphisms (SNPs) were genotyped for ABCB11 (n=29) and LXRA (n=10). The investigated high-risk patient sample provides a power of greater than 80% for the detection of odds ratios down to 1.55. No evidence of association of the two genes in the single point tagging markers, coding variants or in the sliding window haplotype analysis was detected (all nominal single-point P values>or=.08). In conclusion, in the investigated German sample, no evidence of association of ABCB11 and LXRA to gallstone susceptibility was detected. The gallstone trait is not allelic to progressive familial cholestasis at the ABCB11 locus. Systematic fine mapping of the Lith1 region is required to identify the causative genetic variants for gallstone in mice and humans.
Collapse
Affiliation(s)
- Clemens Schafmayer
- Department of General and Thoracic Surgery, Christian-Albrechts-University, Kiel, and Department of Internal Medicine I, University Hospital Bonn, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Kuntsi J, Neale BM, Chen W, Faraone SV, Asherson P. The IMAGE project: methodological issues for the molecular genetic analysis of ADHD. Behav Brain Funct 2006; 2:27. [PMID: 16887023 PMCID: PMC1559631 DOI: 10.1186/1744-9081-2-27] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2006] [Accepted: 08/03/2006] [Indexed: 01/25/2023] Open
Abstract
The genetic mechanisms involved in attention deficit hyperactivity disorder (ADHD) are being studied with considerable success by several centres worldwide. These studies confirm prior hypotheses about the role of genetic variation within genes involved in the regulation of dopamine, norepinephrine and serotonin neurotransmission in susceptibility to ADHD. Despite the importance of these findings, uncertainties remain due to the very small effects sizes that are observed. We discuss possible reasons for why the true strength of the associations may have been underestimated in research to date, considering the effects of linkage disequilibrium, allelic heterogeneity, population differences and gene by environment interactions. With the identification of genes associated with ADHD, the goal of ADHD genetics is now shifting from gene discovery towards gene functionality – the study of intermediate phenotypes ('endophenotypes'). We discuss methodological issues relating to quantitative genetic data from twin and family studies on candidate endophenotypes and how such data can inform attempts to link molecular genetic data to cognitive, affective and motivational processes in ADHD. The International Multi-centre ADHD Gene (IMAGE) project exemplifies current collaborative research efforts on the genetics of ADHD. This European multi-site project is well placed to take advantage of the resources that are emerging following the sequencing of the human genome and the development of international resources for whole genome association analysis. As a result of IMAGE and other molecular genetic investigations of ADHD, we envisage a rapid increase in the number of identified genetic variants and the promise of identifying novel gene systems that we are not currently investigating, opening further doors in the study of gene functionality.
Collapse
Affiliation(s)
- Jonna Kuntsi
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, De Crespigny Park, London SE5 8AF, UK
| | - Benjamin M Neale
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, De Crespigny Park, London SE5 8AF, UK
| | - Wai Chen
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, De Crespigny Park, London SE5 8AF, UK
| | - Stephen V Faraone
- SUNY Upstate Medical University, 750 East Adams St., Syracuse, NY 13210, USA
| | - Philip Asherson
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, De Crespigny Park, London SE5 8AF, UK
| |
Collapse
|
26
|
Abstract
Similar to other classical science disciplines, immunology has been embracing novel technologies and approaches giving rise to specialised sub-disciplines such as immunogenetics and, more recently, immunogenomics, which, in many ways, is the genome-wide application of immunogenetic approaches. Here, recent progress in the understanding of the immune sub-genome will be reviewed, and the ways in which immunogenomic datasets consisting of genetic and epigenetic variation, linkage disequilibrium and recombination can be harnessed for disease association and evolutionary studies will be discussed. The discussion will focus on data available for the major histocompatibility complex and the leukocyte receptor complex, the two most polymorphic regions of the human immune sub-genome.
Collapse
Affiliation(s)
- Marcos M Miretti
- Immunogenomics Laboratory, The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Stephan Beck
- Immunogenomics Laboratory, The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| |
Collapse
|
27
|
van Belzen MJ, Heutink P. Genetic analysis of psychiatric disorders in humans. GENES BRAIN AND BEHAVIOR 2006; 5 Suppl 2:25-33. [PMID: 16681798 DOI: 10.1111/j.1601-183x.2006.00223.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Psychiatric disorders place a large burden not only on affected individuals and their families but also on societies and health services. Current treatment is only effective in a proportion of the patients, so considerable effort has been put into the development of new medications. The susceptibility to all major psychiatric disorders is, at least in part, genetic. Knowledge of the genes that underlie this susceptibility may lead to the identification of new drug targets and the development of more effective treatments. Therefore, numerous genetic studies in search for the genes involved in psychiatric disorders have been performed. Although results of both linkage and association studies have been inconsistent, several promising gene regions and candidate genes have been identified recently. In this article, we will review the strategies that proved to be successful in detecting genes for psychiatric disorders and we will provide some recommendations to increase the probability of detecting susceptibility genes in genetic studies of different designs.
Collapse
Affiliation(s)
- M J van Belzen
- Department of Medical Genomics, Center for Neurogenomics and Cognitive Research, VU University Medical Center and VU University, Amsterdam, The Netherlands
| | | |
Collapse
|
28
|
Abstract
Recent advances in high throughput genotyping technologies will allow large-scale association studies to disentangle the genetic basis of human common diseases. Currently, a large-scale genotyping effort is being carried out by the HapMap project and the outcome of this project is expected to help researchers in their efforts to understand how genetic variation influences susceptibility to disease. However, there is some controversy on whether this huge public effort will be of value for those populations not studied in the HapMap project. Here, we present simulation results based on the empirical distribution of linkage disequilibrium (LD) on a large chromosomal region (10 Mb) on human chromosome 20(1,2) for two European and two Asian populations. These results show that statistical power to detect associations does not depend on the population were SNP tagging was performed.
Collapse
Affiliation(s)
- Albert Tenesa
- Colon Cancer Genetics Group, University of Edinburgh, School of Molecular and Clinical Medicine, Western General Hospital, Crewe Road, Edinburgh, UK.
| | | |
Collapse
|
29
|
Yoo YK, Ke X, Hong S, Jang HY, Park K, Kim S, Ahn T, Lee YD, Song O, Rho NY, Lee MS, Lee YS, Kim J, Kim YJ, Yang JM, Song K, Kimm K, Weir B, Cardon LR, Lee JE, Hwang JJ. Fine-scale map of encyclopedia of DNA elements regions in the Korean population. Genetics 2006; 174:491-7. [PMID: 16702437 PMCID: PMC1569806 DOI: 10.1534/genetics.105.052225] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The International HapMap Project aims to generate detailed human genome variation maps by densely genotyping single-nucleotide polymorphisms (SNPs) in CEPH, Chinese, Japanese, and Yoruba samples. This will undoubtedly become an important facility for genetic studies of diseases and complex traits in the four populations. To address how the genetic information contained in such variation maps is transferable to other populations, the Korean government, industries, and academics have launched the Korean HapMap project to genotype high-density Encyclopedia of DNA Elements (ENCODE) regions in 90 Korean individuals. Here we show that the LD pattern, block structure, haplotype diversity, and recombination rate are highly concordant between Korean and the two HapMap Asian samples, particularly Japanese. The availability of information from both Chinese and Japanese samples helps to predict more accurately the possible performance of HapMap markers in Korean disease-gene studies. Tagging SNPs selected from the two HapMap Asian maps, especially the Japanese map, were shown to be very effective for Korean samples. These results demonstrate that the HapMap variation maps are robust in related populations and will serve as an important resource for the studies of the Korean population in particular.
Collapse
|
30
|
Howie BN, Carlson CS, Rieder MJ, Nickerson DA. Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Hum Genet 2006; 120:58-68. [PMID: 16680432 DOI: 10.1007/s00439-006-0182-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2006] [Accepted: 03/30/2006] [Indexed: 10/24/2022]
Abstract
Common genetic polymorphism may explain a portion of the heritable risk for common diseases, so considerable effort has been devoted to finding and typing common single-nucleotide polymorphisms (SNPs) in the human genome. Many SNPs show correlated genotypes, or linkage disequilibrium (LD), suggesting that only a subset of all SNPs (known as tagging SNPs, or tagSNPs) need to be genotyped for disease association studies. Based on the genetic differences that exist among human populations, most tagSNP sets are defined in a single population and applied only in populations that are closely related. To improve the efficiency of multi-population analyses, we have developed an algorithm called MultiPop-TagSelect that finds a near-minimal union of population-specific tagSNP sets across an arbitrary number of populations. We present this approach as an extension of LD-select, a tagSNP selection method that uses a greedy algorithm to group SNPs into bins based on their pairwise association patterns, although the MultiPop-TagSelect algorithm could be used with any SNP tagging approach that allows choices between nearly equivalent SNPs. We evaluate the algorithm by considering tagSNP selection in candidate-gene resequencing data and lower density whole-chromosome data. Our analysis reveals that an exhaustive search is often intractable, while the developed algorithm can quickly and reliably find near-optimal solutions even for difficult tagSNP selection problems. Using populations of African, Asian, and European ancestry, we also show that an optimal multi-population set of tagSNPs can be substantially smaller (up to 44%) than a typical set obtained through independent or sequential selection.
Collapse
Affiliation(s)
- Bryan N Howie
- Department of Genome Sciences, University of Washington, Box 357730, Seattle, WA 98195, USA
| | | | | | | |
Collapse
|
31
|
Smith AV, Thomas DJ, Munro HM, Abecasis GR. Sequence features in regions of weak and strong linkage disequilibrium. Genome Res 2006; 15:1519-34. [PMID: 16251462 PMCID: PMC1310640 DOI: 10.1101/gr.4421405] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
We use genotype data generated by the International HapMap Project to dissect the relationship between sequence features and the degree of linkage disequilibrium in the genome. We show that variation in linkage disequilibrium is broadly similar across populations and examine sequence landscape in regions of strong and weak disequilibrium. Linkage disequilibrium is generally low within approximately 15 Mb of the telomeres of each chromosome and noticeably elevated in large, duplicated regions of the genome as well as within approximately 5 Mb of centromeres and other heterochromatic regions. At a broad scale (100-1000 kb resolution), our results show that regions of strong linkage disequilibrium are typically GC poor and have reduced polymorphism. In addition, these regions are enriched for LINE repeats, but have fewer SINE, DNA, and simple repeats than the rest of the genome. At a fine scale, we examine the sequence composition of "hotspots" for the rapid breakdown of linkage disequilibrium and show that they are enriched in SINEs, in simple repeats, and in sequences that are conserved between species. Regions of high and low linkage disequilibrium (the top and bottom quartiles of the genome) have a higher density of genes and coding bases than the rest of the genome. Closer examination of the data shows that whereas some types of genes (including genes involved in immune response and sensory perception) are typically located in regions of low linkage disequilibrium, other genes (including those involved in DNA and RNA metabolism, response to DNA damage, and the cell cycle) are preferentially located in regions of strong linkage disequilibrium. Our results provide a detailed analysis of the relationship between sequence features and linkage disequilibrium and suggest an evolutionary justification for the heterogeneity in linkage disequilibrium in the genome.
Collapse
Affiliation(s)
- Albert V Smith
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | | | | | |
Collapse
|
32
|
Lawrence R, Evans DM, Morris AP, Ke X, Hunt S, Paolucci M, Ragoussis J, Deloukas P, Bentley D, Cardon LR. Genetically indistinguishable SNPs and their influence on inferring the location of disease-associated variants. Genome Res 2006; 15:1503-10. [PMID: 16251460 PMCID: PMC1310638 DOI: 10.1101/gr.4217605] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
As part of a recent high-density linkage disequilibrium (LD) study of chromosome 20, we obtained genotypes for approximately 30,000 SNPs at a density of 1 SNP/2 kb on four different population samples (47 CEPH founders; 91 UK unrelateds [unrelated white individuals of western European ancestry]; 97 African Americans; 42 East Asians). We observed that approximately 50% of SNPs had at least one genetically indistinguishable partner; i.e., for every individual considered, their genotype at the first locus was identical to their genotype at the second locus, or in LD terms, the SNPs were in "perfect" LD (r2 = 1.0). These "genetically indistinguishable SNPs" (giSNPs) formed into clusters of varying size. The larger the cluster, the greater the tendency to be located within genes and to overlap with giSNP clusters in other population samples. As might be expected for this map density, many giSNPs were located close to one another, thus reflecting local regions of undetected recombination or haplotype blocks. However, approximately 1/3 of giSNP clusters had intermingled, non-indistinguishable SNPs with incomplete LD (D' and r2 <1), sometimes spanning hundreds of kilobases, comprising up to 70 indistinguishable markers and overlapping multiple haplotype blocks. These long-range, nonconsecutive giSNPs have implications for disease gene localization by allelic association as evidence for association at one locus will be indistinguishable from that at another locus, even though both loci may be situated far apart. We describe the distribution of giSNPs on this map of chromosome 20 and illustrate the potential impact they can have on association mapping.
Collapse
Affiliation(s)
- Robert Lawrence
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Crawford DC, Yi Q, Smith JD, Shephard C, Wong M, Witrak L, Livingston RJ, Rieder MJ, Nickerson DA. Allelic spectrum of the natural variation in CRP. Hum Genet 2006; 119:496-504. [PMID: 16550411 PMCID: PMC1449912 DOI: 10.1007/s00439-006-0160-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2005] [Accepted: 01/29/2006] [Indexed: 12/03/2022]
Abstract
With the recent completion of the International HapMap Project, many tools are in hand for genetic association studies seeking to test the common variant/common disease hypothesis. In contrast, very few tools and resources are in place for genotype–phenotype studies hypothesizing that rare variation has a large impact on the phenotype of interest. To create these tools for rare variant/common disease studies, much interest is being generated towards investing in re-sequencing either large sample sizes of random chromosomes or smaller sample sizes of patients with extreme phenotypes. As a case study for rare variant discovery in random chromosomes, we have re-sequenced ~1,000 chromosomes representing diverse populations for the gene C-reactive protein (CRP). CRP is an important gene in the fields of cardiovascular and inflammation genetics, and its size (~2 kb) makes it particularly amenable medical or deep re-sequencing. With these data, we explore several issues related to the present-day candidate gene association study including the benefits of complete SNP discovery, the effects of tagSNP selection across diverse populations, and completeness of dbSNP for CRP. Also, we show that while deep re-sequencing uncovers potentially medically relevant coding SNPs, these SNPs are fleetingly rare when genotyped in a population-based survey of 7,000 Americans (NHANES III). Collectively, these data suggest that several different types re-sequencing and genotyping approaches may be required to fully understand the complete spectrum of alleles that impact human phenotypes.
Collapse
Affiliation(s)
- Dana C Crawford
- Department of Genome Sciences, University of Washington, 1705 NE Pacific, Seattle, WA 98195-7730, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Montpetit A, Nelis M, Laflamme P, Magi R, Ke X, Remm M, Cardon L, Hudson TJ, Metspalu A. An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet 2006; 2:e27. [PMID: 16532062 PMCID: PMC1391920 DOI: 10.1371/journal.pgen.0020027] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2005] [Accepted: 01/23/2006] [Indexed: 11/18/2022] Open
Abstract
The Haplotype Map (HapMap) project recently generated genotype data for more than 1 million single-nucleotide polymorphisms (SNPs) in four population samples. The main application of the data is in the selection of tag single-nucleotide polymorphisms (tSNPs) to use in association studies. The usefulness of this selection process needs to be verified in populations outside those used for the HapMap project. In addition, it is not known how well the data represent the general population, as only 90–120 chromosomes were used for each population and since the genotyped SNPs were selected so as to have high frequencies. In this study, we analyzed more than 1,000 individuals from Estonia. The population of this northern European country has been influenced by many different waves of migrations from Europe and Russia. We genotyped 1,536 randomly selected SNPs from two 500-kbp ENCODE regions on Chromosome 2. We observed that the tSNPs selected from the CEPH (Centre d'Etude du Polymorphisme Humain) from Utah (CEU) HapMap samples (derived from US residents with northern and western European ancestry) captured most of the variation in the Estonia sample. (Between 90% and 95% of the SNPs with a minor allele frequency of more than 5% have an r2 of at least 0.8 with one of the CEU tSNPs.) Using the reverse approach, tags selected from the Estonia sample could almost equally well describe the CEU sample. Finally, we observed that the sample size, the allelic frequency, and the SNP density in the dataset used to select the tags each have important effects on the tagging performance. Overall, our study supports the use of HapMap data in other Caucasian populations, but the SNP density and the bias towards high-frequency SNPs have to be taken into account when designing association studies. The recent completion of the Haplotype Map (HapMap) project of the human genome provides considerable information on the patterns of variation in the genome of four populations. One of the applications is a description of a set of tags that act as proxies for many other surrounding variants. This will greatly help researchers in their quest to find complex disease genes by reducing the number of genetic variants to test in association studies. To evaluate its usefulness, several aspects of the map, including its transferability to other populations, still needed to be verified experimentally. Using genomic regions where variants had been thoroughly documented in Caucasian samples from Estonia, the researchers found that the transferability of tags is extremely good. The researchers also found that variants with low frequency in the general population (i.e., less than 5%) could not be accurately captured with tags, and that the regional density of variants in the HapMap project had a major impact on the performance of the tags. This research indicates that the HapMap project will be useful, but that careful consideration of hypotheses and study design will be essential for the success of association studies.
Collapse
Affiliation(s)
- Alexandre Montpetit
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
| | - Mari Nelis
- Institute of Molecular and Cell Biology of the University of Tartu, Tartu, Estonia
- Estonian Biocentre, Tartu, Estonia
| | - Philippe Laflamme
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
| | - Reedik Magi
- Institute of Molecular and Cell Biology of the University of Tartu, Tartu, Estonia
| | - Xiayi Ke
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Maido Remm
- Institute of Molecular and Cell Biology of the University of Tartu, Tartu, Estonia
| | - Lon Cardon
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Thomas J Hudson
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
| | - Andres Metspalu
- Institute of Molecular and Cell Biology of the University of Tartu, Tartu, Estonia
- Estonian Biocentre, Tartu, Estonia
- The Estonian Genome Project Foundation, Tartu, Estonia
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
35
|
Tang NLS, Pharoah PDP, Ma SL, Easton DF. Evaluation of an algorithm of tagging SNPs selection by linkage disequilibrium. Clin Biochem 2006; 39:240-3. [PMID: 16427037 DOI: 10.1016/j.clinbiochem.2005.11.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2005] [Revised: 10/30/2005] [Accepted: 11/25/2005] [Indexed: 11/25/2022]
Abstract
BACKGROUND Single nucleotide polymorphisms (SNPs) are the most abundant kind of genetic polymorphism in the human genome. They are important in both genetic research and genetic testing in a clinical setting, such as in the area of pharmacogenetics. In order to improve efficiency, tagging SNPs (tagSNPs) are selected in genes of interest to represent other co-related SNPs in linkage disequilibrium (LD) with the tagSNPs. Various algorithms have been proposed to identify a subset of single nucleotide polymorphisms as tagSNPs. Most algorithms of tagSNPs selection are haplotype-based, in which the spatial relationship between SNPs is considered. Currently, a more efficient cluster-based algorithm is proposed which clusters SNPs solely by a LD parameter, such as r(2). Here, we evaluated the sample distribution of r(2) and its effect on the cluster-based tagSNPs selection. DESIGN AND METHODS The genotype data of 198 individual within a 500-kb region on 5q31 was used to evaluate the sample distribution of r(2) and its effect on the cluster-based tagSNPs selection. RESULTS It was found that the degree of variation of LD depends on the LD structure of genes. CONCLUSION As a cluster-based tagSNPs selection algorithm does not take into account the spatial position of SNPs, a more stringent r(2) threshold is required to achieve more reliable tagSNPs selection.
Collapse
Affiliation(s)
- Nelson L S Tang
- Department of Chemical Pathology, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong.
| | | | | | | |
Collapse
|
36
|
Lim J, Kim YJ, Yoon Y, Kim SO, Kang H, Park J, Han AR, Han B, Oh B, Kimm K, Yoon B, Song K. Comparative study of the linkage disequilibrium of an ENCODE region, chromosome 7p15, in Korean, Japanese, and Han Chinese samples. Genomics 2006; 87:392-8. [PMID: 16376517 DOI: 10.1016/j.ygeno.2005.11.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2005] [Revised: 10/19/2005] [Accepted: 11/12/2005] [Indexed: 10/25/2022]
Abstract
The extent and pattern of linkage disequilibrium (LD) in the human genome provide important information for disease gene mapping. Previous studies have shown that LDs vary depending on chromosomal regions and populations. As the Asian samples of the International HapMap Project consisted of Japanese and Chinese populations, it was of interest whether we could use the HapMap data as a reference to carry out association studies of common complex diseases in a closely related population, such as Koreans. We have compared the LD and recombination patterns defined by single-nucleotide polymorphisms (SNPs) in ENCODE region ENm010, chromosome 7p15.2, in Korean, Japanese, and Chinese samples and further tested the robustness of tagSNPs among the Asian samples. We genotyped 792 SNPs in 500 kb (chromosome 7: 26699793-27199792, NCBI build 34) from 90 unrelated Koreans by fluorescence polarization detection and compared the data with Asian data from the HapMap project. Despite some differences in the position of high LD region boundaries, the overall patterns of LD were remarkably similar across the three samples, reflecting strong genetic affinities among them. Furthermore, the haplotype tag SNP transferability across the three samples was greater than 90%. Our results support the initial suggestion that the populations genotyped in the HapMap project might serve as reference populations for the selection of tagSNPs in association studies.
Collapse
Affiliation(s)
- Jiyoung Lim
- Department of Biochemistry and Molecular Biology, University of Ulsan College of Medicine, 388-1 Poongnap-Dong, Songpa-Gu, Seoul 138-736, Korea
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Traherne JA, Horton R, Roberts AN, Miretti MM, Hurles ME, Stewart CA, Ashurst JL, Atrazhev AM, Coggill P, Palmer S, Almeida J, Sims S, Wilming LG, Rogers J, de Jong PJ, Carrington M, Elliott JF, Sawcer S, Todd JA, Trowsdale J, Beck S. Genetic analysis of completely sequenced disease-associated MHC haplotypes identifies shuffling of segments in recent human history. PLoS Genet 2006; 2:e9. [PMID: 16440057 PMCID: PMC1331980 DOI: 10.1371/journal.pgen.0020009] [Citation(s) in RCA: 145] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2005] [Accepted: 12/13/2005] [Indexed: 11/23/2022] Open
Abstract
The major histocompatibility complex (MHC) is recognised as one of the most important genetic regions in relation to common human disease. Advancement in identification of MHC genes that confer susceptibility to disease requires greater knowledge of sequence variation across the complex. Highly duplicated and polymorphic regions of the human genome such as the MHC are, however, somewhat refractory to some whole-genome analysis methods. To address this issue, we are employing a bacterial artificial chromosome (BAC) cloning strategy to sequence entire MHC haplotypes from consanguineous cell lines as part of the MHC Haplotype Project. Here we present 4.25 Mb of the human haplotype QBL (HLA-A26-B18-Cw5-DR3-DQ2) and compare it with the MHC reference haplotype and with a second haplotype, COX (HLA-A1-B8-Cw7-DR3-DQ2), that shares the same HLA-DRB1, -DQA1, and -DQB1 alleles. We have defined the complete gene, splice variant, and sequence variation contents of all three haplotypes, comprising over 259 annotated loci and over 20,000 single nucleotide polymorphisms (SNPs). Certain coding sequences vary significantly between different haplotypes, making them candidates for functional and disease-association studies. Analysis of the two DR3 haplotypes allowed delineation of the shared sequence between two HLA class II-related haplotypes differing in disease associations and the identification of at least one of the sites that mediated the original recombination event. The levels of variation across the MHC were similar to those seen for other HLA-disparate haplotypes, except for a 158-kb segment that contained the HLA-DRB1, -DQA1, and -DQB1 genes and showed very limited polymorphism compatible with identity-by-descent and relatively recent common ancestry (<3,400 generations). These results indicate that the differential disease associations of these two DR3 haplotypes are due to sequence variation outside this central 158-kb segment, and that shuffling of ancestral blocks via recombination is a potential mechanism whereby certain DR-DQ allelic combinations, which presumably have favoured immunological functions, can spread across haplotypes and populations.
Collapse
Affiliation(s)
- James A Traherne
- Department of Pathology, Immunology Division, University of Cambridge, Cambridge, United Kingdom
| | - Roger Horton
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Anne N Roberts
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, United Kingdom
| | - Marcos M Miretti
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Matthew E Hurles
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - C. Andrew Stewart
- Department of Pathology, Immunology Division, University of Cambridge, Cambridge, United Kingdom
| | - Jennifer L Ashurst
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Alexey M Atrazhev
- Alberta Diabetes Institute (ADI), Department of Medical Microbiology and Immunology, Division of Dermatology and Cutaneous Sciences, University of Alberta, Edmonton, Canada
| | - Penny Coggill
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Sophie Palmer
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Jeff Almeida
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Sarah Sims
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Laurens G Wilming
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Jane Rogers
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Pieter J. de Jong
- Children's Hospital Oakland Research Institute, Oakland, California, United States of America
| | - Mary Carrington
- Basic Research Program, SAIC-Frederick, Inc., Laboratory of Genomic Diversity, National Cancer Institute, Frederick, Maryland, United States of America
| | - John F Elliott
- Alberta Diabetes Institute (ADI), Department of Medical Microbiology and Immunology, Division of Dermatology and Cutaneous Sciences, University of Alberta, Edmonton, Canada
| | - Stephen Sawcer
- Department of Clinical Neurosciences, University of Cambridge, Addenbrooke's Hospital, Cambridge, United Kingdom
| | - John A Todd
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge, United Kingdom
| | - John Trowsdale
- Department of Pathology, Immunology Division, University of Cambridge, Cambridge, United Kingdom
| | - Stephan Beck
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
38
|
Franke A, Wollstein A, Teuber M, Wittig M, Lu T, Hoffmann K, Nürnberg P, Krawczak M, Schreiber S, Hampe J. GENOMIZER: an integrated analysis system for genome-wide association data. Hum Mutat 2006; 27:583-8. [PMID: 16652332 DOI: 10.1002/humu.20306] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genome-wide association analysis appears to be a promising way to identify heritable susceptibility factors for complex human disorders. However, the feasibility of large-scale genotyping experiments is currently limited by an incomplete marker coverage of the genome, a restricted understanding of the functional role of given genomic regions, and the small sample sizes used. Thus, genome-wide association analysis will be a screening tool to facilitate subsequent gene discovery rather than a means to completely resolve individual genetic risk profiles. The validation of association findings will continue to rely upon the replication of "leads" in independent samples from either the same or different populations. Even under such pragmatic conditions, the timely analysis of the large data sets in question poses serious technical challenges. We have therefore developed public-domain software, GENOMIZER, that implements the workflow of an association experiment, including data management, single-point and haplotype analysis, "lead" definition, and data visualization. GENOMIZER (www.ikmb.uni-kiel.de/genomizer) comes with a complete user manual, and is open-source software licensed under the GNU Lesser General Public License. We suggest that the use of this software will facilitate the handling and interpretation of the currently emerging genome-wide association data.
Collapse
Affiliation(s)
- Andre Franke
- Institute of Clinical Molecular Biology, Kiel Center of the German National Genotyping Platform, Christian-Albrechts-University, Kiel, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Abstract
Currently, more than 10 million DNA sequence variations have been uncovered in the human genome. The most detailed variation discovery efforts have focused on candidate genes involved in cardiovascular disease or in susceptibilities associated with exposure to environmental agents. Here we provide an overview of natural genetic variation from the literature and in 510 human candidate genes resequenced for variation discovery. The average human gene contains 126 biallelic polymorphisms, 46 of which are common (> or =5% minor allele frequency) and 5 of which are found in coding regions. Using this complete picture of genetic diversity, we explore conservation, signatures of selection, and historical recombination to mine information useful for candidate gene association studies. In general, we find that the patterns of human gene variation suggest that no one approach will be appropriate for genetic association studies across all genes. Therefore, many different approaches may be required to identify the elusive genotypes associated with common human phenotypes.
Collapse
Affiliation(s)
- Dana C Crawford
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.
| | | | | |
Collapse
|
40
|
Zeggini E, Rayner W, Morris AP, Hattersley AT, Walker M, Hitman GA, Deloukas P, Cardon LR, McCarthy MI. An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets. Nat Genet 2005; 37:1320-2. [PMID: 16258542 DOI: 10.1038/ng1670] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2005] [Accepted: 10/04/2005] [Indexed: 11/09/2022]
Abstract
A substantial investment has been made in the generation of large public resources designed to enable the identification of tag SNP sets, but data establishing the adequacy of the sample sizes used are limited. Using large-scale empirical and simulated data sets, we found that the sample sizes used in the HapMap project are sufficient to capture common variation, but that performance declines substantially for variants with minor allele frequencies of <5%.
Collapse
Affiliation(s)
- Eleftheria Zeggini
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Zhang K, Sun F. Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples. BMC Genet 2005; 6:51. [PMID: 16236175 PMCID: PMC1274312 DOI: 10.1186/1471-2156-6-51] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2005] [Accepted: 10/19/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent studies have indicated that the human genome could be divided into regions with low haplotype diversity interspersed with regions of high haplotype diversity. In regions of low haplotype diversity, a small fraction of SNPs (tag SNPs) are sufficient to account for most of the haplotype diversity of the human genome. These tag SNPs can be extremely useful for testing the association of a marker locus with a qualitative or quantitative trait locus in that it may not be necessary to genotype all the SNPs. When tag SNPs are used to reduce the genotyping effort in association studies, it is important to know how much power is lost. It is also important to know how much power is gained when tag SNPs instead of the same number of randomly chosen SNPs are used. RESULTS We design a simulation study to tackle these problems for a variety of quantitative association tests using either case-parent samples or unrelated population samples. First, the samples are generated based on the quantitative trait model with the assumption of either an extremal sampling scheme or a random sampling scheme. Second, a small number of samples are selected to determine the haplotype blocks and the tag SNPs. Third, the statistical power of the tests is evaluated using four kinds of data: (1) all the SNPs and the corresponding haplotypes, (2) the tag SNPs and the corresponding haplotypes, (3) the same number of evenly spaced SNPs with minor allele frequency greater than a threshold and the corresponding haplotypes, (4) the same number of randomly chosen SNPs and their corresponding haplotypes. CONCLUSION Our results suggest that in most situations genotyping efforts can be significantly reduced by using tag SNPs for mapping the QTL in association studies without much loss of power, which is consistent with previous studies on association mapping of qualitative traits. For all situations considered, two-locus haplotype analysis using tag SNPs are more powerful than those using the same number of randomly selected SNPs, but the degree of such power differences depends upon the sampling scheme and the population history.
Collapse
Affiliation(s)
- Kui Zhang
- Section on Statistical Genetics, Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Fengzhu Sun
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
42
|
Abstract
Much effort and expense are being spent internationally to detect genetic polymorphisms contributing to susceptibility to complex human disease. Concomitantly, the technology for detecting and genotyping single nucleotide polymorphisms (SNPs) has undergone rapid development, yielding extensive catalogues of these polymorphisms across the genome. Population-based maps of the correlations amongst SNPs (linkage disequilibrium) are now being developed to accelerate the discovery of genes for complex human diseases. These genomic advances coincide with an increasing recognition of the importance of very large sample sizes for studying genetic effects. Together, these new genetic and epidemiological data hold renewed promise for the identification of susceptibility genes for complex traits. We review the state of knowledge about the structure of the human genome as related to SNPs and linkage disequilibrium, discuss the potential applications of this knowledge to mapping complex disease genes, and consider the issues facing whole genome association scanning using SNPs.
Collapse
Affiliation(s)
- Lyle J Palmer
- Western Australian Institute for Medical Research and University of Western Australia Centre for Medical Research, University of Western Australia.
| | | |
Collapse
|
43
|
Ke X, Miretti MM, Broxholme J, Hunt S, Beck S, Bentley DR, Deloukas P, Cardon LR. A comparison of tagging methods and their tagging space. Hum Mol Genet 2005; 14:2757-67. [PMID: 16103130 DOI: 10.1093/hmg/ddi309] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Single-nucleotide polymorphism (SNP) tagging is widely used as a way of saving genotyping costs in association studies. A number of different tagging methods have been developed to reduce the number of markers to be genotyped while maintaining power for detecting effects on non-assayed SNPs. How the different methods perform in different settings, the degree to which they overlap and share common tags and how they differ are important questions. We investigated these questions by comparing three widely used tagging methods/algorithms--one haplotype r2-based method, one pair-wise r2-based method and one method which was based on haplotype diversity but focused on major haplotypes. Tagging efficiency was defined as the number of genotyped markers divided by the number of tagging SNPs. Tagging effectiveness was defined as the proportion of un-genotyped or 'hidden' SNPs being detected (having a pair-wise or haplotype r2 with a set of tagging SNPs over a threshold, e.g. haplotype r2> or =0.80). The ENCODE regions genotyped on the HapMap CEPH individuals were examined in this study. Tagging effectiveness was generally poor for rare SNPs than for common SNPs, for all three tagging methods. Inclusion of rare SNPs into initial HapMap scheme could enhance the performance of tags on rare hidden SNPs at the expense of increased genotyping cost. At a moderate tagging efficiency, more than 90% of hidden SNPs detected by tagging SNPs selected by one method were also detected by tagging SNPs selected by another method, and this figure could be increased to 100% if tagging efficiency was allowed to drop. These results indicate that the tagging space is highly concordant between different tagging methods, despite the fact that they often involve different sets of tagging SNPs.
Collapse
Affiliation(s)
- Xiayi Ke
- Wellcome Trust Centre for Human Genetics, University of Oxford, UK.
| | | | | | | | | | | | | | | |
Collapse
|
44
|
Abstract
The current enthusiasm for pharmacogenetics draws much of its inspiration from the relatively few examples of polymorphisms that have marked and seemingly clinically relevant effects on drug response. In this regard, pharmacogenetic research has paralleled the study of human disease, which has enjoyed success in identifying mutations underlying mendelian conditions. Progress in deciphering the genetics of complex diseases, involving the interaction of multiple genes with each other and with the environment has been considerably less successful. In most instances, drug responses will probably also prove to be complex, influenced by both the environment and multiple genetic factors. For pharmacogenetics to deliver on its potential, this complexity will need to be recognized and accommodated, both in basic research and in clinical application of pharmacogenetics. As the attention of researchers begins to shift toward more systematic pharmacogenetic investigations, we suggest some priorities and standards for pharmacogenetic research.
Collapse
Affiliation(s)
- Anna C Need
- Institute for Genome Sciences & Policy, Center for Population Genomics & Pharmacogenetics, Duke University, 103 Research Drive, DUMC Box 3471, Durham, North Carolina 27710, USA
| | | | | |
Collapse
|
45
|
Miretti MM, Walsh EC, Ke X, Delgado M, Griffiths M, Hunt S, Morrison J, Whittaker P, Lander ES, Cardon LR, Bentley DR, Rioux JD, Beck S, Deloukas P. A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms. Am J Hum Genet 2005; 76:634-46. [PMID: 15747258 PMCID: PMC1199300 DOI: 10.1086/429393] [Citation(s) in RCA: 194] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2004] [Accepted: 02/02/2005] [Indexed: 11/03/2022] Open
Abstract
Autoimmune, inflammatory, and infectious diseases present a major burden to human health and are frequently associated with loci in the human major histocompatibility complex (MHC). Here, we report a high-resolution (1.9 kb) linkage-disequilibrium (LD) map of a 4.46-Mb fragment containing the MHC in U.S. pedigrees with northern and western European ancestry collected by the Centre d'Etude du Polymorphisme Humain (CEPH) and the first generation of haplotype tag single-nucleotide polymorphisms (tagSNPs) that provide up to a fivefold increase in genotyping efficiency for all future MHC-linked disease-association studies. The data confirm previously identified recombination hotspots in the class II region and allow the prediction of numerous novel hotspots in the class I and class III regions. The region of longest LD maps outside the classic MHC to the extended class I region spanning the MHC-linked olfactory-receptor gene cluster. The extended haplotype homozygosity analysis for recent positive selection shows that all 14 outlying haplotype variants map to a single extended haplotype, which most commonly bears HLA-DRB1*1501. The SNP data, haplotype blocks, and tagSNPs analysis reported here have been entered into a multidimensional Web-based database (GLOVAR), where they can be accessed and viewed in the context of relevant genome annotation. This LD map allowed us to give coordinates for the extremely variable LD structure underlying the MHC.
Collapse
Affiliation(s)
- Marcos M. Miretti
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Emily C. Walsh
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Xiayi Ke
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Marcos Delgado
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Mark Griffiths
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Sarah Hunt
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Jonathan Morrison
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Pamela Whittaker
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Eric S. Lander
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Lon R. Cardon
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - David R. Bentley
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - John D. Rioux
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Stephan Beck
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Panos Deloukas
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA; and Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
46
|
Wang WYS, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 2005; 6:109-18. [PMID: 15716907 DOI: 10.1038/nrg1522] [Citation(s) in RCA: 747] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
To fully understand the allelic variation that underlies common diseases, complete genome sequencing for many individuals with and without disease is required. This is still not technically feasible. However, recently it has become possible to carry out partial surveys of the genome by genotyping large numbers of common SNPs in genome-wide association studies. Here, we outline the main factors - including models of the allelic architecture of common diseases, sample size, map density and sample-collection biases - that need to be taken into account in order to optimize the cost efficiency of identifying genuine disease-susceptibility loci.
Collapse
Affiliation(s)
- William Y S Wang
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, UK
| | | | | | | |
Collapse
|