1
|
Aminbakhsh AP, Théberge ET, Burden E, Adejumo CK, Gravely AK, Lehman A, Sedlak TL. Exploring associations between estrogen and gene candidates identified by coronary artery disease genome-wide association studies. Front Cardiovasc Med 2025; 12:1502985. [PMID: 40182431 PMCID: PMC11965610 DOI: 10.3389/fcvm.2025.1502985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 03/04/2025] [Indexed: 04/05/2025] Open
Abstract
Introduction Coronary artery disease (CAD) is the leading cause of death around the world, with epidemiological sex and gender differences in prevalence, pathophysiology and outcomes. It has been hypothesized that sex steroids, like estrogen, may contribute to these sex differences. There is a relatively large genetic component to developing CAD, with heritability estimates ranging between 40%-60%. In the last two decades, genome-wide association studies (GWAS) have contributed substantially to advancing the understanding of genetic candidates contributing to CAD. The aim of this study was to determine if genes discovered in CAD GWASs are affected by estrogen via direct modulation or indirect down-stream targets. Methods A scoping review was conducted using MEDLINE and EMBASE for studies of atherosclerotic coronary artery disease and a genome-wide association study (GWAS) design. Analysis was limited to candidate genes with corresponding single nucleotide polymorphisms (SNPs) surpassing genome-wide significance and had been mapped to genes by study authors. The number of studies that conducted sex-stratified analyses with significant genes were quantified. A literature search of the final gene lists was done to examine any evidence suggesting estrogen may modulate the genes and/or gene products. Results There were 60 eligible CAD GWASs meeting inclusion criteria for data extraction. Of these 60, only 36 had genome-wide significant SNPs reported, and only 3 of these had significant SNPs from sex-stratified analyses mapped to genes. From these 36 studies, a total of 61 genes were curated, of which 26 genes (43%) were found to have modulation by estrogen. All 26 were discovered in studies that adjusted for sex. 12/26 genes were also discovered in studies that conducted sex-stratified analyses. 12/26 genes were classified as having a role in lipid synthesis, metabolism and/or lipoprotein mechanisms, while 11/26 were classified as having a role in vascular integrity, and 3/26 were classified as having a role in thrombosis. Discussion This study provides further evidence of the relationship between estrogen, genetic risk and the development of CAD. More sex-stratified research will need to be conducted to further characterize estrogen's relation to sex differences in the pathology and progression of CAD.
Collapse
Affiliation(s)
- Ava P. Aminbakhsh
- Department of Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Emilie T. Théberge
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Elizabeth Burden
- Division of Internal Medicine, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver Coastal Health, Vancouver, BC, Canada
| | - Cindy Kalenga Adejumo
- Division of Internal Medicine, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver Coastal Health, Vancouver, BC, Canada
| | - Annabel K. Gravely
- Department of Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Anna Lehman
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver Coastal Health, Vancouver, BC, Canada
| | - Tara L. Sedlak
- Vancouver Coastal Health, Vancouver, BC, Canada
- Division of Cardiology, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
2
|
Graça M, Nobre R, Sousa L, Ilic A. Distributed transformer for high order epistasis detection in large-scale datasets. Sci Rep 2024; 14:14579. [PMID: 38918413 PMCID: PMC11199512 DOI: 10.1038/s41598-024-65317-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 06/19/2024] [Indexed: 06/27/2024] Open
Abstract
Understanding the genetic basis of complex diseases is one of the most important challenges in current precision medicine. To this end, Genome-Wide Association Studies aim to correlate Single Nucleotide Polymorphisms (SNPs) to the presence or absence of certain traits. However, these studies do not consider interactions between several SNPs, known as epistasis, which explain most genetic diseases. Analyzing SNP combinations to detect epistasis is a major computational task, due to the enormous search space. A possible solution is to employ deep learning strategies for genomic prediction, but the lack of explainability derived from the black-box nature of neural networks is a challenge yet to be addressed. Herein, a novel, flexible, portable, and scalable framework for network interpretation based on transformers is proposed to tackle any-order epistasis. The results on various epistasis scenarios show that the proposed framework outperforms state-of-the-art methods for explainability, while being scalable to large datasets and portable to various deep learning accelerators. The proposed framework is validated on three WTCCC datasets, identifying SNPs related to genes known in the literature that have direct relationships with the studied diseases.
Collapse
Affiliation(s)
- Miguel Graça
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal.
| | - Ricardo Nobre
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal
| | - Leonel Sousa
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal
| | - Aleksandar Ilic
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal
| |
Collapse
|
3
|
Giri P, Parmar M, Ezhuthachan DD, Desai T, Dwivedi M. Promoter polymorphisms of neuropeptide Y, interleukin-1B and increased IL-1β levels are associated with rheumatoid arthritis susceptibility in South Gujarat population. HUMAN GENE 2024; 39:201251. [DOI: 10.1016/j.humgen.2023.201251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/06/2024]
|
4
|
Duan J, Zhang J, Liu L, Wen Y. A guidance of model selection for genomic prediction based on linear mixed models for complex traits. Front Genet 2022; 13:1017380. [PMID: 36276959 PMCID: PMC9581223 DOI: 10.3389/fgene.2022.1017380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 09/20/2022] [Indexed: 11/27/2022] Open
Abstract
Brain imaging outcomes are important for Alzheimer's disease (AD) detection, and their prediction based on both genetic and demographic risk factors can facilitate the ongoing prevention and treatment of AD. Existing studies have identified numerous significantly AD-associated SNPs. However, how to make the best use of them for prediction analyses remains unknown. In this research, we first explored the relationship between genetic architecture and prediction accuracy of linear mixed models via visualizing the Manhattan plots generated based on the data obtained from the Wellcome Trust Case Control Consortium, and then constructed prediction models for eleven AD-related brain imaging outcomes using data from United Kingdom Biobank and Alzheimer's Disease Neuroimaging Initiative studies. We found that the simple Manhattan plots can be informative for the selection of prediction models. For traits that do not exhibit any significant signals from the Manhattan plots, the simple genomic best linear unbiased prediction (gBLUP) model is recommended due to its robust and accurate prediction performance as well as its computational efficiency. For diseases and traits that show spiked signals on the Manhattan plots, the latent Dirichlet process regression is preferred, as it can flexibly accommodate both the oligogenic and omnigenic models. For the prediction of AD-related traits, the Manhattan plots suggest their polygenic nature, and gBLUP has achieved robust performance for all these traits. We found that for these AD-related traits, genetic factors themselves only explain a very small proportion of the heritability, and the well-known AD risk factors can substantially improve the prediction model.
Collapse
Affiliation(s)
- Jiefang Duan
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Jiayu Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Long Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, China
| | - Yalu Wen
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, China.,Department of Statistics, University of Auckland, Auckland, New Zealand
| |
Collapse
|
5
|
Yamasaki M, Makino T, Khor SS, Toyoda H, Miyagawa T, Liu X, Kuwabara H, Kano Y, Shimada T, Sugiyama T, Nishida H, Sugaya N, Tochigi M, Otowa T, Okazaki Y, Kaiya H, Kawamura Y, Miyashita A, Kuwano R, Kasai K, Tanii H, Sasaki T, Honda M, Tokunaga K. Sensitivity to gene dosage and gene expression affects genes with copy number variants observed among neuropsychiatric diseases. BMC Med Genomics 2020; 13:55. [PMID: 32223758 PMCID: PMC7104509 DOI: 10.1186/s12920-020-0699-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 02/24/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Copy number variants (CNVs) have been reported to be associated with diseases, traits, and evolution. However, it is hard to determine which gene should have priority as a target for further functional experiments if a CNV is rare or a singleton. In this study, we attempted to overcome this issue by using two approaches: by assessing the influences of gene dosage sensitivity and gene expression sensitivity. Dosage sensitive genes derived from two-round whole-genome duplication in previous studies. In addition, we proposed a cross-sectional omics approach that utilizes open data from GTEx to assess the effect of whole-genome CNVs on gene expression. METHODS Affymetrix Genome-Wide SNP Array 6.0 was used to detect CNVs by PennCNV and CNV Workshop. After quality controls for population stratification, family relationship and CNV detection, 287 patients with narcolepsy, 133 patients with essential hypersomnia, 380 patients with panic disorders, 164 patients with autism, 784 patients with Alzheimer disease and 1280 healthy individuals remained for the enrichment analysis. RESULTS Overall, significant enrichment of dosage sensitive genes was found across patients with narcolepsy, panic disorders and autism. Particularly, significant enrichment of dosage-sensitive genes in duplications was observed across all diseases except for Alzheimer disease. For deletions, less or no enrichment of dosage-sensitive genes with deletions was seen in the patients when compared to the healthy individuals. Interestingly, significant enrichments of genes with expression sensitivity in brain were observed in patients with panic disorder and autism. While duplications presented a higher burden, deletions did not cause significant differences when compared to the healthy individuals. When we assess the effect of sensitivity to genome dosage and gene expression at the same time, the highest ratio of enrichment was observed in the group including dosage-sensitive genes and genes with expression sensitivity only in brain. In addition, shared CNV regions among the five neuropsychiatric diseases were also investigated. CONCLUSIONS This study contributed the evidence that dosage-sensitive genes are associated with CNVs among neuropsychiatric diseases. In addition, we utilized open data from GTEx to assess the effect of whole-genome CNVs on gene expression. We also investigated shared CNV region among neuropsychiatric diseases.
Collapse
Affiliation(s)
- Maria Yamasaki
- Department of Health Data Science Research, Healthy Aging Innovation Center, Tokyo Metropolitan Geriatric Medical Center, Tokyo, Japan
| | - Takashi Makino
- Laboratory of Evolutionary Genomics, Graduate School of Life Sciences, Tohoku University, Sendai, Japan
| | - Seik-Soon Khor
- Genome Medical Science Project (Toyama), National Center for for Global Health and Medicine, Tokyo, Japan
| | - Hiromi Toyoda
- Genome Medical Science Project (Toyama), National Center for for Global Health and Medicine, Tokyo, Japan
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Taku Miyagawa
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Xiaoxi Liu
- RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Hitoshi Kuwabara
- Department of Psychiatry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Yukiko Kano
- Department of Child and Adolescent Psychiatry, Hamamatsu University School of Medicine, Shizuoka, Japan
- Department of Child Psychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Takafumi Shimada
- Division for Counseling and Support, The University of Tokyo, Tokyo, Japan
| | - Toshiro Sugiyama
- Department of Child and Adolescent Psychiatry, Hamamatsu University School of Medicine, Shizuoka, Japan
| | - Hisami Nishida
- Asunaro Hospital for Child and Adolescent Psychiatry, Mie, Japan
| | - Nagisa Sugaya
- Unit of Public Health and Preventive Medicine, School of Medicine, Yokohama City University, Kanagawa, Japan
| | - Mamoru Tochigi
- Department of Neuropsychiatry, Teikyo University Hospital, Tokyo, Japan
| | - Takeshi Otowa
- Department of Neuropsychiatry, NTT Medical Center Tokyo, Tokyo, Japan
| | - Yuji Okazaki
- Department of Psychiatry, Koseikai Michinoo Hospital, Nagasaki, Japan
| | - Hisanobu Kaiya
- Panic Disorder Research Center, Warakukai Med Corp, Tokyo, Japan
| | - Yoshiya Kawamura
- Department of Psychiatry, Shonan Kamakura General Hospital, Kanagawa, Japan
| | - Akinori Miyashita
- Department of Molecular Genetics, Bioresource Science Branch, Center for Bioresources, Brain Research Institute, Niigata University, Niigata, Japan
| | - Ryozo Kuwano
- Department of Molecular Genetics, Bioresource Science Branch, Center for Bioresources, Brain Research Institute, Niigata University, Niigata, Japan
- Asahigawaso Research Institute, Asahigawaso Medical-Welfare Center, Okayama, Japan
| | - Kiyoto Kasai
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Hisashi Tanii
- Center for Physical and Mental Health, Mie University, Tsu, Mie Japan
| | - Tsukasa Sasaki
- Division of Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
| | - Makoto Honda
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Katsushi Tokunaga
- Genome Medical Science Project (Toyama), National Center for for Global Health and Medicine, Tokyo, Japan
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
6
|
Cai M, Chen LS, Liu J, Yang C. IGREX for quantifying the impact of genetically regulated expression on phenotypes. NAR Genom Bioinform 2020; 2:lqaa010. [PMID: 32118202 PMCID: PMC7034630 DOI: 10.1093/nargab/lqaa010] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 01/08/2020] [Accepted: 02/05/2020] [Indexed: 12/20/2022] Open
Abstract
By leveraging existing GWAS and eQTL resources, transcriptome-wide association studies (TWAS) have achieved many successes in identifying trait-associations of genetically regulated expression (GREX) levels. TWAS analysis relies on the shared GREX variation across GWAS and the reference eQTL data, which depends on the cellular conditions of the eQTL data. Considering the increasing availability of eQTL data from different conditions and the often unknown trait-relevant cell/tissue-types, we propose a method and tool, IGREX, for precisely quantifying the proportion of phenotypic variation attributed to the GREX component. IGREX takes as input a reference eQTL panel and individual-level or summary-level GWAS data. Using eQTL data of 48 tissue types from the GTEx project as a reference panel, we evaluated the tissue-specific IGREX impact on a wide spectrum of phenotypes. We observed strong GREX effects on immune-related protein biomarkers. By incorporating trans-eQTLs and analyzing genetically regulated alternative splicing events, we evaluated new potential directions for TWAS analysis.
Collapse
Affiliation(s)
- Mingxuan Cai
- Department of Mathematics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Lin S Chen
- Department of Public Health Sciences, The University of Chicago, IL 60637, USA
| | - Jin Liu
- Center for Quantitative Medicine, Duke-NUS Medical School, 169856, Singapore
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| |
Collapse
|
7
|
Saad MN, Mabrouk MS, Eldeib AM, Shaker OG. Comparative study for haplotype block partitioning methods - Evidence from chromosome 6 of the North American Rheumatoid Arthritis Consortium (NARAC) dataset. PLoS One 2018; 13:e0209603. [PMID: 30596705 PMCID: PMC6312333 DOI: 10.1371/journal.pone.0209603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 12/07/2018] [Indexed: 11/19/2022] Open
Abstract
Haplotype-based methods compete with "one-SNP-at-a-time" approaches on being preferred for association studies. Chromosome 6 contains most of the known genetic biomarkers for rheumatoid arthritis (RA) disease. Therefore, chromosome 6 serves as a benchmark for the haplotype methods testing. The aim of this study is to test the North American Rheumatoid Arthritis Consortium (NARAC) dataset to find out if haplotype block methods or single-locus approaches alone can sufficiently provide the significant single nucleotide polymorphisms (SNPs) associated with RA. In addition, could we be satisfied with only one method of the haplotype block methods for partitioning chromosome 6 of the NARAC dataset? In the NARAC dataset, chromosome 6 comprises 35,574 SNPs for 2,062 individuals (868 cases, 1,194 controls). Individual SNP approach and three haplotype block methods were applied to the NARAC dataset to identify the RA biomarkers. We employed three haplotype partitioning methods which are confidence interval test (CIT), four gamete test (FGT), and solid spine of linkage disequilibrium (SSLD). P-values after stringent Bonferroni correction for multiple testing were measured to assess the strength of association between the genetic variants and RA susceptibility. Moreover, the block size (in base pairs (bp) and number of SNPs included), number of blocks, percentage of uncovered SNPs by the block method, percentage of significant blocks from the total number of blocks, number of significant haplotypes and SNPs were used to compare among the three haplotype block methods. Individual SNP, CIT, FGT, and SSLD methods detected 432, 1,086, 1,099, and 1,322 associated SNPs, respectively. Each method identified significant SNPs that were not detected by any other method (Individual SNP: 12, FGT: 37, CIT: 55, and SSLD: 189 SNPs). 916 SNPs were discovered by all the three haplotype block methods. 367 SNPs were discovered by the haplotype block methods and the individual SNP approach. The P-values of these 367 SNPs were lower than those of the SNPs uniquely detected by only one method. The 367 SNPs detected by all the methods represent promising candidates for RA susceptibility. They should be further investigated for the European population. A hybrid technique including the four methods should be applied to detect the significant SNPs associated with RA for chromosome 6 of the NARAC dataset. Moreover, SSLD method may be preferred for its favored benefits in case of selecting only one method.
Collapse
Affiliation(s)
- Mohamed N. Saad
- Biomedical Engineering Department, Faculty of Engineering, Minia University, Minia, Egypt
| | - Mai S. Mabrouk
- Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology (MUST), 6th of October City, Egypt
| | - Ayman M. Eldeib
- Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, Giza, Egypt
| | - Olfat G. Shaker
- Medical Biochemistry and Molecular Biology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
| |
Collapse
|
8
|
Strand NS, Allen JM, Ghulam M, Taylor MR, Munday RK, Carrillo M, Movsesyan A, Zayas RM. Dissecting the function of Cullin-RING ubiquitin ligase complex genes in planarian regeneration. Dev Biol 2018; 433:210-217. [PMID: 29291974 DOI: 10.1016/j.ydbio.2017.10.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 09/25/2017] [Accepted: 10/11/2017] [Indexed: 12/26/2022]
Abstract
The ubiquitin system plays a role in nearly every aspect of eukaryotic cell biology. The enzymes responsible for transferring ubiquitin onto specific substrates are the E3 ubiquitin ligases, a large and diverse family of proteins, for which biological roles and target substrates remain largely undefined. Studies using model organisms indicate that ubiquitin signaling mediates key steps in developmental processes and tissue regeneration. Here, we used the freshwater planarian, Schmidtea mediterranea, to investigate the role of Cullin-RING ubiquitin ligase (CRL) complexes in stem cell regulation during regeneration. We identified six S. mediterranea cullin genes, and used RNAi to uncover roles for homologs of Cullin-1, -3 and -4 in planarian regeneration. The cullin-1 RNAi phenotype included defects in blastema formation, organ regeneration, lesions, and lysis. To further investigate the function of cullin-1-mediated cellular processes in planarians, we examined genes encoding the adaptor protein Skp1 and F-box substrate-recognition proteins that are predicted to partner with Cullin-1. RNAi against skp1 resulted in phenotypes similar to cullin-1 RNAi, and an RNAi screen of the F-box genes identified 19 genes that recapitulated aspects of cullin-1 RNAi, including ones that in mammals are involved in stem cell regulation and cancer biology. Our data provides evidence that CRLs play discrete roles in regenerative processes and provide a platform to investigate how CRLs regulate stem cells in vivo.
Collapse
Affiliation(s)
- Nicholas S Strand
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - John M Allen
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - Mahjoobah Ghulam
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - Matthew R Taylor
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - Roma K Munday
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - Melissa Carrillo
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - Artem Movsesyan
- Department of Biology, San Diego State University, San Diego, CA 92182, USA
| | - Ricardo M Zayas
- Department of Biology, San Diego State University, San Diego, CA 92182, USA.
| |
Collapse
|
9
|
Hsieh AR, Chen DP, Chattopadhyay AS, Li YJ, Chang CC, Fann CSJ. A non-threshold region-specific method for detecting rare variants in complex diseases. PLoS One 2017; 12:e0188566. [PMID: 29190701 PMCID: PMC5708778 DOI: 10.1371/journal.pone.0188566] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 11/09/2017] [Indexed: 11/23/2022] Open
Abstract
A region-specific method, NTR (non-threshold rare) variant detection method, was developed—it does not use the threshold for defining rare variants and accounts for directions of effects. NTR also considers linkage disequilibrium within the region and accommodates common and rare variants simultaneously. NTR weighs variants according to minor allele frequency and odds ratio to combine the effects of common and rare variants on disease occurrence into a single score and provides a test statistic to assess the significance of the score. In the simulations, under different effect sizes, the power of NTR increased as the effect size increased, and the type I error of our method was controlled well. Moreover, NTR was compared with several other existing methods, including the combined multivariate and collapsing method (CMC), weighted sum statistic method (WSS), sequence kernel association test (SKAT), and its modification, SKAT-O. NTR yields comparable or better power in simulations, especially when the effects of linkage disequilibrium between variants were at least moderate. In an analysis of diabetic nephropathy data, NTR detected more confirmed disease-related genes than the other aforementioned methods. NTR can thus be used as a complementary tool to help in dissecting the etiology of complex diseases.
Collapse
Affiliation(s)
- Ai-Ru Hsieh
- Graduate Institute of Biostatistics, China Medical University, Taichung, Taiwan
| | - Dao-Peng Chen
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | | | - Ying-Ju Li
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | - Chien-Ching Chang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | - Cathy S. J. Fann
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
10
|
Cheng W, Guo Z, Zhang X, Wang W. CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA 2016; 10:46. [PMID: 29081726 PMCID: PMC5658064 DOI: 10.1145/2903147] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 03/01/2016] [Indexed: 06/07/2023]
Abstract
Multi-view graph clustering aims to enhance clustering performance by integrating heterogeneous information collected in different domains. Each domain provides a different view of the data instances. Leveraging cross-domain information has been demonstrated an effective way to achieve better clustering results. Despite the previous success, existing multi-view graph clustering methods usually assume that different views are available for the same set of instances. Thus instances in different domains can be treated as having strict one-to-one relationship. In many real-life applications, however, data instances in one domain may correspond to multiple instances in another domain. Moreover, relationships between instances in different domains may be associated with weights based on prior (partial) knowledge. In this paper, we propose a flexible and robust framework, CGC (Co-regularized Graph Clustering), based on non-negative matrix factorization (NMF), to tackle these challenges. CGC has several advantages over the existing methods. First, it supports many-to-many cross-domain instance relationship. Second, it incorporates weight on cross-domain relationship. Third, it allows partial cross-domain mapping so that graphs in different domains may have different sizes. Finally, it provides users with the extent to which the cross-domain instance relationship violates the in-domain clustering structure, and thus enables users to re-evaluate the consistency of the relationship. We develop an efficient optimization method that guarantees to find the global optimal solution with a given confidence requirement. The proposed method can automatically identify noisy domains and assign smaller weights to them. This helps to obtain optimal graph partition for the focused domain. Extensive experimental results on UCI benchmark data sets, newsgroup data sets and biological interaction networks demonstrate the effectiveness of our approach.
Collapse
Affiliation(s)
| | | | | | - Wei Wang
- University of California, Los Angeles
| |
Collapse
|
11
|
Repnik K, Potočnik U. eQTL analysis links inflammatory bowel disease associated 1q21 locus to ECM1 gene. J Appl Genet 2016; 57:363-72. [PMID: 26738999 DOI: 10.1007/s13353-015-0334-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 12/16/2015] [Accepted: 12/18/2015] [Indexed: 12/11/2022]
Abstract
Genome-wide association studies (GWAS) have been highly successful in inflammatory bowel disease (IBD) with 163 confirmed associations so far. We used expression quantitative trait loci (eQTL) mapping to analyze IBD associated regions for which causative gene from the region is still unknown. First, we performed an extensive literature search and in silico analysis of published GWAS in IBD and eQTL studies and extracted 402 IBD associated SNPs assigned to 208 candidate loci, and 9562 eQTL correlations. When crossing GWA and eQTL data we found that for 50 % of loci there is no eQTL gene, while for 31.2 % we can determine one gene, for 11.1 % two genes and for the remaining 7.7 % three or more genes. Based on that we selected loci with one, two, and three or more eQTL genes and analyzed them in peripheral blood lymphocytes and intestine tissue samples of 606 Slovene patients with IBD and in 449 controls. Association analysis of selected SNPs showed statistical significance for three (rs2631372 and rs1050152 on 5q locus and rs13294 on 1q locus) out of six selected SNPs with at least one phenotype. Furthermore, with eQTL analysis of selected chromosomal regions, we confirmed a link between SNP and gene for four (SLC22A5 on 5q, ECM1 on 1q, ORMDL3 on 17q, and PUS10 on 2p locus) out of five selected regions. For 1q21 loci, we confirmed gene ECM1 as the most plausible gene from this region to be involved in pathogenesis of IBD and thereby contributed new eQTL correlation from this genomic region.
Collapse
Affiliation(s)
- Katja Repnik
- Faculty of Medicine, Center for Human Molecular Genetics and Pharmacogenomics, University of Maribor, Taborska ulica 8, 2000, Maribor, Slovenia.,Faculty for Chemistry and Chemical Engineering, University of Maribor, Maribor, Slovenia
| | - Uroš Potočnik
- Faculty of Medicine, Center for Human Molecular Genetics and Pharmacogenomics, University of Maribor, Taborska ulica 8, 2000, Maribor, Slovenia. .,Faculty for Chemistry and Chemical Engineering, University of Maribor, Maribor, Slovenia.
| |
Collapse
|
12
|
Zeng T, Zhang W, Yu X, Liu X, Li M, Chen L. Big-data-based edge biomarkers: study on dynamical drug sensitivity and resistance in individuals. Brief Bioinform 2015; 17:576-92. [DOI: 10.1093/bib/bbv078] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Indexed: 12/21/2022] Open
|
13
|
Koder S, Repnik K, Ferkolj I, Pernat C, Skok P, Weersma RK, Potočnik U. Genetic polymorphism in ATG16L1 gene influences the response to adalimumab in Crohn's disease patients. Pharmacogenomics 2015; 16:191-204. [PMID: 25712183 DOI: 10.2217/pgs.14.172] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 12/01/2014] [Indexed: 02/07/2023] Open
Abstract
AIM To see if SNPs could help predict response to biological therapy using adalimumab (ADA) in Crohn's disease (CD). MATERIALS & METHODS IBDQ index and CRP levels were used to monitor therapy response. We genotyped 31 CD-associated genes in 102 Slovenian CD patients. RESULTS The strongest association for treatment response defined as decrease in CRP levels was found for ATG16L1 SNP rs10210302. Additional SNPs in 7 out of 31 tested CD-associated genes (PTGER4, CASP9, IL27, C11orf30, CCNY, IL13, NR1I2) showed suggestive association with ADA response. CONCLUSION Our results suggest ADA response in CD patients is genetically predisposed by SNPs in CD risk genes and suggest ATG16L1 as most promising candidate gene for drug response in ADA treatment. Original submitted 24 September 2014; Revision submitted 1 December 2014.
Collapse
Affiliation(s)
- Silvo Koder
- University Medical Centre Maribor, Ljubljanska 5, Maribor, Slovenia
| | | | | | | | | | | | | |
Collapse
|
14
|
Wen SH, Yeh JI. Cohen's h for detection of disease association with rare genetic variants. BMC Genomics 2014; 15:875. [PMID: 25294186 PMCID: PMC4198687 DOI: 10.1186/1471-2164-15-875] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2014] [Accepted: 10/03/2014] [Indexed: 11/16/2022] Open
Abstract
Background The power of the genome wide association studies starts to go down when the minor allele frequency (MAF) is below 0.05. Here, we proposed the use of Cohen’s h in detecting disease associated rare variants. The variance stabilizing effect based on the arcsine square root transformation of MAFs to generate Cohen’s h contributed to the statistical power for rare variants analysis. We re-analyzed published datasets, one microarray and one sequencing based, and used simulation to compare the performance of Cohen’s h with the risk difference (RD) and odds ratio (OR). Results The analysis showed that the type 1 error rate of Cohen’s h was as expected and Cohen’s h and RD were both less biased and had higher power than OR. The advantage of Cohen’s h was more obvious when MAF was less than 0.01. Conclusions Cohen’s h can increase the power to find genetic association of rare variants and diseases, especially when MAF is less than 0.01. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-875) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Jih-I Yeh
- Department of Molecular Biology and Human Genetics, Tzu-Chi University, 701, Sec 3, Chung-Yang Rd, Hualien 97004, Taiwan.
| |
Collapse
|
15
|
Korczowska I. Rheumatoid arthritis susceptibility genes: An overview. World J Orthop 2014; 5:544-549. [PMID: 25232530 PMCID: PMC4133460 DOI: 10.5312/wjo.v5.i4.544] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Revised: 05/29/2014] [Accepted: 06/16/2014] [Indexed: 02/06/2023] Open
Abstract
Rheumatoid arthritis (RA) is a chronic, inflammatory autoimmune disease sustained by genetic factors. Various aspects of the genetic contribution to the pathogenetics and outcome of RA are still unknown. Several genes have been indicated so far in the pathogenesis of RA. Apart from human leukocyte antigen, large genome wide association studies have identified many loci involved in RA pathogenesis. These genes include protein tyrosine phosphatase, nonreceptor type 22, Peptidyl Arginine Deiminase type IV, signal transducer and activator of transcription 4, cytotoxic T-lymphocyte-associated protein 4, tumor necrosis factor-receptor associated factor 1/complement component 5, tumor necrosis factor and others. It is important to determine whether a combination of RA risk alleles are able to identify patients who will develop certain clinical outcomes, such myocardium infarction, severe infection or lymphoma, as well as to identify patients who will respond to biological medication therapy.
Collapse
|
16
|
Gonzalez S, Camarillo C, Rodriguez M, Ramirez M, Zavala J, Armas R, Contreras SA, Contreras J, Dassori A, Almasy L, Flores D, Jerez A, Raventós H, Ontiveros A, Nicolini H, Escamilla M. A genome-wide linkage scan of bipolar disorder in Latino families identifies susceptibility loci at 8q24 and 14q32. Am J Med Genet B Neuropsychiatr Genet 2014; 165B:479-91. [PMID: 25044503 DOI: 10.1002/ajmg.b.32251] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 05/27/2014] [Indexed: 12/14/2022]
Abstract
A genome-wide nonparametric linkage screen was performed to localize Bipolar Disorder (BP) susceptibility loci in a sample of 3757 individuals of Latino ancestry. The sample included 963 individuals with BP phenotype (704 relative pairs) from 686 families recruited from the US, Mexico, Costa Rica, and Guatemala. Non-parametric analyses were performed over a 5 cM grid with an average genetic coverage of 0.67 cM. Multipoint analyses were conducted across the genome using non-parametric Kong & Cox LOD scores along with Sall statistics for all relative pairs. Suggestive and significant genome-wide thresholds were calculated based on 1000 simulations. Single-marker association tests in the presence of linkage were performed assuming a multiplicative model with a population prevalence of 2%. We identified two genome-wide significant susceptibly loci for BP at 8q24 and 14q32, and a third suggestive locus at 2q13-q14. Within these three linkage regions, the top associated single marker (rs1847694, P = 2.40 × 10(-5)) is located 195 Kb upstream of DPP10 in Chromosome 2. DPP10 is prominently expressed in brain neuronal populations, where it has been shown to bind and regulate Kv4-mediated A-type potassium channels. Taken together, these results provide additional evidence that 8q24, 14q32, and 2q13-q14 are susceptibly loci for BP and these regions may be involved in the pathogenesis of BP in the Latino population.
Collapse
Affiliation(s)
- Suzanne Gonzalez
- Center of Excellence for Neurosciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, Texas; Department of Psychiatry, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, Texas
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Abstract
A novel haplotype association method is presented, and its power is demonstrated. Relying on a statistical model for linkage disequilibrium (LD), the method first infers ancestral haplotypes and their loadings at each marker for each individual. The loadings are then used to quantify local haplotype sharing between individuals at each marker. A statistical model was developed to link the local haplotype sharing and phenotypes to test for association. We devised a novel method to fit the LD model, reducing the complexity from putatively quadratic to linear (in the number of ancestral haplotypes). Therefore, the LD model can be fitted to all study samples simultaneously, and, consequently, our method is applicable to big data sets. Compared to existing haplotype association methods, our method integrated out phase uncertainty, avoided arbitrariness in specifying haplotypes, and had the same number of tests as the single-SNP analysis. We applied our method to data from the Wellcome Trust Case Control Consortium and discovered eight novel associations between seven gene regions and five disease phenotypes. Among these, GRIK4, which encodes a protein that belongs to the glutamate-gated ionic channel family, is strongly associated with both coronary artery disease and rheumatoid arthritis. A software package implementing methods described in this article is freely available at http://www.haplotype.org.
Collapse
|
18
|
Abstract
The "Bermuda triangle" of genetics, environment and autoimmunity is involved in the pathogenesis of rheumatoid arthritis (RA). Various aspects of genetic contribution to the etiology, pathogenesis and outcome of RA are discussed in this review. The heritability of RA has been estimated to be about 60 %, while the contribution of HLA to heritability has been estimated to be 11-37 %. Apart from known shared epitope (SE) alleles, such as HLA-DRB1*01 and DRB1*04, other HLA alleles, such as HLA-DRB1*13 and DRB1*15 have been linked to RA susceptibility. A novel SE classification divides SE alleles into S1, S2, S3P and S3D groups, where primarily S2 and S3P groups have been associated with predisposition to seropositive RA. The most relevant non-HLA gene single nucleotide polymorphisms (SNPs) associated with RA include PTPN22, IL23R, TRAF1, CTLA4, IRF5, STAT4, CCR6, PADI4. Large genome-wide association studies (GWAS) have identified more than 30 loci involved in RA pathogenesis. HLA and some non-HLA genes may differentiate between anti-citrullinated protein antibody (ACPA) seropositive and seronegative RA. Genetic susceptibility has also been associated with environmental factors, primarily smoking. Some GWAS studies carried out in rodent models of arthritis have confirmed the role of human genes. For example, in the collagen-induced (CIA) and proteoglycan-induced arthritis (PgIA) models, two important loci - Pgia26/Cia5 and Pgia2/Cia2/Cia3, corresponding the human PTPN22/CD2 and TRAF1/C5 loci, respectively - have been identified. Finally, pharmacogenomics identified SNPs or multiple genetic signatures that may be associated with responses to traditional disease-modifying drugs and biologics.
Collapse
|
19
|
Zeng P, Zhao Y, Zhang L, Huang S, Chen F. Rare variants detection with kernel machine learning based on likelihood ratio test. PLoS One 2014; 9:e93355. [PMID: 24675868 PMCID: PMC3968153 DOI: 10.1371/journal.pone.0093355] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2013] [Accepted: 03/03/2014] [Indexed: 11/18/2022] Open
Abstract
This paper mainly utilizes likelihood-based tests to detect rare variants associated with a continuous phenotype under the framework of kernel machine learning. Both the likelihood ratio test (LRT) and the restricted likelihood ratio test (ReLRT) are investigated. The relationship between the kernel machine learning and the mixed effects model is discussed. By using the eigenvalue representation of LRT and ReLRT, their exact finite sample distributions are obtained in a simulation manner. Numerical studies are performed to evaluate the performance of the proposed approaches under the contexts of standard mixed effects model and kernel machine learning. The results have shown that the LRT and ReLRT can control the type I error correctly at the given α level. The LRT and ReLRT consistently outperform the SKAT, regardless of the sample size and the proportion of the negative causal rare variants, and suffer from fewer power reductions compared to the SKAT when both positive and negative effects of rare variants are present. The LRT and ReLRT performed under the context of kernel machine learning have slightly higher powers than those performed under the context of standard mixed effects model. We use the Genetic Analysis Workshop 17 exome sequencing SNP data as an illustrative example. Some interesting results are observed from the analysis. Finally, we give the discussion.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, China
| | - Yang Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Liwei Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Shuiping Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, China
| | - Feng Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- * E-mail:
| |
Collapse
|
20
|
Test of rare variant association based on affected sib-pairs. Eur J Hum Genet 2014; 23:229-37. [PMID: 24667785 DOI: 10.1038/ejhg.2014.43] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2013] [Revised: 11/06/2013] [Accepted: 12/30/2013] [Indexed: 11/08/2022] Open
Abstract
With the development of sequencing techniques, there is increasing interest to detect associations between rare variants and complex traits. Quite a few statistical methods to detect associations between rare variants and complex traits have been developed for unrelated individuals. Statistical methods for detecting rare variant associations under family-based designs have not received as much attention as methods for unrelated individuals. Recent studies show that rare disease variants will be enriched in family data and thus family-based designs may improve power to detect rare variant associations. In this article, we propose a novel test to test association between the optimally weighted combination of variants and trait of interests for affected sib-pairs. The optimal weights are analytically derived and can be calculated from sampled genotypes and phenotypes. Based on the optimal weights, the proposed method is robust to the directions of the effects of causal variants and is less affected by neutral variants than existing methods are. Our simulation results show that, in all the cases, the proposed method is substantially more powerful than existing methods based on unrelated individuals and existing methods based on affected sib-pairs.
Collapse
|
21
|
Verma SK, Deshmukh V, Liu P, Nutter CA, Espejo R, Hung ML, Wang GS, Yeo GW, Kuyumcu-Martinez MN. Reactivation of fetal splicing programs in diabetic hearts is mediated by protein kinase C signaling. J Biol Chem 2013; 288:35372-86. [PMID: 24151077 DOI: 10.1074/jbc.m113.507426] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Diabetic cardiomyopathy is one of the complications of diabetes that eventually leads to heart failure and death. Aberrant activation of PKC signaling contributes to diabetic cardiomyopathy by mechanisms that are poorly understood. Previous reports indicate that PKC is implicated in alternative splicing regulation. Therefore, we wanted to test whether PKC activation in diabetic hearts induces alternative splicing abnormalities. Here, using RNA sequencing we identified a set of 22 alternative splicing events that undergo a developmental switch in splicing, and we confirmed that splicing reverts to an embryonic pattern in adult diabetic hearts. This network of genes has important functions in RNA metabolism and in developmental processes such as differentiation. Importantly, PKC isozymes α/β control alternative splicing of these genes via phosphorylation and up-regulation of the RNA-binding proteins CELF1 and Rbfox2. Using a mutant of CELF1, we show that phosphorylation of CELF1 by PKC is necessary for regulation of splicing events altered in diabetes. In summary, our studies indicate that activation of PKCα/β in diabetic hearts contributes to the genome-wide splicing changes through phosphorylation and up-regulation of CELF1/Rbfox2 proteins. These findings provide a basis for PKC-mediated cardiac pathogenesis under diabetic conditions.
Collapse
Affiliation(s)
- Sunil K Verma
- From the Departments of Biochemistry and Molecular Biology and
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Clarke GM, Rivas MA, Morris AP. A flexible approach for the analysis of rare variants allowing for a mixture of effects on binary or quantitative traits. PLoS Genet 2013; 9:e1003694. [PMID: 23966874 PMCID: PMC3744430 DOI: 10.1371/journal.pgen.1003694] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Accepted: 06/19/2013] [Indexed: 11/18/2022] Open
Abstract
Multiple rare variants either within or across genes have been hypothesised to collectively influence complex human traits. The increasing availability of high throughput sequencing technologies offers the opportunity to study the effect of rare variants on these traits. However, appropriate and computationally efficient analytical methods are required to account for collections of rare variants that display a combination of protective, deleterious and null effects on the trait. We have developed a novel method for the analysis of rare genetic variation in a gene, region or pathway that, by simply aggregating summary statistics at each variant, can: (i) test for the presence of a mixture of effects on a trait; (ii) be applied to both binary and quantitative traits in population-based and family-based data; (iii) adjust for covariates to allow for non-genetic risk factors and; (iv) incorporate imputed genetic variation. In addition, for preliminary identification of promising genes, the method can be applied to association summary statistics, available from meta-analysis of published data, for example, without the need for individual level genotype data. Through simulation, we show that our method is immune to the presence of bi-directional effects, with no apparent loss in power across a range of different mixtures, and can achieve greater power than existing approaches as long as summary statistics at each variant are robust. We apply our method to investigate association of type-1 diabetes with imputed rare variants within genes in the major histocompatibility complex using genotype data from the Wellcome Trust Case Control Consortium.
Collapse
Affiliation(s)
- Geraldine M Clarke
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.
| | | | | |
Collapse
|
23
|
Lin WY, Yi N, Lou XY, Zhi D, Zhang K, Gao G, Tiwari HK, Liu N. Haplotype kernel association test as a powerful method to identify chromosomal regions harboring uncommon causal variants. Genet Epidemiol 2013; 37:560-70. [PMID: 23740760 DOI: 10.1002/gepi.21740] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2012] [Revised: 05/01/2013] [Accepted: 05/06/2013] [Indexed: 01/09/2023]
Abstract
For most complex diseases, the fraction of heritability that can be explained by the variants discovered from genome-wide association studies is minor. Although the so-called "rare variants" (minor allele frequency [MAF] < 1%) have attracted increasing attention, they are unlikely to account for much of the "missing heritability" because very few people may carry these rare variants. The genetic variants that are likely to fill in the "missing heritability" include uncommon causal variants (MAF < 5%), which are generally untyped in association studies using tagging single-nucleotide polymorphisms (SNPs) or commercial SNP arrays. Developing powerful statistical methods can help to identify chromosomal regions harboring uncommon causal variants, while bypassing the genome-wide or exome-wide next-generation sequencing. In this work, we propose a haplotype kernel association test (HKAT) that is equivalent to testing the variance component of random effects for distinct haplotypes. With an appropriate weighting scheme given to haplotypes, we can further enhance the ability of HKAT to detect uncommon causal variants. With scenarios simulated according to the population genetics theory, HKAT is shown to be a powerful method for detecting chromosomal regions harboring uncommon causal variants.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | | | | | | | | | | | | | | |
Collapse
|
24
|
Génin E, Sahbatou M, Gazal S, Babron MC, Perdry H, Leutenegger AL. Could inbred cases identified in GWAS data succeed in detecting rare recessive variants where affected sib-pairs have failed? Hum Hered 2013; 74:142-52. [PMID: 23594492 DOI: 10.1159/000346790] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
To detect fully penetrant rare recessive variants that could constitute Mendelian subentities of complex diseases, we propose a novel strategy, the HBD-GWAS strategy, which can be applied to genome-wide association study (GWAS) data. This strategy first involves the identification of inbred individuals among cases using the genome-wide SNP data and then focuses on these inbred affected individuals and searches for genomic regions of shared homozygosity by descent that could harbor rare recessive disease-causing variants. In this second step, analogous to homozygosity mapping, a heterogeneity lod-score, HFLOD, is computed to quantify the evidence of linkage provided by the data. In this paper, we evaluate this strategy theoretically under different scenarios and compare its performances with those of linkage analysis using affected sib-pair (ASP) data. If cases affected by these Mendelian subentities are not enriched in the sample of cases, the HBD-GWAS strategy has almost no power to detect them, unless they explain an important part of the disease prevalence. The HBD-GWAS strategy outperforms the ASP linkage strategy only in a very limited number of situations where there exists a strong allelic heterogeneity. When several rare recessive variants within the same gene are involved, the ASP design indeed often fails to detect the gene, whereas, by focusing on inbred individuals using the HBD-GWAS strategy, the gene might be detected provided very large samples of cases are available.
Collapse
Affiliation(s)
- Emmanuelle Génin
- Inserm UMR-946, Genetic Variability and Human Diseases, Paris, France.
| | | | | | | | | | | |
Collapse
|
25
|
Quilter CR, Sargent CA, Bauer J, Bagga MR, Reiter CP, Hutchinson EL, Southwood OI, Evans G, Mileham A, Griffin DK, Affara NA. An association and haplotype analysis of porcine maternal infanticide: a model for human puerperal psychosis? Am J Med Genet B Neuropsychiatr Genet 2012; 159B:908-27. [PMID: 22976950 DOI: 10.1002/ajmg.b.32097] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 08/09/2012] [Indexed: 12/16/2022]
Abstract
An association analysis using the Illumina porcine SNP60 beadchip was performed to identify SNPs significantly associated with porcine maternal infanticide. We previously hypothesised that this was a good animal model for human puerperal psychosis, an extreme form of postnatal mood disorder. Animals were selected from carefully phenotyped unrelated infanticide and control groups (representing extremes of the phenotypic spectrum), from four different lines. Permutation and sliding window analyses and an analysis to see which haplotypes were in linkage disequilibrium (LD) were compared to identify concordant regions. Across all analyses, intervals on SSCs 1, 3, 4, 10, and 13 were constant, contained genes associated with psychiatric or neurological disorders and were significant in multiple lines. The strongest (near GWS) consistent candidate region across all analyses and all breeds was the one located on SSC3 with one peak at 23.4 Mb, syntenic to a candidate region for bipolar disorder and another at 31.9 Mb, syntenic to a candidate region for human puerperal psychosis (16p13). From the haplotype/LD analysis, two regions reached genome wide significance (GWS): the first on SSC4 (KHDRBS3 to FAM135B), which was significant (-logP 5.57) in one Duroc based breed and is syntenic to a region in humans associated with cognition and neurotism; the second on SSC15, which was significant (-log10P 5.68) in two breeds and contained PAX3, which is expressed in the brain.
Collapse
Affiliation(s)
- C R Quilter
- Human Molecular Genetics Group, Department of Pathology, University of Cambridge, Cambridge, UK.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Zhu X, Feng T, Elston R. Linkage-disequilibrium-based binning misleads the interpretation of genome-wide association studies. Am J Hum Genet 2012; 91:965-8; author reply 969-70. [PMID: 23122590 PMCID: PMC3487138 DOI: 10.1016/j.ajhg.2012.05.029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Revised: 05/10/2012] [Accepted: 05/10/2012] [Indexed: 11/26/2022] Open
Affiliation(s)
- Xiaofeng Zhu
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Tao Feng
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Robert C. Elston
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| |
Collapse
|
27
|
Zhang Y, Guan W, Pan W. Adjustment for population stratification via principal components in association analysis of rare variants. Genet Epidemiol 2012; 37:99-109. [PMID: 23065775 DOI: 10.1002/gepi.21691] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 09/11/2012] [Accepted: 09/13/2012] [Indexed: 11/07/2022]
Abstract
For unrelated samples, principal component (PC) analysis has been established as a simple and effective approach to adjusting for population stratification in association analysis of common variants (CVs, with minor allele frequencies MAF > 5%). However, it is less clear how it would perform in analysis of low-frequency variants (LFVs, MAF between 1% and 5%), or of rare variants (RVs, MAF < 5%). Furthermore, with next-generation sequencing data, it is unknown whether PCs should be constructed based on CVs, LFVs, or RVs. In this study, we used the 1000 Genomes Project sequence data to explore the construction of PCs and their use in association analysis of LFVs or RVs for unrelated samples. It is shown that a few top PCs based on either CVs or LFVs could separate two continental groups, European and African samples, but those based on only RVs performed less well. When applied to several association tests in simulated data with population stratification, using PCs based on either CVs or LFVs was effective in controlling Type I error rates, while nonadjustment led to inflated Type I error rates. Perhaps the most interesting observation is that, although the PCs based on LFVs could better separate the two continental groups than those based on CVs, the use of the former could lead to overadjustment in the sense of substantial power loss in the absence of population stratification; in contrast, we did not see any problem with the use of the PCs based on CVs in all our examples.
Collapse
Affiliation(s)
- Yiwei Zhang
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455-0392, USA
| | | | | |
Collapse
|
28
|
Michou L, Cornélis F, Levesque JM, Bombardieri S, Balsa A, Westhovens R, Barrera P, Alves H, van de Putte L, Migliorini P, Bardin T, Petit-Teixeira E, Fernandes MJ. A genetic association study of the CLEC12A gene in rheumatoid arthritis. Joint Bone Spine 2012; 79:451-6. [DOI: 10.1016/j.jbspin.2011.12.012] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2011] [Accepted: 12/17/2011] [Indexed: 12/14/2022]
|
29
|
Hu X, Daly M. What have we learned from six years of GWAS in autoimmune diseases, and what is next? Curr Opin Immunol 2012; 24:571-5. [PMID: 23017373 DOI: 10.1016/j.coi.2012.09.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2012] [Revised: 08/30/2012] [Accepted: 09/04/2012] [Indexed: 01/03/2023]
Abstract
Genome-wide association studies (GWAS) have discovered hundreds of common genetic variants that predispose humans to autoimmune diseases, opening up unprecedented potential for elucidating the pathways and processes of disease. To understand the role of these variants in susceptibility, we need to derive mechanistic insight by integration of genetic results with other biological data types and also with careful functional studies. In many cases, such studies have highlighted coherent biological processes at a high level and elucidated specific mechanisms that contribute to autoimmunity and inflammation. The understanding of the genetic component of autoimmune etiology will become more complete as fine-mapping and sequencing data become readily available. A comprehensive catalog of human immune phenotypes could provide a functional basis for assessing genetic influence on immune function and variation in response to therapeutic interventions, as well as for rationally designing new targeted therapeutics.
Collapse
Affiliation(s)
- Xinli Hu
- Harvard Medical School, Harvard-MIT Division of Health Sciences and Technology, Boston, MA 02114, USA
| | | |
Collapse
|
30
|
Lin WY, Yi N, Zhi D, Zhang K, Gao G, Tiwari HK, Liu N. Haplotype-based methods for detecting uncommon causal variants with common SNPs. Genet Epidemiol 2012; 36:572-82. [PMID: 22706849 DOI: 10.1002/gepi.21650] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2012] [Revised: 04/19/2012] [Accepted: 05/09/2012] [Indexed: 01/01/2023]
Abstract
Detecting uncommon causal variants (minor allele frequency [MAF] < 5%) is difficult with commercial single-nucleotide polymorphism (SNP) arrays that are designed to capture common variants (MAF > 5%). Haplotypes can provide insights into underlying linkage disequilibrium (LD) structure and can tag uncommon variants that are not well tagged by common variants. In this work, we propose a wei-SIMc-matching test that inversely weights haplotype similarities with the estimated standard deviation of haplotype counts to boost the power of similarity-based approaches for detecting uncommon causal variants. We then compare the power of the wei-SIMc-matching test with that of several popular haplotype-based tests, including four other similarity-based tests, a global score test for haplotypes (global), a test based on the maximum score statistic over all haplotypes (max), and two newly proposed haplotype-based tests for rare variant detection. With systematic simulations under a wide range of LD patterns, the results show that wei-SIMc-matching and global are the two most powerful tests. Among these two tests, wei-SIMc-matching has reliable asymptotic P-values, whereas global needs permutations to obtain reliable P-values when the frequencies of some haplotype categories are low or when the trait is skewed. Therefore, we recommend wei-SIMc-matching for detecting uncommon causal variants with surrounding common SNPs, in light of its power and computational feasibility.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | | | | | | | | | | | | |
Collapse
|
31
|
Weatherford ET, Liu X, Sigmund CD. Regulation of renin expression by the orphan nuclear receptors Nr2f2 and Nr2f6. Am J Physiol Renal Physiol 2012; 302:F1025-33. [PMID: 22278040 PMCID: PMC3330716 DOI: 10.1152/ajprenal.00362.2011] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2011] [Accepted: 01/21/2012] [Indexed: 01/13/2023] Open
Abstract
Understanding the transcriptional mechanisms of renin expression is key to understanding the regulation of the renin-angiotensin system. We previously identified the nuclear receptors RAR/RXR and Nr2f6 (EAR2) as positive and negative transcriptional regulators of renin expression, respectively (Liu X, Huang X, Sigmund CD. Circ Res 92: 1033-1040, 2003). Both mediate their effects through a hormone response element (HRE) within the renin enhancer. Here, we determined whether another nuclear receptor, Nr2f2 (Coup-TFII, Arp-1), identified in a screen of proteins that bind the HRE, also regulates renin expression. Luciferase assays indicate that Nr2f2 negatively regulates the renin promoter more potently than Nr2f6. Gel-shift and chromatin immunoprecipitation (ChIP) indicate that Nr2f2 and Nr2f6 can bind directly to the renin enhancer through the HRE. Surprisingly, baseline expression of endogenous renin was not effected when Nr2f2 was knocked down in As4.1 cells, whereas knockdown of Nr2f6 increased renin expression twofold. Interestingly, however, knockdown of Nr2f2 augmented the induction of renin expression caused by retinoic acid. These data indicate that both Nr2f6 and Nr2f2 can negatively regulate the renin promoter, under baseline conditions and in response to physiological queues, respectively. Therefore, Nr2f2 may require an initiating signal that results in a change at the chromatin level or activation of another transcription factor to exert its effects. We conclude that both Nr2f2 and Nr2f6 negatively regulate renin promoter activity, but may do so by divergent mechanisms.
Collapse
Affiliation(s)
- Eric T Weatherford
- Department of Pharmacology, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, IA 52242, USA
| | | | | |
Collapse
|
32
|
Zintzaras E, Song YB, Zheng WL, Jiang L, Ma WL. Is there evidence to claim or deny association between variants of the multidrug resistance gene (MDR1 or ABCB1) and inflammatory bowel disease? Inflamm Bowel Dis 2012; 18:562-72. [PMID: 21887726 DOI: 10.1002/ibd.21728] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Accepted: 03/16/2011] [Indexed: 12/14/2022]
Abstract
BACKGROUND Inflammatory bowel disease (IBD) is a complex disease with a genetic background. Crohn's disease (CD) and ulcerative colitis (UC) are the two main types of IBD. There is indication that variants in the MDR1 gene are associated with development of IBD. However, the 20 published genetic association studies (GAS) for the three most popular variants in the MDR1 gene (C3435T, G2677T/A, and C1236T) have produced inclusive results. METHODS In order to decrease the uncertainty of pooled risk effects and to explore the trend and stability of the risk effects, a meticulous meta-analysis, including cumulative and recursive cumulative meta-analysis, of the GAS related to the MDR1 gene with susceptibility to IBD was conducted. The risk effects were estimated based on the odds ratio (OR) of the allele contrast and the generalized odds ratio (OR(G) ). RESULTS The analysis showed marginal significant association for the C3435T variant in UC: the risk estimate for the allele contrast was OR = 1.11 (1.00-1.22) and OR(G) = 1.12 (1.01-1.27), indicating that a subject with high mutational load has a 12% higher probability of being diseased. The respective cumulative meta-analysis indicated a downward trend of association, as evidence accumulates with the association being significant during the whole published period. The cumulative meta-analysis for the other variants showed lack of any trend of association. However, the recursive cumulative meta-analysis showed that there is no sufficient evidence for denying or claiming an association for all variants. CONCLUSIONS More evidence is needed to draw safe conclusions regarding the association of MDR1 variants and development of IBD.
Collapse
Affiliation(s)
- Elias Zintzaras
- Department of Biomathematics, University of Thessaly School of Medicine, Larissa, Greece.
| | | | | | | | | |
Collapse
|
33
|
Abstract
The limitations of genome-wide association (GWA) studies that are based on the common disease common variants (CDCV) hypothesis have motivated geneticists to test the hypothesis that rare variants contribute to the variation of common diseases, i.e., common disease/rare variants (CDRV). The newly developed high-throughput sequencing technologies have made the studies of rare variants practicable. Statistical approaches to test associations between a phenotype and rare variants are quickly developing. The central idea of these methods is to test a set of rare variants in a defined region or regions by collapsing or aggregating rare variants, thereby improving the statistical power. In this chapter, we introduce these methods as well as their applications in practice.
Collapse
Affiliation(s)
- Tao Feng
- Department of Epidemiology and Biostatistics, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| | | |
Collapse
|
34
|
Wang X, Prins BP, Sõber S, Laan M, Snieder H. Beyond genome-wide association studies: new strategies for identifying genetic determinants of hypertension. Curr Hypertens Rep 2011; 13:442-51. [PMID: 21953487 PMCID: PMC3212682 DOI: 10.1007/s11906-011-0230-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Genetic linkage and association methods have long been the most important tools for gene identification in humans. These approaches can either be hypothesis-based (i.e., candidate-gene studies) or hypothesis-free (i.e., genome-wide studies). The first part of this review offers an overview of the latest successes in gene finding for blood pressure (BP) and essential hypertension using these DNA sequence-based discovery techniques. We further emphasize the importance of post-genome-wide association study (post-GWAS) analysis, which aims to prioritize genetic variants for functional follow-up. Whole-genome next-generation sequencing will eventually be necessary to provide a more comprehensive picture of all DNA variants affecting BP and hypertension. The second part of this review discusses promising novel approaches that move beyond the DNA sequence and aim to discover BP genes that are differentially regulated by epigenetic mechanisms, including microRNAs, histone modification, and methylation.
Collapse
Affiliation(s)
- Xiaoling Wang
- Georgia Prevention Institute, Department of Pediatrics, Medical College of Georgia, Augusta, GA USA
| | - Bram P. Prins
- Unit of Genetic Epidemiology & Bioinformatics, Department of Epidemiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, PO Box 30.001, 9700 RB Groningen, The Netherlands
| | - Siim Sõber
- Human Molecular Genetics group, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Maris Laan
- Human Molecular Genetics group, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Harold Snieder
- Unit of Genetic Epidemiology & Bioinformatics, Department of Epidemiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, PO Box 30.001, 9700 RB Groningen, The Netherlands
| |
Collapse
|
35
|
Feng T, Elston RC, Zhu X. A novel method to detect rare variants using both family and unrelated case-control data. BMC Proc 2011; 5 Suppl 9:S80. [PMID: 22373319 PMCID: PMC3287921 DOI: 10.1186/1753-6561-5-s9-s80] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
To detect rare variants associated with a phenotype, we develop a novel statistical method that can use both family and unrelated case-control data. Unlike the currently existing methods, we first use family data to calculate weights to be given to rare variants, differentiating between concordantly affected and discordant sib pairs. These weights are then used in an association test applied to the unrelated case-control data. We applied the proposed method to the simulated sequencing data in Genetic Analysis Workshop 17 and identified two genes associated with the disease.
Collapse
Affiliation(s)
- Tao Feng
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA.
| | | | | |
Collapse
|
36
|
Aschard H, Qiu W, Pasaniuc B, Zaitlen N, Cho MH, Carey V. Combining effects from rare and common genetic variants in an exome-wide association study of sequence data. BMC Proc 2011; 5 Suppl 9:S44. [PMID: 22373328 PMCID: PMC3287881 DOI: 10.1186/1753-6561-5-s9-s44] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Recent breakthroughs in next-generation sequencing technologies allow cost-effective methods for measuring a growing list of cellular properties, including DNA sequence and structural variation. Next-generation sequencing has the potential to revolutionize complex trait genetics by directly measuring common and rare genetic variants within a genome-wide context. Because for a given gene both rare and common causal variants can coexist and have independent effects on a trait, strategies that model the effects of both common and rare variants could enhance the power of identifying disease-associated genes. To date, little work has been done on integrating signals from common and rare variants into powerful statistics for finding disease genes in genome-wide association studies. In this analysis of the Genetic Analysis Workshop 17 data, we evaluate various strategies for association of rare, common, or a combination of both rare and common variants on quantitative phenotypes in unrelated individuals. We show that the analysis of common variants only using classical approaches can achieve higher power to detect causal genes than recently proposed rare variant methods and that strategies that combine association signals derived independently in rare and common variants can slightly increase the power compared to strategies that focus on the effect of either the rare variants or the common variants.
Collapse
Affiliation(s)
- Hugues Aschard
- 1Department of Epidemiology, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA.
| | | | | | | | | | | |
Collapse
|
37
|
Gui H, Li M, Sham PC, Cherny SS. Comparisons of seven algorithms for pathway analysis using the WTCCC Crohn's Disease dataset. BMC Res Notes 2011; 4:386. [PMID: 21981765 PMCID: PMC3199264 DOI: 10.1186/1756-0500-4-386] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2011] [Accepted: 10/07/2011] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Though rooted in genomic expression studies, pathway analysis for genome-wide association studies (GWAS) has gained increasing popularity, since it has the potential to discover hidden disease pathogenic mechanisms by combining statistical methods with biological knowledge. Generally, algorithms or programs proposed recently can be categorized by different types of input data, null hypothesis or counts of analysis stages. Due to complexity caused by SNP, gene and pathway relationships, re-sampling strategies like permutation are always utilized to derive an empirical distribution for test statistics for evaluating the significance of candidate pathways. However, evaluation of these algorithms on real GWAS datasets and real biological pathway databases needs to be addressed before we apply them widely with confidence. FINDINGS Two algorithms which use summary statistics from GWAS as input were implemented in KGG, a novel and user-friendly software tool for GWAS pathway analysis. Comparisons of these two algorithms as well as the other five selected algorithms were conducted by analyzing the WTCCC Crohn's Disease dataset utilizing the MsigDB canonical pathways. As a result of using permutation to obtain empirical p-value, most of these methods could control Type I error rate well, although some are conservative. However, the methods varied greatly in terms of power and running time, with the PLINK truncated set-based test being the most powerful and KGG being the fastest. CONCLUSIONS Raw data-based algorithms, such as those implemented in PLINK, are preferable for GWAS pathway analysis as long as computational capacity is available. It may be worthwhile to apply two or more pathway analysis algorithms on the same GWAS dataset, since the methods differ greatly in their outputs and might provide complementary findings for the studied complex disease.
Collapse
Affiliation(s)
- Hongsheng Gui
- Department of Psychiatry, The University of Hong Kong, Hong Kong, SAR, China.
| | | | | | | |
Collapse
|
38
|
Abstract
PURPOSE OF REVIEW To review recent progress in the genetics of rheumatoid arthritis (RA) and discuss the implications for understanding the pathogenesis of the disease as well as clinical application. RECENT FINDINGS Protection against anticitrullinated protein antibody (ACPA) positive RA was shown to be associated wit DRB1*1301. Genome-wide association studies (GWASs) added about 10 new loci to the list of already more than 20 loci associated with RA, so the list is now over 30. Typing for the known risk loci is not helpful for prediction of the risk for RA. It is remarkable how few functional studies have been published. SUMMARY Known genetic factors explain 50-60% of the genetic variance for susceptibility to ACPA-positive and 30-50% for ACPA-negative RA. Searching for the remaining missing or hidden heritability is in all probability not going to yield much for prediction and/or targeted intervention. Therefore, I conclude that if you want to find more genes you should have a lot of patience, time and money, stop with convential GWAS and invest in large-scale sequencing of selected patients and controls. I have a better suggestion, however: use the information that is already available to perform functional studies in order to understand the mechanism of the known associations!
Collapse
|
39
|
Basu S, Pan W. Comparison of statistical tests for disease association with rare variants. Genet Epidemiol 2011; 35:606-19. [PMID: 21769936 DOI: 10.1002/gepi.20609] [Citation(s) in RCA: 188] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Revised: 03/23/2011] [Accepted: 06/03/2011] [Indexed: 01/31/2023]
Abstract
In anticipation of the availability of next-generation sequencing data, there is increasing interest in investigating association between complex traits and rare variants (RVs). In contrast to association studies for common variants (CVs), due to the low frequencies of RVs, common wisdom suggests that existing statistical tests for CVs might not work, motivating the recent development of several new tests for analyzing RVs, most of which are based on the idea of pooling/collapsing RVs. However, there is a lack of evaluations of, and thus guidance on the use of, existing tests. Here we provide a comprehensive comparison of various statistical tests using simulated data. We consider both independent and correlated rare mutations, and representative tests for both CVs and RVs. As expected, if there are no or few non-causal (i.e. neutral or non-associated) RVs in a locus of interest while the effects of causal RVs on the trait are all (or mostly) in the same direction (i.e. either protective or deleterious, but not both), then the simple pooled association tests (without selecting RVs and their association directions) and a new test called kernel-based adaptive clustering (KBAC) perform similarly and are most powerful; KBAC is more robust than simple pooled association tests in the presence of non-causal RVs; however, as the number of non-causal CVs increases and/or in the presence of opposite association directions, the winners are two methods originally proposed for CVs and a new test called C-alpha test proposed for RVs, each of which can be regarded as testing on a variance component in a random-effects model. Interestingly, several methods based on sequential model selection (i.e. selecting causal RVs and their association directions), including two new methods proposed here, perform robustly and often have statistical power between those of the above two classes.
Collapse
Affiliation(s)
- Saonli Basu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455-0392, USA
| | | |
Collapse
|
40
|
Gusev A, Kenny EE, Lowe JK, Salit J, Saxena R, Kathiresan S, Altshuler DM, Friedman JM, Breslow JL, Pe'er I. DASH: a method for identical-by-descent haplotype mapping uncovers association with recent variation. Am J Hum Genet 2011; 88:706-717. [PMID: 21620352 DOI: 10.1016/j.ajhg.2011.04.023] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Revised: 04/13/2011] [Accepted: 04/26/2011] [Indexed: 02/01/2023] Open
Abstract
Rare variants affecting phenotype pose a unique challenge for human genetics. Although genome-wide association studies have successfully detected many common causal variants, they are underpowered in identifying disease variants that are too rare or population-specific to be imputed from a general reference panel and thus are poorly represented on commercial SNP arrays. We set out to overcome these challenges and detect association between disease and rare alleles using SNP arrays by relying on long stretches of genomic sharing that are identical by descent. We have developed an algorithm, DASH, which builds upon pairwise identical-by-descent shared segments to infer clusters of individuals likely to be sharing a single haplotype. DASH constructs a graph with nodes representing individuals and links on the basis of such segments spanning a locus and uses an iterative minimum cut algorithm to identify densely connected components. We have applied DASH to simulated data and diverse GWAS data sets by constructing haplotype clusters and testing them for association. In simulations we show this approach to be significantly more powerful than single-marker testing in an isolated population that is from Kosrae, Federated States of Micronesia and has abundant IBD, and we provide orthogonal information for rare, recent variants in the outbred Wellcome Trust Case-Control Consortium (WTCCC) data. In both cohorts, we identified a number of haplotype associations, five such loci in the WTCCC data and ten in the isolated, that were conditionally significant beyond any individual nearby markers. We have replicated one of these loci in an independent European cohort and identified putative structural changes in low-pass whole-genome sequence of the cluster carriers.
Collapse
Affiliation(s)
- Alexander Gusev
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Eimear E Kenny
- Department of Computer Science, Columbia University, New York, NY 10027, USA; Medical Sciences and Human Genetics, Rockefeller University, New York, NY 10065, USA
| | - Jennifer K Lowe
- Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Jaqueline Salit
- Medical Sciences and Human Genetics, Rockefeller University, New York, NY 10065, USA
| | - Richa Saxena
- Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Sekar Kathiresan
- Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Cardiovascular Disease Prevention Center, Cardiology Division, Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - David M Altshuler
- Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Center for Human Genetic Research and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jeffrey M Friedman
- Medical Sciences and Human Genetics, Rockefeller University, New York, NY 10065, USA
| | - Jan L Breslow
- Medical Sciences and Human Genetics, Rockefeller University, New York, NY 10065, USA
| | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
41
|
Feng T, Elston RC, Zhu X. Detecting rare and common variants for complex traits: sibpair and odds ratio weighted sum statistics (SPWSS, ORWSS). Genet Epidemiol 2011; 35:398-409. [PMID: 21594893 DOI: 10.1002/gepi.20588] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Revised: 03/25/2011] [Accepted: 03/30/2011] [Indexed: 01/04/2023]
Abstract
It is generally known that risk variants segregate together with a disease within families, but this information has not been used in the existing statistical methods for detecting rare variants. Here we introduce two weighted sum statistics that can apply to either genome-wide association data or resequencing data for identifying rare disease variants: weights calculated based on sibpairs and odd ratios, respectively. We evaluated the two methods via extensive simulations under different disease models. We compared the proposed methods with the weighted sum statistic (WSS) proposed by Madsen and Browning, keeping the same genotyping or resequencing cost. Our methods clearly demonstrate more statistical power than the WSS. In addition, we found that using sibpair information can increase power over using only unrelated samples by more than 40%. We applied our methods to the Framingham Heart Study (FHS) and Wellcome Trust Case Control Consortium (WTCCC) hypertension datasets. Although we did not identify any genes as reaching a genome-wide significance level, we found variants in the candidate gene angiotensinogen significantly associated with hypertension at P = 6.9 × 10(-4), whereas the most significant single SNP association evidence is P = 0.063. We further applied the odds ratio weighted method to the IFIH1 gene for type-1 diabetes in the WTCCC data. Our method yielded a P-value of 4.82 × 10(-4), much more significant than that obtained by haplotype-based methods. We demonstrated that family data are extremely informative in searching for rare variants underlying complex traits, and the odds ratio weighted sum statistic is more efficient than currently existing methods.
Collapse
Affiliation(s)
- Tao Feng
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, USA
| | | | | |
Collapse
|
42
|
Abstract
We provide an overview of ongoing discovery efforts in the genetics of blood pressure (BP) and hypertension (HTN) traits. Two large genome-wide association meta-analyses of individuals of European descent were recently published, revealing ~13 new loci for BP traits. Only two of these loci harbor genes in a pathway known to affect BP (CYP17A1 and NPPA/NPPB). Functional variants in these loci are still unknown. Few genome-wide association studies (GWAS) of complex diseases have been published from non-European populations. The study of populations with different evolutionary history and linkage disequilibrium (LD) structure, such as individuals of African ancestry, may provide an opportunity to further narrow these regions to identify the causal gene(s). Several collaborative efforts toward discovery of low-frequency variants and copy number variation for BP traits are currently underway. As evidence for new loci for complex diseases accumulates the assessment of the epidemiologic architecture of these variants in populations assumes higher priority. The impact of public health-relevant contexts such as diet, physical activity, psychosocial factors, and aging has not been examined for most common variants associated with BP.
Collapse
|
43
|
Current world literature. Curr Opin Rheumatol 2011; 23:317-24. [PMID: 21448013 DOI: 10.1097/bor.0b013e328346809c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
44
|
Fransen K, Mitrovic M, van Diemen CC, Weersma RK. The quest for genetic risk factors for Crohn's disease in the post-GWAS era. Genome Med 2011; 3:13. [PMID: 21392414 PMCID: PMC3092098 DOI: 10.1186/gm227] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Multiple genome-wide association studies (GWASs) and two large scale meta-analyses have been performed for Crohn's disease and have identified 71 susceptibility loci. These findings have contributed greatly to our current understanding of the disease pathogenesis. Yet, these loci only explain approximately 23% of the disease heritability. One of the future challenges in this post-GWAS era is to identify potential sources of the remaining heritability. Such sources may include common variants with limited effect size, rare variants with higher effect sizes, structural variations, or even more complicated mechanisms such as epistatic, gene-environment and epigenetic interactions. Here, we outline potential sources of this hidden heritability, focusing on Crohn's disease and the currently available data. We also discuss future strategies to determine more about the heritability; these strategies include expanding current GWAS, fine-mapping, whole genome sequencing or exome sequencing, and using family-based approaches. Despite the current limitations, such strategies may help to transfer research achievements into clinical practice and guide the improvement of preventive and therapeutic measures.
Collapse
Affiliation(s)
- Karin Fransen
- Department of Genetics, University Medical Centre Groningen and University of Groningen, Groningen, the Netherlands
- Department of Gastroenterology and Hepatology, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands
| | - Mitja Mitrovic
- Department of Genetics, University Medical Centre Groningen and University of Groningen, Groningen, the Netherlands
- Center for Human Molecular Genetics and Pharmacogenomics, Medical Faculty, University of Maribor, Maribor, Slovenia
| | - Cleo C van Diemen
- Department of Genetics, University Medical Centre Groningen and University of Groningen, Groningen, the Netherlands
| | - Rinse K Weersma
- Department of Gastroenterology and Hepatology, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands
| |
Collapse
|
45
|
Scherag A, Jarick I, Grothe J, Biebermann H, Scherag S, Volckmar AL, Vogel CIG, Greene B, Hebebrand J, Hinney A. Investigation of a genome wide association signal for obesity: synthetic association and haplotype analyses at the melanocortin 4 receptor gene locus. PLoS One 2010; 5:e13967. [PMID: 21085626 PMCID: PMC2981522 DOI: 10.1371/journal.pone.0013967] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Accepted: 10/24/2010] [Indexed: 01/17/2023] Open
Abstract
Background Independent genome-wide association studies (GWAS) showed an obesogenic effect of two single nucleotide polymorphisms (SNP; rs12970134 and rs17782313) more than 150 kb downstream of the melanocortin 4 receptor gene (MC4R). It is unclear if the SNPs directly influence MC4R function or expression, or if the SNPs are on a haplotype that predisposes to obesity or includes functionally relevant genetic variation (synthetic association). As both exist, functionally relevant mutations and polymorphisms in the MC4R coding region and a robust association downstream of the gene, MC4R is an ideal model to explore synthetic association. Methodology/Principal Findings We analyzed a genomic region (364.9 kb) encompassing the MC4R in GWAS data of 424 obesity trios (extremely obese child/adolescent and both parents). SNP rs12970134 showed the lowest p-value (p = 0.004; relative risk for the obesity effect allele: 1.37); conditional analyses on this SNP revealed that 7 of 78 analyzed SNPs provided independent signals (p≤0.05). These 8 SNPs were used to derive two-marker haplotypes. The three best (according to p-value) haplotype combinations were chosen for confirmation in 363 independent obesity trios. The confirmed obesity effect haplotype includes SNPs 3′ and 5′ of the MC4R. Including MC4R coding variants in a joint model had almost no impact on the effect size estimators expected under synthetic association. Conclusions/Significance A haplotype reaching from a region 5′ of the MC4R to a region at least 150 kb from the 3′ end of the gene showed a stronger association to obesity than single SNPs. Synthetic association analyses revealed that MC4R coding variants had almost no impact on the association signal. Carriers of the haplotype should be enriched for relevant mutations outside the MC4R coding region and could thus be used for re-sequencing approaches. Our data also underscore the problems underlying the identification of relevant mutations depicted by GWAS derived SNPs.
Collapse
Affiliation(s)
- André Scherag
- Institute of Medical Informatics, Biometry and Epidemiology, University of Duisburg-Essen, Essen, Germany
| | - Ivonne Jarick
- Institute of Medical Biometry and Epidemiology, Philipps-University of Marburg, Marburg, Germany
| | - Jessica Grothe
- Institute of Experimental Paediatric Endocrinology, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Heike Biebermann
- Institute of Experimental Paediatric Endocrinology, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Susann Scherag
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Duisburg-Essen, Essen, Germany
| | - Anna-Lena Volckmar
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Duisburg-Essen, Essen, Germany
| | - Carla Ivane Ganz Vogel
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Duisburg-Essen, Essen, Germany
| | - Brandon Greene
- Institute of Medical Biometry and Epidemiology, Philipps-University of Marburg, Marburg, Germany
| | - Johannes Hebebrand
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Duisburg-Essen, Essen, Germany
| | - Anke Hinney
- Department of Child and Adolescent Psychiatry and Psychotherapy, University of Duisburg-Essen, Essen, Germany
- * E-mail:
| |
Collapse
|