1
|
Saadat M. Enrichment analysis of loci associated with psoriasis susceptibility identified in genome-wide association studies. Arch Dermatol Res 2025; 317:564. [PMID: 40088290 DOI: 10.1007/s00403-025-04100-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Revised: 02/17/2025] [Accepted: 02/25/2025] [Indexed: 03/17/2025]
Abstract
Psoriasis is a chronic autoimmune skin disease that has been associated with many polymorphic genes based on genome-wide association studies (GWAS). To perform gene ontology and biological pathway analysis of genes associated with psoriasis and also to investigate the degree of commonality of psoriasis-associated loci with susceptibility loci of some other autoimmune diseases, a total of 319 psoriasis-associated polymorphic protein-coding genes were included in the analysis. The web-based Enrichr was used for performing present analysis. Cytokine-cytokine receptor interaction and JAK-STAT signaling were predicted by KEGG analysis. The top biological process and molecular function were positive regulation of cytokine production and cytokine receptor activity, respectively. The present findings revealed that the psoriasis-associated genes were significantly involved in the immune system and its functions, consistent with the autoimmune nature of the disease. The present study revealed that there were 10 genes shared between psoriasis and other 5 autoimmune diseases, interestingly, these genes were significantly enriched in the JAK-STAT pathway (adjusted p-value = 3.75e-5). There were several multifactorial diseases and complex traits that had a statistically significant commonality between their predisposing genes and psoriasis predisposing genes. The current findings suggest that the JAK-STAT pathway is likely involved in the pathogenesis of psoriasis, such as many other autoimmune diseases. Therefore, it is suggested that inhibitors that block the gene transcription of pro-inflammatory cytokines by blocking intracellular signaling pathways mediated by JAK-STAT may be a good model for the treatment of autoimmune diseases, including psoriasis.
Collapse
Affiliation(s)
- Mostafa Saadat
- Department of Biology, School of Science, Shiraz University, Shiraz, 71467-13565, Iran.
| |
Collapse
|
2
|
Bhol NK, Bhanjadeo MM, Singh AK, Dash UC, Ojha RR, Majhi S, Duttaroy AK, Jena AB. The interplay between cytokines, inflammation, and antioxidants: mechanistic insights and therapeutic potentials of various antioxidants and anti-cytokine compounds. Biomed Pharmacother 2024; 178:117177. [PMID: 39053423 DOI: 10.1016/j.biopha.2024.117177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 07/03/2024] [Accepted: 07/22/2024] [Indexed: 07/27/2024] Open
Abstract
Cytokines regulate immune responses essential for maintaining immune homeostasis, as deregulated cytokine signaling can lead to detrimental outcomes, including inflammatory disorders. The antioxidants emerge as promising therapeutic agents because they mitigate oxidative stress and modulate inflammatory pathways. Antioxidants can potentially ameliorate inflammation-related disorders by counteracting excessive cytokine-mediated inflammatory responses. A comprehensive understanding of cytokine-mediated inflammatory pathways and the interplay with antioxidants is paramount for developing natural therapeutic agents targeting inflammation-related disorders and helping to improve clinical outcomes and enhance the quality of life for patients. Among these antioxidants, curcumin, vitamin C, vitamin D, propolis, allicin, and cinnamaldehyde have garnered attention for their anti-inflammatory properties and potential therapeutic benefits. This review highlights the interrelationship between cytokines-mediated disorders in various diseases and therapeutic approaches involving antioxidants.
Collapse
Affiliation(s)
- Nitish Kumar Bhol
- Post Graduate Department of Biotechnology, Utkal University, Bhubaneswar, Odisha 751004, India
| | | | - Anup Kumar Singh
- National Centre for Cell Science, Savitribai Phule Pune University Campus, Ganeshkhind, Pune, India
| | - Umesh Chandra Dash
- Environmental Biotechnology Laboratory, KIIT School of Biotechnology, KIIT Deemed to be University, Bhubaneswar, Odisha, India
| | - Rakesh Ranjan Ojha
- Department of Bioinformatics, BJB (A) College, Bhubaneswar, Odisha-751014, India
| | - Sanatan Majhi
- Post Graduate Department of Biotechnology, Utkal University, Bhubaneswar, Odisha 751004, India
| | - Asim K Duttaroy
- Department of Nutrition, Institute of Medical Sciences, Faculty of Medicine, University of Oslo, Norway.
| | - Atala Bihari Jena
- National Centre for Cell Science, Savitribai Phule Pune University Campus, Ganeshkhind, Pune, India.
| |
Collapse
|
3
|
Gholami M, Coleman-Fuller N, Salehirad M, Darbeheshti S, Motaghinejad M. Neuroprotective Effects of Sodium-Glucose Cotransporter-2 (SGLT2) Inhibitors (Gliflozins) on Diabetes-Induced Neurodegeneration and Neurotoxicity: A Graphical Review. Int J Prev Med 2024; 15:28. [PMID: 39239308 PMCID: PMC11376549 DOI: 10.4103/ijpvm.ijpvm_5_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 02/20/2024] [Indexed: 09/07/2024] Open
Abstract
Diabetes is a chronic endocrine disorder that negatively affects various body systems, including the nervous system. Diabetes can cause or exacerbate various neurological disorders, and diabetes-induced neurodegeneration can involve several mechanisms such as mitochondrial dysfunction, activation of oxidative stress, neuronal inflammation, and cell death. In recent years, the management of diabetes-induced neurodegeneration has relied on several types of drugs, including sodium-glucose cotransporter-2 (SGLT2) inhibitors, also called gliflozins. In addition to exerting powerful effects in reducing blood glucose, gliflozins have strong anti-neuro-inflammatory characteristics that function by inhibiting oxidative stress and cell death in the nervous system in diabetic subjects. This review presents the molecular pathways involved in diabetes-induced neurodegeneration and evaluates the clinical and laboratory studies investigating the neuroprotective effects of gliflozins against diabetes-induced neurodegeneration, with discussion about the contributing roles of diverse molecular pathways, such as mitochondrial dysfunction, oxidative stress, neuro-inflammation, and cell death. Several databases-including Web of Science, Scopus, PubMed, Google Scholar, and various publishers, such as Springer, Wiley, and Elsevier-were searched for keywords regarding the neuroprotective effects of gliflozins against diabetes-triggered neurodegenerative events. Additionally, anti-neuro-inflammatory, anti-oxidative stress, and anti-cell death keywords were applied to evaluate potential neuronal protection mechanisms of gliflozins in diabetes subjects. The search period considered valid peer-reviewed studies published from January 2000 to July 2023. The current body of literature suggests that gliflozins can exert neuroprotective effects against diabetes-induced neurodegenerative events and neuronal dysfunction, and these effects are mediated via activation of mitochondrial function and prevention of cell death processes, oxidative stress, and inflammation in neurons affected by diabetes. Gliflozins can confer neuroprotective properties in diabetes-triggered neurodegeneration, and these effects are mediated by inhibiting oxidative stress, inflammation, and cell death.
Collapse
Affiliation(s)
- Mina Gholami
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Natalie Coleman-Fuller
- Department of Veterinary and Biomedical Sciences, University of Minnesota, St. Paul, MN, USA
| | - Mahsa Salehirad
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sepideh Darbeheshti
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Majid Motaghinejad
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
4
|
da Silva Rosa SC, Barzegar Behrooz A, Guedes S, Vitorino R, Ghavami S. Prioritization of genes for translation: a computational approach. Expert Rev Proteomics 2024; 21:125-147. [PMID: 38563427 DOI: 10.1080/14789450.2024.2337004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/21/2024] [Indexed: 04/04/2024]
Abstract
INTRODUCTION Gene identification for genetic diseases is critical for the development of new diagnostic approaches and personalized treatment options. Prioritization of gene translation is an important consideration in the molecular biology field, allowing researchers to focus on the most promising candidates for further investigation. AREAS COVERED In this paper, we discussed different approaches to prioritize genes for translation, including the use of computational tools and machine learning algorithms, as well as experimental techniques such as knockdown and overexpression studies. We also explored the potential biases and limitations of these approaches and proposed strategies to improve the accuracy and reliability of gene prioritization methods. Although numerous computational methods have been developed for this purpose, there is a need for computational methods that incorporate tissue-specific information to enable more accurate prioritization of candidate genes. Such methods should provide tissue-specific predictions, insights into underlying disease mechanisms, and more accurate prioritization of genes. EXPERT OPINION Using advanced computational tools and machine learning algorithms to prioritize genes, we can identify potential targets for therapeutic intervention of complex diseases. This represents an up-and-coming method for drug development and personalized medicine.
Collapse
Affiliation(s)
- Simone C da Silva Rosa
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
| | - Amir Barzegar Behrooz
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
- Electrophysiology Research Center, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Sofia Guedes
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Rui Vitorino
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro, Portugal
- Department of Medical Sciences, Institute of Biomedicine-iBiMED, University of Aveiro, Aveiro, Portugal
- UnIC@RISE, Department of Surgery and Physiology, Faculty of Medicine of the University of Porto, Porto, Portugal
| | - Saeid Ghavami
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
- Faculty of Medicine in Zabrze, Academia of Silesia, Katowice, Poland
- Research Institute of Oncology and Hematology, Cancer Care Manitoba, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
5
|
Zabad S, Gravel S, Li Y. Fast and accurate Bayesian polygenic risk modeling with variational inference. Am J Hum Genet 2023; 110:741-761. [PMID: 37030289 PMCID: PMC10183379 DOI: 10.1016/j.ajhg.2023.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 03/13/2023] [Indexed: 04/10/2023] Open
Abstract
The advent of large-scale genome-wide association studies (GWASs) has motivated the development of statistical methods for phenotype prediction with single-nucleotide polymorphism (SNP) array data. These polygenic risk score (PRS) methods use a multiple linear regression framework to infer joint effect sizes of all genetic variants on the trait. Among the subset of PRS methods that operate on GWAS summary statistics, sparse Bayesian methods have shown competitive predictive ability. However, most existing Bayesian approaches employ Markov chain Monte Carlo (MCMC) algorithms, which are computationally inefficient and do not scale favorably to higher dimensions, for posterior inference. Here, we introduce variational inference of polygenic risk scores (VIPRS), a Bayesian summary statistics-based PRS method that utilizes variational inference techniques to approximate the posterior distribution for the effect sizes. Our experiments with 36 simulation configurations and 12 real phenotypes from the UK Biobank dataset demonstrated that VIPRS is consistently competitive with the state-of-the-art in prediction accuracy while being more than twice as fast as popular MCMC-based approaches. This performance advantage is robust across a variety of genetic architectures, SNP heritabilities, and independent GWAS cohorts. In addition to its competitive accuracy on the "White British" samples, VIPRS showed improved transferability when applied to other ethnic groups, with up to 1.7-fold increase in R2 among individuals of Nigerian ancestry for low-density lipoprotein (LDL) cholesterol. To illustrate its scalability, we applied VIPRS to a dataset of 9.6 million genetic markers, which conferred further improvements in prediction accuracy for highly polygenic traits, such as height.
Collapse
Affiliation(s)
- Shadi Zabad
- School of Computer Science, McGill University, Montreal, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada.
| | - Yue Li
- School of Computer Science, McGill University, Montreal, QC, Canada.
| |
Collapse
|
6
|
Karami F, Jamaati H, Coleman-Fuller N, Zeini MS, Hayes AW, Gholami M, Salehirad M, Darabi M, Motaghinejad M. Is metformin neuroprotective against diabetes mellitus-induced neurodegeneration? An updated graphical review of molecular basis. Pharmacol Rep 2023; 75:511-543. [PMID: 37093496 DOI: 10.1007/s43440-023-00469-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 02/21/2023] [Accepted: 02/23/2023] [Indexed: 04/25/2023]
Abstract
Diabetes mellitus (DM) is a metabolic disease that activates several molecular pathways involved in neurodegenerative disorders. Metformin, an anti-hyperglycemic drug used for treating DM, has the potential to exert a significant neuroprotective role against the detrimental effects of DM. This review discusses recent clinical and laboratory studies investigating the neuroprotective properties of metformin against DM-induced neurodegeneration and the roles of various molecular pathways, including mitochondrial dysfunction, oxidative stress, inflammation, apoptosis, and its related cascades. A literature search was conducted from January 2000 to December 2022 using multiple databases including Web of Science, Wiley, Springer, PubMed, Elsevier Science Direct, Google Scholar, the Core Collection, Scopus, and the Cochrane Library to collect and evaluate peer-reviewed literature regarding the neuroprotective role of metformin against DM-induced neurodegenerative events. The literature search supports the conclusion that metformin is neuroprotective against DM-induced neuronal cell degeneration in both peripheral and central nervous systems, and this effect is likely mediated via modulation of oxidative stress, inflammation, and cell death pathways.
Collapse
Affiliation(s)
- Fatemeh Karami
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamidreza Jamaati
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Natalie Coleman-Fuller
- Department of Veterinary and Biomedical Sciences, University of Minnesota, Saint Paul, MN, 55108, USA
| | - Maryam Shokrian Zeini
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - A Wallace Hayes
- University of South Florida College of Public Health and Institute for Integrative Toxicology, Michigan State University, East Lansing, USA
| | - Mina Gholami
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mahsa Salehirad
- Cognitive and Neuroscience Research Center (CNRC), Amir-Almomenin Hospital, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Mohammad Darabi
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Majid Motaghinejad
- Chronic Respiratory Disease Research Center (CRDRC), National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
7
|
Johnson TO, Akinsanmi AO, Ejembi SA, Adeyemi OE, Oche JR, Johnson GI, Adegboyega AE. Modern drug discovery for inflammatory bowel disease: The role of computational methods. World J Gastroenterol 2023; 29:310-331. [PMID: 36687123 PMCID: PMC9846937 DOI: 10.3748/wjg.v29.i2.310] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/02/2022] [Accepted: 12/21/2022] [Indexed: 01/06/2023] Open
Abstract
Inflammatory bowel diseases (IBDs) comprising ulcerative colitis, Crohn’s disease and microscopic colitis are characterized by chronic inflammation of the gastrointestinal tract. IBD has spread around the world and is becoming more prevalent at an alarming rate in developing countries whose societies have become more westernized. Cell therapy, intestinal microecology, apheresis therapy, exosome therapy and small molecules are emerging therapeutic options for IBD. Currently, it is thought that low-molecular-mass substances with good oral bio-availability and the ability to permeate the cell membrane to regulate the action of elements of the inflammatory signaling pathway are effective therapeutic options for the treatment of IBD. Several small molecule inhibitors are being developed as a promising alternative for IBD therapy. The use of highly efficient and time-saving techniques, such as computational methods, is still a viable option for the development of these small molecule drugs. The computer-aided (in silico) discovery approach is one drug development technique that has mostly proven efficacy. Computational approaches when combined with traditional drug development methodology dramatically boost the likelihood of drug discovery in a sustainable and cost-effective manner. This review focuses on the modern drug discovery approaches for the design of novel IBD drugs with an emphasis on the role of computational methods. Some computational approaches to IBD genomic studies, target identification, and virtual screening for the discovery of new drugs and in the repurposing of existing drugs are discussed.
Collapse
Affiliation(s)
| | | | | | | | - Jane-Rose Oche
- Department of Biochemistry, University of Jos, Jos 930222, Plateau, Nigeria
| | - Grace Inioluwa Johnson
- Faculty of Clinical Sciences, College of Health Sciences, University of Jos, Jos 930222, Plateau, Nigeria
| | | |
Collapse
|
8
|
Nguyen TH, He X, Brown RC, Webb BT, Kendler KS, Vladimirov VI, Riley BP, Bacanu SA. DECO: a framework for jointly analyzing de novo and rare case/control variants, and biological pathways. Brief Bioinform 2021; 22:bbab067. [PMID: 33791774 PMCID: PMC8425460 DOI: 10.1093/bib/bbab067] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 01/25/2021] [Accepted: 02/09/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Rare variant-based analyses are beginning to identify risk genes for neuropsychiatric disorders and other diseases. However, the identified genes only account for a fraction of predicted causal genes. Recent studies have shown that rare damaging variants are significantly enriched in specific gene-sets. Methods which are able to jointly model rare variants and gene-sets to identify enriched gene-sets and use these enriched gene-sets to prioritize additional risk genes could improve understanding of the genetic architecture of diseases. RESULTS We propose DECO (Integrated analysis of de novo mutations, rare case/control variants and omics information via gene-sets), an integrated method for rare-variant and gene-set analysis. The method can (i) test the enrichment of gene-sets directly within the statistical model, and (ii) use enriched gene-sets to rank existing genes and prioritize additional risk genes for tested disorders. In simulations, DECO performs better than a homologous method that uses only variant data. To demonstrate the application of the proposed protocol, we have applied this approach to rare-variant datasets of schizophrenia. Compared with a method which only uses variant information, DECO is able to prioritize additional risk genes. AVAILABILITY DECO can be used to analyze rare-variants and biological pathways or cell types for any disease. The package is available on Github https://github.com/hoangtn/DECO.
Collapse
Affiliation(s)
- Tan-Hoang Nguyen
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Xin He
- The Department of Human Genetics, University of Chicago, IL 60637, USA; Grossman Institute for Neuroscience, Quantitative Biology and Human Behavior, University of Chicago, Chicago, IL 60637, USA
| | - Ruth C Brown
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Bradley T Webb
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Kenneth S Kendler
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Vladimir I Vladimirov
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA; Department of Psychiatry & Behavioral Sciences, College of Medicine, Texas A&M University, College Station, TX, USA; and the Lieber Institute for Brain Development, Baltimore, MD, USA
| | - Brien P Riley
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Silviu-Alin Bacanu
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
9
|
Demetci P, Cheng W, Darnell G, Zhou X, Ramachandran S, Crawford L. Multi-scale inference of genetic trait architecture using biologically annotated neural networks. PLoS Genet 2021; 17:e1009754. [PMID: 34411094 PMCID: PMC8407593 DOI: 10.1371/journal.pgen.1009754] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 08/31/2021] [Accepted: 07/31/2021] [Indexed: 01/01/2023] Open
Abstract
In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets. We treat the weights and connections of the network as random variables with prior distributions that reflect how genetic effects manifest at different genomic scales. The BANNs software uses variational inference to provide posterior summaries which allow researchers to simultaneously perform (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. Through simulations, we show that our method improves upon state-of-the-art association mapping and enrichment approaches across a wide range of genetic architectures. We then further illustrate the benefits of BANNs by analyzing real GWA data assayed in approximately 2,000 heterogenous stock of mice from the Wellcome Trust Centre for Human Genetics and approximately 7,000 individuals from the Framingham Heart Study. Lastly, using a random subset of individuals of European ancestry from the UK Biobank, we show that BANNs is able to replicate known associations in high and low-density lipoprotein cholesterol content.
Collapse
Affiliation(s)
- Pinar Demetci
- Department of Computer Science, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Wei Cheng
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
| | - Gregory Darnell
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
- Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Sohini Ramachandran
- Department of Computer Science, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
| | - Lorin Crawford
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Microsoft Research New England, Cambridge, Massachusetts, United States of America
- Department of Biostatistics, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
10
|
Hukku A, Quick C, Luca F, Pique-Regi R, Wen X. BAGSE: a Bayesian hierarchical model approach for gene set enrichment analysis. Bioinformatics 2020; 36:1689-1695. [PMID: 31702789 DOI: 10.1093/bioinformatics/btz831] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Revised: 10/14/2019] [Accepted: 11/06/2019] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Gene set enrichment analysis has been shown to be effective in identifying relevant biological pathways underlying complex diseases. Existing approaches lack the ability to quantify the enrichment levels accurately, hence preventing the enrichment information to be further utilized in both upstream and downstream analyses. A modernized and rigorous approach for gene set enrichment analysis that emphasizes both hypothesis testing and enrichment estimation is much needed. RESULTS We propose a novel computational method, Bayesian Analysis of Gene Set Enrichment (BAGSE), for gene set enrichment analysis. BAGSE is built on a Bayesian hierarchical model and fully accounts for the uncertainty embedded in the association evidence of individual genes. We adopt an empirical Bayes inference framework to fit the proposed hierarchical model by implementing an efficient EM algorithm. Through simulation studies, we illustrate that BAGSE yields accurate enrichment quantification while achieving similar power as the state-of-the-art methods. Further simulation studies show that BAGSE can effectively utilize the enrichment information to improve the power in gene discovery. Finally, we demonstrate the application of BAGSE in analyzing real data from a differential expression experiment and a transcriptome-wide association study. Our results indicate that the proposed statistical framework is effective in aiding the discovery of potentially causal pathways and gene networks. AVAILABILITY AND IMPLEMENTATION BAGSE is implemented using the C++ programing language and is freely available from https://github.com/xqwen/bagse/. Simulated and real data used in this paper are also available at the Github repository for reproducibility purposes. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Abhay Hukku
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Corbin Quick
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI 48201, USA
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI 48201, USA
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
11
|
Song Y, Zhou X, Zhang M, Zhao W, Liu Y, Kardia SLR, Diez Roux AV, Needham BL, Smith JA, Mukherjee B. Bayesian shrinkage estimation of high dimensional causal mediation effects in omics studies. Biometrics 2020; 76:700-710. [PMID: 31733066 PMCID: PMC7228845 DOI: 10.1111/biom.13189] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 10/30/2019] [Accepted: 11/04/2019] [Indexed: 11/29/2022]
Abstract
Causal mediation analysis aims to examine the role of a mediator or a group of mediators that lie in the pathway between an exposure and an outcome. Recent biomedical studies often involve a large number of potential mediators based on high-throughput technologies. Most of the current analytic methods focus on settings with one or a moderate number of potential mediators. With the expanding growth of -omics data, joint analysis of molecular-level genomics data with epidemiological data through mediation analysis is becoming more common. However, such joint analysis requires methods that can simultaneously accommodate high-dimensional mediators and that are currently lacking. To address this problem, we develop a Bayesian inference method using continuous shrinkage priors to extend previous causal mediation analysis techniques to a high-dimensional setting. Simulations demonstrate that our method improves the power of global mediation analysis compared to simpler alternatives and has decent performance to identify true nonnull contributions to the mediation effects of the pathway. The Bayesian method also helps us to understand the structure of the composite null cases for inactive mediators in the pathway. We applied our method to Multi-Ethnic Study of Atherosclerosis and identified DNA methylation regions that may actively mediate the effect of socioeconomic status on cardiometabolic outcomes.
Collapse
Affiliation(s)
- Yanyi Song
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Min Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Yongmei Liu
- Department of Epidemiology and Prevention, Wake Forest School of Medicine, Winston-Salem, NC, U.S.A
| | | | - Ana V. Diez Roux
- Department of Epidemiology and Biostatistics, Drexel University, Philadelphia, PA, U.S.A
| | - Belinda L. Needham
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| |
Collapse
|
12
|
Biswas S, Pal S, Majumder PP, Bhattacharjee S. A framework for pathway knowledge driven prioritization in genome-wide association studies. Genet Epidemiol 2020; 44:841-853. [PMID: 32779262 PMCID: PMC7116354 DOI: 10.1002/gepi.22345] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 06/18/2020] [Accepted: 07/10/2020] [Indexed: 12/27/2022]
Abstract
Many variants with low frequencies or with low to modest effects likely remain unidentified in genome-wide association studies (GWAS) because of stringent genome-wide thresholds for detection. To improve the power of detection, variant prioritization based on their functional annotations and epigenetic landmarks has been used successfully. Here, we propose a novel method of prioritization of a GWAS by exploiting gene-level knowledge (e.g., annotations to pathways and ontologies) and show that it further improves power. Often, disease associated variants are found near genes that are coinvolved in specific biological pathways relevant to disease process. Utilization of this knowledge to conduct a prioritized scan increases the power to detect loci that map to genes clustered in a few specific pathways. We have developed a computationally scalable framework based on penalized logistic regression (termed GKnowMTest-Genomic Knowledge-guided Multiplte Testing) to enable a prioritized pathway-guided GWAS scan with a very large number of gene-level annotations. We demonstrate that the proposed strategy improves overall power and maintains the Type 1 error globally. Our method works on genome-wide summary level data and a user-specified list of pathways (e.g., those extracted from large pathway databases without reference to biology of a specific disease). It automatically reweights the input p values by incorporating the pathway enrichments as "adaptively learned" from the data using a cross-validation technique to avoid overfitting. We used whole-genome simulations and some publicly available GWAS data sets to illustrate the application of our method. The GKnowMTest framework has been implemented as a user-friendly open-source R package.
Collapse
Affiliation(s)
| | - Soumen Pal
- National Institute of Biomedical Genomics, Kalyani, India
| | | | | |
Collapse
|
13
|
Cheng W, Ramachandran S, Crawford L. Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits. PLoS Genet 2020; 16:e1008855. [PMID: 32542026 PMCID: PMC7316356 DOI: 10.1371/journal.pgen.1008855] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 06/25/2020] [Accepted: 05/13/2020] [Indexed: 12/22/2022] Open
Abstract
Traditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits. Our key insight is that enrichment studies on the gene-level are improved when we reformulate the genome-wide SNP-level null hypothesis to identify spurious small-to-intermediate SNP effects and classify them as non-causal. gene-ε efficiently identifies enriched genes under a variety of simulated genetic architectures, achieving greater than a 90% true positive rate at 1% false positive rate for polygenic traits. Lastly, we apply gene-ε to summary statistics derived from six quantitative traits using European-ancestry individuals in the UK Biobank, and identify enriched genes that are in biologically relevant pathways.
Collapse
Affiliation(s)
- Wei Cheng
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Sohini Ramachandran
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Lorin Crawford
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Biostatistics, Brown University, Providence, Rhode Island, United States of America
- Center for Statistical Sciences, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
14
|
McGuirl MR, Smith SP, Sandstede B, Ramachandran S. Detecting Shared Genetic Architecture Among Multiple Phenotypes by Hierarchical Clustering of Gene-Level Association Statistics. Genetics 2020; 215:511-529. [PMID: 32245788 PMCID: PMC7268989 DOI: 10.1534/genetics.120.303096] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 03/31/2020] [Indexed: 12/31/2022] Open
Abstract
Emerging large-scale biobanks pairing genotype data with phenotype data present new opportunities to prioritize shared genetic associations across multiple phenotypes for molecular validation. Past research, by our group and others, has shown gene-level tests of association produce biologically interpretable characterization of the genetic architecture of a given phenotype. Here, we present a new method, Ward clustering to identify Internal Node branch length outliers using Gene Scores (WINGS), for identifying shared genetic architecture among multiple phenotypes. The objective of WINGS is to identify groups of phenotypes, or "clusters," sharing a core set of genes enriched for mutations in cases. We validate WINGS using extensive simulation studies and then combine gene-level association tests with WINGS to identify shared genetic architecture among 81 case-control and seven quantitative phenotypes in 349,468 European-ancestry individuals from the UK Biobank. We identify eight prioritized phenotype clusters and recover multiple published gene-level associations within prioritized clusters.
Collapse
Affiliation(s)
- Melissa R McGuirl
- Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912
| | - Samuel Pattillo Smith
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island 02912
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island 02912
| | - Björn Sandstede
- Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912
- Data Science Initiative, Brown University, Providence, Rhode Island 02912
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island 02912
- Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island 02912
| |
Collapse
|
15
|
Godoy GJ, Olivera C, Paira DA, Salazar FC, Ana Y, Stempin CC, Motrich RD, Rivero VE. T Regulatory Cells From Non-obese Diabetic Mice Show Low Responsiveness to IL-2 Stimulation and Exhibit Differential Expression of Anergy-Related and Ubiquitination Factors. Front Immunol 2019; 10:2665. [PMID: 31824482 PMCID: PMC6886461 DOI: 10.3389/fimmu.2019.02665] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 10/28/2019] [Indexed: 11/13/2022] Open
Abstract
Foxp3+ Regulatory T cells (Tregs) are pivotal for the maintenance of tolerance. Alterations in their number and/or function have been proposed to occur in the autoimmune-prone non-obese diabetic (NOD) mouse. Comparing the frequencies and absolute numbers of CD4+Foxp3+CD25+ Tregs among 4 to 6-week old NOD, B6, and BALB/c mice, we observed differences in counts and Foxp3 expression in Tregs from secondary lymphoid organs, but not in the thymus. Upon TCR and IL-2 stimulation, NOD Tregs showed lower responses than Tregs from B6 and BALB/c mice. Indeed, NOD Tregs responded with less proliferation and with smaller increments in the expression of CD25, LAP-1, CD39, PD-1, PD-L1, and LAG-3, when in vitro cultured for 3 days with anti-CD3/CD28 in the absence or presence of IL-2, Tregs from NOD mice showed to be highly dependent on IL-2 to maintain Foxp3 expression. Moreover, NOD Tregs become producers of IL-17 and INF-gamma more easily than Tregs from the other strains. In addition, NOD Tregs showed lower responsiveness to IL-2, with significantly reduced levels of pSTAT5, even at high IL-2 doses, with respect to B6 and BALB/c Tregs. Interestingly, NOD Tregs exhibit differences in the expression of SOCS3, GRAIL, and OTUB1 when compared with Tregs from B6 and BALB/c mice. Both, at steady state conditions and also after activation, Tregs from NOD mice showed increased levels of OTUB1 and low levels of GRAIL. In addition, NOD Tregs had differences in the expression of ubiquitin related molecules that play a role in the maintenance of Foxp3 cellular pools. Indeed, significantly higher STUB1/USP7 ratios were detected in NOD Tregs, both at basal conditions and after stimulation, compared to in B6 and BALB/c Tregs. Moreover, the addition of a proteasome inhibitor to cell cultures, conferred NOD Tregs the ability to retain Foxp3 expression. Herein, we provide evidence indicating a differential expression of SOCS3, GRAIL, and STUB1/USP7 in Tregs from NOD mice, factors known to be involved in IL-2R signaling and to affect Foxp3 stability. These findings add to the current knowledge of the immunobiology of Tregs and may be related to the known insufficiency of Tregs from NOD mice to maintain self-tolerance.
Collapse
Affiliation(s)
- Gloria J Godoy
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Carolina Olivera
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Daniela A Paira
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Florencia C Salazar
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Yamile Ana
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Cinthia C Stempin
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Ruben D Motrich
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Virginia E Rivero
- Centro de Investigaciones en Bioquímica Clínica e Inmunología, CONICET, Córdoba, Argentina.,Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, Argentina
| |
Collapse
|
16
|
Ryan FJ, Drew DP, Douglas C, Leong LEX, Moldovan M, Lynn M, Fink N, Sribnaia A, Penttila I, McPhee AJ, Collins CT, Makrides M, Gibson RA, Rogers GB, Lynn DJ. Changes in the Composition of the Gut Microbiota and the Blood Transcriptome in Preterm Infants at Less than 29 Weeks Gestation Diagnosed with Bronchopulmonary Dysplasia. mSystems 2019; 4:e00484-19. [PMID: 31662429 PMCID: PMC6819732 DOI: 10.1128/msystems.00484-19] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 10/09/2019] [Indexed: 12/21/2022] Open
Abstract
Bronchopulmonary dysplasia (BPD) is a common chronic lung condition in preterm infants that results in abnormal lung development and leads to considerable morbidity and mortality, making BPD one of the most common complications of preterm birth. We employed RNA sequencing and 16S rRNA gene sequencing to profile gene expression in blood and the composition of the fecal microbiota in infants born at <29 weeks gestational age and diagnosed with BPD in comparison to those of preterm infants that were not diagnosed with BPD. 16S rRNA gene sequencing, performed longitudinally on 255 fecal samples collected from 50 infants in the first months of life, identified significant differences in the relative levels of abundance of Klebsiella, Salmonella, Escherichia/Shigella, and Bifidobacterium in the BPD infants in a manner that was birth mode dependent. Transcriptome sequencing (RNA-Seq) analysis revealed that more than 400 genes were upregulated in infants with BPD. Genes upregulated in BPD infants were significantly enriched for functions related to red blood cell development and oxygen transport, while several immune-related pathways were downregulated. We also identified a gene expression signature consistent with an enrichment of immunosuppressive CD71+ early erythroid cells in infants with BPD. Intriguingly, genes that were correlated in their expression with the relative abundances of specific taxa in the microbiota were significantly enriched for roles in the immune system, suggesting that changes in the microbiota might influence immune gene expression systemically.IMPORTANCE Bronchopulmonary dysplasia (BPD) is a serious inflammatory condition of the lung and is the most common complication associated with preterm birth. A large body of evidence now suggests that the gut microbiota can influence immunity and inflammation systemically; however, the role of the gut microbiota in BPD has not been evaluated to date. Here, we report that there are significant differences in the gut microbiota of infants born at <29 weeks gestation and subsequently diagnosed with BPD, which are particularly pronounced when infants are stratified by birth mode. We also show that erythroid and immune gene expression levels are significantly altered in BPD infants. Interestingly, we identified an association between the composition of the microbiota and immune gene expression in blood in early life. Together, these findings suggest that the composition of the microbiota may influence the risk of developing BPD and, more generally, may shape systemic immune gene expression.
Collapse
Affiliation(s)
- Feargal J Ryan
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Damian P Drew
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- College of Medicine and Public Health, Flinders University, Bedford Park, South Australia, Australia
| | - Chloe Douglas
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia
| | - Lex E X Leong
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- College of Medicine and Public Health, Flinders University, Bedford Park, South Australia, Australia
| | - Max Moldovan
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Miriam Lynn
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Naomi Fink
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia
| | - Anastasia Sribnaia
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - Irmeli Penttila
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia
| | - Andrew J McPhee
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- Neonatal Medicine, Women's and Children's Hospital, North Adelaide, South Australia, Australia
| | - Carmel T Collins
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia
| | - Maria Makrides
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia
| | - Robert A Gibson
- SAHMRI Women and Kids, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- School of Agriculture, Food, and Wine, The University of Adelaide, Adelaide, South Australia, Australia
| | - Geraint B Rogers
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- College of Medicine and Public Health, Flinders University, Bedford Park, South Australia, Australia
| | - David J Lynn
- Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
- College of Medicine and Public Health, Flinders University, Bedford Park, South Australia, Australia
| |
Collapse
|
17
|
Zeng P, Hao X, Zhou X. Pleiotropic mapping and annotation selection in genome-wide association studies with penalized Gaussian mixture models. Bioinformatics 2019; 34:2797-2807. [PMID: 29635306 DOI: 10.1093/bioinformatics/bty204] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 04/02/2018] [Indexed: 12/11/2022] Open
Abstract
Motivation Genome-wide association studies (GWASs) have identified many genetic loci associated with complex traits. A substantial fraction of these identified loci is associated with multiple traits-a phenomena known as pleiotropy. Identification of pleiotropic associations can help characterize the genetic relationship among complex traits and can facilitate our understanding of disease etiology. Effective pleiotropic association mapping requires the development of statistical methods that can jointly model multiple traits with genome-wide single nucleic polymorphisms (SNPs) together. Results We develop a joint modeling method, which we refer to as the integrative MApping of Pleiotropic association (iMAP). iMAP models summary statistics from GWASs, uses a multivariate Gaussian distribution to account for phenotypic correlation, simultaneously infers genome-wide SNP association pattern using mixture modeling and has the potential to reveal causal relationship between traits. Importantly, iMAP integrates a large number of SNP functional annotations to substantially improve association mapping power, and, with a sparsity-inducing penalty, is capable of selecting informative annotations from a large, potentially non-informative set. To enable scalable inference of iMAP to association studies with hundreds of thousands of individuals and millions of SNPs, we develop an efficient expectation maximization algorithm based on an approximate penalized regression algorithm. With simulations and comparisons to existing methods, we illustrate the benefits of iMAP in terms of both high association mapping power and accurate estimation of genome-wide SNP association patterns. Finally, we apply iMAP to perform a joint analysis of 48 traits from 31 GWAS consortia together with 40 tissue-specific SNP annotations generated from the Roadmap Project. Availability and implementation iMAP is freely available at http://www.xzlab.org/software.html. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, Xuzhou Medical University, Xuzhou, Jiangsu, China.,Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.,Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Xingjie Hao
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.,Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.,Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
18
|
Madireddy L, Patsopoulos NA, Cotsapas C, Bos SD, Beecham A, McCauley J, Kim K, Jia X, Santaniello A, Caillier SJ, Andlauer TFM, Barcellos LF, Berge T, Bernardinelli L, Martinelli-Boneschi F, Booth DR, Briggs F, Celius EG, Comabella M, Comi G, Cree BAC, D’Alfonso S, Dedham K, Duquette P, Dardiotis E, Esposito F, Fontaine B, Gasperi C, Goris A, Dubois B, Gourraud PA, Hadjigeorgiou G, Haines J, Hawkins C, Hemmer B, Hintzen R, Horakova D, Isobe N, Kalra S, Kira JI, Khalil M, Kockum I, Lill CM, Lincoln M, Luessi F, Martin R, Oturai A, Palotie A, Pericak-Vance MA, Henry R, Saarela J, Ivinson A, Olsson T, Taylor BV, Stewart GJ, Harbo HF, Compston A, Hauser SL, Hafler DA, Zipp F, De Jager P, Sawcer S, Oksenberg JR, Baranzini SE. A systems biology approach uncovers cell-specific gene regulatory effects of genetic associations in multiple sclerosis. Nat Commun 2019; 10:2236. [PMID: 31110181 PMCID: PMC6527683 DOI: 10.1038/s41467-019-09773-y] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Accepted: 03/26/2019] [Indexed: 02/02/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified more than 50,000 unique associations with common human traits. While this represents a substantial step forward, establishing the biology underlying these associations has proven extremely difficult. Even determining which cell types and which particular gene(s) are relevant continues to be a challenge. Here, we conduct a cell-specific pathway analysis of the latest GWAS in multiple sclerosis (MS), which had analyzed a total of 47,351 cases and 68,284 healthy controls and found more than 200 non-MHC genome-wide associations. Our analysis identifies pan immune cell as well as cell-specific susceptibility genes in T cells, B cells and monocytes. Finally, genotype-level data from 2,370 patients and 412 controls is used to compute intra-individual and cell-specific susceptibility pathways that offer a biological interpretation of the individual genetic risk to MS. This approach could be adopted in any other complex trait for which genome-wide data is available.
Collapse
|
19
|
Najafi A, Janghorbani S, Motahari SA, Fatemizadeh E. Statistical Association Mapping of Population-Structured Genetic Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:638-649. [PMID: 29990264 DOI: 10.1109/tcbb.2017.2786239] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Association mapping of genetic diseases has attracted extensive research interest during the recent years. However, most of the methodologies introduced so far suffer from spurious inference of the associated sites due to population inhomogeneities. In this paper, we introduce a statistical framework to compensate for this shortcoming by equipping the current methodologies with a state-of-the-art clustering algorithm being widely used in population genetics applications. The proposed framework jointly infers the disease-associated factors and the hidden population structures. In this regard, a Markov Chain-Monte Carlo (MCMC) procedure has been employed to assess the posterior probability distribution of the model parameters. We have implemented our proposed framework on a software package whose performance is extensively evaluated on a number of synthetic datasets, and compared to some of the well-known existing methods such as STRUCTURE. It has been shown that in extreme scenarios, up to $10-15$10-15 percent of improvement in the inference accuracy is achieved with a moderate increase in computational complexity.
Collapse
|
20
|
Zhu X, Stephens M. Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nat Commun 2018; 9:4361. [PMID: 30341297 PMCID: PMC6195536 DOI: 10.1038/s41467-018-06805-x] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 09/29/2018] [Indexed: 12/27/2022] Open
Abstract
Genome-wide association studies (GWAS) aim to identify genetic factors associated with phenotypes. Standard analyses test variants for associations individually. However, variant-level associations are hard to identify and can be difficult to interpret biologically. Enrichment analyses help address both problems by targeting sets of biologically related variants. Here we introduce a new model-based enrichment method that requires only GWAS summary statistics. Applying this method to interrogate 4,026 gene sets in 31 human phenotypes identifies many previously-unreported enrichments, including enrichments of endochondral ossification pathway for height, NFAT-dependent transcription pathway for rheumatoid arthritis, brain-related genes for coronary artery disease, and liver-related genes for Alzheimer’s disease. A key feature of our method is that inferred enrichments automatically help identify new trait-associated genes. For example, accounting for enrichment in lipid transport genes highlights association between MTTP and low-density lipoprotein levels, whereas conventional analyses of the same data found no significant variants near this gene. In genome-wide association studies, variant-level associations are hard to identify and can be difficult to interpret biologically. Here, the authors develop a new model-based enrichment analysis method, and apply it to identify new associated genes, pathways and tissues across 31 human phenotypes.
Collapse
Affiliation(s)
- Xiang Zhu
- Department of Statistics, Stanford University, Stanford, 94305, CA, USA. .,Department of Statistics, The University of Chicago, Chicago, 60637, IL, USA.
| | - Matthew Stephens
- Department of Statistics, The University of Chicago, Chicago, 60637, IL, USA. .,Department of Human Genetics, The University of Chicago, Chicago, 60637, IL, USA.
| |
Collapse
|
21
|
Zhu X, Stephens M. BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES. Ann Appl Stat 2017; 11:1561-1592. [PMID: 29399241 PMCID: PMC5796536 DOI: 10.1214/17-aoas1046] [Citation(s) in RCA: 84] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/23/2023]
Abstract
Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a "Regression with Summary Statistics" (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss.
Collapse
|
22
|
Yang J, Fritsche LG, Zhou X, Abecasis G. A Scalable Bayesian Method for Integrating Functional Information in Genome-wide Association Studies. Am J Hum Genet 2017; 101:404-416. [PMID: 28844487 PMCID: PMC5590971 DOI: 10.1016/j.ajhg.2017.08.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 08/03/2017] [Indexed: 11/17/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified many complex loci. However, most loci reside in noncoding regions and have unknown biological functions. Integrative analysis that incorporates known functional information into GWASs can help elucidate the underlying biological mechanisms and prioritize important functional variants. Hence, we develop a flexible Bayesian variable selection model with efficient computational techniques for such integrative analysis. Different from previous approaches, our method models the effect-size distribution and probability of causality for variants with different annotations and jointly models genome-wide variants to account for linkage disequilibrium (LD), thus prioritizing associations based on the quantification of the annotations and allowing for multiple associated variants per locus. Our method dramatically improves both computational speed and posterior sampling convergence by taking advantage of the block-wise LD structures in human genomes. In simulations, our method accurately quantifies the functional enrichment and performs more powerfully for prioritizing the true associations than alternative methods, where the power gain is especially apparent when multiple associated variants in LD reside in the same locus. We applied our method to an in-depth GWAS of age-related macular degeneration with 33,976 individuals and 9,857,286 variants. We find the strongest enrichment for causality among non-synonymous variants (54× more likely to be causal, 1.4× larger effect sizes) and variants in transcription, repressed Polycomb, and enhancer regions, as well as identify five additional candidate loci beyond the 32 known AMD risk loci. In conclusion, our method is shown to efficiently integrate functional information in GWASs, helping identify functional associated-variants and underlying biology.
Collapse
Affiliation(s)
- Jingjing Yang
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA
| | - Lars G Fritsche
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA; K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Xiang Zhou
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
| | - Gonçalo Abecasis
- Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
| |
Collapse
|
23
|
Dennis J, Medina-Rivera A, Truong V, Antounians L, Zwingerman N, Carrasco G, Strug L, Wells P, Trégouët DA, Morange PE, Wilson MD, Gagnon F. Leveraging cell type specific regulatory regions to detect SNPs associated with tissue factor pathway inhibitor plasma levels. Genet Epidemiol 2017; 41:455-466. [PMID: 28421636 DOI: 10.1002/gepi.22049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 03/07/2017] [Accepted: 03/14/2017] [Indexed: 11/10/2022]
Abstract
Tissue factor pathway inhibitor (TFPI) regulates the formation of intravascular blood clots, which manifest clinically as ischemic heart disease, ischemic stroke, and venous thromboembolism (VTE). TFPI plasma levels are heritable, but the genetics underlying TFPI plasma level variability are poorly understood. Herein we report the first genome-wide association scan (GWAS) of TFPI plasma levels, conducted in 251 individuals from five extended French-Canadian Families ascertained on VTE. To improve discovery, we also applied a hypothesis-driven (HD) GWAS approach that prioritized single nucleotide polymorphisms (SNPs) in (1) hemostasis pathway genes, and (2) vascular endothelial cell (EC) regulatory regions, which are among the highest expressers of TFPI. Our GWAS identified 131 SNPs with suggestive evidence of association (P-value < 5 × 10-8 ), but no SNPs reached the genome-wide threshold for statistical significance. Hemostasis pathway genes were not enriched for TFPI plasma level associated SNPs (global hypothesis test P-value = 0.147), but EC regulatory regions contained more TFPI plasma level associated SNPs than expected by chance (global hypothesis test P-value = 0.046). We therefore stratified our genome-wide SNPs, prioritizing those in EC regulatory regions via stratified false discovery rate (sFDR) control, and reranked the SNPs by q-value. The minimum q-value was 0.27, and the top-ranked SNPs did not show association evidence in the MARTHA replication sample of 1,033 unrelated VTE cases. Although this study did not result in new loci for TFPI, our work lays out a strategy to utilize epigenomic data in prioritization schemes for future GWAS studies.
Collapse
Affiliation(s)
- Jessica Dennis
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Alejandra Medina-Rivera
- Program in Genetics and Genome Biology, the Hospital for Sick Children, Toronto, Canada.,Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | - Vinh Truong
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Lina Antounians
- Program in Genetics and Genome Biology, the Hospital for Sick Children, Toronto, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Nora Zwingerman
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Giovana Carrasco
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | - Lisa Strug
- Program in Genetics and Genome Biology, the Hospital for Sick Children, Toronto, Canada.,Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Phil Wells
- Ottawa Hospital Research Institute, Ottawa, Canada
| | - David-Alexandre Trégouët
- Sorbonne Universités, UPMC Univ Paris 06, Paris, France.,INSERM, UMR_S 1166, Paris, France.,ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - Pierre-Emmanuel Morange
- INSERM, UMR_S 1062, Marseille, France.,Inra, UMR_INRA 1260, Marseille, France.,Aix Marseille Université, Marseille, France
| | - Michael D Wilson
- Program in Genetics and Genome Biology, the Hospital for Sick Children, Toronto, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Canada.,Heart & Stroke Richard Lewar Centre of Excellence in Cardiovascular Research, Toronto, Canada
| | - France Gagnon
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| |
Collapse
|
24
|
Li Q, Yu M, Wang S. A Statistical Framework for Pathway and Gene Identification from Integrative Analysis. J MULTIVARIATE ANAL 2017; 156:1-17. [PMID: 28943673 PMCID: PMC5606168 DOI: 10.1016/j.jmva.2016.12.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
In the era of big data, integrative analyses that pool data from different sources are now extensively conducted in order to improve performance. Among many interesting applications, genomics research is an area where integrative methods become popular tools to identify prognostic biomarkers for various diseases. In this paper, we propose such a framework for pathway and gene identification. Our method employs a hierarchical decomposition on genes' effects followed by a proper regularization to identify important pathways and genes across multiple studies. Asymptotic theories are provided to show that our method is both pathway and gene selection consistent. More importantly, we explicitly show that pathway selection consistency needs milder statistical conditions than gene selection consistency, as it would allow false positives and negatives at the gene selection level. Finite-sample performance of our method is shown to be superior than other ad hoc methods in various simulation studies. We further apply our method to analyze five cardiovascular disease studies. Our method is intrinsically a general method on group-wise and element-wise selections from integrative analysis, which can have other applications beyond genomic research.
Collapse
Affiliation(s)
- Quefeng Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27517, USA. Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, NC 27709, USA
| | - Menggang Yu
- Department of Biostatistics and Medical Informatics, University of Wisconsin at Madison, Madison, WI 53792, USA
| | - Sijian Wang
- Department of Biostatistics and Medical Informatics, University of Wisconsin at Madison, Madison, WI 53792, USA
| |
Collapse
|
25
|
Zhang M, Mu H, Lv H, Duan L, Shang Z, Li J, Jiang Y, Zhang R. Integrative analysis of genome-wide association studies and gene expression analysis identifies pathways associated with rheumatoid arthritis. Oncotarget 2017; 7:8580-9. [PMID: 26885899 PMCID: PMC4890988 DOI: 10.18632/oncotarget.7390] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 01/28/2016] [Indexed: 11/25/2022] Open
Abstract
Rheumatoid arthritis (RA) is a complex and systematic autoimmune disease, which is usually influenced by both genetic and environmental factors. Pathway analyses based on a single data type such as microarray data or SNP data have successfully revealed some biology pathways associated with RA. However, we found that the pathway analysis based on a single data type only provide limited understanding about the pathogenesis of RA. Gene-disease association is usually caused by many ways, such as genotype, gene expression and so on. Therefore, the integrative analysis method combining multiple levels of evidence can more precisely and comprehensively identify the pathway associations. In this study, we performed a pathway analysis by integrating GWAS and gene expression analysis to detect the RA-related pathways. The integrative analysis identified 28 pathways associated with RA. Among these pathways, 18 pathways were also found by both GWAS and gene expression analysis, 7 pathways are novel RA-related pathways, such as B cell receptor signaling pathway, Toll-like receptor signaling pathway, Fc gamma R-mediated phagocytosis and so on. Compared with pathway analyses using only one type genomic data, we found integrative analysis can increase the power to identify the real associations and provided more stable and accurate results. We believe these results will contribute to perform future genetic studies in RA pathogenesis and may promote the development of new therapeutic strategies by targeting these pathways.
Collapse
Affiliation(s)
- Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongbo Mu
- College of Science, Northeast Forestry University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Lian Duan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Zhenwei Shang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jin Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ruijie Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
26
|
Abstract
Genetic and cellular studies of type 1 diabetes in patients and in the nonobese diabetic mouse model of type 1 diabetes point to an imbalance between effector T cells and regulatory T cells (Tregs) as a driver of the disease. The imbalance may arise as a consequence of genetically encoded defects in thymic deletion of islet antigen-specific T cells, induction of islet antigen-specific thymic Tregs, unfavorable tissue environment for peripheral Treg induction, and failure of islet antigen-specific Tregs to survive in the inflamed islets secondary to insufficient IL-2 signals. These understandings are the foundation for rationalized design of new therapeutic interventions to restore the balance by selectively targeting effector T cells and boosting Tregs.
Collapse
Affiliation(s)
- Allyson Spence
- Department of Surgery and UCSF Diabetes Center, University of California, 513 Parnassus HSE-520, Box 0780, San Francisco, CA, 94143, USA
| | - Qizhi Tang
- Department of Surgery and UCSF Diabetes Center, University of California, 513 Parnassus HSE-520, Box 0780, San Francisco, CA, 94143, USA.
| |
Collapse
|
27
|
Incorporating Functional Annotations for Fine-Mapping Causal Variants in a Bayesian Framework Using Summary Statistics. Genetics 2016; 204:933-958. [PMID: 27655946 DOI: 10.1534/genetics.116.188953] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2016] [Accepted: 09/07/2016] [Indexed: 12/21/2022] Open
Abstract
Functional annotations have been shown to improve both the discovery power and fine-mapping accuracy in genome-wide association studies. However, the optimal strategy to incorporate the large number of existing annotations is still not clear. In this study, we propose a Bayesian framework to incorporate functional annotations in a systematic manner. We compute the maximum a posteriori solution and use cross validation to find the optimal penalty parameters. By extending our previous fine-mapping method CAVIARBF into this framework, we require only summary statistics as input. We also derived an exact calculation of Bayes factors using summary statistics for quantitative traits, which is necessary when a large proportion of trait variance is explained by the variants of interest, such as in fine mapping expression quantitative trait loci (eQTL). We compared the proposed method with PAINTOR using different strategies to combine annotations. Simulation results show that the proposed method achieves the best accuracy in identifying causal variants among the different strategies and methods compared. We also find that for annotations with moderate effects from a large annotation pool, screening annotations individually and then combining the top annotations can produce overly optimistic results. We applied these methods on two real data sets: a meta-analysis result of lipid traits and a cis-eQTL study of normal prostate tissues. For the eQTL data, incorporating annotations significantly increased the number of potential causal variants with high probabilities.
Collapse
|
28
|
Pathway-based Genome-wide Association Studies Reveal the Association Between Growth Factor Activity and Inflammatory Bowel Disease. Inflamm Bowel Dis 2016; 22:1540-51. [PMID: 27104816 DOI: 10.1097/mib.0000000000000785] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
BACKGROUND The inflammatory bowel diseases known as Crohn's disease (CD) and ulcerative colitis (UC) are related autoimmune conditions with a complex etiology composed of genetic and environmental factors. Genetic studies have revealed 200 susceptibility loci for inflammatory bowel diseases, but these only account for a small fraction of the genetic heritability of the disease. We employed pathway-based approaches to identify genes that cooperatively make contributions to the genetic etiology of CD. METHODS We exploited the largest CD dataset (20,000 cases + 28,000 controls) and UC dataset (17,000 cases + 33,500 controls) to date. We conducted a meta-analysis of 5 CD cohorts of European ancestry using 3 pathway-based approaches and further performed replication studies in an independent cohort genotyped on the Immunochip and in another pediatric cohort of European ancestry. Similar meta-analysis was performed for UC cohorts. RESULTS In addition to the multiple immune-related pathways that have been implicated in the genetic etiology of inflammatory bowel diseases before, we found significant associations involving genes in growth factor signaling for CD. This result was replicated in 2 independent cohorts of European ancestry. This association with growth factor activity is not unique to CD. We found a similar significant association with UC cohorts. CONCLUSIONS Our findings suggest that genes involved in pathways of growth factor signaling may make joint contributions to the etiology of CD and UC, providing novel insight into the genetic mechanisms of these diseases.
Collapse
|
29
|
de Almeida Santana MH, Junior GAO, Cesar ASM, Freua MC, da Costa Gomes R, da Luz E Silva S, Leme PR, Fukumasu H, Carvalho ME, Ventura RV, Coutinho LL, Kadarmideen HN, Ferraz JBS. Copy number variations and genome-wide associations reveal putative genes and metabolic pathways involved with the feed conversion ratio in beef cattle. J Appl Genet 2016; 57:495-504. [PMID: 27001052 DOI: 10.1007/s13353-016-0344-7] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Revised: 01/20/2016] [Accepted: 03/02/2016] [Indexed: 10/22/2022]
Abstract
The use of genome-wide association results combined with other genomic approaches may uncover genes and metabolic pathways related to complex traits. In this study, the phenotypic and genotypic data of 1475 Nellore (Bos indicus) cattle and 941,033 single nucleotide polymorphisms (SNPs) were used for genome-wide association study (GWAS) and copy number variations (CNVs) analysis in order to identify candidate genes and putative pathways involved with the feed conversion ratio (FCR). The GWAS was based on the Bayes B approach analyzing genomic windows with multiple regression models to estimate the proportion of genetic variance explained by each window. The CNVs were detected with PennCNV software using the log R ratio and B allele frequency data. CNV regions (CNVRs) were identified with CNVRuler and a linear regression was used to associate CNVRs and the FCR. Functional annotation of associated genomic regions was performed with the Database for Annotation, Visualization and Integrated Discovery (DAVID) and the metabolic pathways were obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG). We showed five genomic windows distributed over chromosomes 4, 6, 7, 8, and 24 that explain 12 % of the total genetic variance for FCR, and detected 12 CNVRs (chromosomes 1, 5, 7, 10, and 12) significantly associated [false discovery rate (FDR) < 0.05] with the FCR. Significant genomic regions (GWAS and CNV) harbor candidate genes involved in pathways related to energetic, lipid, and protein metabolism. The metabolic pathways found in this study are related to processes directly connected to feed efficiency in beef cattle. It was observed that, even though different genomic regions and genes were found between the two approaches (GWAS and CNV), the metabolic processes covered were related to each other. Therefore, a combination of the approaches complement each other and lead to a better understanding of the FCR.
Collapse
Affiliation(s)
- Miguel Henrique de Almeida Santana
- Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 7, 1870, Frederiksberg, Denmark.,Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| | | | | | - Mateus Castelani Freua
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| | - Rodrigo da Costa Gomes
- Empresa Brasileira de Pesquisa Agropecuária, CNPGC/EMBRAPA, BR 262 km 4, 79002-970, Campo Grande, Brazil
| | - Saulo da Luz E Silva
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| | - Paulo Roberto Leme
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| | - Heidge Fukumasu
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| | - Minos Esperândio Carvalho
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| | - Ricardo Vieira Ventura
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil.,University of Guelph, 50 Stone Road East, Guelph, Ontario, N1G 2W1, Canada
| | - Luiz Lehmann Coutinho
- Escola Superior de Agricultura Luiz de Queiroz, University of São Paulo, 13418-900, Piracicaba, Brazil
| | - Haja N Kadarmideen
- Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 7, 1870, Frederiksberg, Denmark
| | - José Bento Sterman Ferraz
- Faculdade de Zootecnia e Engenharia de Alimentos, University of São Paulo, Duque de Caxias Norte, 225, 13635-900, Pirassununga, Brazil
| |
Collapse
|
30
|
Li J, Wei Z, Hakonarson H. Application of computational methods in genetic study of inflammatory bowel disease. World J Gastroenterol 2016; 22:949-960. [PMID: 26811639 PMCID: PMC4716047 DOI: 10.3748/wjg.v22.i3.949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 11/04/2015] [Accepted: 11/24/2015] [Indexed: 02/06/2023] Open
Abstract
Genetic factors play an important role in the etiology of inflammatory bowel disease (IBD). The launch of genome-wide association study (GWAS) represents a landmark in the genetic study of human complex disease. Concurrently, computational methods have undergone rapid development during the past a few years, which led to the identification of numerous disease susceptibility loci. IBD is one of the successful examples of GWAS and related analyses. A total of 163 genetic loci and multiple signaling pathways have been identified to be associated with IBD. Pleiotropic effects were found for many of these loci; and risk prediction models were built based on a broad spectrum of genetic variants. Important gene-gene, gene-environment interactions and key contributions of gut microbiome are being discovered. Here we will review the different types of analyses that have been applied to IBD genetic study, discuss the computational methods for each type of analysis, and summarize the discoveries made in IBD research with the application of these methods.
Collapse
|
31
|
Reduced interleukin-2 responsiveness impairs the ability of Treg cells to compete for IL-2 in nonobese diabetic mice. Immunol Cell Biol 2016; 94:509-19. [PMID: 26763864 DOI: 10.1038/icb.2016.7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Revised: 01/06/2016] [Accepted: 01/10/2015] [Indexed: 12/11/2022]
Abstract
Enhancement of regulatory T cell (Treg cell) frequency and function is the goal of many therapeutic strategies aimed at treating type 1 diabetes (T1D). The interleukin-2 (IL-2) pathway, which has been strongly implicated in T1D susceptibility in both humans and mice, is a master regulator of Treg cell homeostasis and function. We investigated how IL-2 pathway defects impact Treg cells in T1D-susceptible nonobese diabetic (NOD) mice in comparison with protected C57BL/6 and NOD congenic mice. NOD Treg cells were reduced in frequency specifically in the lymph nodes and expressed lower levels of CD25 and CD39/CD73 immunosuppressive molecules. In the spleen and blood, Treg cell frequency was preserved through expansion of CD25(low), effector phenotype Treg cells. Reduced CD25 expression led to decreased IL-2 signaling in NOD Treg cells. In vivo, treatment with IL-2-anti-IL-2 antibody complexes led to effective upregulation of suppressive molecules on NOD Treg cells in the spleen and blood, but had reduced efficacy on lymph node Treg cells. In contrast, NOD CD8(+) and CD4(+) effector T cells were not impaired in their response to IL-2 therapy. We conclude that NOD Treg cells have an impaired responsiveness to IL-2 that reduces their ability to compete for a limited supply of IL-2.
Collapse
|
32
|
Seldin MF. The genetics of human autoimmune disease: A perspective on progress in the field and future directions. J Autoimmun 2015; 64:1-12. [PMID: 26343334 PMCID: PMC4628839 DOI: 10.1016/j.jaut.2015.08.015] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 08/23/2015] [Indexed: 12/18/2022]
Abstract
Progress in defining the genetics of autoimmune disease has been dramatically enhanced by large scale genetic studies. Genome-wide approaches, examining hundreds or for some diseases thousands of cases and controls, have been implemented using high throughput genotyping and appropriate algorithms to provide a wealth of data over the last decade. These studies have identified hundreds of non-HLA loci as well as further defining HLA variations that predispose to different autoimmune diseases. These studies to identify genetic risk loci are also complemented by progress in gene expression studies including definition of expression quantitative trait loci (eQTL), various alterations in chromatin structure including histone marks, DNase I sensitivity, repressed chromatin regions as well as transcript factor binding sites. Integration of this information can partially explain why particular variations can alter proclivity to autoimmune phenotypes. Despite our incomplete knowledge base with only partial definition of hereditary factors and possible functional connections, this progress has and will continue to facilitate a better understanding of critical pathways and critical changes in immunoregulation. Advances in defining and understanding functional variants potentially can lead to both novel therapeutics and personalized medicine in which therapeutic approaches are chosen based on particular molecular phenotypes and genomic alterations.
Collapse
Affiliation(s)
- Michael F Seldin
- Department of Biochemistry and Molecular Medicine, University of California, Davis, Tupper Hall Room 4453, Davis, CA 95616, USA; Division of Rheumatology and Allergy, Department of Medicine, University of California, Davis, Tupper Hall Room 4453, Davis, CA 95616, USA.
| |
Collapse
|
33
|
Crispim AC, Kelly MJ, Guimarães SEF, e Silva FF, Fortes MRS, Wenceslau RR, Moore S. Multi-Trait GWAS and New Candidate Genes Annotation for Growth Curve Parameters in Brahman Cattle. PLoS One 2015; 10:e0139906. [PMID: 26445451 PMCID: PMC4622042 DOI: 10.1371/journal.pone.0139906] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 09/18/2015] [Indexed: 12/16/2022] Open
Abstract
Understanding the genetic architecture of beef cattle growth cannot be limited simply to the genome-wide association study (GWAS) for body weight at any specific ages, but should be extended to a more general purpose by considering the whole growth trajectory over time using a growth curve approach. For such an approach, the parameters that are used to describe growth curves were treated as phenotypes under a GWAS model. Data from 1,255 Brahman cattle that were weighed at birth, 6, 12, 15, 18, and 24 months of age were analyzed. Parameter estimates, such as mature weight (A) and maturity rate (K) from nonlinear models are utilized as substitutes for the original body weights for the GWAS analysis. We chose the best nonlinear model to describe the weight-age data, and the estimated parameters were used as phenotypes in a multi-trait GWAS. Our aims were to identify and characterize associated SNP markers to indicate SNP-derived candidate genes and annotate their function as related to growth processes in beef cattle. The Brody model presented the best goodness of fit, and the heritability values for the parameter estimates for mature weight (A) and maturity rate (K) were 0.23 and 0.32, respectively, proving that these traits can be a feasible alternative when the objective is to change the shape of growth curves within genetic improvement programs. The genetic correlation between A and K was -0.84, indicating that animals with lower mature body weights reached that weight at younger ages. One hundred and sixty seven (167) and two hundred and sixty two (262) significant SNPs were associated with A and K, respectively. The annotated genes closest to the most significant SNPs for A had direct biological functions related to muscle development (RAB28), myogenic induction (BTG1), fetal growth (IL2), and body weights (APEX2); K genes were functionally associated with body weight, body height, average daily gain (TMEM18), and skeletal muscle development (SMN1). Candidate genes emerging from this GWAS may inform the search for causative mutations that could underpin genomic breeding for improved growth rates.
Collapse
Affiliation(s)
- Aline Camporez Crispim
- Department of Animal Science, Universidade Federal de Viçosa, Viçosa, Minas Gerais, Brazil
| | - Matthew John Kelly
- Queensland Alliance for Agriculture & Food Innovation University of Queensland, Brisbane, Queensland, Australia
| | | | | | | | - Raphael Rocha Wenceslau
- Animal Science Institute, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Stephen Moore
- Queensland Alliance for Agriculture & Food Innovation University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
34
|
Ningappa M, So J, Glessner J, Ashokkumar C, Ranganathan S, Min J, Higgs BW, Sun Q, Haberman K, Schmitt L, Vilarinho S, Mistry PK, Vockley G, Dhawan A, Gittes GK, Hakonarson H, Jaffe R, Subramaniam S, Shin D, Sindhi R. The Role of ARF6 in Biliary Atresia. PLoS One 2015; 10:e0138381. [PMID: 26379158 PMCID: PMC4574480 DOI: 10.1371/journal.pone.0138381] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 01/22/2015] [Indexed: 02/05/2023] Open
Abstract
Background & Aims Altered extrahepatic bile ducts, gut, and cardiovascular anomalies constitute the variable phenotype of biliary atresia (BA). Methods To identify potential susceptibility loci, Caucasian children, normal (controls) and with BA (cases) at two US centers were compared at >550000 SNP loci. Systems biology analysis was carried out on the data. In order to validate a key gene identified in the analysis, biliary morphogenesis was evaluated in 2-5-day post-fertilization zebrafish embryos after morpholino-antisense oligonucleotide knockdown of the candidate gene ADP ribosylation factor-6 (ARF6, Mo-arf6). Results Among 39 and 24 cases at centers 1 and 2, respectively, and 1907 controls, which clustered together on principal component analysis, the SNPs rs3126184 and rs10140366 in a 3’ flanking enhancer region for ARF6 demonstrated higher minor allele frequencies (MAF) in each cohort, and 63 combined cases, compared with controls (0.286 vs. 0.131, P = 5.94x10-7, OR 2.66; 0.286 vs. 0.13, P = 5.57x10-7, OR 2.66). Significance was enhanced in 77 total cases, which included 14 additional BA genotyped at rs3126184 only (p = 1.58x10-2, OR = 2.66). Pathway analysis of the 1000 top-ranked SNPs in CHP cases revealed enrichment of genes for EGF regulators (p<1 x10-7), ERK/MAPK and CREB canonical pathways (p<1 x10-34), and functional networks for cellular development and proliferation (p<1 x10-45), further supporting the role of EGFR-ARF6 signaling in BA. In zebrafish embryos, Mo-arf6 injection resulted in a sparse intrahepatic biliary network, several biliary epithelial cell defects, and poor bile excretion to the gall bladder compared with uninjected embryos. Biliary defects were reproduced with the EGFR-blocker AG1478 alone or with Mo-arf6 at lower doses of each agent and rescued with arf6 mRNA. Conclusions The BA-associated SNPs identify a chromosome 14q21.3 susceptibility locus encompassing the ARF6 gene. arf6 knockdown in zebrafish implicates early biliary dysgenesis as a basis for BA, and also suggests a role for EGFR signaling in BA pathogenesis.
Collapse
Affiliation(s)
- Mylarappa Ningappa
- Hillman Center for Pediatric Transplantation of the Children’s Hospital of Pittsburgh of University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, 15224, United States of America
| | - Juhoon So
- Department of Developmental Biology and McGowan Institute of Regenerative Medicine, University of Pittsburgh, Pittsburgh, PA, 15261, United States of America
| | - Joseph Glessner
- Center for Applied Genomics of the Children’s Hospital of Philadelphia, Philadelphia, PA, 19104, United States of America
| | - Chethan Ashokkumar
- Hillman Center for Pediatric Transplantation of the Children’s Hospital of Pittsburgh of University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, 15224, United States of America
| | - Sarangarajan Ranganathan
- Department of Pathology, Division of Pediatric Pathology, Children’s Hospital of Pittsburgh of UPMC, Pittsburgh, PA, 15224, United States of America
| | - Jun Min
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92013, United States of America
| | - Brandon W. Higgs
- Hillman Center for Pediatric Transplantation of the Children’s Hospital of Pittsburgh of University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, 15224, United States of America
| | - Qing Sun
- Hillman Center for Pediatric Transplantation of the Children’s Hospital of Pittsburgh of University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, 15224, United States of America
| | - Kimberly Haberman
- Hillman Center for Pediatric Transplantation of the Children’s Hospital of Pittsburgh of University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, 15224, United States of America
| | - Lori Schmitt
- Histology Core Laboratory, Children’s Hospital of Pittsburgh of UPMC, Pittsburgh, PA, 15224, United States of America
| | - Silvia Vilarinho
- Department of Internal Medicine, Section of Digestive Diseases, Yale University School of Medicine, New Haven, CT, 06510, United States of America
| | - Pramod K. Mistry
- Department of Internal Medicine, Section of Digestive Diseases, Yale University School of Medicine, New Haven, CT, 06510, United States of America
| | - Gerard Vockley
- Department of Pediatrics and Human Genetics, Children’s Hospital of Pittsburgh of UPMC, Pittsburgh, PA, 15224, United States of America
| | - Anil Dhawan
- Paediatric Liver, GI, and Nutrition, King’s College Hospital, London, WC2R 2LS, England
| | - George K. Gittes
- Pediatric General and Thoracic Surgery, Children’s Hospital of Pittsburgh of UPMC, Pittsburgh, PA, 15224, United States of America
| | - Hakon Hakonarson
- Center for Applied Genomics of the Children’s Hospital of Philadelphia, Philadelphia, PA, 19104, United States of America
| | - Ronald Jaffe
- Histology Core Laboratory, Children’s Hospital of Pittsburgh of UPMC, Pittsburgh, PA, 15224, United States of America
| | - Shankar Subramaniam
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92013, United States of America
| | - Donghun Shin
- Department of Developmental Biology and McGowan Institute of Regenerative Medicine, University of Pittsburgh, Pittsburgh, PA, 15261, United States of America
| | - Rakesh Sindhi
- Hillman Center for Pediatric Transplantation of the Children’s Hospital of Pittsburgh of University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, 15224, United States of America
- * E-mail:
| |
Collapse
|
35
|
Wang W, Mandel J, Bouaziz J, Commenges D, Nabirotchkine S, Chumakov I, Cohen D, Guedj M. A Multi-Marker Genetic Association Test Based on the Rasch Model Applied to Alzheimer's Disease. PLoS One 2015; 10:e0138223. [PMID: 26379234 PMCID: PMC4574966 DOI: 10.1371/journal.pone.0138223] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 08/27/2015] [Indexed: 12/28/2022] Open
Abstract
Results from Genome-Wide Association Studies (GWAS) have shown that the genetic basis of complex traits often include many genetic variants with small to moderate effects whose identification remains a challenging problem. In this context multi-marker analysis at the gene and pathway level can complement traditional point-wise approaches that treat the genetic markers individually. In this paper we propose a novel statistical approach for multi-marker analysis based on the Rasch model. The method summarizes the categorical genotypes of SNPs by a generalized logistic function into a genetic score that can be used for association analysis. Through different sets of simulations, the false-positive rate and power of the proposed approach are compared to a set of existing methods, and shows good performances. The application of the Rasch model on Alzheimer's Disease (AD) ADNI GWAS dataset also allows a coherent interpretation of the results. Our analysis supports the idea that APOE is a major susceptibility gene for AD. In the top genes selected by proposed method, several could be functionally linked to AD. In particular, a pathway analysis of these genes also highlights the metabolism of cholesterol, that is known to play a key role in AD pathogenesis. Interestingly, many of these top genes can be integrated in a hypothetic signalling network.
Collapse
Affiliation(s)
- Wenjia Wang
- Pharnext, Issy-les-Moulineaux, Ile de France, France
- Inserm U897, University of Bordeaux, Bordeaux, Aquitaine, France
| | - Jonas Mandel
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Jan Bouaziz
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Daniel Commenges
- Inserm U897, University of Bordeaux, Bordeaux, Aquitaine, France
| | | | - Ilya Chumakov
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Daniel Cohen
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Mickaël Guedj
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | | |
Collapse
|
36
|
Heinonen MT, Moulder R, Lahesmaa R. New Insights and Biomarkers for Type 1 Diabetes: Review for Scandinavian Journal of Immunology. Scand J Immunol 2015; 82:244-53. [DOI: 10.1111/sji.12338] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 06/25/2015] [Indexed: 12/16/2022]
Affiliation(s)
- M. T. Heinonen
- Turku Centre for Biotechnology; University of Turku; Åbo Akademi University; Turku Finland
| | - R. Moulder
- Turku Centre for Biotechnology; University of Turku; Åbo Akademi University; Turku Finland
| | - R. Lahesmaa
- Turku Centre for Biotechnology; University of Turku; Åbo Akademi University; Turku Finland
| |
Collapse
|
37
|
Wallace C, Cutler AJ, Pontikos N, Pekalski ML, Burren OS, Cooper JD, García AR, Ferreira RC, Guo H, Walker NM, Smyth DJ, Rich SS, Onengut-Gumuscu S, Sawcer SJ, Ban M, Richardson S, Todd JA, Wicker LS. Dissection of a Complex Disease Susceptibility Region Using a Bayesian Stochastic Search Approach to Fine Mapping. PLoS Genet 2015; 11:e1005272. [PMID: 26106896 PMCID: PMC4481316 DOI: 10.1371/journal.pgen.1005272] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Accepted: 05/12/2015] [Indexed: 12/15/2022] Open
Abstract
Identification of candidate causal variants in regions associated with risk of common diseases is complicated by linkage disequilibrium (LD) and multiple association signals. Nonetheless, accurate maps of these variants are needed, both to fully exploit detailed cell specific chromatin annotation data to highlight disease causal mechanisms and cells, and for design of the functional studies that will ultimately be required to confirm causal mechanisms. We adapted a Bayesian evolutionary stochastic search algorithm to the fine mapping problem, and demonstrated its improved performance over conventional stepwise and regularised regression through simulation studies. We then applied it to fine map the established multiple sclerosis (MS) and type 1 diabetes (T1D) associations in the IL-2RA (CD25) gene region. For T1D, both stepwise and stochastic search approaches identified four T1D association signals, with the major effect tagged by the single nucleotide polymorphism, rs12722496. In contrast, for MS, the stochastic search found two distinct competing models: a single candidate causal variant, tagged by rs2104286 and reported previously using stepwise analysis; and a more complex model with two association signals, one of which was tagged by the major T1D
associated rs12722496 and the other by rs56382813. There is low to moderate LD between rs2104286 and both rs12722496 and rs56382813 (r2 ≃ 0:3) and our two SNP model could not be recovered through a forward stepwise search after conditioning on rs2104286. Both signals in the two variant model for MS affect CD25 expression on distinct subpopulations of CD4+ T cells, which are key cells in the autoimmune process. The results support a shared causal variant for T1D and MS. Our study illustrates the benefit of using a purposely designed model search strategy for fine mapping and the advantage of combining disease and protein expression data. Genetic association studies have identified many DNA sequence variants that associate with disease risk. By exploiting the known correlation that exists between neighbouring variants in the genome, inference can be extended beyond those individual variants tested to identify sets within which a causal variant is likely to reside. However, this correlation, particularly in the presence of multiple disease causing variants in relative proximity, makes disentangling the specific causal variants difficult. Statistical approaches to this fine mapping problem have traditionally taken a stepwise search approach, beginning with the most associated variant in a region, then iteratively attempting to find additional associated variants. We adapted a stochastic search approach that avoids this stepwise process and is explicitly designed for dealing with highly correlated predictors to the fine mapping problem. We showed in simulated data that it outperforms its stepwise counterpart and other variable selection strategies such as the lasso. We applied our approach to understand the association of two immune-mediated diseases to a region on chromosome 10p15. We identified a model for multiple sclerosis containing two variants, neither of which was found through a stepwise search, and functionally linked both of these to the neighbouring candidate gene, IL2RA, in independent data. Our approach can be used to aid fine mapping of other disease-associated regions, which is critical for design of functional follow-up studies required to understand the mechanisms through which genetic variants influence disease.
Collapse
Affiliation(s)
- Chris Wallace
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom; MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, United Kingdom
| | - Antony J Cutler
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Nikolas Pontikos
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Marcin L Pekalski
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Oliver S Burren
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Jason D Cooper
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Arcadio Rubio García
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Ricardo C Ferreira
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Hui Guo
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom; Centre for Biostatistics Institute of Population Health, The University of Manchester Manchester, United Kingdom
| | - Neil M Walker
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Deborah J Smyth
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America; Department of Medicine, Division of Endocrinology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Suna Onengut-Gumuscu
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America; Department of Public Health Sciences, Division of Biostatistics and Epidemiology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Stephen J Sawcer
- University of Cambridge, Department of Clinical Neurosciences, Cambridge, United Kingdom
| | - Maria Ban
- University of Cambridge, Department of Clinical Neurosciences, Cambridge, United Kingdom
| | - Sylvia Richardson
- MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, United Kingdom
| | - John A Todd
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Linda S Wicker
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
38
|
Leveraging Multi-ethnic Evidence for Mapping Complex Traits in Minority Populations: An Empirical Bayes Approach. Am J Hum Genet 2015; 96:740-52. [PMID: 25892113 DOI: 10.1016/j.ajhg.2015.03.008] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 03/10/2015] [Indexed: 01/21/2023] Open
Abstract
Elucidating the genetic basis of complex traits and diseases in non-European populations is particularly challenging because US minority populations have been under-represented in genetic association studies. We developed an empirical Bayes approach named XPEB (cross-population empirical Bayes), designed to improve the power for mapping complex-trait-associated loci in a minority population by exploiting information from genome-wide association studies (GWASs) from another ethnic population. Taking as input summary statistics from two GWASs-a target GWAS from an ethnic minority population of primary interest and an auxiliary base GWAS (such as a larger GWAS in Europeans)-our XPEB approach reprioritizes SNPs in the target population to compute local false-discovery rates. We demonstrated, through simulations, that whenever the base GWAS harbors relevant information, XPEB gains efficiency. Moreover, XPEB has the ability to discard irrelevant auxiliary information, providing a safeguard against inflated false-discovery rates due to genetic heterogeneity between populations. Applied to a blood-lipids study in African Americans, XPEB more than quadrupled the discoveries from the conventional approach, which used a target GWAS alone, bringing the number of significant loci from 14 to 65. Thus, XPEB offers a flexible framework for mapping complex traits in minority populations.
Collapse
|
39
|
Dong C, Wei P, Jian X, Gibbs R, Boerwinkle E, Wang K, Liu X. Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Hum Mol Genet 2014; 24:2125-37. [PMID: 25552646 DOI: 10.1093/hmg/ddu733] [Citation(s) in RCA: 807] [Impact Index Per Article: 73.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome sequencing (WES) studies. Although many deleteriousness prediction methods have been developed, their prediction results are sometimes inconsistent with each other and their relative merits are still unclear in practical applications. To address these issues, we comprehensively evaluated the predictive performance of 18 current deleteriousness-scoring methods, including 11 function prediction scores (PolyPhen-2, SIFT, MutationTaster, Mutation Assessor, FATHMM, LRT, PANTHER, PhD-SNP, SNAP, SNPs&GO and MutPred), 3 conservation scores (GERP++, SiPhy and PhyloP) and 4 ensemble scores (CADD, PON-P, KGGSeq and CONDEL). We found that FATHMM and KGGSeq had the highest discriminative power among independent scores and ensemble scores, respectively. Moreover, to ensure unbiased performance evaluation of these prediction scores, we manually collected three distinct testing datasets, on which no current prediction scores were tuned. In addition, we developed two new ensemble scores that integrate nine independent scores and allele frequency. Our scores achieved the highest discriminative power compared with all the deleteriousness prediction scores tested and showed low false-positive prediction rate for benign yet rare nonsynonymous variants, which demonstrated the value of combining information from multiple orthologous approaches. Finally, to facilitate variant prioritization in WES studies, we have pre-computed our ensemble scores for 87 347 044 possible variants in the whole-exome and made them publicly available through the ANNOVAR software and the dbNSFP database.
Collapse
Affiliation(s)
- Chengliang Dong
- Zilkha Neurogenetic Institute, Biostatistics Division, Department of Preventive Medicine and
| | - Peng Wei
- Human Genetics Center, Division of Biostatistics, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA and
| | - Xueqiu Jian
- Division of Epidemiology, Human Genetics and Environmental Sciences and
| | - Richard Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eric Boerwinkle
- Human Genetics Center, Division of Epidemiology, Human Genetics and Environmental Sciences and Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kai Wang
- Zilkha Neurogenetic Institute, Biostatistics Division, Department of Preventive Medicine and, Department of Psychiatry, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA,
| | - Xiaoming Liu
- Human Genetics Center, Division of Epidemiology, Human Genetics and Environmental Sciences and
| |
Collapse
|
40
|
Gusev A, Lee S, Trynka G, Finucane H, Vilhjálmsson B, Xu H, Zang C, Ripke S, Bulik-Sullivan B, Stahl E, Kähler AK, Hultman CM, Purcell SM, McCarroll SA, Daly M, Pasaniuc B, Sullivan PF, Neale BM, Wray NR, Raychaudhuri S, Price AL, Ripke S, Neale B, Corvin A, Walters J, Farh KH, Holmans P, Lee P, Bulik-Sullivan B, Collier D, Huang H, Pers T, Agartz I, Agerbo E, Albus M, Alexander M, Amin F, Bacanu S, Begemann M, Belliveau R, Bene J, Bergen S, Bevilacqua E, Bigdeli T, Black D, Børglum A, Bruggeman R, Buccola N, Buckner R, Byerley W, Cahn W, Cai G, Campion D, Cantor R, Carr V, Carrera N, Catts S, Chambert K, Chan R, Chen R, Chen E, Cheng W, Cheung E, Chong S, Cloninger C, Cohen D, Cohen N, Cormican P, Craddock N, Crowley J, Curtis D, Davidson M, Davis K, Degenhardt F, Del Favero J, DeLisi L, Demontis D, Dikeos D, Dinan T, Djurovic S, Donohoe G, Drapeau E, Duan J, Dudbridge F, Durmishi N, Eichhammer P, Eriksson J, Escott-Price V, Essioux L, Fanous A, Farrell M, Frank J, Franke L, Freedman R, Freimer N, Friedl M, Friedman J, Fromer M, Genovese G, Georgieva L, et alGusev A, Lee S, Trynka G, Finucane H, Vilhjálmsson B, Xu H, Zang C, Ripke S, Bulik-Sullivan B, Stahl E, Kähler AK, Hultman CM, Purcell SM, McCarroll SA, Daly M, Pasaniuc B, Sullivan PF, Neale BM, Wray NR, Raychaudhuri S, Price AL, Ripke S, Neale B, Corvin A, Walters J, Farh KH, Holmans P, Lee P, Bulik-Sullivan B, Collier D, Huang H, Pers T, Agartz I, Agerbo E, Albus M, Alexander M, Amin F, Bacanu S, Begemann M, Belliveau R, Bene J, Bergen S, Bevilacqua E, Bigdeli T, Black D, Børglum A, Bruggeman R, Buccola N, Buckner R, Byerley W, Cahn W, Cai G, Campion D, Cantor R, Carr V, Carrera N, Catts S, Chambert K, Chan R, Chen R, Chen E, Cheng W, Cheung E, Chong S, Cloninger C, Cohen D, Cohen N, Cormican P, Craddock N, Crowley J, Curtis D, Davidson M, Davis K, Degenhardt F, Del Favero J, DeLisi L, Demontis D, Dikeos D, Dinan T, Djurovic S, Donohoe G, Drapeau E, Duan J, Dudbridge F, Durmishi N, Eichhammer P, Eriksson J, Escott-Price V, Essioux L, Fanous A, Farrell M, Frank J, Franke L, Freedman R, Freimer N, Friedl M, Friedman J, Fromer M, Genovese G, Georgieva L, Gershon E, Giegling I, Giusti-Rodrguez P, Godard S, Goldstein J, Golimbet V, Gopal S, Gratten J, Grove J, de Haan L, Hammer C, Hamshere M, Hansen M, Hansen T, Haroutunian V, Hartmann A, Henskens F, Herms S, Hirschhorn J, Hoffmann P, Hofman A, Hollegaard M, Hougaard D, Ikeda M, Joa I, Julià A, Kahn R, Kalaydjieva L, Karachanak-Yankova S, Karjalainen J, Kavanagh D, Keller M, Kelly B, Kennedy J, Khrunin A, Kim Y, Klovins J, Knowles J, Konte B, Kucinskas V, Kucinskiene Z, Kuzelova-Ptackova H, Kähler A, Laurent C, Keong J, Lee S, Legge S, Lerer B, Li M, Li T, Liang KY, Lieberman J, Limborska S, Loughland C, Lubinski J, Lnnqvist J, Macek M, Magnusson P, Maher B, Maier W, Mallet J, Marsal S, Mattheisen M, Mattingsdal M, McCarley R, McDonald C, McIntosh A, Meier S, Meijer C, Melegh B, Melle I, Mesholam-Gately R, Metspalu A, Michie P, Milani L, Milanova V, Mokrab Y, Morris D, Mors O, Mortensen P, Murphy K, Murray R, Myin-Germeys I, Mller-Myhsok B, Nelis M, Nenadic I, Nertney D, Nestadt G, Nicodemus K, Nikitina-Zake L, Nisenbaum L, Nordin A, O’Callaghan E, O’Dushlaine C, O’Neill F, Oh SY, Olincy A, Olsen L, Van Os J, Pantelis C, Papadimitriou G, Papiol S, Parkhomenko E, Pato M, Paunio T, Pejovic-Milovancevic M, Perkins D, Pietilinen O, Pimm J, Pocklington A, Powell J, Price A, Pulver A, Purcell S, Quested D, Rasmussen H, Reichenberg A, Reimers M, Richards A, Roffman J, Roussos P, Ruderfer D, Salomaa V, Sanders A, Schall U, Schubert C, Schulze T, Schwab S, Scolnick E, Scott R, Seidman L, Shi J, Sigurdsson E, Silagadze T, Silverman J, Sim K, Slominsky P, Smoller J, So HC, Spencer C, Stahl E, Stefansson H, Steinberg S, Stogmann E, Straub R, Strengman E, Strohmaier J, Stroup T, Subramaniam M, Suvisaari J, Svrakic D, Szatkiewicz J, Sderman E, Thirumalai S, Toncheva D, Tooney P, Tosato S, Veijola J, Waddington J, Walsh D, Wang D, Wang Q, Webb B, Weiser M, Wildenauer D, Williams N, Williams S, Witt S, Wolen A, Wong E, Wormley B, Wu J, Xi H, Zai C, Zheng X, Zimprich F, Wray N, Stefansson K, Visscher P, Adolfsson R, Andreassen O, Blackwood D, Bramon E, Buxbaum J, Brglum A, Cichon S, Darvasi A, Domenici E, Ehrenreich H, Esko T, Gejman P, Gill M, Gurling H, Hultman C, Iwata N, Jablensky A, Jönsson E, Kendler K, Kirov G, Knight J, Lencz T, Levinson D, Li Q, Liu J, Malhotra A, McCarroll S, McQuillin A, Moran J, Mortensen P, Mowry B, Nthen M, Ophoff R, Owen M, Palotie A, Pato C, Petryshen T, Posthuma D, Rietschel M, Riley B, Rujescu D, Sham P, Sklar P, St. Clair D, Weinberger D, Wendland J, Werge T, Daly M, Sullivan P, O’Donovan M, Ripke S, O’Dushlaine C, Chambert K, Moran JL, Kähler AK, Akterin S, Bergen S, Magnusson PK, Neale BM, Ruderfer D, Scolnick E, Purcell S, McCarroll S, Sklar P, Hultman CM, Sullivan PF. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am J Hum Genet 2014; 95:535-52. [PMID: 25439723 PMCID: PMC4225595 DOI: 10.1016/j.ajhg.2014.10.004] [Show More Authors] [Citation(s) in RCA: 439] [Impact Index Per Article: 39.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Accepted: 10/02/2014] [Indexed: 10/25/2022] Open
Abstract
Regulatory and coding variants are known to be enriched with associations identified by genome-wide association studies (GWASs) of complex disease, but their contributions to trait heritability are currently unknown. We applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg(2)) across functional categories (while accounting for shared variance due to linkage disequilibrium). Extensive simulations showed that in contrast to current estimates from GWAS summary statistics, the variance-component approach partitions heritability accurately under a wide range of complex-disease architectures. Across the 11 diseases DNaseI hypersensitivity sites (DHSs) from 217 cell types spanned 16% of imputed SNPs (and 24% of genotyped SNPs) but explained an average of 79% (SE = 8%) of hg(2) from imputed SNPs (5.1× enrichment; p = 3.7 × 10(-17)) and 38% (SE = 4%) of hg(2) from genotyped SNPs (1.6× enrichment, p = 1.0 × 10(-4)). Further enrichment was observed at enhancer DHSs and cell-type-specific DHSs. In contrast, coding variants, which span 1% of the genome, explained <10% of hg(2) despite having the highest enrichment. We replicated these findings but found no significant contribution from rare coding variants in independent schizophrenia cohorts genotyped on GWAS and exome chips. Our results highlight the value of analyzing components of heritability to unravel the functional architecture of common disease.
Collapse
|
41
|
Evangelou M, Smyth DJ, Fortune MD, Burren OS, Walker NM, Guo H, Onengut-Gumuscu S, Chen WM, Concannon P, Rich SS, Todd JA, Wallace C. A method for gene-based pathway analysis using genomewide association study summary statistics reveals nine new type 1 diabetes associations. Genet Epidemiol 2014; 38:661-70. [PMID: 25371288 PMCID: PMC4258092 DOI: 10.1002/gepi.21853] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Revised: 06/02/2014] [Accepted: 07/29/2014] [Indexed: 12/11/2022]
Abstract
Pathway analysis can complement point-wise single nucleotide polymorphism (SNP) analysis in exploring genomewide association study (GWAS) data to identify specific disease-associated genes that can be candidate causal genes. We propose a straightforward methodology that can be used for conducting a gene-based pathway analysis using summary GWAS statistics in combination with widely available reference genotype data. We used this method to perform a gene-based pathway analysis of a type 1 diabetes (T1D) meta-analysis GWAS (of 7,514 cases and 9,045 controls). An important feature of the conducted analysis is the removal of the major histocompatibility complex gene region, the major genetic risk factor for T1D. Thirty-one of the 1,583 (2%) tested pathways were identified to be enriched for association with T1D at a 5% false discovery rate. We analyzed these 31 pathways and their genes to identify SNPs in or near these pathway genes that showed potentially novel association with T1D and attempted to replicate the association of 22 SNPs in additional samples. Replication P-values were skewed () with 12 of the 22 SNPs showing . Support, including replication evidence, was obtained for nine T1D associated variants in genes ITGB7 (rs11170466, ), NRP1 (rs722988, ), BAD (rs694739, ), CTSB (rs1296023, ), FYN (rs11964650, ), UBE2G1 (rs9906760, ), MAP3K14 (rs17759555, ), ITGB1 (rs1557150, ), and IL7R (rs1445898, ). The proposed methodology can be applied to other GWAS datasets for which only summary level data are available.
Collapse
Affiliation(s)
- Marina Evangelou
- JDRF/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, NIHR Cambridge Biomedical Research Centre, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Kichaev G, Yang WY, Lindstrom S, Hormozdiari F, Eskin E, Price AL, Kraft P, Pasaniuc B. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet 2014; 10:e1004722. [PMID: 25357204 PMCID: PMC4214605 DOI: 10.1371/journal.pgen.1004722] [Citation(s) in RCA: 358] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2014] [Accepted: 09/01/2014] [Indexed: 11/18/2022] Open
Abstract
Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.
Collapse
Affiliation(s)
- Gleb Kichaev
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
| | - Wen-Yun Yang
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Sara Lindstrom
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Farhad Hormozdiari
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
| | - Eleazar Eskin
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Alkes L. Price
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Peter Kraft
- Program in Genetic Epidemiology and Statistical Genetics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| |
Collapse
|
43
|
Systems-based analyses of brain regions functionally impacted in Parkinson's disease reveals underlying causal mechanisms. PLoS One 2014; 9:e102909. [PMID: 25170892 PMCID: PMC4149353 DOI: 10.1371/journal.pone.0102909] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Accepted: 06/25/2014] [Indexed: 12/20/2022] Open
Abstract
Detailed analysis of disease-affected tissue provides insight into molecular mechanisms contributing to pathogenesis. Substantia nigra, striatum, and cortex are functionally connected with increasing degrees of alpha-synuclein pathology in Parkinson's disease. We undertook functional and causal pathway analysis of gene expression and proteomic alterations in these three regions, and the data revealed pathways that correlated with disease progression. In addition, microarray and RNAseq experiments revealed previously unidentified causal changes related to oligodendrocyte function and synaptic vesicle release, and these and other changes were reflected across all brain regions. Importantly, subsets of these changes were replicated in Parkinson's disease blood; suggesting peripheral tissue may provide important avenues for understanding and measuring disease status and progression. Proteomic assessment revealed alterations in mitochondria and vesicular transport proteins that preceded gene expression changes indicating defects in translation and/or protein turnover. Our combined approach of proteomics, RNAseq and microarray analyses provides a comprehensive view of the molecular changes that accompany functional loss and alpha-synuclein pathology in Parkinson's disease, and may be instrumental to understand, diagnose and follow Parkinson's disease progression.
Collapse
|
44
|
Mooney MA, Nigg JT, McWeeney SK, Wilmot B. Functional and genomic context in pathway analysis of GWAS data. Trends Genet 2014; 30:390-400. [PMID: 25154796 DOI: 10.1016/j.tig.2014.07.004] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 07/18/2014] [Accepted: 07/18/2014] [Indexed: 02/07/2023]
Abstract
Gene set analysis (GSA) is a promising tool for uncovering the polygenic effects associated with complex diseases. However, the available techniques reflect a wide variety of hypotheses about how genetic effects interact to contribute to disease susceptibility. The lack of consensus about the best way to perform GSA has led to confusion in the field and has made it difficult to compare results across methods. A clear understanding of the various choices made during GSA - such as how gene sets are defined, how single-nucleotide polymorphisms (SNPs) are assigned to genes, and how individual SNP-level effects are aggregated to produce gene- or pathway-level effects - will improve the interpretability and comparability of results across methods and studies. In this review we provide an overview of the various data sources used to construct gene sets and the statistical methods used to test for gene set association, as well as provide guidelines for ensuring the comparability of results.
Collapse
Affiliation(s)
- Michael A Mooney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| | - Joel T Nigg
- Division of Psychology, Department of Psychiatry, Oregon Health & Science University, Portland, OR, USA; Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA
| | - Shannon K McWeeney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA.
| | - Beth Wilmot
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| |
Collapse
|
45
|
Zablocki RW, Schork AJ, Levine RA, Andreassen OA, Dale AM, Thompson WK. Covariate-modulated local false discovery rate for genome-wide association studies. Bioinformatics 2014; 30:2098-104. [PMID: 24711653 PMCID: PMC4103587 DOI: 10.1093/bioinformatics/btu145] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Revised: 03/03/2014] [Accepted: 03/05/2014] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Genome-wide association studies (GWAS) have largely failed to identify most of the genetic basis of highly heritable diseases and complex traits. Recent work has suggested this could be because many genetic variants, each with individually small effects, compose their genetic architecture, limiting the power of GWAS, given currently obtainable sample sizes. In this scenario, Bonferroni-derived thresholds are severely underpowered to detect the vast majority of associations. Local false discovery rate (fdr) methods provide more power to detect non-null associations, but implicit assumptions about the exchangeability of single nucleotide polymorphisms (SNPs) limit their ability to discover non-null loci. METHODS We propose a novel covariate-modulated local false discovery rate (cmfdr) that incorporates prior information about gene element-based functional annotations of SNPs, so that SNPs from categories enriched for non-null associations have a lower fdr for a given value of a test statistic than SNPs in unenriched categories. This readjustment of fdr based on functional annotations is achieved empirically by fitting a covariate-modulated parametric two-group mixture model. The proposed cmfdr methodology is applied to a large Crohn's disease GWAS. RESULTS Use of cmfdr dramatically improves power, e.g. increasing the number of loci declared significant at the 0.05 fdr level by a factor of 5.4. We also demonstrate that SNPs were declared significant using cmfdr compared with usual fdr replicate in much higher numbers, while maintaining similar replication rates for a given fdr cutoff in de novo samples, using the eight Crohn's disease substudies as independent training and test datasets. Availability an implementation: https://sites.google.com/site/covmodfdr/ CONTACT : wes.stat@gmail.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rong W Zablocki
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Andrew J Schork
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Richard A Levine
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Ole A Andreassen
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Anders M Dale
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USAComputational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USAComputational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USA
| | - Wesley K Thompson
- Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA, Cognitive Sciences Graduate Program, University of California at San Diego, La Jolla, CA 92093, USA, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA, Institute of Clinical Medicine, University of Oslo, Oslo, 0424, Norway, Multimodal Imaging Laboratory and Department of Psychiatry, University of California at San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
46
|
Iversen ES, Lipton G, Clyde MA, Monteiro ANA. Functional annotation signatures of disease susceptibility loci improve SNP association analysis. BMC Genomics 2014; 15:398. [PMID: 24886216 PMCID: PMC4041996 DOI: 10.1186/1471-2164-15-398] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 05/13/2014] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Genetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional annotations of the genetic variants have rarely played more than an indirect role in assessing evidence for association. Here, we demonstrate how these data can be systematically integrated into an association study's analysis plan. RESULTS We developed a Bayesian statistical model for the prior probability of phenotype-genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs in the GWAS Catalog (GC). The functional predictors examined included measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants in the Database of Genomic Variants and known regulatory elements in the Open Regulatory Annotation database, PolyPhen-2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotations would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non-informative predictors and evaluated the model's ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP's presence in the GC. Further, using data from a genome-wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome-wide scale and improves power to detect associations. CONCLUSIONS We show how diverse functional annotations can be efficiently combined to create 'functional signatures' that predict the a priori odds of a variant's association to a trait and how these signatures can be integrated into a standard genome-wide-scale association analysis, resulting in improved power to detect truly associated variants.
Collapse
Affiliation(s)
- Edwin S Iversen
- Department of Statistical Science, Duke University, Box 90251, 27708-0251 Durham, NC, USA.
| | | | | | | |
Collapse
|
47
|
Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am J Hum Genet 2014; 94:559-73. [PMID: 24702953 DOI: 10.1016/j.ajhg.2014.03.004] [Citation(s) in RCA: 389] [Impact Index Per Article: 35.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Accepted: 03/11/2014] [Indexed: 01/23/2023] Open
Abstract
Annotations of gene structures and regulatory elements can inform genome-wide association studies (GWASs). However, choosing the relevant annotations for interpreting an association study of a given trait remains challenging. I describe a statistical model that uses association statistics computed across the genome to identify classes of genomic elements that are enriched with or depleted of loci influencing a trait. The model naturally incorporates multiple types of annotations. I applied the model to GWASs of 18 human traits, including red blood cell traits, platelet traits, glucose levels, lipid levels, height, body mass index, and Crohn disease. For each trait, I used the model to evaluate the relevance of 450 different genomic annotations, including protein-coding genes, enhancers, and DNase-I hypersensitive sites in over 100 tissues and cell lines. The fraction of phenotype-associated SNPs influencing protein sequence ranged from around 2% (for platelet volume) up to around 20% (for low-density lipoprotein cholesterol), repressed chromatin was significantly depleted for SNPs associated with several traits, and cell-type-specific DNase-I hypersensitive sites were enriched with SNPs associated with several traits (for example, the spleen in platelet volume). Finally, reweighting each GWAS by using information from functional genomics increased the number of loci with high-confidence associations by around 5%.
Collapse
|