1
|
Rosean S, Sosa EA, O'Shea D, Raj SM, Seoighe C, Greally JM. Regulatory landscape enrichment analysis (RLEA): a computational toolkit for non-coding variant enrichment and cell type prioritization. BMC Bioinformatics 2024; 25:179. [PMID: 38714913 PMCID: PMC11075237 DOI: 10.1186/s12859-024-05794-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 04/22/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND As genomic studies continue to implicate non-coding sequences in disease, testing the roles of these variants requires insights into the cell type(s) in which they are likely to be mediating their effects. Prior methods for associating non-coding variants with cell types have involved approaches using linkage disequilibrium or ontological associations, incurring significant processing requirements. GaiaAssociation is a freely available, open-source software that enables thousands of genomic loci implicated in a phenotype to be tested for enrichment at regulatory loci of multiple cell types in minutes, permitting insights into the cell type(s) mediating the studied phenotype. RESULTS In this work, we present Regulatory Landscape Enrichment Analysis (RLEA) by GaiaAssociation and demonstrate its capability to test the enrichment of 12,133 variants across the cis-regulatory regions of 44 cell types. This analysis was completed in 134.0 ± 2.3 s, highlighting the efficient processing provided by GaiaAssociation. The intuitive interface requires only four inputs, offers a collection of customizable functions, and visualizes variant enrichment in cell-type regulatory regions through a heatmap matrix. GaiaAssociation is available on PyPi for download as a command line tool or Python package and the source code can also be installed from GitHub at https://github.com/GreallyLab/gaiaAssociation . CONCLUSIONS GaiaAssociation is a novel package that provides an intuitive and efficient resource to understand the enrichment of non-coding variants across the cis-regulatory regions of different cells, empowering studies seeking to identify disease-mediating cell types.
Collapse
Affiliation(s)
- Samuel Rosean
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Eric A Sosa
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Dónal O'Shea
- School of Mathematics, Statistics & Applied Mathematics, National University of Ireland Galway, Galway, H91 TK33, Ireland
| | - Srilakshmi M Raj
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Cathal Seoighe
- School of Mathematics, Statistics & Applied Mathematics, National University of Ireland Galway, Galway, H91 TK33, Ireland
| | - John M Greally
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.
| |
Collapse
|
2
|
Waksmunski AR, Grunin M, Kinzy TG, Igo RP, Haines JL, Cooke Bailey JN. Statistical driver genes as a means to uncover missing heritability for age-related macular degeneration. BMC Med Genomics 2020; 13:95. [PMID: 32631374 PMCID: PMC7336430 DOI: 10.1186/s12920-020-00747-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 06/22/2020] [Indexed: 11/26/2022] Open
Abstract
Background Age-related macular degeneration (AMD) is a progressive retinal disease contributing to blindness worldwide. Multiple estimates for AMD heritability (h2) exist; however, a substantial proportion of h2 is not attributable to known genomic loci. The International AMD Genomics Consortium (IAMDGC) gathered the largest dataset of advanced AMD (ADV) cases and controls available and identified 34 loci containing 52 independent risk variants defining known AMD h2. To better define AMD heterogeneity, we used Pathway Analysis by Randomization Incorporating Structure (PARIS) on the IAMDGC data and identified 8 statistical driver genes (SDGs), including 2 novel SDGs not discovered by the IAMDGC. We chose to further investigate these pathway-based risk genes and determine their contribution to ADV h2, as well as the differential ADV subtype h2. Methods We performed genomic-relatedness-based restricted maximum-likelihood (GREML) analyses on ADV, geographic atrophy (GA), and choroidal neovascularization (CNV) subtypes to investigate the h2 of genotyped variants on the full DNA array chip, 34 risk loci (n = 2758 common variants), 52 variants from the IAMDGC 2016 GWAS, and the 8 SDGs, specifically the novel 2 SDGs, PPARA and PLCG2. Results Via GREML, full chip h2 was 44.05% for ADV, 46.37% for GA, and 62.03% for CNV. The lead 52 variants’ h2 (ADV: 14.52%, GA: 8.02%, CNV: 13.62%) and 34 loci h2 (ADV: 13.73%, GA: 8.81%, CNV: 12.89%) indicate that known variants contribute ~ 14% to ADV h2. SDG variants account for a small percentage of ADV, GA, and CNV heritability, but estimates based on the combination of SDGs and the 34 known loci are similar to those calculated for known loci alone. We identified modest epistatic interactions among variants in the 2 SDGs and the 52 IAMDGC variants, including modest interactions between variants in PPARA and PLCG2. Conclusions Pathway analyses, which leverage biological relationships among genes in a pathway, may be useful in identifying additional loci that contribute to the heritability of complex disorders in a non-additive manner. Heritability analyses of these loci, especially amongst disease subtypes, may provide clues to the importance of specific genes to the genetic architecture of AMD.
Collapse
Affiliation(s)
- Andrea R Waksmunski
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA.,Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, 44106, USA.,Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Michelle Grunin
- Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, 44106, USA.,Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Tyler G Kinzy
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Robert P Igo
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Jonathan L Haines
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA.,Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, 44106, USA.,Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Jessica N Cooke Bailey
- Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, 44106, USA. .,Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA.
| |
Collapse
|
3
|
Cole BS, Hall MA, Urbanowicz RJ, Gilbert‐Diamond D, Moore JH. Analysis of Gene‐Gene Interactions. ACTA ACUST UNITED AC 2018; 95:1.14.1-1.14.10. [DOI: 10.1002/cphg.45] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Brian S. Cole
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Molly A. Hall
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
- The Center for Systems Genomics, The Pennsylvania State University, University Park Pennsylvania
| | - Ryan J. Urbanowicz
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| | - Diane Gilbert‐Diamond
- Institute for Quantitative Biomedical Sciences at Dartmouth Hanover New Hampshire
- Department of Epidemiology, Geisel School of Medicine at Dartmouth Hanover New Hampshire
| | - Jason H. Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania Philadelphia Pennsylvania
| |
Collapse
|
4
|
Abstract
Most analyses of genome-wide association data consider each variant independently without considering or adjusting for the genetic background present in the rest of the genome. New approaches to genome analysis use representations of genomic sharing to better account for confounding factors like population stratification or to directly approximate heritability through the estimated sharing of individuals in a dataset. These approaches use mixed linear models, which relate genotypic sharing to phenotypic sharing, and rely on the efficient computation of genetic sharing among individuals in a dataset. This unit describes the principles and practical application of mixed models for the analysis of genome-wide association study data. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Jacob B Hall
- Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio
| | - William S Bush
- Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio
| |
Collapse
|
5
|
Butkiewicz M, Cooke Bailey JN, Frase A, Dudek S, Yaspan BL, Ritchie MD, Pendergrass SA, Haines JL. Pathway analysis by randomization incorporating structure-PARIS: an update. ACTA ACUST UNITED AC 2016; 32:2361-3. [PMID: 27153576 DOI: 10.1093/bioinformatics/btw130] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 03/03/2016] [Indexed: 01/11/2023]
Abstract
MOTIVATION We present an update to the pathway enrichment analysis tool 'Pathway Analysis by Randomization Incorporating Structure (PARIS)' that determines aggregated association signals generated from genome-wide association study results. Pathway-based analyses highlight biological pathways associated with phenotypes. PARIS uses a unique permutation strategy to evaluate the genomic structure of interrogated pathways, through permutation testing of genomic features, thus eliminating many of the over-testing concerns arising with other pathway analysis approaches. RESULTS We have updated PARIS to incorporate expanded pathway definitions through the incorporation of new expert knowledge from multiple database sources, through customized user provided pathways, and other improvements in user flexibility and functionality. AVAILABILITY AND IMPLEMENTATION PARIS is freely available to all users at https://ritchielab.psu.edu/software/paris-download CONTACT jnc43@case.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mariusz Butkiewicz
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, USA
| | - Jessica N Cooke Bailey
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, USA
| | - Alex Frase
- Biomedical and Translational Informatics Program, Geisinger Health System, Danville, PA, USA
| | - Scott Dudek
- Biomedical and Translational Informatics Program, Geisinger Health System, Danville, PA, USA
| | - Brian L Yaspan
- Department of Human Genetics, Genentech, Inc, South San Francisco, CA, USA
| | - Marylyn D Ritchie
- Biomedical and Translational Informatics Program, Geisinger Health System, Danville, PA, USA
| | - Sarah A Pendergrass
- Biomedical and Translational Informatics Program, Geisinger Health System, Danville, PA, USA
| | - Jonathan L Haines
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
6
|
Seldin MF. The genetics of human autoimmune disease: A perspective on progress in the field and future directions. J Autoimmun 2015; 64:1-12. [PMID: 26343334 PMCID: PMC4628839 DOI: 10.1016/j.jaut.2015.08.015] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 08/23/2015] [Indexed: 12/18/2022]
Abstract
Progress in defining the genetics of autoimmune disease has been dramatically enhanced by large scale genetic studies. Genome-wide approaches, examining hundreds or for some diseases thousands of cases and controls, have been implemented using high throughput genotyping and appropriate algorithms to provide a wealth of data over the last decade. These studies have identified hundreds of non-HLA loci as well as further defining HLA variations that predispose to different autoimmune diseases. These studies to identify genetic risk loci are also complemented by progress in gene expression studies including definition of expression quantitative trait loci (eQTL), various alterations in chromatin structure including histone marks, DNase I sensitivity, repressed chromatin regions as well as transcript factor binding sites. Integration of this information can partially explain why particular variations can alter proclivity to autoimmune phenotypes. Despite our incomplete knowledge base with only partial definition of hereditary factors and possible functional connections, this progress has and will continue to facilitate a better understanding of critical pathways and critical changes in immunoregulation. Advances in defining and understanding functional variants potentially can lead to both novel therapeutics and personalized medicine in which therapeutic approaches are chosen based on particular molecular phenotypes and genomic alterations.
Collapse
Affiliation(s)
- Michael F Seldin
- Department of Biochemistry and Molecular Medicine, University of California, Davis, Tupper Hall Room 4453, Davis, CA 95616, USA; Division of Rheumatology and Allergy, Department of Medicine, University of California, Davis, Tupper Hall Room 4453, Davis, CA 95616, USA.
| |
Collapse
|
7
|
Jiao H, Wang K, Yang F, Grant SFA, Hakonarson H, Price RA, Li WD. Pathway-Based Genome-Wide Association Studies for Plasma Triglycerides in Obese Females and Normal-Weight Controls. PLoS One 2015; 10:e0134923. [PMID: 26308950 PMCID: PMC4550433 DOI: 10.1371/journal.pone.0134923] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 07/15/2015] [Indexed: 12/27/2022] Open
Abstract
Pathway-based analysis as an alternative approach can provide complementary information to single-marker genome-wide association studies (GWASs), which always ignore the epistasis and does not have sufficient power to find rare variants. In this study, using genotypes from a genome-wide association study (GWAS), pathway-based association studies were carried out by a modified Gene Set Enrichment Algorithm (GSEA) method (GenGen) for triglyceride in 1028 unrelated European-American extremely obese females (BMI≥35kg/m2) and normal-weight controls (BMI<25kg/m2), and another pathway association analysis (ICSNPathway) was also used to verify the GenGen result in the same data. The GO0009110 pathway (vitamin anabolism) was among the strongest associations with triglyceride (empirical P<0.001); the result remained significant after FDR correction (P = 0.022). MMAB, an obesity-related locus, included in this pathway. The ABCG1 and BCL6 gene was found in several triglyceride-related pathways (empirical P<0.05), which were also replicated by ICSNPathway (empirical P<0.05, FDR<0.05). We also performed single-marked GWAS using PLINK for TG levels (log-transformed). Significant associations were found between ASTN2 gene SNPs and plasma triglyceride levels (rs7035794, P = 2.24×10−10). Our study suggested that vitamin anabolism pathway, BCL6 gene pathways and ASTN2 gene may contribute to the genetic variation of plasma triglyceride concentrations.
Collapse
Affiliation(s)
- Hongxiao Jiao
- Research Center of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Kai Wang
- Zilkha Neurogenetic Institute and Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, 90089, United States of America
| | - Fuhua Yang
- Research Center of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
| | - Struan F. A. Grant
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, 19104, United States of America
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, United States of America
| | - Hakon Hakonarson
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, 19104, United States of America
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, United States of America
| | - R. Arlen Price
- Center for Neurobiology and Behavior, Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, United States of America
- * E-mail: (WDL); (RAP)
| | - Wei-Dong Li
- Research Center of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China
- Center for Neurobiology and Behavior, Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, United States of America
- * E-mail: (WDL); (RAP)
| |
Collapse
|
8
|
Saez I, Set E, Hsu M. From genes to behavior: placing cognitive models in the context of biological pathways. Front Neurosci 2014; 8:336. [PMID: 25414628 PMCID: PMC4220121 DOI: 10.3389/fnins.2014.00336] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 10/05/2014] [Indexed: 01/16/2023] Open
Abstract
Connecting neural mechanisms of behavior to their underlying molecular and genetic substrates has important scientific and clinical implications. However, despite rapid growth in our knowledge of the functions and computational properties of neural circuitry underlying behavior in a number of important domains, there has been much less progress in extending this understanding to their molecular and genetic substrates, even in an age marked by exploding availability of genomic data. Here we describe recent advances in analytical strategies that aim to overcome two important challenges associated with studying the complex relationship between genes and behavior: (i) reducing distal behavioral phenotypes to a set of molecular, physiological, and neural processes that render them closer to the actions of genetic forces, and (ii) striking a balance between the competing demands of discovery and interpretability when dealing with genomic data containing up to millions of markers. Our proposed approach involves linking, on one hand, models of neural computations and circuits hypothesized to underlie behavior, and on the other hand, the set of the genes carrying out biochemical processes related to the functioning of these neural systems. In particular, we focus on the specific example of value-based decision-making, and discuss how such a combination allows researchers to leverage existing biological knowledge at both neural and genetic levels to advance our understanding of the neurogenetic mechanisms underlying behavior.
Collapse
Affiliation(s)
- Ignacio Saez
- Helen Wills Neuroscience Program, Haas School of Business, University of California, Berkeley Berkeley, CA, USA
| | - Eric Set
- Helen Wills Neuroscience Program, Haas School of Business, University of California, Berkeley Berkeley, CA, USA ; Department of Economics, University of Illinois at Urbana-Champaign Urbana, IL, USA
| | - Ming Hsu
- Helen Wills Neuroscience Program, Haas School of Business, University of California, Berkeley Berkeley, CA, USA
| |
Collapse
|
9
|
Slowikowski K, Hu X, Raychaudhuri S. SNPsea: an algorithm to identify cell types, tissues and pathways affected by risk loci. Bioinformatics 2014; 30:2496-7. [PMID: 24813542 PMCID: PMC4147889 DOI: 10.1093/bioinformatics/btu326] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
UNLABELLED We created a fast, robust and general C+ + implementation of a single-nucleotide polymorphism (SNP) set enrichment algorithm to identify cell types, tissues and pathways affected by risk loci. It tests trait-associated genomic loci for enrichment of specificity to conditions (cell types, tissues and pathways). We use a non-parametric statistical approach to compute empirical P-values by comparison with null SNP sets. As a proof of concept, we present novel applications of our method to four sets of genome-wide significant SNPs associated with red blood cell count, multiple sclerosis, celiac disease and HDL cholesterol. AVAILABILITY AND IMPLEMENTATION http://broadinstitute.org/mpg/snpsea. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kamil Slowikowski
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA 02138, USA, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston MA 02215, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA and Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA 02138, USA, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston MA 02215, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA and Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA
| | - Xinli Hu
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA 02138, USA, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston MA 02215, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA and Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA
| | - Soumya Raychaudhuri
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA 02138, USA, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston MA 02215, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA and Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA 02138, USA, Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston MA 02215, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA and Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA
| |
Collapse
|
10
|
Carbonetto P, Stephens M. Integrated enrichment analysis of variants and pathways in genome-wide association studies indicates central role for IL-2 signaling genes in type 1 diabetes, and cytokine signaling genes in Crohn's disease. PLoS Genet 2013; 9:e1003770. [PMID: 24098138 PMCID: PMC3789883 DOI: 10.1371/journal.pgen.1003770] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Accepted: 07/22/2013] [Indexed: 12/17/2022] Open
Abstract
Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study. Genome-wide association studies have helped locate gene variants that affect our susceptibility to diseases. The analysis of these studies is typically straightforward: test each genetic variant whether it is correlated with predisposition to disease. This approach often works well for identifying commonly occurring variants with moderate effects on disease risk. However, the effects of many variants are so small they fail to register statistically significant correlations. This is a concern because many diseases are modulated by many genetic factors with small effects on disease risk. An alternative is to examine groups of variants, such as variants sharing a common pathway, and assess whether these groups are “enriched” for correlations with disease. This can be a more effective approach to identifying genetic factors relevant to disease. However, it does not tell us which genes are associated with disease. To address this limitation, we describe an approach that integrates enrichment analysis with tests for disease-variant correlations within a single framework. We illustrate this approach in genome-wide studies of seven complex diseases. We show that our approach supports enriched pathways in several diseases, and uncovers disease-susceptibility genes in these pathways not identified in conventional analyses of the same data.
Collapse
Affiliation(s)
- Peter Carbonetto
- Dept. of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| | - Matthew Stephens
- Dept. of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Dept. of Statistics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
11
|
Chen R, Ren S, Sun Y. Genome-wide association studies on prostate cancer: the end or the beginning? Protein Cell 2013; 4:677-86. [PMID: 23982739 DOI: 10.1007/s13238-013-3055-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Accepted: 07/31/2013] [Indexed: 10/26/2022] Open
Abstract
Prostate cancer (PCa) is the second most frequently diagnosed malignancy in men. Genome-wide association studies (GWAS) has been highly successful in discovering susceptibility loci for prostate cancer. Currently, more than twenty GWAS have identified more than fifty common variants associated with susceptibility with PCa. Yet with the increase in loci, voices from the scientific society are calling for more. In this review, we summarize current findings, discuss the common problems troubling current studies and shed light upon possible breakthroughs in the future. GWAS is the beginning of something wonderful. Although we are quite near the end of the beginning, post-GWAS studies are just taking off and future studies are needed extensively. It is believed that in the future GWAS information will be helpful to build a comprehensive system intergraded with PCa prevention, diagnosis, molecular classification, personalized therapy.
Collapse
Affiliation(s)
- Rui Chen
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, 200433, China
| | - Shancheng Ren
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, 200433, China
| | - Yinghao Sun
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, 200433, China.
| |
Collapse
|
12
|
Brookes K. The VNTR in complex disorders: The forgotten polymorphisms? A functional way forward? Genomics 2013; 101:273-81. [DOI: 10.1016/j.ygeno.2013.03.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2013] [Revised: 03/08/2013] [Accepted: 03/11/2013] [Indexed: 12/16/2022]
|
13
|
Abstract
OBJECTIVE The aims of this study were to identify the candidate causal single nucleotide polymorphisms (SNPs) and candidate causal mechanisms of asthma and to generate SNP to gene to pathway hypotheses. METHODS SNPs that met a threshold of p ≤ 0.001 in a genome-wide association study (GWAS) dataset of asthma, which included 292,443 SNPs in 473 asthma cases and 1892 controls, were used in the present study. Identify candidate causal SNPs and pathway (ICSNPathway) analysis was applied to this dataset. RESULTS ICSNPathway analysis identified four candidate causal SNPs, four genes, and 21 candidate causal pathways, which in total provided four hypothetical biologic mechanisms: (1) rs7192 (nonsynonymous coding) to HLA-DRA to 21 pathways, such as, the role of eosinophils in the chemokine network of allergy, Th1/Th2 differentiation, and asthma (nominal p ≤ 0.001, FDR p ≤ 0.01); (2) rs20541 (nonsynonymous coding) to IL13 to asthma and cytokines and inflammatory response (nominal p<0.001, FDR p ≤ 0.008); (3) rs1058808 (frameshift coding) to ERBB2 to transmembrane receptor activity (nominal p=0.001, FDR p=0.01); (4) rs17350764 (nonsynonymous coding (deleterious)) to OR52J3 to transmembrane receptor activity (nominal p=0.001, FDR p=0.01). CONCLUSION By applying ICSNPathway analysis to asthma GWAS data, we found four candidate causal SNPs, four genes involving HLA-DRA and IL-13, and four hypotheses, which may contribute to asthma susceptibility.
Collapse
|
14
|
Sullivan PF, Daly MJ, O'Donovan M. Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nat Rev Genet 2012; 13:537-51. [PMID: 22777127 PMCID: PMC4110909 DOI: 10.1038/nrg3240] [Citation(s) in RCA: 817] [Impact Index Per Article: 68.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Psychiatric disorders are among the most intractable enigmas in medicine. In the past 5 years, there has been unprecedented progress on the genetics of many of these conditions. In this Review, we discuss the genetics of nine cardinal psychiatric disorders (namely, Alzheimer's disease, attention-deficit hyperactivity disorder, alcohol dependence, anorexia nervosa, autism spectrum disorder, bipolar disorder, major depressive disorder, nicotine dependence and schizophrenia). Empirical approaches have yielded new hypotheses about aetiology and now provide data on the often debated genetic architectures of these conditions, which have implications for future research strategies. Further study using a balanced portfolio of methods to assess multiple forms of genetic variation is likely to yield many additional new findings.
Collapse
Affiliation(s)
- Patrick F Sullivan
- Departments of Genetics and Psychiatry, CB# 7264, 5097 Genomic Medicine, University of North Carolina at Chapel Hill, North Carolina 27599-27264, USA.
| | | | | |
Collapse
|