1
|
Alsheikh AJ, Wollenhaupt S, King EA, Reeb J, Ghosh S, Stolzenburg LR, Tamim S, Lazar J, Davis JW, Jacob HJ. The landscape of GWAS validation; systematic review identifying 309 validated non-coding variants across 130 human diseases. BMC Med Genomics 2022; 15:74. [PMID: 35365203 PMCID: PMC8973751 DOI: 10.1186/s12920-022-01216-w] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 03/17/2022] [Indexed: 02/08/2023] Open
Abstract
Background The remarkable growth of genome-wide association studies (GWAS) has created a critical need to experimentally validate the disease-associated variants, 90% of which involve non-coding variants. Methods To determine how the field is addressing this urgent need, we performed a comprehensive literature review identifying 36,676 articles. These were reduced to 1454 articles through a set of filters using natural language processing and ontology-based text-mining. This was followed by manual curation and cross-referencing against the GWAS catalog, yielding a final set of 286 articles. Results We identified 309 experimentally validated non-coding GWAS variants, regulating 252 genes across 130 human disease traits. These variants covered a variety of regulatory mechanisms. Interestingly, 70% (215/309) acted through cis-regulatory elements, with the remaining through promoters (22%, 70/309) or non-coding RNAs (8%, 24/309). Several validation approaches were utilized in these studies, including gene expression (n = 272), transcription factor binding (n = 175), reporter assays (n = 171), in vivo models (n = 104), genome editing (n = 96) and chromatin interaction (n = 33). Conclusions This review of the literature is the first to systematically evaluate the status and the landscape of experimentation being used to validate non-coding GWAS-identified variants. Our results clearly underscore the multifaceted approach needed for experimental validation, have practical implications on variant prioritization and considerations of target gene nomination. While the field has a long way to go to validate the thousands of GWAS associations, we show that progress is being made and provide exemplars of validation studies covering a wide variety of mechanisms, target genes, and disease areas. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-022-01216-w.
Collapse
Affiliation(s)
- Ammar J Alsheikh
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA.
| | - Sabrina Wollenhaupt
- Information Research, AbbVie Deutschland GmbH & Co. KG, 67061, Knollstrasse, Ludwigshafen, Germany
| | - Emily A King
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA
| | - Jonas Reeb
- Information Research, AbbVie Deutschland GmbH & Co. KG, 67061, Knollstrasse, Ludwigshafen, Germany
| | - Sujana Ghosh
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA
| | | | - Saleh Tamim
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA
| | - Jozef Lazar
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA
| | - J Wade Davis
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA
| | - Howard J Jacob
- Genomics Research Center, AbbVie Inc, North Chicago, Illinois, 60064, USA
| |
Collapse
|
2
|
Grzegorzewska AE, Niepolski L, Świderska MK, Mostowska A, Stolarek I, Warchoł W, Figlerowicz M, Jagodziński PP. ENHO, RXRA, and LXRA polymorphisms and dyslipidaemia, related comorbidities and survival in haemodialysis patients. BMC MEDICAL GENETICS 2018; 19:194. [PMID: 30413149 PMCID: PMC6234788 DOI: 10.1186/s12881-018-0708-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 10/23/2018] [Indexed: 12/26/2022]
Abstract
BACKGROUND The energy homeostasis-associated gene (ENHO), retinoid X receptor alpha gene (RXRA), and liver X receptor alpha gene (LXRA) are involved in adipogenic/lipogenic regulation. We investigated whether single-nucleotide polymorphisms in these genes (ENHO rs2281997, rs72735260; RXRA rs749759, rs10776909, rs10881578; LXRA rs2279238, rs7120118, rs11039155) are associated with dyslipidaemia, related comorbidities and survival of haemodialysis (HD) patients also tested for T-helper (Th) cell interleukin genes (IL). METHODS The study was carried out in 873 HD patients. Dyslipidaemia was diagnosed by the recommendations of the Kidney Disease Outcomes Quality Initiative (K/DOQI) guidelines (2003); atherogenic dyslipidaemia was referred to if the TG/HDL cholesterol ratio was equal to or higher than 3.8. Genotyping of ENHO SNPs, LXRA SNPs, and IL12A rs568408 was carried out using HRM analysis. RXRA SNPs, IL12B rs3212227, and IL18 rs360719 were genotyped using PCR-RFLP analysis. The circulating adropin concentration was determined in 126 patients by enzyme-linked immunosorbent assay. Survival probability was analysed using the Kaplan-Meier method in 440 patients followed through 7.5 years. RESULTS Dyslipidaemia by K/DOQI was diagnosed in 459 patients (91% revealed hyper-LDL- cholesterolaemia), atherogenic dyslipidaemia was diagnosed in 454 patients, and 231 patients were free of dyslipidaemia by both criteria. The variant allele (T) of ENHO rs2281997 was associated with the hyper-LDL cholesterolaemic pattern of dyslipidaemia by K/DOQI. The frequency of atherogenic dyslipidaemia was lower in T-allele bearers than in CC-genotype patients. The rs2281997 T allele was associated with lower cardiovascular mortality in HD patients showing atherogenic dyslipidaemia. ENHO, RXRA, and LXRA showed epistatic interactions in dyslipidaemia. Circulating adropin was lower in atherogenic dyslipidaemia than in non-atherogenic conditions. RXRA rs10776909 was associated with myocardial infarction. Bearers of LXRA rs2279238, rs7120118 or rs11039155 minor alleles showed higher mortality. ENHO SNP positions fell within the same DNase 1 hypersensitivity site expressed in the Th1 cell line. Epistatic interactions occurred between rs2281997 and Th1 IL SNPs (rs360719, rs568408). CONCLUSIONS Atherogenic dyslipidaemia occurs in HD patients in whom ENHO encodes less adropin. ENHO, RXRA, and LXRA SNPs, separately or jointly, are associated with dyslipidaemia, myocardial infarction, and survival in HD patients. Differences in the availability of transcription binding sites may contribute to these associations.
Collapse
Affiliation(s)
- Alicja E Grzegorzewska
- Department of Nephrology, Transplantology and Internal Diseases, Poznan University of Medical Sciences (PUMS), Poznań, Poland.
| | | | - Monika K Świderska
- Department of Nephrology, Transplantology and Internal Diseases, Poznan University of Medical Sciences (PUMS), Poznań, Poland
| | | | - Ireneusz Stolarek
- Polish Academy of Sciences, Institute of Bioorganic Chemistry, Poznań, Poland
| | | | - Marek Figlerowicz
- Polish Academy of Sciences, Institute of Bioorganic Chemistry, Poznań, Poland
| | | |
Collapse
|
3
|
Smith AJP, Deloukas P, Munroe PB. Emerging applications of genome-editing technology to examine functionality of GWAS-associated variants for complex traits. Physiol Genomics 2018; 50:510-522. [PMID: 29652634 DOI: 10.1152/physiolgenomics.00028.2018] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Over the last decade, genome-wide association studies (GWAS) have propelled the discovery of thousands of loci associated with complex diseases. The focus is now turning toward the function of these association signals, determining the causal variant(s) among those in strong linkage disequilibrium, and identifying their underlying mechanisms, such as long-range gene regulation. Genome-editing techniques utilizing zinc-finger nucleases (ZFN), transcription activator-like effector nucleases (TALENs), and clustered regularly-interspaced short palindromic repeats with Cas9 nuclease (CRISPR-Cas9) are becoming the tools of choice to establish functionality for these variants, due to the ability to assess effects of single variants in vivo. This review will discuss examples of how these technologies have begun to aid functional analysis of GWAS loci for complex traits such as cardiovascular disease, Type 2 diabetes, cancer, obesity, and autoimmune disease. We focus on analysis of variants occurring within noncoding genomic regions, as these comprise the majority of GWAS variants, providing the greatest challenges to determining functionality, and compare editing strategies that provide different levels of evidence for variant functionality. The review describes molecular insights into some of these potentially causal variants and how these may relate to the pathology of the trait and look toward future directions for these technologies in post-GWAS analysis, such as base-editing.
Collapse
Affiliation(s)
- Andrew J P Smith
- Clinical Pharmacology, William Harvey Research Institute, Barts and The London, Queen Mary University of London , United Kingdom.,NIHR Barts Biomedical Research Centre, Queen Mary University of London , London , United Kingdom
| | - Panos Deloukas
- Clinical Pharmacology, William Harvey Research Institute, Barts and The London, Queen Mary University of London , United Kingdom.,NIHR Barts Biomedical Research Centre, Queen Mary University of London , London , United Kingdom
| | - Patricia B Munroe
- Clinical Pharmacology, William Harvey Research Institute, Barts and The London, Queen Mary University of London , United Kingdom.,NIHR Barts Biomedical Research Centre, Queen Mary University of London , London , United Kingdom
| |
Collapse
|
4
|
Do C, Shearer A, Suzuki M, Terry MB, Gelernter J, Greally JM, Tycko B. Genetic-epigenetic interactions in cis: a major focus in the post-GWAS era. Genome Biol 2017. [PMID: 28629478 PMCID: PMC5477265 DOI: 10.1186/s13059-017-1250-y] [Citation(s) in RCA: 95] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Studies on genetic-epigenetic interactions, including the mapping of methylation quantitative trait loci (mQTLs) and haplotype-dependent allele-specific DNA methylation (hap-ASM), have become a major focus in the post-genome-wide-association-study (GWAS) era. Such maps can nominate regulatory sequence variants that underlie GWAS signals for common diseases, ranging from neuropsychiatric disorders to cancers. Conversely, mQTLs need to be filtered out when searching for non-genetic effects in epigenome-wide association studies (EWAS). Sequence variants in CCCTC-binding factor (CTCF) and transcription factor binding sites have been mechanistically linked to mQTLs and hap-ASM. Identifying these sites can point to disease-associated transcriptional pathways, with implications for targeted treatment and prevention.
Collapse
Affiliation(s)
- Catherine Do
- Institute for Cancer Genetics and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA
| | - Alyssa Shearer
- Institute for Cancer Genetics and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA
| | - Masako Suzuki
- Center for Epigenomics, Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Mary Beth Terry
- Department of Epidemiology, Columbia University Mailman School of Public Health, and Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, 10032, USA
| | - Joel Gelernter
- Departments of Psychiatry, Genetics, and Neurobiology, Yale University School of Medicine, New Haven, CT, 06520, USA
| | - John M Greally
- Center for Epigenomics, Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Benjamin Tycko
- Institute for Cancer Genetics, Herbert Irving Comprehensive Cancer Center, Taub Institute for Research on Alzheimer's disease and the Aging Brain, New York, NY, 10032, USA. .,Department of Pathology and Cell Biology, Columbia University, New York, NY, 10032, USA.
| |
Collapse
|
5
|
Liver X Receptor Genes Variants Modulate ALS Phenotype. Mol Neurobiol 2017; 55:1959-1965. [PMID: 28244008 DOI: 10.1007/s12035-017-0453-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 02/09/2017] [Indexed: 12/14/2022]
Abstract
Amyotrophic lateral sclerosis (ALS) is one of the most severe motor neuron (MN) disorders in adults. Phenotype of ALS patients is highly variable and may be influenced by modulators of energy metabolism. Recent works have implicated the liver X receptors α and β (LXRs), either in the propagation process of ALS or in the maintenance of MN survival. LXRs are nuclear receptors activated by oxysterols, modulating cholesterol levels, a suspected modulator of ALS severity. In a cohort of 438 ALS patients and 330 healthy controls, the influence of LXR genes on ALS risk and phenotype was studied using single nucleotide polymorphisms (SNPs). The two LXRα SNPs rs2279238 and rs7120118 were shown to be associated with age at onset in ALS patients. Consistently, homozygotes were twice more correlated than were heterozygotes to delayed onset. The onset was thus delayed by 3.9 years for rs2279238 C/T carriers and 7.8 years for T/T carriers. Similar results were obtained for rs7120118 (+2.1 years and +6.7 years for T/C and C/C genotypes, respectively). The LXRβ SNP rs2695121 was also shown to be associated with a 30% increase of ALS duration (p = 0.0055, FDR = 0.044). The tested genotypes were not associated with ALS risk. These findings add further evidence to the suspected implication of LXR genes in the disease process of ALS and might open new perspectives in ALS therapeutics.
Collapse
|
6
|
Mouzat K, Raoul C, Polge A, Kantar J, Camu W, Lumbroso S. Liver X receptors: from cholesterol regulation to neuroprotection-a new barrier against neurodegeneration in amyotrophic lateral sclerosis? Cell Mol Life Sci 2016; 73:3801-8. [PMID: 27510420 PMCID: PMC11108529 DOI: 10.1007/s00018-016-2330-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 08/04/2016] [Indexed: 12/13/2022]
Abstract
Cholesterol plays a central role in numerous nervous system functions. Cholesterol is the major constituent of myelin sheaths, is essential for synapse and dendrite formation, axon guidance as well as neurotransmission. Among regulators of cholesterol homeostasis, liver X receptors (LXRs), two members of the nuclear receptor superfamily, play a determinant role. LXRs act as cholesterol sensors and respond to high intracellular cholesterol concentration by decreasing plasmatic and intracellular cholesterol content. Beyond their cholesterol-lowering role, LXRs have been proposed as regulators of immunity and anti-inflammatory factors. Dysregulation of cholesterol metabolism combined to neuroinflammatory context have been described in neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS). ALS is characterized by the progressive loss of motoneurons in the brain and spinal cord, leading to severe paralytic condition and death of patients in a median time of 3 years. Motoneuron degeneration is accompanied by chronic neuroinflammatory response, involving microglial and astrocytic activation, infiltration of blood-derived immune cells and release of pro-inflammatory factors. We propose to discuss here the role of LXRs as a molecular link between the central nervous system cholesterol metabolism, neuroinflammation, motoneuron survival and their potential as promising therapeutic candidates for ALS therapy.
Collapse
Affiliation(s)
- Kevin Mouzat
- Department of Biochemistry and Molecular Biology, Nîmes University Hospital, Nîmes, France.
- University of Montpellier, Montpellier, France.
- INSERM UMR1051, The Neuroscience Institute of Montpellier (INM), Saint Eloi Hospital, Montpellier, France.
| | - Cédric Raoul
- INSERM UMR1051, The Neuroscience Institute of Montpellier (INM), Saint Eloi Hospital, Montpellier, France
| | - Anne Polge
- Department of Biochemistry and Molecular Biology, Nîmes University Hospital, Nîmes, France
| | - Jovana Kantar
- Department of Biochemistry and Molecular Biology, Nîmes University Hospital, Nîmes, France
- INSERM UMR1051, The Neuroscience Institute of Montpellier (INM), Saint Eloi Hospital, Montpellier, France
| | - William Camu
- University of Montpellier, Montpellier, France
- INSERM UMR1051, The Neuroscience Institute of Montpellier (INM), Saint Eloi Hospital, Montpellier, France
- Neurology Department, ALS Center, Gui de Chauliac Hospital, Montpellier, France
| | - Serge Lumbroso
- Department of Biochemistry and Molecular Biology, Nîmes University Hospital, Nîmes, France
- University of Montpellier, Montpellier, France
- INSERM UMR1051, The Neuroscience Institute of Montpellier (INM), Saint Eloi Hospital, Montpellier, France
| |
Collapse
|
7
|
Oldoni F, Palmen J, Giambartolomei C, Howard P, Drenos F, Plagnol V, Humphries SE, Talmud PJ, Smith AJP. Post-GWAS methodologies for localisation of functional non-coding variants: ANGPTL3. Atherosclerosis 2016; 246:193-201. [PMID: 26800306 PMCID: PMC4773290 DOI: 10.1016/j.atherosclerosis.2015.12.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Revised: 11/29/2015] [Accepted: 12/04/2015] [Indexed: 11/24/2022]
Abstract
Genome-wide association studies have confirmed the involvement of non-coding angiopoietin-like 3 (ANGPTL3) gene variants with coronary artery disease, levels of low-density lipoprotein cholesterol (LDL-C), triglycerides and ANGPTL3 mRNA transcript. Extensive linkage disequilibrium at the locus, however, has hindered efforts to identify the potential functional variants. Using regulatory annotations from ENCODE, combined with functional in vivo assays such as allele-specific formaldehyde-assisted isolation of regulatory elements, statistical approaches including eQTL/lipid colocalisation, and traditional in vitro methodologies including electrophoretic mobility shift assay and luciferase reporter assays, variants affecting the ANGPTL3 regulome were examined. From 253 variants associated with ANGPTL3 mRNA expression, and/or lipid traits, 46 were located within liver regulatory elements and potentially functional. One variant, rs10889352, demonstrated allele-specific effects on DNA-protein interactions, reporter gene expression and chromatin accessibility, in line with effects on LDL-C levels and expression of ANGPTL3 mRNA. The ANGPTL3 gene lies within DOCK7, although the variant is within non-coding regions outside of ANGPTL3, within DOCK7, suggesting complex long-range regulatory effects on gene expression. This study illustrates the power of combining multiple genome-wide datasets with laboratory data to localise functional non-coding variation and provides a model for analysis of regulatory variants from GWAS.
Collapse
Affiliation(s)
- Federico Oldoni
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK; Department of Molecular Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Jutta Palmen
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK
| | | | - Philip Howard
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK
| | - Fotios Drenos
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK
| | - Vincent Plagnol
- UCL Genetics Institute, University College London, London, UK
| | - Steve E Humphries
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK
| | - Philippa J Talmud
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK
| | - Andrew J P Smith
- Department of Cardiovascular Genetics, Institute of Cardiovascular Sciences, University College London, London, UK.
| |
Collapse
|
8
|
Kim K, Lee K, Bang H, Kim JY, Choi JK. Intersection of genetics and epigenetics in monozygotic twin genomes. Methods 2015; 102:50-6. [PMID: 26548893 DOI: 10.1016/j.ymeth.2015.10.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 10/18/2015] [Indexed: 02/01/2023] Open
Abstract
As a final function of various epigenetic mechanisms, chromatin regulation is a transcription control process that especially demonstrates active interaction with genetic elements. Thus, chromatin structure has become a principal focus in recent genomics researches that strive to characterize regulatory functions of DNA variants related to diseases or other traits. Although researchers have been focusing on DNA methylation when studying monozygotic (MZ) twins, a great model in epigenetics research, interactions between genetics and epigenetics in chromatin level are expected to be an imperative research trend in the future. In this review, we discuss how the genome, epigenome, and transcriptome of MZ twins can be studied in an integrative manner from this perspective.
Collapse
Affiliation(s)
- Kwoneel Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Kibaick Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Hyoeun Bang
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Jeong Yeon Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea.
| |
Collapse
|
9
|
Abstract
Type 2 diabetes (T2D) is a metabolic disorder characterized by high blood glucose levels and elevated risk of cardiovascular events. The progression of T2D can be delayed, or prevented, so early prediction is of high importance. More than 70 genetic loci are associated with T2D risk, raising the possibility of early identification of future cases. Results show that the benefits in discrimination by including genes in current risk models are uncertain. Improvements have been shown in reclassification but are too modest for clinical use. Given the current guidelines for T2D risk assessment and the increasing availability of genotyped individuals, we could soon be able to use genetics, not to quantify risk, but to inform clinicians on those requiring earlier observation.
Collapse
Affiliation(s)
- Fotios Drenos
- MRC Integrative Epidemiology Unit, School of Social & Community Medicine, University of Bristol, Bristol, UK
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK
| |
Collapse
|
10
|
Smith AJP, Humphries SE, Talmud PJ. Identifying functional noncoding variants from genome-wide association studies for cardiovascular disease and related traits. Curr Opin Lipidol 2015; 26:120-6. [PMID: 25692342 DOI: 10.1097/mol.0000000000000158] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE OF REVIEW Genome-wide association studies have identified many novel loci for cardiovascular disease and related traits. Attention is now shifting towards the analysis of these loci for causal variants, with a view to identify the novel mechanisms leading to disease. RECENT FINDINGS This review focuses on the approaches to identify causal, noncoding variants for coronary artery disease, lipid traits and other cardiovascular risk factors. Fine-mapping studies are discussed, along with the novel statistical approaches to produce 'credible sets'. The use of combining genome-wide association study datasets with experimental methods such as expression quantitative trait loci and allele-specific chromatin accessibility are explored, with recent examples discussed. Mapping long-range chromatin interactions and evolving genome-editing technologies such as clustered regularly interspaced short palindromic repeats combined with clustered regularly interspaced short palindromic repeats-associated (Cas9) nuclease promise to aid considerably the search for causal variants. SUMMARY Identification of causal variants for cardiovascular disease and related traits is still in the early stages, but with technologies evolving and increasingly relevant tissue samples undergoing analysis, there are favourable prospects that many new mechanisms for disease will be uncovered by the end of this decade.
Collapse
Affiliation(s)
- Andrew J P Smith
- British Heart Foundation Laboratories, Institute of Cardiovascular Sciences, Centre for Cardiovascular Genetics, University College London, London, UK
| | | | | |
Collapse
|
11
|
Demonstration of the presence of the "deleted" MIR122 gene in HepG2 cells. PLoS One 2015; 10:e0122471. [PMID: 25811611 PMCID: PMC4374784 DOI: 10.1371/journal.pone.0122471] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 02/13/2015] [Indexed: 12/17/2022] Open
Abstract
MicroRNA 122 (miR-122) is highly expressed in the liver where it influences diverse biological processes and pathways, including hepatitis C virus replication and metabolism of iron and cholesterol. It is processed from a long non-coding primary transcript (~7.5 kb) and the gene has two evolutionarily-conserved regions containing the pri-mir-122 promoter and pre-mir-122 hairpin region. Several groups reported that the widely-used hepatocytic cell line HepG2 had deficient expression of miR-122, previously ascribed to deletion of the pre-mir-122 stem-loop region. We aimed to characterise this deletion by direct sequencing of 6078 bp containing the pri-mir-122 promoter and pre-mir-122 stem-loop region in HepG2 and Huh-7, a control hepatocytic cell line reported to express miR-122, supported by sequence analysis of cloned genomic DNA. In contrast to previous findings, the entire sequence was present in both cell lines. Ten SNPs were heterozygous in HepG2 indicating that DNA was present in two copies. Three validation isolates of HepG2 were sequenced, showing identical genotype to the original in two, whereas the third was different. Investigation of promoter chromatin status by FAIRE showed that Huh-7 cells had 6.2 ± 0.19- and 2.7 ± 0.01- fold more accessible chromatin at the proximal (HNF4α-binding) and distal DR1 transcription factor sites, compared to HepG2 cells (p=0.03 and 0.001, respectively). This was substantiated by ENCODE genome annotations, which showed a DNAse I hypersensitive site in the pri-mir-122 promoter in Huh-7 that was absent in HepG2 cells. While the origin of the reported deletion is unclear, cell lines should be obtained from a reputable source and used at low passage number to avoid discrepant results. Deficiency of miR-122 expression in HepG2 cells may be related to a relative deficiency of accessible promoter chromatin in HepG2 versus Huh-7 cells.
Collapse
|
12
|
del Rosario RCH, Poschmann J, Rouam SL, Png E, Khor CC, Hibberd ML, Prabhakar S. Sensitive detection of chromatin-altering polymorphisms reveals autoimmune disease mechanisms. Nat Methods 2015; 12:458-64. [PMID: 25799442 DOI: 10.1038/nmeth.3326] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2014] [Accepted: 02/06/2015] [Indexed: 12/30/2022]
Abstract
Most disease associations detected by genome-wide association studies (GWAS) lie outside coding genes, but very few have been mapped to causal regulatory variants. Here, we present a method for detecting regulatory quantitative trait loci (QTLs) that does not require genotyping or whole-genome sequencing. The method combines deep, long-read chromatin immunoprecipitation-sequencing (ChIP-seq) with a statistical test that simultaneously scores peak height correlation and allelic imbalance: the genotype-independent signal correlation and imbalance (G-SCI) test. We performed histone acetylation ChIP-seq on 57 human lymphoblastoid cell lines and used the resulting reads to call 500,066 single-nucleotide polymorphisms de novo within regulatory elements. The G-SCI test annotated 8,764 of these as histone acetylation QTLs (haQTLs)—an order of magnitude larger than the set of candidates detected by expression QTL analysis. Lymphoblastoid haQTLs were highly predictive of autoimmune disease mechanisms. Thus, our method facilitates large-scale regulatory variant detection in any moderately sized cohort for which functional profiling data can be generated, thereby simplifying identification of causal variants within GWAS loci.
Collapse
Affiliation(s)
| | - Jeremie Poschmann
- Computational and Systems Biology Group, Genome Institute of Singapore, Singapore
| | - Sigrid Laure Rouam
- Computational and Systems Biology Group, Genome Institute of Singapore, Singapore
| | - Eileen Png
- Infectious Diseases Group, Genome Institute of Singapore, Singapore
| | - Chiea Chuen Khor
- 1] Human Genetics Group, Genome Institute of Singapore, Singapore. [2] Singapore Eye Research Institute, Singapore. [3] Department of Opthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Martin Lloyd Hibberd
- 1] Infectious Diseases Group, Genome Institute of Singapore, Singapore. [2] Department of Pathogen Molecular Biology, London School of Hygiene &Tropical Medicine, London, UK
| | - Shyam Prabhakar
- Computational and Systems Biology Group, Genome Institute of Singapore, Singapore
| |
Collapse
|
13
|
Light N, Adoue V, Ge B, Chen SH, Kwan T, Pastinen T. Interrogation of allelic chromatin states in human cells by high-density ChIP-genotyping. Epigenetics 2014; 9:1238-51. [PMID: 25055051 DOI: 10.4161/epi.29920] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Allele-specific (AS) assessment of chromatin has the potential to elucidate specific cis-regulatory mechanisms, which are predicted to underlie the majority of the known genetic associations to complex disease. However, development of chromatin landscapes at allelic resolution has been challenging since sites of variable signal strength require substantial read depths not commonly applied in sequencing based approaches. In this study, we addressed this by performing parallel analyses of input DNA and chromatin immunoprecipitates (ChIP) on high-density Illumina genotyping arrays. Allele-specificity for the histone modifications H3K4me1, H3K4me3, H3K27ac, H3K27me3, and H3K36me3 was assessed using ChIP samples generated from 14 lymphoblast and 6 fibroblast cell lines. AS-ChIP SNPs were combined into domains and validated using high-confidence ChIP-seq sites. We observed characteristic patterns of allelic-imbalance for each histone-modification around allele-specifically expressed transcripts. Notably, we found H3K4me1 to be significantly anti-correlated with allelic expression (AE) at transcription start sites, indicating H3K4me1 allelic imbalance as a marker of AE. We also found that allelic chromatin domains exhibit population and cell-type specificity as well as heritability within trios. Finally, we observed that a subset of allelic chromatin domains is regulated by DNase I-sensitive quantitative trait loci and that these domains are significantly enriched for genome-wide association studies hits, with autoimmune disease associated SNPs specifically enriched in lymphoblasts. This study provides the first genome-wide maps of allelic-imbalance for five histone marks. Our results provide new insights into the role of chromatin in cis-regulation and highlight the need for high-depth sequencing in ChIP-seq studies along with the need to improve allele-specificity of ChIP-enrichment.
Collapse
Affiliation(s)
- Nicholas Light
- Department of Human Genetics; McGill University; Montréal, QC Canada; McGill University and Genome Québec Innovation Centre; McGill University; Montréal, QC Canada
| | - Véronique Adoue
- Institut National de la Santé et de la Recherche Médicale (Inserm); U1043; Toulouse, France
| | - Bing Ge
- McGill University and Genome Québec Innovation Centre; McGill University; Montréal, QC Canada
| | - Shu-Huang Chen
- McGill University and Genome Québec Innovation Centre; McGill University; Montréal, QC Canada
| | - Tony Kwan
- Department of Human Genetics; McGill University; Montréal, QC Canada; McGill University and Genome Québec Innovation Centre; McGill University; Montréal, QC Canada
| | - Tomi Pastinen
- Department of Human Genetics; McGill University; Montréal, QC Canada; McGill University and Genome Québec Innovation Centre; McGill University; Montréal, QC Canada
| |
Collapse
|
14
|
Khetarpal SA, Rader DJ. Genetics of lipid traits: Genome-wide approaches yield new biology and clues to causality in coronary artery disease. Biochim Biophys Acta Mol Basis Dis 2014; 1842:2010-2020. [PMID: 24931102 DOI: 10.1016/j.bbadis.2014.06.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 05/29/2014] [Accepted: 06/03/2014] [Indexed: 10/25/2022]
Abstract
A wealth of novel lipid loci have been identified through a variety of approaches focused on common and low-frequency variation and collaborative metaanalyses in multiethnic populations. Despite progress in identification of loci, the task of determining causal variants remains challenging. This work will undoubtedly be enhanced by improved understanding of regulatory DNA at a genomewide level as well as new methodologies for interrogating the relationships between noncoding SNPs and regulatory regions. Equally challenging is the identification of causal genes at novel loci. Some progress has been made for a handful of genes and comprehensive testing of candidate genes using multiple model systems is underway. Additional insights will be gleaned from focusing on low frequency and rare coding variation at candidate loci in large populations. This article is part of a Special Issue entitled: From Genome to Function.
Collapse
Affiliation(s)
| | - Daniel J Rader
- Perelman School of Medicine, University of Pennsylvania, USA.
| |
Collapse
|
15
|
Kim K, Ban HJ, Seo J, Lee K, Yavartanoo M, Kim SC, Park K, Cho SB, Choi JK. Genetic factors underlying discordance in chromatin accessibility between monozygotic twins. Genome Biol 2014; 15:R72. [PMID: 24887574 PMCID: PMC4072931 DOI: 10.1186/gb-2014-15-5-r72] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Accepted: 05/29/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Open chromatin is implicated in regulatory processes; thus, variations in chromatin structure may contribute to variations in gene expression and other phenotypes. In this work, we perform targeted deep sequencing for open chromatin, and array-based genotyping across the genomes of 72 monozygotic twins to identify genetic factors regulating co-twin discordance in chromatin accessibility. RESULTS We show that somatic mutations cause chromatin discordance mainly via the disruption of transcription factor binding sites. Structural changes in DNA due to C:G to A:T transversions are under purifying selection due to a strong impact on chromatin accessibility. We show that CpGs whose methylation is specifically regulated during cellular differentiation appear to be protected from high mutation rates of 5'-methylcytosines, suggesting that the spectrum of CpG variations may be shaped fully at the developmental level but not through natural selection. Based on the association mapping of within-pair chromatin differences, we search for cases in which twin siblings with a particular genotype had chromatin discordance at the relevant locus. We identify 1,325 chromatin sites that are differentially accessible, depending on the genotype of a nearby locus, suggesting that epigenetic differences can control regulatory variations via interactions with genetic factors. Poised promoters present high levels of chromatin discordance in association with either somatic mutations or genetic-epigenetic interactions. CONCLUSION Our observations illustrate how somatic mutations and genetic polymorphisms may contribute to regulatory, and ultimately phenotypic, discordance.
Collapse
|
16
|
Simon JM, Giresi PG, Davis IJ, Lieb JD. Addendum: Using formaldehyde-assisted isolation of regulatory elements (FAIRE) to isolate active regulatory DNA. Nat Protoc 2014. [DOI: 10.1038/nprot.2014.062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
17
|
Abstract
Understanding the functional mechanisms underlying genetic signals associated with complex traits and common diseases, such as cancer, diabetes and Alzheimer's disease, is a formidable challenge. Many genetic signals discovered through genome-wide association studies map to non-protein coding sequences, where their molecular consequences are difficult to evaluate. This article summarizes concepts for the systematic interpretation of non-coding genetic signals using genome annotation data sets in different cellular systems. We outline strategies for the global analysis of multiple association intervals and the in-depth molecular investigation of individual intervals. We highlight experimental techniques to validate candidate (potential causal) regulatory variants, with a focus on novel genome-editing techniques including CRISPR/Cas9. These approaches are also applicable to low-frequency and rare variants, which have become increasingly important in genomic studies of complex traits and diseases. There is a pressing need to translate genetic signals into biological mechanisms, leading to prognostic, diagnostic and therapeutic advances.
Collapse
Affiliation(s)
- Dirk S Paul
- UCL Cancer Institute, University College LondonLondon, United Kingdom
| | - Nicole Soranzo
- Wellcome Trust Sanger InstituteHinxton, Cambridge, United Kingdom
- Department of Haematology, University of CambridgeCambridge, United Kingdom
| | - Stephan Beck
- UCL Cancer Institute, University College LondonLondon, United Kingdom
| |
Collapse
|
18
|
Rhie SK, Coetzee SG, Noushmehr H, Yan C, Kim JM, Haiman CA, Coetzee GA. Comprehensive functional annotation of seventy-one breast cancer risk Loci. PLoS One 2013; 8:e63925. [PMID: 23717510 PMCID: PMC3661550 DOI: 10.1371/journal.pone.0063925] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 04/08/2013] [Indexed: 02/06/2023] Open
Abstract
Breast Cancer (BCa) genome-wide association studies revealed allelic frequency differences between cases and controls at index single nucleotide polymorphisms (SNPs). To date, 71 loci have thus been identified and replicated. More than 320,000 SNPs at these loci define BCa risk due to linkage disequilibrium (LD). We propose that BCa risk resides in a subgroup of SNPs that functionally affects breast biology. Such a shortlist will aid in framing hypotheses to prioritize a manageable number of likely disease-causing SNPs. We extracted all the SNPs, residing in 1 Mb windows around breast cancer risk index SNP from the 1000 genomes project to find correlated SNPs. We used FunciSNP, an R/Bioconductor package developed in-house, to identify potentially functional SNPs at 71 risk loci by coinciding them with chromatin biofeatures. We identified 1,005 SNPs in LD with the index SNPs (r(2)≥0.5) in three categories; 21 in exons of 18 genes, 76 in transcription start site (TSS) regions of 25 genes, and 921 in enhancers. Thirteen SNPs were found in more than one category. We found two correlated and predicted non-benign coding variants (rs8100241 in exon 2 and rs8108174 in exon 3) of the gene, ANKLE1. Most putative functional LD SNPs, however, were found in either epigenetically defined enhancers or in gene TSS regions. Fifty-five percent of these non-coding SNPs are likely functional, since they affect response element (RE) sequences of transcription factors. Functionality of these SNPs was assessed by expression quantitative trait loci (eQTL) analysis and allele-specific enhancer assays. Unbiased analyses of SNPs at BCa risk loci revealed new and overlooked mechanisms that may affect risk of the disease, thereby providing a valuable resource for follow-up studies.
Collapse
Affiliation(s)
- Suhn Kyong Rhie
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Simon G. Coetzee
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Houtan Noushmehr
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Chunli Yan
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Jae Mun Kim
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Christopher A. Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Gerhard A. Coetzee
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
19
|
Abstract
PURPOSE OF REVIEW This review summarizes recently published large-scale efforts elucidating the genetic architecture of lipid levels. A supplemental file with all genetic loci is provided for research purposes and we performed bioinformatic analyses of the genetic variants to give an oversight of involved pathways. RECENT FINDINGS In total, 52 genes for HDL cholesterol, 42 genes for LDL cholesterol, 59 genes for total cholesterol, and 39 genes for triglycerides have been identified. Genetic overlap is present between the different traits and similar pathways are involved. Most of the SNPs that were detected in the European studies could be replicated in other ethnicities and these SNPs show the same direction of effect suggesting that the underlying genetic architecture of blood lipids is similar between ethnicities. SUMMARY Genetic studies have identified many loci associated with plasma lipids and have provided insight into the underlying mechanisms of lipid homeostasis. Future research is needed to determine whether these loci may be novel targets for lipid-lowering therapy and for reducing cardiovascular disease risk. In addition, the proportion of genetic variance explained by these lipid loci is still limited and new large-scale genetic studies are ongoing to identify additional common and rare variants associated with lipid levels.
Collapse
Affiliation(s)
- Folkert W. Asselbergs
- Division of Heart and Lungs, Department of Cardiology
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht
- Durrer Center for Cardiogenetic Research, Amsterdam, The Netherlands
| | - Ruth C. Lovering
- Centre for Cardiovascular Genetics, BHF Laboratories, Institute of Cardiovascular Sciences, University College London, London, UK
| | - Fotios Drenos
- Centre for Cardiovascular Genetics, BHF Laboratories, Institute of Cardiovascular Sciences, University College London, London, UK
| |
Collapse
|
20
|
Lee K, Kim SC, Jung I, Kim K, Seo J, Lee HS, Bogu GK, Kim D, Lee S, Lee B, Choi JK. Genetic landscape of open chromatin in yeast. PLoS Genet 2013; 9:e1003229. [PMID: 23408895 PMCID: PMC3567132 DOI: 10.1371/journal.pgen.1003229] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 11/26/2012] [Indexed: 11/19/2022] Open
Abstract
Chromatin regulation underlies a variety of DNA metabolism processes, including transcription, recombination, repair, and replication. To perform a quantitative genetic analysis of chromatin accessibility, we obtained open chromatin profiles across 96 genetically different yeast strains by FAIRE (formaldehyde-assisted isolation of regulatory elements) assay followed by sequencing. While 5∼10% of open chromatin region (OCRs) were significantly affected by variations in their underlying DNA sequences, subtelomeric areas as well as gene-rich and gene-poor regions displayed high levels of sequence-independent variation. We performed quantitative trait loci (QTL) mapping using the FAIRE signal for each OCR as a quantitative trait. While individual OCRs were associated with a handful of specific genetic markers, gene expression levels were associated with many regulatory loci. We found multi-target trans-loci responsible for a very large number of OCRs, which seemed to reflect the widespread influence of certain chromatin regulators. Such regulatory hotspots were enriched for known regulatory functions, such as recombinational DNA repair, telomere replication, and general transcription control. The OCRs associated with these multi-target trans-loci coincided with recombination hotspots, telomeres, and gene-rich regions according to the function of the associated regulators. Our findings provide a global quantitative picture of the genetic architecture of chromatin regulation.
Collapse
Affiliation(s)
- Kibaick Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Sang Cheol Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Inkyung Jung
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Kwoneel Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Jungmin Seo
- Research Institute of Bioinformatics, Omicsis, Daejeon, Korea
| | - Heun-Sik Lee
- Center for Genome Science, National Institute of Health, Cheongwon, Korea
| | | | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Sanghyuk Lee
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
| | - Byungwook Lee
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
- * E-mail: (BL); (JKC)
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea
- Genome Institute of Singapore, Singapore, Singapore
- * E-mail: (BL); (JKC)
| |
Collapse
|
21
|
Functional variants from chromatin changes. Nat Methods 2012. [DOI: 10.1038/nmeth.2201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|