1
|
Fazel-Najafabadi M, Looger LL, Reddy-Rallabandi H, Nath SK. A multilayered post-GWAS analysis pipeline defines functional variants and target genes for systemic lupus erythematosus (SLE). MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.07.23288295. [PMID: 37066327 PMCID: PMC10104240 DOI: 10.1101/2023.04.07.23288295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Objectives Systemic lupus erythematosus (SLE), an autoimmune disease with incompletely understood etiology, has a strong genetic component. Although genome-wide association studies (GWAS) have revealed multiple SLE susceptibility loci and associated single nucleotide polymorphisms (SNPs), the precise causal variants, target genes, cell types, tissues, and mechanisms of action remain largely unknown. Methods Here, we report a comprehensive post-GWAS analysis using extensive bioinformatics, molecular modeling, and integrative functional genomic and epigenomic analyses to optimize fine-mapping. We compile and cross-reference immune cell-specific expression quantitative trait loci ( cis - and trans -eQTLs) with promoter-capture Hi-C, allele-specific chromatin accessibility, and massively parallel reporter assay data to define predisposing variants and target genes. We experimentally validate a predicted locus using CRISPR/Cas9 genome editing, qPCR, and Western blot. Results Anchoring on 452 index SNPs, we selected 9,931 high-linkage disequilibrium (r 2 >0.8) SNPs and defined 182 independent non-HLA SLE loci. 3,746 SNPs from 143 loci were identified as regulating 564 unique genes. Target genes are enriched in lupus-related tissues and associated with other autoimmune diseases. Of these, 329 SNPs (106 loci) showed significant allele-specific chromatin accessibility and/or enhancer activity, indicating regulatory potential. Using CRISPR/Cas9, we validated rs57668933 as a functional variant regulating multiple targets, including SLE risk gene ELF1 , in B-cells. Conclusion We demonstrate and validate post-GWAS strategies for utilizing multi-dimensional data to prioritize likely causal variants with cognate gene targets underlying SLE pathogenesis. Our results provide a catalog of significantly SLE-associated SNPs and loci, target genes, and likely biochemical mechanisms, to guide experimental characterization.
Collapse
|
2
|
Cao Y, Liu S, Ren G, Tang Q, Zhao K. cLoops2: a full-stack comprehensive analytical tool for chromatin interactions. Nucleic Acids Res 2021; 50:57-71. [PMID: 34928392 DOI: 10.1093/nar/gkab1233] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 11/18/2021] [Accepted: 12/02/2021] [Indexed: 12/21/2022] Open
Abstract
Investigating chromatin interactions between regulatory regions such as enhancer and promoter elements is vital for understanding the regulation of gene expression. Compared to Hi-C and its variants, the emerging 3D mapping technologies focusing on enriched signals, such as TrAC-looping, reduce the sequencing cost and provide higher interaction resolution for cis-regulatory elements. A robust pipeline is needed for the comprehensive interpretation of these data, especially for loop-centric analysis. Therefore, we have developed a new versatile tool named cLoops2 for the full-stack analysis of these 3D chromatin interaction data. cLoops2 consists of core modules for peak-calling, loop-calling, differentially enriched loops calling and loops annotation. It also contains multiple modules for interaction resolution estimation, data similarity estimation, features quantification, feature aggregation analysis, and visualization. cLoops2 with documentation and example data are open source and freely available at GitHub: https://github.com/KejiZhaoLab/cLoops2.
Collapse
Affiliation(s)
- Yaqiang Cao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20892, USA
| | - Shuai Liu
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20892, USA
| | - Gang Ren
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20892, USA
| | - Qingsong Tang
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20892, USA
| | - Keji Zhao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20892, USA
| |
Collapse
|
3
|
Li Y, Yan H, Guo J, Han Y, Zhang C, Liu X, Du J, Tian XL. Down-regulated RGS5 by genetic variants impairs endothelial cell function and contributes to coronary artery disease. Cardiovasc Res 2021; 117:240-255. [PMID: 31605122 DOI: 10.1093/cvr/cvz268] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 08/22/2019] [Accepted: 10/04/2019] [Indexed: 12/20/2022] Open
Abstract
AIMS Genetic contribution to coronary artery disease (CAD) remains largely unillustrated. Although transcriptomic profiles have identified dozens of genes that are differentially expressed in normal and atherosclerotic vessels, whether those genes are genetically associated with CAD remains to be determined. Here, we combined genetic association studies, transcriptome profiles and in vitro and in vivo functional experiments to identify novel susceptibility genes for CAD. METHODS AND RESULTS Through an integrative analysis of transcriptome profiles with genome-wide association studies for CAD, we obtained 18 candidate genes and selected one representative single nucleotide polymorphism (SNP) for each gene for multi-centred validations. We identified an intragenic SNP, rs1056515 in RGS5 gene (odds ratio = 1.17, 95% confidence interval =1.10-1.24, P = 3.72 × 10-8) associated with CAD at genome-wide significance. Rare genetic variants in linkage disequilibrium with rs1056515 were identified in CAD patients leading to a decreased expression of RGS5. The decreased expression was also observed in atherosclerotic vessels and endothelial cells treated by various cardiovascular risk factors. Through siRNA knockdown and adenoviral overexpression, we further showed that RGS5 regulated endothelial inflammation, vascular remodelling, as well as canonical NF-κB signalling activation. Moreover, CXCL12, a specific downstream target of the non-canonical NF-κB pathway, was strongly affected by RGS5. However, the p100 processing, a well-documented marker for non-canonical NF-κB pathway activation, was not altered, suggesting an existence of a novel mechanism by which RGS5 regulates CXCL12. CONCLUSIONS We identified RGS5 as a novel susceptibility gene for CAD and showed that the decreased expression of RGS5 impaired endothelial cell function and functionally contributed to atherosclerosis through a variety of molecular mechanisms. How RGS5 regulates the expression of CXCL12 needs further studies.
Collapse
Affiliation(s)
- Yang Li
- Vascular Biology Laboratory, Beijing Anzhen Hospital, Capital Medical University, Beijing Institute of Heart, Lung & Blood Vessel Disease, Beijing, China
| | - Han Yan
- Department of Human Population Genetics, Institute of Molecular Medicine, Peking University, No. 5 Yiheyuan Road, Beijing, China
| | - Jian Guo
- Department of Human Population Genetics, Institute of Molecular Medicine, Peking University, No. 5 Yiheyuan Road, Beijing, China
| | - Yingchun Han
- Vascular Biology Laboratory, Beijing Anzhen Hospital, Capital Medical University, Beijing Institute of Heart, Lung & Blood Vessel Disease, Beijing, China
| | - Cuifang Zhang
- Department of Human Population Genetics, Institute of Molecular Medicine, Peking University, No. 5 Yiheyuan Road, Beijing, China
| | - Xiuying Liu
- Center for Molecular Systems Biology, Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Jie Du
- Vascular Biology Laboratory, Beijing Anzhen Hospital, Capital Medical University, Beijing Institute of Heart, Lung & Blood Vessel Disease, Beijing, China
| | - Xiao-Li Tian
- Department of Human Population Genetics, Institute of Molecular Medicine, Peking University, No. 5 Yiheyuan Road, Beijing, China
- Department of Human Population Genetics, A217 Life Science Building, Human Aging Research Institute and School of Life Science, Jiangxi Key Laboratory of Human Aging, Nanchang University, 999 Xuefu Road, Honggutan New District, Nanchang City, Jiangxi Province 330031, China
| |
Collapse
|
4
|
The Utility of Resolving Asthma Molecular Signatures Using Tissue-Specific Transcriptome Data. G3-GENES GENOMES GENETICS 2020; 10:4049-4062. [PMID: 32900903 PMCID: PMC7642926 DOI: 10.1534/g3.120.401718] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
An integrative analysis focused on multi-tissue transcriptomics has not been done for asthma. Tissue-specific DEGs remain undetected in many multi-tissue analyses, which influences identification of disease-relevant pathways and potential drug candidates. Transcriptome data from 609 cases and 196 controls, generated using airway epithelium, bronchial, nasal, airway macrophages, distal lung fibroblasts, proximal lung fibroblasts, CD4+ lymphocytes, CD8+ lymphocytes from whole blood and induced sputum samples, were retrieved from Gene Expression Omnibus (GEO). Differentially regulated asthma-relevant genes identified from each sample type were used to identify (a) tissue-specific and tissue-shared asthma pathways, (b) their connection to GWAS-identified disease genes to identify candidate tissue for functional studies, (c) to select surrogate sample for invasive tissues, and finally (d) to identify potential drug candidates via connectivity map analysis. We found that inter-tissue similarity in gene expression was more pronounced at pathway/functional level than at gene level with highest similarity between bronchial epithelial cells and lung fibroblasts, and lowest between airway epithelium and whole blood samples. Although public-domain gene expression data are limited by inadequately annotated per-sample demographic and clinical information which limited the analysis, our tissue-resolved analysis clearly demonstrated relative importance of unique and shared asthma pathways, At the pathway level, IL-1b signaling and ERK signaling were significant in many tissue types, while Insulin-like growth factor and TGF-beta signaling were relevant in only airway epithelial tissue. IL-12 (in macrophages) and Immunoglobulin signaling (in lymphocytes) and chemokines (in nasal epithelium) were the highest expressed pathways. Overall, the IL-1 signaling genes (inflammatory) were relevant in the airway compartment, while pro-Th2 genes including IL-13 and STAT6 were more relevant in fibroblasts, lymphocytes, macrophages and bronchial biopsies. These genes were also associated with asthma in the GWAS catalog. Support Vector Machine showed that DEGs based on macrophages and epithelial cells have the highest and lowest discriminatory accuracy, respectively. Drug (entinostat, BMS-345541) and genetic perturbagens (KLF6, BCL10, INFB1 and BAMBI) negatively connected to disease at multi-tissue level could potentially repurposed for treating asthma. Collectively, our study indicates that the DEGs, perturbagens and disease are connected differentially depending on tissue/cell types. While most of the existing literature describes asthma transcriptome data from individual sample types, the present work demonstrates the utility of multi-tissue transcriptome data. Future studies should focus on collecting transcriptomic data from multiple tissues, age and race groups, genetic background, disease subtypes and on the availability of better-annotated data in the public domain.
Collapse
|
5
|
Discover novel disease-associated genes based on regulatory networks of long-range chromatin interactions. Methods 2020; 189:22-33. [PMID: 33096239 DOI: 10.1016/j.ymeth.2020.10.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 08/29/2020] [Accepted: 10/18/2020] [Indexed: 02/01/2023] Open
Abstract
Identifying genes and non-coding genetic variants that are genetically associated with complex diseases and the underlying mechanisms is one of the most important questions in functional genomics. Due to the limited statistical power and the lack of mechanistic modeling, traditional genome-wide association studies (GWAS) is restricted to fully address this question. Based on multi-omics data integration, cell-type specific regulatory networks can be built to improve GWAS analysis. In this study, we developed a new computational infrastructure, APRIL, to incorporate 3D chromatin interactions into regulatory network construction, which can extend the networks to include long-range cis-regulatory links between non-coding GWAS SNPs and target genes. Combinatorial transcription factors that co-regulate groups of genes are also inferred to further expand the networks with trans-regulation. A suite of machine learning predictions and statistical tests are incorporated in APRIL to predict novel disease-associated genes based on the expanded regulatory networks. Important features of non-coding regulatory elements and genetic variants are prioritized in network-based predictions, providing systems-level insights on the mechanisms of transcriptional dysregulation associated with complex diseases.
Collapse
|
6
|
Kibinge NK, Relton CL, Gaunt TR, Richardson TG. Characterizing the Causal Pathway for Genetic Variants Associated with Neurological Phenotypes Using Human Brain-Derived Proteome Data. Am J Hum Genet 2020; 106:885-892. [PMID: 32413284 PMCID: PMC7273531 DOI: 10.1016/j.ajhg.2020.04.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 04/06/2020] [Indexed: 01/09/2023] Open
Abstract
Leveraging high-dimensional molecular datasets can help us develop mechanistic insight into associations between genetic variants and complex traits. In this study, we integrated human proteome data derived from brain tissue to evaluate whether targeted proteins putatively mediate the effects of genetic variants on seven neurological phenotypes (Alzheimer disease, amyotrophic lateral sclerosis, depression, insomnia, intelligence, neuroticism, and schizophrenia). Applying the principles of Mendelian randomization (MR) systematically across the genome highlighted 43 effects between genetically predicted proteins derived from the dorsolateral prefrontal cortex and these outcomes. Furthermore, genetic colocalization provided evidence that the same causal variant at 12 of these loci was responsible for variation in both protein and neurological phenotype. This included genes such as DCC, which encodes the netrin-1 receptor and has an important role in the development of the nervous system (p = 4.29 × 10-11 with neuroticism), as well as SARM1, which has been previously implicated in axonal degeneration (p = 1.76 × 10-08 with amyotrophic lateral sclerosis). We additionally conducted a phenome-wide MR study for each of these 12 genes to assess potential pleiotropic effects on 700 complex traits and diseases. Our findings suggest that genes such as SNX32, which was initially associated with increased risk of Alzheimer disease, may potentially influence other complex traits in the opposite direction. In contrast, genes such as CTSH (which was also associated with Alzheimer disease) and SARM1 may make worthwhile therapeutic targets because they did not have genetically predicted effects on any of the other phenotypes after correcting for multiple testing.
Collapse
Affiliation(s)
- Nelson K Kibinge
- Medical Research Council (MRC) Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom
| | - Caroline L Relton
- Medical Research Council (MRC) Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom
| | - Tom R Gaunt
- Medical Research Council (MRC) Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom
| | - Tom G Richardson
- Medical Research Council (MRC) Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom.
| |
Collapse
|