1
|
The genetic regulation of the gastric transcriptome is associated with metabolic and obesity-related traits and diseases. Physiol Genomics 2024; 56:384-396. [PMID: 38406838 DOI: 10.1152/physiolgenomics.00120.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 01/26/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024] Open
Abstract
Tissue-specific gene expression and gene regulation lead to a better understanding of tissue-specific physiology and pathophysiology. We analyzed the transcriptome and genetic regulatory profiles of two distinct gastric sites, corpus and antrum, to identify tissue-specific gene expression and its regulation. Gastric corpus and antrum mucosa biopsies were collected during routine gastroscopies from up to 431 healthy individuals. We obtained genotype and transcriptome data and performed transcriptome profiling and expression quantitative trait locus (eQTL) studies. We further used data from genome-wide association studies (GWAS) of various diseases and traits to partition their heritability and to perform transcriptome-wide association studies (TWAS). The transcriptome data from corpus and antral mucosa highlights the heterogeneity of gene expression in the stomach. We identified enriched pathways revealing distinct and common physiological processes in gastric corpus and antrum. Furthermore, we found an enrichment of the single nucleotide polymorphism (SNP)-based heritability of metabolic, obesity-related, and cardiovascular traits and diseases by considering corpus- and antrum-specifically expressed genes. Particularly, we could prioritize gastric-specific candidate genes for multiple metabolic traits, like NQO1 which is involved in glucose metabolism, MUC1 which contributes to purine and protein metabolism or RAB27B being a regulator of weight and body composition. Our findings show that gastric corpus and antrum vary in their transcriptome and genetic regulatory profiles indicating physiological differences which are mostly related to digestion and epithelial protection. Moreover, our findings demonstrate that the genetic regulation of the gastric transcriptome is linked to biological mechanisms associated with metabolic, obesity-related, and cardiovascular traits and diseases. NEW & NOTEWORTHY We analyzed the transcriptomes and genetic regulatory profiles of gastric corpus and for the first time also of antrum mucosa in 431 healthy individuals. Through tissue-specific gene expression and eQTL analyses, we uncovered unique and common physiological processes across both primary gastric sites. Notably, our findings reveal that stomach-specific eQTLs are enriched in loci associated with metabolic traits and diseases, highlighting the pivotal role of gene expression regulation in gastric physiology and potential pathophysiology.
Collapse
|
2
|
Subset-based method for cross-tissue transcriptome-wide association studies improves power and interpretability. HGG ADVANCES 2024; 5:100283. [PMID: 38491773 PMCID: PMC10999697 DOI: 10.1016/j.xhgg.2024.100283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 03/09/2024] [Accepted: 03/09/2024] [Indexed: 03/18/2024] Open
Abstract
Integrating results from genome-wide association studies (GWASs) and studies of molecular phenotypes such as gene expressions can improve our understanding of the biological functions of trait-associated variants and can help prioritize candidate genes for downstream analysis. Using reference expression quantitative trait locus (eQTL) studies, several methods have been proposed to identify gene-trait associations, primarily based on gene expression imputation. To increase the statistical power by leveraging substantial eQTL sharing across tissues, meta-analysis methods aggregating such gene-based test results across multiple tissues or contexts have been developed as well. However, most existing meta-analysis methods have limited power to identify associations when the gene has weaker associations in only a few tissues and cannot identify the subset of tissues in which the gene is "activated." For this, we developed a cross-tissue subset-based transcriptome-wide association study (CSTWAS) meta-analysis method that improves power under such scenarios and can extract the set of potentially associated tissues. To improve applicability, CSTWAS uses only GWAS summary statistics and pre-computed correlation matrices to identify a subset of tissues that have the maximal evidence of gene-trait association. Through numerical simulations, we found that CSTWAS can maintain a well-calibrated type-I error rate, improves power especially when there is a small number of associated tissues for a gene-trait association, and identifies an accurate associated tissue set. By analyzing GWAS summary statistics of three complex traits and diseases, we demonstrate that CSTWAS could identify biological meaningful signals while providing an interpretation of disease etiology by extracting a set of potentially associated tissues.
Collapse
|
3
|
TBK1, a prioritized drug repurposing target for amyotrophic lateral sclerosis: evidence from druggable genome Mendelian randomization and pharmacological verification in vitro. BMC Med 2024; 22:96. [PMID: 38443977 PMCID: PMC10916235 DOI: 10.1186/s12916-024-03314-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 02/23/2024] [Indexed: 03/07/2024] Open
Abstract
BACKGROUND There is a lack of effective therapeutic strategies for amyotrophic lateral sclerosis (ALS); therefore, drug repurposing might provide a rapid approach to meet the urgent need for treatment. METHODS To identify therapeutic targets associated with ALS, we conducted Mendelian randomization (MR) analysis and colocalization analysis using cis-eQTL of druggable gene and ALS GWAS data collections to determine annotated druggable gene targets that exhibited significant associations with ALS. By subsequent repurposing drug discovery coupled with inclusion criteria selection, we identified several drug candidates corresponding to their druggable gene targets that have been genetically validated. The pharmacological assays were then conducted to further assess the efficacy of genetics-supported repurposed drugs for potential ALS therapy in various cellular models. RESULTS Through MR analysis, we identified potential ALS druggable genes in the blood, including TBK1 [OR 1.30, 95%CI (1.19, 1.42)], TNFSF12 [OR 1.36, 95%CI (1.19, 1.56)], GPX3 [OR 1.28, 95%CI (1.15, 1.43)], TNFSF13 [OR 0.45, 95%CI (0.32, 0.64)], and CD68 [OR 0.38, 95%CI (0.24, 0.58)]. Additionally, we identified potential ALS druggable genes in the brain, including RESP18 [OR 1.11, 95%CI (1.07, 1.16)], GPX3 [OR 0.57, 95%CI (0.48, 0.68)], GDF9 [OR 0.77, 95%CI (0.67, 0.88)], and PTPRN [OR 0.17, 95%CI (0.08, 0.34)]. Among them, TBK1, TNFSF12, RESP18, and GPX3 were confirmed in further colocalization analysis. We identified five drugs with repurposing opportunities targeting TBK1, TNFSF12, and GPX3, namely fostamatinib (R788), amlexanox (AMX), BIIB-023, RG-7212, and glutathione as potential repurposing drugs. R788 and AMX were prioritized due to their genetic supports, safety profiles, and cost-effectiveness evaluation. Further pharmacological analysis revealed that R788 and AMX mitigated neuroinflammation in ALS cell models characterized by overly active cGAS/STING signaling that was induced by MSA-2 or ALS-related toxic proteins (TDP-43 and SOD1), through the inhibition of TBK1 phosphorylation. CONCLUSIONS Our MR analyses provided genetic evidence supporting TBK1, TNFSF12, RESP18, and GPX3 as druggable genes for ALS treatment. Among the drug candidates targeting the above genes with repurposing opportunities, FDA-approved drug-R788 and AMX served as effective TBK1 inhibitors. The subsequent pharmacological studies validated the potential of R788 and AMX for treating specific ALS subtypes through the inhibition of TBK1 phosphorylation.
Collapse
|
4
|
Expression quantitative trait loci analysis in rheumatoid arthritis identifies tissue specific variants associated with severity and outcome. Ann Rheum Dis 2024; 83:288-299. [PMID: 37979960 PMCID: PMC10894812 DOI: 10.1136/ard-2023-224540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 10/20/2023] [Indexed: 11/20/2023]
Abstract
OBJECTIVE Genome-wide association studies have successfully identified more than 100 loci associated with susceptibility to rheumatoid arthritis (RA). However, our understanding of the functional effects of genetic variants in causing RA and their effects on disease severity and response to treatment remains limited. METHODS In this study, we conducted expression quantitative trait locus (eQTL) analysis to dissect the link between genetic variants and gene expression comparing the disease tissue against blood using RNA-Sequencing of synovial biopsies (n=85) and blood samples (n=51) from treatment-naïve patients with RA from the Pathobiology of Early Arthritis Cohort. RESULTS This identified 898 eQTL genes in synovium and genes loci in blood, with 232 genes in common to both synovium and blood, although notably many eQTL were tissue specific. Examining the HLA region, we uncovered a specific eQTL at HLA-DPB2 with the critical triad of single-nucleotide polymorphisms (SNPs) rs3128921 driving synovial HLA-DPB2 expression, and both rs3128921 and HLA-DPB2 gene expression correlating with clinical severity and increasing probability of the lympho-myeloid pathotype. CONCLUSIONS This analysis highlights the need to explore functional consequences of genetic associations in disease tissue. HLA-DPB2 SNP rs3128921 could potentially be used to stratify patients to more aggressive treatment immediately at diagnosis.
Collapse
|
5
|
Common genetic variation impacts stress response in the brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.27.573459. [PMID: 38234801 PMCID: PMC10793429 DOI: 10.1101/2023.12.27.573459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
To explain why individuals exposed to identical stressors experience divergent clinical outcomes, we determine how molecular encoding of stress modifies genetic risk for brain disorders. Analysis of post-mortem brain (n=304) revealed 8557 stress-interactive expression quantitative trait loci (eQTLs) that dysregulate expression of 915 eGenes in response to stress, and lie in stress-related transcription factor binding sites. Response to stress is robust across experimental paradigms: up to 50% of stress-interactive eGenes validate in glucocorticoid treated hiPSC-derived neurons (n=39 donors). Stress-interactive eGenes show brain region- and cell type-specificity, and, in post-mortem brain, implicate glial and endothelial mechanisms. Stress dysregulates long-term expression of disorder risk genes in a genotype-dependent manner; stress-interactive transcriptomic imputation uncovered 139 novel genes conferring brain disorder risk only in the context of traumatic stress. Molecular stress-encoding explains individualized responses to traumatic stress; incorporating trauma into genomic studies of brain disorders is likely to improve diagnosis, prognosis, and drug discovery.
Collapse
|
6
|
Stem Cell Models for Context-Specific Modeling in Psychiatric Disorders. Biol Psychiatry 2023; 93:642-650. [PMID: 36658083 DOI: 10.1016/j.biopsych.2022.09.033] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 09/27/2022] [Accepted: 09/27/2022] [Indexed: 01/21/2023]
Abstract
Genome-wide association studies reveal the complex polygenic architecture underlying psychiatric disorder risk, but there is an unmet need to validate causal variants, resolve their target genes(s), and explore their functional impacts on disorder-related mechanisms. Disorder-associated loci regulate transcription of target genes in a cell type- and context-specific manner, which can be measured through expression quantitative trait loci. In this review, we discuss methods and insights from context-specific modeling of genetically and environmentally regulated expression. Human induced pluripotent stem cell-derived cell type and organoid models have uncovered context-specific psychiatric disorder associations by investigating tissue-, cell type-, sex-, age-, and stressor-specific genetic regulation of expression. Techniques such as massively parallel reporter assays and pooled CRISPR (clustered regularly interspaced short palindromic repeats) screens make it possible to functionally fine-map genome-wide association study loci and validate their target genes at scale. Integration of disorder-associated contexts with these patient-specific human induced pluripotent stem cell models makes it possible to uncover gene by environment interactions that mediate disorder risk, which will ultimately improve our ability to diagnose and treat psychiatric disorders.
Collapse
|
7
|
Brain expression quantitative trait locus and network analyses reveal downstream effects and putative drivers for brain-related diseases. Nat Genet 2023; 55:377-388. [PMID: 36823318 PMCID: PMC10011140 DOI: 10.1038/s41588-023-01300-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
Identification of therapeutic targets from genome-wide association studies (GWAS) requires insights into downstream functional consequences. We harmonized 8,613 RNA-sequencing samples from 14 brain datasets to create the MetaBrain resource and performed cis- and trans-expression quantitative trait locus (eQTL) meta-analyses in multiple brain region- and ancestry-specific datasets (n ≤ 2,759). Many of the 16,169 cortex cis-eQTLs were tissue-dependent when compared with blood cis-eQTLs. We inferred brain cell types for 3,549 cis-eQTLs by interaction analysis. We prioritized 186 cis-eQTLs for 31 brain-related traits using Mendelian randomization and co-localization including 40 cis-eQTLs with an inferred cell type, such as a neuron-specific cis-eQTL (CYP24A1) for multiple sclerosis. We further describe 737 trans-eQTLs for 526 unique variants and 108 unique genes. We used brain-specific gene-co-regulation networks to link GWAS loci and prioritize additional genes for five central nervous system diseases. This study represents a valuable resource for post-GWAS research on central nervous system diseases.
Collapse
|
8
|
Abstract
BACKGROUND Coronary artery disease (CAD) is the leading cause of death worldwide. Recent meta-analyses of genome-wide association studies have identified over 175 loci associated with CAD. The majority of these loci are in noncoding regions and are predicted to regulate gene expression. Given that vascular smooth muscle cells (SMCs) play critical roles in the development and progression of CAD, we aimed to identify the subset of the CAD loci associated with the regulation of transcription in distinct SMC phenotypes. METHODS We measured gene expression in SMCs isolated from the ascending aortas of 151 heart transplant donors of various genetic ancestries in quiescent or proliferative conditions and calculated the association of their expression and splicing with ~6.3 million imputed single-nucleotide polymorphism markers across the genome. RESULTS We identified 4910 expression and 4412 splicing quantitative trait loci (sQTLs) representing regions of the genome associated with transcript abundance and splicing. A total of 3660 expression quantitative trait loci (eQTLs) had not been observed in the publicly available Genotype-Tissue Expression dataset. Further, 29 and 880 eQTLs were SMC-specific and sex-biased, respectively. We made these results available for public query on a user-friendly website. To identify the effector transcript(s) regulated by CAD loci, we used 4 distinct colocalization approaches. We identified 84 eQTL and 164 sQTL that colocalized with CAD loci, highlighting the importance of genetic regulation of mRNA splicing as a molecular mechanism for CAD genetic risk. Notably, 20% and 35% of the eQTLs were unique to quiescent or proliferative SMCs, respectively. One CAD locus colocalized with a sex-specific eQTL (TERF2IP), and another locus colocalized with SMC-specific eQTL (ALKBH8). The most significantly associated CAD locus, 9p21, was an sQTL for the long noncoding RNA CDKN2B-AS1, also known as ANRIL, in proliferative SMCs. CONCLUSIONS Collectively, our results provide evidence for the molecular mechanisms of genetic susceptibility to CAD in distinct SMC phenotypes.
Collapse
|
9
|
Mendelian randomization and genetic colocalization infer the effects of the multi-tissue proteome on 211 complex disease-related phenotypes. Genome Med 2022; 14:140. [PMID: 36510323 PMCID: PMC9746220 DOI: 10.1186/s13073-022-01140-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/10/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Human proteins are widely used as drug targets. Integration of large-scale protein-level genome-wide association studies (GWAS) and disease-related GWAS has thus connected genetic variation to disease mechanisms via protein. Previous proteome-by-phenome-wide Mendelian randomization (MR) studies have been mainly focused on plasma proteomes. Previous MR studies using the brain proteome only reported protein effects on a set of pre-selected tissue-specific diseases. No studies, however, have used high-throughput proteomics from multiple tissues to perform MR on hundreds of phenotypes. METHODS Here, we performed MR and colocalization analysis using multi-tissue (cerebrospinal fluid (CSF), plasma, and brain from pre- and post-meta-analysis of several disease-focus cohorts including Alzheimer disease (AD)) protein quantitative trait loci (pQTLs) as instrumental variables to infer protein effects on 211 phenotypes, covering seven broad categories: biological traits, blood traits, cancer types, neurological diseases, other diseases, personality traits, and other risk factors. We first implemented these analyses with cis pQTLs, as cis pQTLs are known for being less prone to horizontal pleiotropy. Next, we included both cis and trans conditionally independent pQTLs that passed the genome-wide significance threshold keeping only variants associated with fewer than five proteins to minimize pleiotropic effects. We compared the tissue-specific protein effects on phenotypes across different categories. Finally, we integrated the MR-prioritized proteins with the druggable genome to identify new potential targets. RESULTS In the MR and colocalization analysis including study-wide significant cis pQTLs as instrumental variables, we identified 33 CSF, 13 plasma, and five brain proteins to be putative causal for 37, 18, and eight phenotypes, respectively. After expanding the instrumental variables by including genome-wide significant cis and trans pQTLs, we identified a total of 58 CSF, 32 plasma, and nine brain proteins associated with 58, 44, and 16 phenotypes, respectively. For those protein-phenotype associations that were found in more than one tissue, the directions of the associations for 13 (87%) pairs were consistent across tissues. As we were unable to use methods correcting for horizontal pleiotropy given most of the proteins were only associated with one valid instrumental variable after clumping, we found that the observations of protein-phenotype associations were consistent with a causal role or horizontal pleiotropy. Between 66.7 and 86.3% of the disease-causing proteins overlapped with the druggable genome. Finally, between one and three proteins, depending on the tissue, were connected with at least one drug compound for one phenotype from both DrugBank and ChEMBL databases. CONCLUSIONS Integrating multi-tissue pQTLs with MR and the druggable genome may open doors to pinpoint novel interventions for complex traits with no effective treatments, such as ovarian and lung cancers.
Collapse
|
10
|
CLIMB: High-dimensional association detection in large scale genomic data. Nat Commun 2022; 13:6874. [PMID: 36371401 PMCID: PMC9653391 DOI: 10.1038/s41467-022-34360-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 10/21/2022] [Indexed: 11/14/2022] Open
Abstract
Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood eMpirical Bayes), a statistical methodology that learns patterns of condition-specificity present in genomic data. CLIMB provides a generic framework facilitating a host of analyses, such as clustering genomic features sharing similar condition-specific patterns and identifying which of these features are involved in cell fate commitment. We apply CLIMB to three sets of hematopoietic data, which examine CTCF ChIP-seq measured in 17 different cell populations, RNA-seq measured across constituent cell populations in three committed lineages, and DNase-seq in 38 cell populations. Our results show that CLIMB improves upon existing alternatives in statistical precision, while capturing interpretable and biologically relevant clusters in the data.
Collapse
|
11
|
CellRegMap: a statistical framework for mapping context-specific regulatory variants using scRNA-seq. Mol Syst Biol 2022; 18:e10663. [PMID: 35972065 PMCID: PMC9380406 DOI: 10.15252/msb.202110663] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 06/28/2022] [Accepted: 07/01/2022] [Indexed: 11/11/2022] Open
Abstract
Single‐cell RNA sequencing (scRNA‐seq) enables characterizing the cellular heterogeneity in human tissues. Recent technological advances have enabled the first population‐scale scRNA‐seq studies in hundreds of individuals, allowing to assay genetic effects with single‐cell resolution. However, existing strategies to analyze these data remain based on principles established for the genetic analysis of bulk RNA‐seq. In particular, current methods depend on a priori definitions of discrete cell types, and hence cannot assess allelic effects across subtle cell types and cell states. To address this, we propose the Cell Regulatory Map (CellRegMap), a statistical framework to test for and quantify genetic effects on gene expression in individual cells. CellRegMap provides a principled approach to identify and characterize genotype–context interactions of known eQTL variants using scRNA‐seq data. This model‐based approach resolves allelic effects across cellular contexts of different granularity, including genetic effects specific to cell subtypes and continuous cell transitions. We validate CellRegMap using simulated data and apply it to previously identified eQTL from two recent studies of differentiating iPSCs, where we uncover hundreds of eQTL displaying heterogeneity of genetic effects across cellular contexts. Finally, we identify fine‐grained genetic regulation in neuronal subtypes for eQTL that are colocalized with human disease variants.
Collapse
|
12
|
Identification of a Novel Functional Non-synonymous Single Nucleotide Polymorphism in Frizzled Class Receptor 6 Gene for Involvement in Depressive Symptoms. Front Mol Neurosci 2022; 15:882396. [PMID: 35875672 PMCID: PMC9302575 DOI: 10.3389/fnmol.2022.882396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 06/16/2022] [Indexed: 12/05/2022] Open
Abstract
Although numerous susceptibility loci for depression have been identified in recent years, their biological function and molecular mechanism remain largely unknown. By using an exome-wide association study for depressive symptoms assessed by the Center for Epidemiological Studies Depression (CES-D) score, we discovered a novel missense single nucleotide polymorphism (SNP), rs61753730 (Q152E), located in the fourth exon of the frizzled class receptor 6 gene (FZD6), which is a potential causal variant and is significantly associated with the CES-D score. Computer-based in silico analysis revealed that the protein configuration and stability, as well as the secondary structure of FZD6 differed greatly between the wild-type (WT) and Q152E mutant. We further found that rs61753730 significantly affected the luciferase activity and expression of FZD6 in an allele-specific way. Finally, we generated Fzd6-knockin (Fzd6-KI) mice with rs61753730 mutation using the CRISPR/Cas9 genome editing system and found that these mice presented greater immobility in the forced swimming test, less preference for sucrose in the sucrose preference test, as well as decreased center entries, center time, and distance traveled in the open filed test compared with WT mice after exposed to chronic social defeat stress. These results indicate the involvement of rs61753730 in depression. Taken together, our findings demonstrate that SNP rs61753730 is a novel functional variant and plays an important role in depressive symptoms.
Collapse
|
13
|
HSPB1 Gene Variants and Schizophrenia: A Case-Control Study in a Polish Population. DISEASE MARKERS 2022; 2022:4933011. [PMID: 35340410 PMCID: PMC8941579 DOI: 10.1155/2022/4933011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 02/01/2022] [Accepted: 03/01/2022] [Indexed: 11/20/2022]
Abstract
Schizophrenia (SCZ) is a severe psychiatric disorder that has a significant genetic component. HSPB1 (HSP27) is known for its neuroprotective functions under stress conditions and appears to play an important role during the development of the central nervous system, which is in agreement with the neurodevelopmental hypothesis of SCZ. The aim of the present case-control study was to investigate whether HSPB1 variants contribute to the risk and clinical features (age of onset, symptoms, and suicidal behavior) of SCZ in a Polish population. To the best of our knowledge, this is the first study that investigated the association between the HSPB1 polymorphisms and SCZ. Three SNPs of HSPB1 (rs2868370, rs2868371, and rs7459185) were genotyped in a total of 1082 (403 patients and 679 controls) unrelated subjects using TaqMan assays. The results showed that the genotypes, alleles, and haplotypes of the three SNPs were not significantly different between the schizophrenic patients and healthy controls either in the overall analysis or in the gender-stratified analysis (all p > 0.05). However, we did find a significant effect of the rs2868371 genotype on the age of onset, negative symptoms, and disorganized symptoms in the five-factor model of PANSS (all p < 0.01). Post hoc comparisons showed that carriers of the rs2868371 G/G genotype had significantly higher negative and disorganized factor scores than those with the C/G and C/C genotypes, respectively. Further investigations with other larger independent samples are required to confirm our findings and to better explore the effect of the HSPB1 polymorphisms on the risk and symptomatology of SCZ.
Collapse
|
14
|
Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
|
15
|
Functional characterisation of the amyotrophic lateral sclerosis risk locus GPX3/TNIP1. Genome Med 2022; 14:7. [PMID: 35042540 PMCID: PMC8767698 DOI: 10.1186/s13073-021-01006-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 11/30/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Amyotrophic lateral sclerosis (ALS) is a complex, late-onset, neurodegenerative disease with a genetic contribution to disease liability. Genome-wide association studies (GWAS) have identified ten risk loci to date, including the TNIP1/GPX3 locus on chromosome five. Given association analysis data alone cannot determine the most plausible risk gene for this locus, we undertook a comprehensive suite of in silico, in vivo and in vitro studies to address this. METHODS The Functional Mapping and Annotation (FUMA) pipeline and five tools (conditional and joint analysis (GCTA-COJO), Stratified Linkage Disequilibrium Score Regression (S-LDSC), Polygenic Priority Scoring (PoPS), Summary-based Mendelian Randomisation (SMR-HEIDI) and transcriptome-wide association study (TWAS) analyses) were used to perform bioinformatic integration of GWAS data (Ncases = 20,806, Ncontrols = 59,804) with 'omics reference datasets including the blood (eQTLgen consortium N = 31,684) and brain (N = 2581). This was followed up by specific expression studies in ALS case-control cohorts (microarray Ntotal = 942, protein Ntotal = 300) and gene knockdown (KD) studies of human neuronal iPSC cells and zebrafish-morpholinos (MO). RESULTS SMR analyses implicated both TNIP1 and GPX3 (p < 1.15 × 10-6), but there was no simple SNP/expression relationship. Integrating multiple datasets using PoPS supported GPX3 but not TNIP1. In vivo expression analyses from blood in ALS cases identified that lower GPX3 expression correlated with a more progressed disease (ALS functional rating score, p = 5.5 × 10-3, adjusted R2 = 0.042, Beffect = 27.4 ± 13.3 ng/ml/ALSFRS unit) with microarray and protein data suggesting lower expression with risk allele (recessive model p = 0.06, p = 0.02 respectively). Validation in vivo indicated gpx3 KD caused significant motor deficits in zebrafish-MO (mean difference vs. control ± 95% CI, vs. control, swim distance = 112 ± 28 mm, time = 1.29 ± 0.59 s, speed = 32.0 ± 2.53 mm/s, respectively, p for all < 0.0001), which were rescued with gpx3 expression, with no phenotype identified with tnip1 KD or gpx3 overexpression. CONCLUSIONS These results support GPX3 as a lead ALS risk gene in this locus, with more data needed to confirm/reject a role for TNIP1. This has implications for understanding disease mechanisms (GPX3 acts in the same pathway as SOD1, a well-established ALS-associated gene) and identifying new therapeutic approaches. Few previous examples of in-depth investigations of risk loci in ALS exist and a similar approach could be applied to investigate future expected GWAS findings.
Collapse
|
16
|
Stem Cell-Derived β Cells: A Versatile Research Platform to Interrogate the Genetic Basis of β Cell Dysfunction. Int J Mol Sci 2022; 23:501. [PMID: 35008927 PMCID: PMC8745644 DOI: 10.3390/ijms23010501] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 12/27/2021] [Accepted: 12/29/2021] [Indexed: 02/07/2023] Open
Abstract
Pancreatic β cell dysfunction is a central component of diabetes progression. During the last decades, the genetic basis of several monogenic forms of diabetes has been recognized. Genome-wide association studies (GWAS) have also facilitated the identification of common genetic variants associated with an increased risk of diabetes. These studies highlight the importance of impaired β cell function in all forms of diabetes. However, how most of these risk variants confer disease risk, remains unanswered. Understanding the specific contribution of genetic variants and the precise role of their molecular effectors is the next step toward developing treatments that target β cell dysfunction in the era of personalized medicine. Protocols that allow derivation of β cells from pluripotent stem cells, represent a powerful research tool that allows modeling of human development and versatile experimental designs that can be used to shed some light on diabetes pathophysiology. This article reviews different models to study the genetic basis of β cell dysfunction, focusing on the recent advances made possible by stem cell applications in the field of diabetes research.
Collapse
|
17
|
PhosSNPs-Regulated Gene Network and Pathway Significant for Rheumatoid Arthritis. Hum Hered 2021; 86:10-20. [PMID: 34569543 DOI: 10.1159/000518608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/09/2021] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES Peripheral blood mononuclear cells (PBMCs) are critical for immunity and participate in multiple human diseases, including rheumatoid arthritis (RA). PhosSNPs are nonsynonymous SNPs influencing protein phosphorylation, thus probably modulate cell signaling and gene expression. We aimed to identify phosSNPs-regulated gene network/pathway potentially significant for RA. METHODS We collected genome-wide phosSNP genotyping data and transcriptome-wide mRNA expression data from PBMCs of a Chinese sample. We discovered and verified with public datasets differentially expressed genes (DEGs) associated with RA, and replicated RA-associated SNPs in our study sample. We performed a targeted expression quantitative trait locus (eQTL) study on significant phosSNPs and DEGs. RESULTS We identified 29 nominally significant eQTL phosSNPs and 83 target genes, and constructed comprehensive regulatory/interaction networks, highlighting the vital effects of two eQTL phosSNPs (rs371513 and rs4824675, FDR <0.05) and four critical node genes (HSPA4, NDUFA2, MRPL15, and ATP5O). Besides, two node/key genes NDUFA2 and ATP5O, regulated by rs371513, were significantly enriched in mitochondrial oxidative phosphorylation pathway. Besides, four pairs of eQTL effects were replicated independently in whole blood and/or transformed fibroblasts. CONCLUSIONS The findings delineated a potential role of protein phosphorylation and genetic variations in RA and warranted the significant roles of phosSNPs in regulating RA-associated genes expression in PBMCs. The results pointed out the relevance and significance of oxidative phosphorylation pathway to RA.
Collapse
|
18
|
High nocturnal sleep fragmentation is associated with low T lymphocyte P2Y11 protein levels in narcolepsy type 1. Sleep 2021; 44:zsab062. [PMID: 33710305 PMCID: PMC8361345 DOI: 10.1093/sleep/zsab062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 01/31/2021] [Indexed: 11/12/2022] Open
Abstract
STUDY OBJECTIVES Narcolepsy type 1 (NT1) is associated with hypocretin neuron loss. However, there are still unexplained phenotypic NT1 features. We investigated the associations between clinical and sleep phenotypic characteristics, the NT1-associated P2RY11 polymorphism rs2305795, and P2Y11 protein levels in T lymphocytes in patients with NT1, their first-degree relatives and unrelated controls. METHODS The P2RY11 SNP was genotyped in 100 patients (90/100 H1N1-(Pandemrix)-vaccinated), 119 related and 123 non-related controls. CD4 and CD8 T lymphocyte P2Y11 protein levels were quantified using flow cytometry in 167 patients and relatives. Symptoms and sleep recording parameters were also collected. RESULTS We found an association between NT1 and the rs2305795 A allele (OR = 2, 95% CI (1.3, 3.0), p = 0.001). T lymphocyte P2Y11 protein levels were significantly lower in patients and relatives homozygous for the rs2305795 risk A allele (CD4: p = 0.012; CD8: p = 0.007). The nocturnal sleep fragmentation index was significantly negatively correlated with patients' P2Y11 protein levels (CD4: p = 0.004; CD8: p = 0.006). Mean MSLT sleep latency, REM-sleep latency, and core clinical symptoms were not associated with P2Y11 protein levels. CONCLUSIONS We confirmed that the P2RY11 polymorphism rs2305795 is associated with NT1 also in a mainly H1N1-(Pandemrix)-vaccinated cohort. We demonstrated that homozygosity for the A risk allele is associated with lower P2Y11 protein levels. A high level of nocturnal sleep fragmentation was associated with low P2Y11 levels in patients. This suggests that P2Y11 has a previously unknown function in sleep-wake stabilization that affects the severity of NT1.
Collapse
|
19
|
Long Non-Coding RNAs Involved in Progression of Non-Alcoholic Fatty Liver Disease to Steatohepatitis. Cells 2021; 10:cells10081883. [PMID: 34440652 PMCID: PMC8394311 DOI: 10.3390/cells10081883] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 07/22/2021] [Accepted: 07/23/2021] [Indexed: 12/15/2022] Open
Abstract
Non-alcoholic fatty liver disease (NAFLD) is the most prevalent chronic liver disease and is characterized by different stages varying from benign fat accumulation to non-alcoholic steatohepatitis (NASH) that may progress to cirrhosis and liver cancer. In recent years, a regulatory role of long non-coding RNAs (lncRNAs) in NAFLD has emerged. Therefore, we aimed to characterize the still poorly understood lncRNA contribution to disease progression. Transcriptome analysis in 60 human liver samples with various degrees of NAFLD/NASH was combined with a functional genomics experiment in an in vitro model where we exposed HepG2 cells to free fatty acids (FFA) to induce steatosis, then stimulated them with tumor necrosis factor alpha (TNFα) to mimic inflammation. Bioinformatics analyses provided a functional prediction of novel lncRNAs. We further functionally characterized the involvement of one novel lncRNA in the nuclear-factor-kappa B (NF-κB) signaling pathway by its silencing in Hepatoma G2 (HepG2) cells. We identified 730 protein-coding genes and 18 lncRNAs that responded to FFA/TNFα and associated with human NASH phenotypes with consistent effect direction, with most being linked to inflammation. One novel intergenic lncRNA, designated lncTNF, was 20-fold up-regulated upon TNFα stimulation in HepG2 cells and positively correlated with lobular inflammation in human liver samples. Silencing lncTNF in HepG2 cells reduced NF-κB activity and suppressed expression of the NF-κB target genes A20 and NFKBIA. The lncTNF we identified in the NF-κB signaling pathway may represent a novel target for controlling liver inflammation.
Collapse
|
20
|
SNPs at 3'UTR of APOL1 and miR-6741-3p target sites associated with kidney diseases more susceptible to SARS-COV-2 infection: in silco and in vitro studies. Mamm Genome 2021; 32:389-400. [PMID: 34089082 PMCID: PMC8177038 DOI: 10.1007/s00335-021-09880-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 05/24/2021] [Indexed: 01/04/2023]
Abstract
Acute Kidney Injury (AKI) is a common manifestation of COVID-19 and several cases have been reported in the setting of the high-risk APOL1 genotype (common genetic variants). This increases the likelihood that African American people with the high-risk genotype APOL1 are at increased risk for kidney disease in the COVID-19 environment. Single-nucleotide polymorphisms (SNPs) are found in various microRNAs (miRNAs) and target genes change the miRNA activity that leads to different diseases. Evidence has shown that SNPs increase/decrease the effectiveness of the interaction between miRNAs and disease-related target genes. The aim of this study is not only to identify miRSNPs on the APOL1 gene and SNPs in miRNA genes targeting 3′UTR but also to evaluate the effect of these gene variations in kidney patients and their association with SARS-COV-2 infection. In 3′UTR of the APOL1 gene, we detected 96 miRNA binding sites and 35 different SNPs with 10 different online software in the binding sites of the miRNA (in silico). Also we studied gene expression of patients and control samples by using qRT-PCR (in vitro). In silico study, the binding site of miR-6741-3p on APOL1 has two SNPs (rs1288875001, G > C; rs1452517383, A > C) on APOL1 3′UTR, and its genomic sequence is the same nucleotide as rs1288875001. Similarly, two other SNPs (rs1142591, T > A; rs376326225, G > A) were identified in the binding sites of miR-6741-3p at the first position. Here, the miRSNP (rs1288875001) in APOL1 3′UTR and SNP (rs376326225) in the miR-6741-3p genomic sequence are cross-matched in the same binding region. In vitro study, the relative expression levels were calculated by the 2−ΔΔCt method & Mann–Whitney U test. The expression of APOL1 gene was different in chronic kidney patients along with COVID-19. By these results, APOL1 expression was found lower in patients than healthy (p < 0.05) in kidney patients along with COVID-19. In addition, miR-6741-3p targets many APOL1-related genes (TLR7, SLC6A19, IL-6,10,18, chemokine (C–C motif) ligand 5, SWT1, NFYB, BRF1, HES2, NFYB, MED12L, MAFG, GTF2H5, TRAF3, angiotensin II receptor-associated protein, PRSS23) by evaluating online software in the binding sites of the miR-6741-3p. miR-6741-3p has not previously shown any association with kidney diseases and SARS-COV-2 infection. It assures that APOL1 can have a significant consequence in kidney-associated diseases by different pathways. Henceforth, this study represents and demonstrates an effective association between miR-6741-3p and kidney diseases, i.e., collapsing glomerulopathy, chronic kidney disease (CKD), acute kidney injury (AKI), and tubulointerstitial lesions susceptibility to SARS-COV-2 infection via in silico and in vitro exploration and recommended to have better insight.
Collapse
|
21
|
Longitudinal data reveal strong genetic and weak non-genetic components of ethnicity-dependent blood DNA methylation levels. Epigenetics 2021; 16:662-676. [PMID: 32997571 PMCID: PMC8143220 DOI: 10.1080/15592294.2020.1817290] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 07/06/2020] [Accepted: 07/24/2020] [Indexed: 11/18/2022] Open
Abstract
Epigenetic architecture is influenced by genetic and environmental factors, but little is known about their relative contributions or longitudinal dynamics. Here, we studied DNA methylation (DNAm) at over 750,000 CpG sites in mononuclear blood cells collected at birth and age 7 from 196 children of primarily self-reported Black and Hispanic ethnicities to study race-associated DNAm patterns. We developed a novel Bayesian method for high-dimensional longitudinal data and showed that race-associated DNAm patterns at birth and age 7 are nearly identical. Additionally, we estimated that up to 51% of all self-reported race-associated CpGs had race-dependent DNAm levels that were mediated through local genotype and, quite surprisingly, found that genetic factors explained an overwhelming majority of the variation in DNAm levels at other, previously identified, environmentally-associated CpGs. These results indicate that race-associated blood DNAm patterns in particular, and blood DNAm levels in general, are primarily driven by genetic factors, and are not as sensitive to environmental exposures as previously suggested, at least during the first 7 years of life.
Collapse
|
22
|
Advances in Genomic Discovery and Implications for Personalized Prevention and Medicine: Estonia as Example. J Pers Med 2021; 11:jpm11050358. [PMID: 33946982 PMCID: PMC8145318 DOI: 10.3390/jpm11050358] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 04/19/2021] [Accepted: 04/25/2021] [Indexed: 02/07/2023] Open
Abstract
The current paradigm of personalized medicine envisages the use of genomic data to provide predictive information on the health course of an individual with the aim of prevention and individualized care. However, substantial efforts are required to realize the concept: enhanced genetic discoveries, translation into intervention strategies, and a systematic implementation in healthcare. Here we review how further genetic discoveries are improving personalized prediction and advance functional insights into the link between genetics and disease. In the second part we give our perspective on the way these advances in genomic research will transform the future of personalized prevention and medicine using Estonia as a primer.
Collapse
|
23
|
Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells. Genome Biol 2021; 22:76. [PMID: 33673841 PMCID: PMC7934233 DOI: 10.1186/s13059-021-02293-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 02/10/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The discovery that somatic cells can be reprogrammed to induced pluripotent stem cells (iPSCs) has provided a foundation for in vitro human disease modelling, drug development and population genetics studies. Gene expression plays a critical role in complex disease risk and therapeutic response. However, while the genetic background of reprogrammed cell lines has been shown to strongly influence gene expression, the effect has not been evaluated at the level of individual cells which would provide significant resolution. By integrating single cell RNA-sequencing (scRNA-seq) and population genetics, we apply a framework in which to evaluate cell type-specific effects of genetic variation on gene expression. RESULTS Here, we perform scRNA-seq on 64,018 fibroblasts from 79 donors and map expression quantitative trait loci (eQTLs) at the level of individual cell types. We demonstrate that the majority of eQTLs detected in fibroblasts are specific to an individual cell subtype. To address if the allelic effects on gene expression are maintained following cell reprogramming, we generate scRNA-seq data in 19,967 iPSCs from 31 reprogramed donor lines. We again identify highly cell type-specific eQTLs in iPSCs and show that the eQTLs in fibroblasts almost entirely disappear during reprogramming. CONCLUSIONS This work provides an atlas of how genetic variation influences gene expression across cell subtypes and provides evidence for patterns of genetic architecture that lead to cell type-specific eQTL effects.
Collapse
|
24
|
Abstract
Most disease-associated variants, although located in putatively regulatory regions, do not have detectable effects on gene expression. One explanation could be that we have not examined gene expression in the cell types or conditions that are most relevant for disease. Even large-scale efforts to study gene expression across tissues are limited to human samples obtained opportunistically or postmortem, mostly from adults. In this review we evaluate recent findings and suggest an alternative strategy, drawing on the dynamic and highly context-specific nature of gene regulation. We discuss new technologies that can extend the standard regulatory mapping framework to more diverse, disease-relevant cell types and states.
Collapse
|
25
|
Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power. Nat Genet 2021; 53:195-204. [PMID: 33462486 PMCID: PMC7867648 DOI: 10.1038/s41588-020-00766-y] [Citation(s) in RCA: 96] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 12/15/2020] [Indexed: 12/26/2022]
Abstract
Admixed populations are routinely excluded from genomic studies due to concerns over population structure. Here, we present a statistical framework and software package, Tractor, to facilitate the inclusion of admixed individuals in association studies by leveraging local ancestry. We test Tractor with simulated and empirical two-way admixed African-European cohorts. Tractor generates accurate ancestry-specific effect-size estimates and P values, can boost genome-wide association study (GWAS) power and improves the resolution of association signals. Using a local ancestry-aware regression model, we replicate known hits for blood lipids, discover novel hits missed by standard GWAS and localize signals closer to putative causal variants.
Collapse
|
26
|
Polymorphisms in mitochondrial ribosomal protein S5 (MRPS5) are associated with leprosy risk in Chinese. PLoS Negl Trop Dis 2020; 14:e0008883. [PMID: 33362202 PMCID: PMC7757804 DOI: 10.1371/journal.pntd.0008883] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 10/13/2020] [Indexed: 01/15/2023] Open
Abstract
Leprosy is an infectious disease caused by Mycobacterium leprae (M. leprae), with about 210,000 new cases per year worldwide. Although numerous risk loci have been uncovered by genome-wide association studies, the effects of common genetic variants are relatively modest. To identify possible new genetic locus involved in susceptibility to leprosy, whole exome sequencing was performed for 28 subjects including 14 patients and 12 unaffected members from 8 leprosy-affected families as well as another case and an unrelated control, and then the follow-up SNP genotyping of the candidate variants was studied in case-control sample sets. A rare missense variant in mitochondrial ribosomal protein S5 (MRPS5), rs200730619 (c. 95108402T>C [p. Tyr137Cys]) was identified and validated in 369 cases and 270 controls of Chinese descent (Padjusted = 0.006, odds ratio [OR] = 2.74) as a contributing factor to leprosy risk. Moreover, the mRNA level of MRPS5 was downregulated in M. leprae sonicate-stimulated peripheral blood mononuclear cells. Our results indicated that MRPS5 may be involved in leprosy pathogenesis. Further studies are needed to determine if defective MRPS5 could lead to impairment of energy metabolism of host immune cells, which could further cause defect in clearing M. leprae and increase susceptibility to infection.
Collapse
|
27
|
Celiac disease susceptibility: The genome and beyond. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2020; 358:1-45. [PMID: 33707051 DOI: 10.1016/bs.ircmb.2020.10.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Celiac Disease (CeD) is an immune-mediated complex disease that is triggered by the ingestion of gluten and develops in genetically susceptible individuals. It has been known for a long time that the Human Leucocyte Antigen (HLA) molecules DQ2 and DQ8 are necessary, although not sufficient, for the disease development, and therefore other susceptibility genes and (epi)genetic events must participate in CeD pathogenesis. The advances in Genomics during the last 15 years have made CeD one of the immune-related disorders with the best-characterized genetic component. In the present work, we will first review the main Genome-Wide Association Studies (GWAS) carried out in the disorder, and emphasize post-GWAS discoveries, including diverse integrative strategies, SNP prioritization approaches, and insights into the Microbiome through the host Genomics. Second, we will explore CeD-related Epigenetics and Epigenomics, mostly focusing on the emerging knowledge of the celiac methylome, and the vast but yet under-explored non-coding RNA (ncRNA) landscape. We conclude that much has been done in the field although there are still completely unvisited areas in the post-Genomics of CeD. Chromatin conformation and accessibility, and Epitranscriptomics are promising domains that need to be unveiled to complete the big picture of the celiac Genome.
Collapse
|
28
|
Sex differences in human adipose tissue gene expression and genetic regulation involve adipogenesis. Genome Res 2020; 30:1379-1392. [PMID: 32967914 PMCID: PMC7605264 DOI: 10.1101/gr.264614.120] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 08/27/2020] [Indexed: 02/06/2023]
Abstract
Sex differences in adipose tissue distribution and function are associated with sex differences in cardiometabolic disease. While many studies have revealed sex differences in adipocyte cell signaling and physiology, there is a relative dearth of information regarding sex differences in transcript abundance and regulation. We investigated sex differences in subcutaneous adipose tissue transcriptional regulation using omic-scale data from ∼3000 geographically and ethnically diverse human samples. We identified 162 genes with robust sex differences in expression. Differentially expressed genes were implicated in oxidative phosphorylation and adipogenesis. We further determined that sex differences in gene expression levels could be related to sex differences in the genetics of gene expression regulation. Our analyses revealed sex-specific genetic associations, and this finding was replicated in a study of 98 inbred mouse strains. The genes under genetic regulation in human and mouse were enriched for oxidative phosphorylation and adipogenesis. Enrichment analysis showed that the associated genetic loci resided within binding motifs for adipogenic transcription factors (e.g., PPARG and EGR1). We demonstrated that sex differences in gene expression could be influenced by sex differences in genetic regulation for six genes (e.g., FADS1 and MAP1B). These genes exhibited dynamic expression patterns during adipogenesis and robust expression in mature human adipocytes. Our results support a role for adipogenesis-related genes in subcutaneous adipose tissue sex differences in the genetic and environmental regulation of gene expression.
Collapse
|
29
|
SNP rs17079281 decreases lung cancer risk through creating an YY1-binding site to suppress DCBLD1 expression. Oncogene 2020; 39:4092-4102. [PMID: 32231272 PMCID: PMC7220863 DOI: 10.1038/s41388-020-1278-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 03/13/2020] [Accepted: 03/17/2020] [Indexed: 12/24/2022]
Abstract
Genome-wide association studies (GWAS) have identified numerous genetic variants that are associated with lung cancer risk, but the biological mechanisms underlying these associations remain largely unknown. Here we investigated the functional relevance of a genetic region in 6q22.2 which was identified to be associated with lung cancer risk in our previous GWAS. We performed linkage disequilibrium (LD) analysis and bioinformatic prediction to screen functional SNPs linked to a tagSNP in 6q22.2 loci, followed by two case-control studies and a meta-analysis with 4403 cases and 5336 controls to identify if these functional SNPs were associated with lung cancer risk. A novel SNP rs17079281 in the DCBLD1 promoter was identified to be associated with lung cancer risk in Chinese populations. Compared with those with C allele, patients with T allele had lower risk of adenocarcinoma (adjusted OR = 0.86; 95% CI: 0.80–0.92), but not squamous cell carcinoma (adjusted OR = 0.99; 95% CI: 0.91–1.10), and patients with the C/T or T/T genotype had lower levels of DCBLD1 expression than those with C/C genotype in lung adenocarcinoma tissues. We performed functional assays to characterize its biological relevance. The results showed that the T allele of rs17079281 had higher binding affinity to transcription factor YY1 than the C allele, which suppressed DCBLD1 expression. DCBLD1 behaved like an oncogene, promoting tumor growth by influencing cell cycle progression. These findings suggest that the functional variant rs17079281C>T decreased lung adenocarcinoma risk by creating an YY1-binding site to suppress DCBLD1 expression, which may serve as a biomarker for assessing lung cancer susceptibility.
Collapse
|
30
|
Tissue-specific sex differences in human gene expression. Hum Mol Genet 2020; 28:2976-2986. [PMID: 31044242 DOI: 10.1093/hmg/ddz090] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 04/12/2019] [Accepted: 04/24/2019] [Indexed: 02/07/2023] Open
Abstract
Despite extensive sex differences in human complex traits and disease, the male and female genomes differ only in the sex chromosomes. This implies that most sex-differentiated traits are the result of differences in the expression of genes that are common to both sexes. While sex differences in gene expression have been observed in a range of different tissues, the biological mechanisms for tissue-specific sex differences (TSSDs) in gene expression are not well understood. A total of 30 640 autosomal and 1021 X-linked transcripts were tested for heterogeneity in sex difference effect sizes in n = 617 individuals across 40 tissue types in Genotype-Tissue Expression (GTEx). This identified 65 autosomal and 66 X-linked TSSD transcripts (corresponding to unique genes) at a stringent significance threshold. Results for X-linked TSSD transcripts showed mainly concordant direction of sex differences across tissues and replicate previous findings. Autosomal TSSD transcripts had mainly discordant direction of sex differences across tissues. The top cis-expression quantitative trait loci (eQTLs) across tissues for autosomal TSSD transcripts are located a similar distance away from the nearest androgen and estrogen binding motifs and the nearest enhancer, as compared to cis-eQTLs for transcripts with stable sex differences in gene expression across tissue types. Enhancer regions that overlap top cis-eQTLs for TSSD transcripts, however, were found to be more dispersed across tissues. These observations suggest that androgen and estrogen regulatory elements in a cis region may play a common role in sex differences in gene expression, but TSSD in gene expression may additionally be due to causal variants located in tissue-specific enhancer regions.
Collapse
|
31
|
Abstract
In recent years, functional genomics approaches combining genetic information with bulk RNA-sequencing data have identified the downstream expression effects of disease-associated genetic risk factors through so-called expression quantitative trait locus (eQTL) analysis. Single-cell RNA-sequencing creates enormous opportunities for mapping eQTLs across different cell types and in dynamic processes, many of which are obscured when using bulk methods. Rapid increase in throughput and reduction in cost per cell now allow this technology to be applied to large-scale population genetics studies. To fully leverage these emerging data resources, we have founded the single-cell eQTLGen consortium (sc-eQTLGen), aimed at pinpointing the cellular contexts in which disease-causing genetic variants affect gene expression. Here, we outline the goals, approach and potential utility of the sc-eQTLGen consortium. We also provide a set of study design considerations for future single-cell eQTL studies.
Collapse
|
32
|
Model-based clustering of multi-tissue gene expression data. Bioinformatics 2020; 36:1807-1813. [PMID: 31688915 PMCID: PMC7162352 DOI: 10.1093/bioinformatics/btz805] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2018] [Revised: 09/05/2019] [Accepted: 10/31/2019] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Recently, it has become feasible to generate large-scale, multi-tissue gene expression data, where expression profiles are obtained from multiple tissues or organs sampled from dozens to hundreds of individuals. When traditional clustering methods are applied to this type of data, important information is lost, because they either require all tissues to be analyzed independently, ignoring dependencies and similarities between tissues, or to merge tissues in a single, monolithic dataset, ignoring individual characteristics of tissues. RESULTS We developed a Bayesian model-based multi-tissue clustering algorithm, revamp, which can incorporate prior information on physiological tissue similarity, and which results in a set of clusters, each consisting of a core set of genes conserved across tissues as well as differential sets of genes specific to one or more subsets of tissues. Using data from seven vascular and metabolic tissues from over 100 individuals in the STockholm Atherosclerosis Gene Expression (STAGE) study, we demonstrate that multi-tissue clusters inferred by revamp are more enriched for tissue-dependent protein-protein interactions compared to alternative approaches. We further demonstrate that revamp results in easily interpretable multi-tissue gene expression associations to key coronary artery disease processes and clinical phenotypes in the STAGE individuals. AVAILABILITY AND IMPLEMENTATION Revamp is implemented in the Lemon-Tree software, available at https://github.com/eb00/lemon-tree. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
33
|
Integrative QTL analysis of gene expression and chromatin accessibility identifies multi-tissue patterns of genetic regulation. PLoS Genet 2020; 16:e1008537. [PMID: 31961859 PMCID: PMC7010298 DOI: 10.1371/journal.pgen.1008537] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 02/10/2020] [Accepted: 11/23/2019] [Indexed: 01/08/2023] Open
Abstract
Gene transcription profiles across tissues are largely defined by the activity of regulatory elements, most of which correspond to regions of accessible chromatin. Regulatory element activity is in turn modulated by genetic variation, resulting in variable transcription rates across individuals. The interplay of these factors, however, is poorly understood. Here we characterize expression and chromatin state dynamics across three tissues-liver, lung, and kidney-in 47 strains of the Collaborative Cross (CC) mouse population, examining the regulation of these dynamics by expression quantitative trait loci (eQTL) and chromatin QTL (cQTL). QTL whose allelic effects were consistent across tissues were detected for 1,101 genes and 133 chromatin regions. Also detected were eQTL and cQTL whose allelic effects differed across tissues, including local-eQTL for Pik3c2g detected in all three tissues but with distinct allelic effects. Leveraging overlapping measurements of gene expression and chromatin accessibility on the same mice from multiple tissues, we used mediation analysis to identify chromatin and gene expression intermediates of eQTL effects. Based on QTL and mediation analyses over multiple tissues, we propose a causal model for the distal genetic regulation of Akr1e1, a gene involved in glycogen metabolism, through the zinc finger transcription factor Zfp985 and chromatin intermediates. This analysis demonstrates the complexity of transcriptional and chromatin dynamics and their regulation over multiple tissues, as well as the value of the CC and related genetic resource populations for identifying specific regulatory mechanisms within cells and tissues.
Collapse
|
34
|
Biological characterization of expression quantitative trait loci (eQTLs) showing tissue-specific opposite directional effects. Eur J Hum Genet 2019; 27:1745-1756. [PMID: 31296926 PMCID: PMC6871526 DOI: 10.1038/s41431-019-0468-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 05/29/2019] [Accepted: 06/04/2019] [Indexed: 12/22/2022] Open
Abstract
Interpreting the susceptible loci documented by genome-wide association studies (GWASs) is of utmost importance in the post-GWAS era. Since most complex traits are contributed by multiple tissues, analyzing tissue-specific effects of expression quantitative trait loci (eQTLs) is a promising approach. Here we describe “opposite eQTL effects”, i.e., gene expression effects of eQTLs that are in the opposite direction between different tissues, as the biologically meaningful annotations of genes and genetic variants for understanding the GWAS loci. The genes and single-nucleotide polymorphisms (SNPs) associated with the opposite eQTL effects (opp-multi-eQTL-Genes and opp-multi-eQTL-SNPs) were extracted from the largest eQTL database provided by the Genotype-Tissue Expression (GTEx) project (release version 7). The opposite eQTL effects were detected even between closely related tissues such as cerebellum and brain cortex, and a significant proportion of the genes having eQTLs were annotated as the opp-multi-eQTL-Genes (2,323 out of 31,212; 7.4%). The opp-multi-eQTL-SNPs showed locational enrichment at the transcription start site and also possible involvement of epigenetic regulation. The biological importance of the opposite eQTL effects was also assessed using the SNPs reported in GWASs (GWAS-SNPs), which demonstrated that a high proportion of the opp-multi-eQTL-SNPs are in linkage disequilibrium with the GWAS-SNPs (2,498 out of 9,290; 26.9%). Based on the results, the opposite eQTL effects can be a common phenomenon in the tissue-specific gene regulation with a possible contribution to the development of complex traits.
Collapse
|
35
|
Power, false discovery rate and Winner's Curse in eQTL studies. Nucleic Acids Res 2019; 46:e133. [PMID: 30189032 PMCID: PMC6294523 DOI: 10.1093/nar/gky780] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 08/17/2018] [Indexed: 12/16/2022] Open
Abstract
Investigation of the genetic architecture of gene expression traits has aided interpretation of disease and trait-associated genetic variants; however, key aspects of expression quantitative trait loci (eQTL) study design and analysis remain understudied. We used extensive, empirically driven simulations to explore eQTL study design and the performance of various analysis strategies. Across multiple testing correction methods, false discoveries of genes with eQTLs (eGenes) were substantially inflated when false discovery rate (FDR) control was applied to all tests and only appropriately controlled using hierarchical procedures. All multiple testing correction procedures had low power and inflated FDR for eGenes whose causal SNPs had small allele frequencies using small sample sizes (e.g. frequency <10% in 100 samples), indicating that even moderately low frequency eQTL SNPs (eSNPs) in these studies are enriched for false discoveries. In scenarios with ≥80% power, the top eSNP was the true simulated eSNP 90% of the time, but substantially less frequently for very common eSNPs (minor allele frequencies >25%). Overestimation of eQTL effect sizes, so-called ‘Winner’s Curse’, was common in low and moderate power settings. To address this, we developed a bootstrap method (BootstrapQTL) that led to more accurate effect size estimation. These insights provide a foundation for future eQTL studies, especially those with sampling constraints and subtly different conditions.
Collapse
|
36
|
Genome-Wide Variants Shared Between Smoking Quantity and Schizophrenia on 15q25 Are Associated With CHRNA5 Expression in the Brain. Schizophr Bull 2019; 45:813-823. [PMID: 30202994 PMCID: PMC6581148 DOI: 10.1093/schbul/sby093] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Cigarette smokers with schizophrenia consume more cigarettes than smokers in the general population. Schizophrenia and smoking quantity may have shared genetic liability. Genome-wide association studies (GWASs) of schizophrenia and smoking quantity have highlighted a biological pleiotropy in which a robust 15q25 locus affects both traits. To identify the genetic variants shared between these traits on 15q25, we used summary statistics from large-scale GWAS meta-analyses of schizophrenia in the Psychiatric Genomics Consortium 2 and smoking quantity assessed by cigarettes smoked per day in the Tobacco and Genetics Consortium. To evaluate the regulatory potential of the shared genetic variants, expression quantitative trait loci analysis in 10 postmortem brain regions was performed using the BRAINEAC dataset in 134 neuropathologically normal individuals. Twenty-two genetic variants on 15q25 were associated with both smoking quantity and schizophrenia at the genome-wide significance level (P < 5.00 × 10-8). Major alleles of all variants were associated with higher smoking quantity and risk of schizophrenia. These genetic variants were associated with PSMA4, CHRNA3, and CHRNB4 expression in specific brain regions (lowest P = 4.81 × 10-4) and with CHRNA5 expression in multiple brain regions (lowest P = 8.70 × 10-6). Risk-associated major alleles of these variants were commonly associated with higher expression in several brain regions, excluding the medulla, at the transcript level. In addition, the risk-associated major allele at rs637137 was associated with higher CHRNA5 expression at the specific exon level in multiple brain regions (lowest P = 2.37 × 10-5). Our findings suggest that genome-wide variants shared between smoking quantity and schizophrenia contribute to a common pathophysiology underlying these traits involving altered CHRNA5 expression in the brain.
Collapse
|
37
|
An empirical Bayes approach for multiple tissue eQTL analysis. Biostatistics 2019; 19:391-406. [PMID: 29029013 DOI: 10.1093/biostatistics/kxx048] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Accepted: 08/23/2017] [Indexed: 12/18/2022] Open
Abstract
Expression quantitative trait locus (eQTL) analyses identify genetic markers associated with the expression of a gene. Most up-to-date eQTL studies consider the connection between genetic variation and expression in a single tissue. Multi-tissue analyses have the potential to improve findings in a single tissue, and elucidate the genotypic basis of differences between tissues. In this article, we develop a hierarchical Bayesian model (MT-eQTL) for multi-tissue eQTL analysis. MT-eQTL explicitly captures patterns of variation in the presence or absence of eQTL, as well as the heterogeneity of effect sizes across tissues. We devise an efficient Expectation-Maximization (EM) algorithm for model fitting. Inferences concerning eQTL detection and the configuration of eQTL across tissues are derived from the adaptive thresholding of local false discovery rates, and maximum a posteriori estimation, respectively. We also provide theoretical justification of the adaptive procedure. We investigate the MT-eQTL model through an extensive analysis of a 9-tissue data set from the GTEx initiative.
Collapse
|
38
|
Kernel size-related genes revealed by an integrated eQTL analysis during early maize kernel development. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 98:19-32. [PMID: 30548709 PMCID: PMC6850110 DOI: 10.1111/tpj.14193] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 10/05/2018] [Accepted: 11/16/2018] [Indexed: 05/21/2023]
Abstract
In maize, kernel traits strongly impact overall grain yields, and it is known that sophisticated spatiotemporal programs of gene expression coordinate kernel development, so advancing our knowledge of kernel development can help efforts to improve grain yields. Here, using phenotype, genotype and transcriptomics data of maize kernels at 5 and 15 days after pollination (DAP) for a large association mapping panel, we employed multiple quantitative genetics approaches-genome-wide association studies (GWAS) as well as expression quantitative trait loci (eQTL) and quantitative trait transcript (QTT) analyses-to gain insights about molecular genetic basis of kernel development in maize. This resulted in the identification of 137 putative kernel length-related genes at 5 DAP, of which 43 are located in previously reported QTL regions. Strikingly, we identified an eQTL that overlaps the locus encoding a maize homolog of the recently described m6 A methylation reader protein ECT2 from Arabidopsis; this putative epi eQTL is associated with 53 genes and may represent a master epi-transcriptomic regulator of kernel development. Notably, among the genes associated with this epi eQTL, 10 are for the main storage proteins in the maize endosperm (zeins) and two are known regulators of zein expression or endosperm development (Opaque2 and ZmICE1). Collectively, beyond cataloging and characterizing genomic attributes of a large number of eQTL associated with kernel development in maize, our study highlights how an eQTL approach can bolster the impact of both GWAS and QTT studies and can drive insights about the basic biology of plants.
Collapse
|
39
|
Transcriptome profiling of four candidate milk genes in milk and tissue samples of temperate and tropical cattle. J Genet 2019. [DOI: 10.1007/s12041-019-1060-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
40
|
Establishing gene Amelogenin as sex-specific marker in yak by genomic approach. J Genet 2019; 98:7. [PMID: 30945688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Yak, an economically important bovine species considered as lifeline of the Himalaya. Indeed, this gigantic bovine is neglected because of the scientific intervention for its conservation as well as research documentation for a long time. Amelogenin is an essential protein for tooth enamel which eutherian mammals contain two copies in both X and Y chromosome each. In bovine, the deletion of a fragment of the nucleotide sequence in Y chromosome copy of exon 6 made Amelogenin an excellent sex-specific marker. Thus, an attempt was made to use the gene as an advanced molecular marker of sexing of the yak to improve breeding strategies and reproduction. The present study confirmed that the polymerase chain reaction amplification of the Amelogenin gene with a unique primer is useful in sex identification of the yak. The test is further refined with qPCR validation by quantifying the DNA copy number of the Amelogenin gene in male and female. We observed a high level of sequence polymorphisms of AMELX and AMELY in yak considered as novel identification. These tests can be further extended into several other specialized fields including forensics, meat production and processing, and quality control.
Collapse
|
41
|
Identifying Multi-Omics Causers and Causal Pathways for Complex Traits. Front Genet 2019; 10:110. [PMID: 30847004 PMCID: PMC6393387 DOI: 10.3389/fgene.2019.00110] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 01/30/2019] [Indexed: 12/23/2022] Open
Abstract
The central dogma of molecular biology delineates a unidirectional causal flow, i.e., DNA → RNA → protein → trait. Genome-wide association studies, next-generation sequencing association studies, and their meta-analyses have successfully identified ~12,000 susceptibility genetic variants that are associated with a broad array of human physiological traits. However, such conventional association studies ignore the mediate causers (i.e., RNA, protein) and the unidirectional causal pathway. Such studies may not be ideally powerful; and the genetic variants identified may not necessarily be genuine causal variants. In this article, we model the central dogma by a mediate causal model and analytically prove that the more remote an omics level is from a physiological trait, the smaller the magnitude of their correlation is. Under both random and extreme sampling schemes, we numerically demonstrate that the proteome-trait correlation test is more powerful than the transcriptome-trait correlation test, which in turn is more powerful than the genotype-trait association test. In conclusion, integrating RNA and protein expressions with DNA data and causal inference are necessary to gain a full understanding of how genetic causal variants contribute to phenotype variations.
Collapse
|
42
|
Identification of expression quantitative trait loci associated with schizophrenia and affective disorders in normal brain tissue. PLoS Genet 2018; 14:e1007607. [PMID: 30142156 PMCID: PMC6126875 DOI: 10.1371/journal.pgen.1007607] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 09/06/2018] [Accepted: 08/02/2018] [Indexed: 01/12/2023] Open
Abstract
Schizophrenia and the affective disorders, here comprising bipolar disorder and major depressive disorder, are psychiatric illnesses that lead to significant morbidity and mortality worldwide. Whilst understanding of their pathobiology remains limited, large case-control studies have recently identified single nucleotide polymorphisms (SNPs) associated with these disorders. However, discerning the functional effects of these SNPs has been difficult as the associated causal genes are unknown. Here we evaluated whether schizophrenia and affective disorder associated-SNPs are correlated with gene expression within human brain tissue. Specifically, to identify expression quantitative trait loci (eQTLs), we leveraged disorder-associated SNPs identified from 11 genome-wide association studies with gene expression levels in post-mortem, neurologically-normal tissue from two independent human brain tissue expression datasets (UK Brain Expression Consortium (UKBEC) and Genotype-Tissue Expression (GTEx)). Utilizing stringent multi-region meta-analyses, we identified 2,224 cis-eQTLs associated with expression of 40 genes, including 11 non-coding RNAs. One cis-eQTL, rs16969968, results in a functionally disruptive missense mutation in CHRNA5, a schizophrenia-implicated gene. Importantly, comparing across tissues, we find that blood eQTLs capture < 10% of brain cis-eQTLs. Contrastingly, > 30% of brain-associated eQTLs are significant in tibial nerve. This study identifies putatively causal genes whose expression in region-specific tissue may contribute to the risk of schizophrenia and affective disorders.
Collapse
|
43
|
Elucidating the Underlying Functional Mechanisms of Breast Cancer Susceptibility Through Post-GWAS Analyses. Front Genet 2018; 9:280. [PMID: 30116257 PMCID: PMC6082943 DOI: 10.3389/fgene.2018.00280] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 07/09/2018] [Indexed: 12/12/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified more than 170 single nucleotide polymorphisms (SNPs) associated with the susceptibility to breast cancer. Together, these SNPs explain 18% of the familial relative risk, which is estimated to be nearly half of the total familial breast cancer risk that is collectively explained by low-risk susceptibility alleles. An important aspect of this success has been the access to large sample sizes through collaborative efforts within the Breast Cancer Association Consortium (BCAC), but also collaborations between cancer association consortia. Despite these achievements, however, understanding of each variant's underlying mechanism and how these SNPs predispose women to breast cancer remains limited and represents a major challenge in the field, particularly since the vast majority of the GWAS-identified SNPs are located in non-coding regions of the genome and are merely tags for the causal variants. In recent years, fine-scale mapping studies followed by functional evaluation of putative causal variants have begun to elucidate the biological function of several GWAS-identified variants. In this review, we discuss the findings and lessons learned from these post-GWAS analyses of 22 risk loci. Identifying the true causal variants underlying breast cancer susceptibility and their function not only provides better estimates of the explained familial relative risk thereby improving polygenetic risk scores (PRSs), it also increases our understanding of the biological mechanisms responsible for causing susceptibility to breast cancer. This will facilitate the identification of further breast cancer risk alleles and the development of preventive medicine for those women at increased risk for developing the disease.
Collapse
|
44
|
Evidence for association of STAT4 and IL12RB2 variants with Myasthenia gravis susceptibility: What is the effect on gene expression in thymus? J Neuroimmunol 2018; 319:93-99. [PMID: 29576322 DOI: 10.1016/j.jneuroim.2018.03.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 03/13/2018] [Accepted: 03/14/2018] [Indexed: 12/14/2022]
Abstract
Myasthenia gravis (MG) is an autoimmune disease mediated by the presence of autoantibodies that bind mainly to the acetylcholine receptor (AChR) in the neuromuscular junction. In our case-control association study, we analyzed common variants located in genes of the IL12/STAT4 and IL10/STAT3 signaling pathways. A total of 175 sporadic MG patients of Greek descent, positively detected with anti-AChR autoantibodies and 84 ethnically-matched, healthy volunteers were enrolled in the study. Thymus samples were obtained from 16 non-MG individuals for relative gene expression analysis. The strongest signals of association were observed in the cases of rs6679356 between the late-onset MG patients and controls and rs7574865 between early-onset MG and controls. Our investigation of the correlation between the MG-associated variants and the expression levels of each gene in thymus did not result in significant differences.
Collapse
|
45
|
The Post-GWAS Era: From Association to Function. Am J Hum Genet 2018; 102:717-730. [PMID: 29727686 DOI: 10.1016/j.ajhg.2018.04.002] [Citation(s) in RCA: 449] [Impact Index Per Article: 74.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 04/04/2018] [Indexed: 12/13/2022] Open
Abstract
During the past 12 years, genome-wide association studies (GWASs) have uncovered thousands of genetic variants that influence risk for complex human traits and diseases. Yet functional studies aimed at delineating the causal genetic variants and biological mechanisms underlying the observed statistical associations with disease risk have lagged. In this review, we highlight key advances in the field of functional genomics that may facilitate the derivation of biological meaning post-GWAS. We highlight the evidence suggesting that causal variants underlying disease risk often function through regulatory effects on the expression of target genes and that these expression effects might be modest and cell-type specific. We moreover discuss specific studies as proof-of-principle examples for current statistical, bioinformatic, and empirical bench-based approaches to downstream elucidation of GWAS-identified disease risk loci.
Collapse
|
46
|
Single-cell RNA sequencing identifies celltype-specific cis-eQTLs and co-expression QTLs. Nat Genet 2018; 50:493-497. [PMID: 29610479 PMCID: PMC5905669 DOI: 10.1038/s41588-018-0089-9] [Citation(s) in RCA: 191] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 02/23/2018] [Indexed: 11/17/2022]
|
47
|
Integrative genomics identifies new genes associated with severe COPD and emphysema. Respir Res 2018; 19:46. [PMID: 29566699 PMCID: PMC5863845 DOI: 10.1186/s12931-018-0744-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 03/06/2018] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Genome-wide association studies have identified several genetic risk loci for severe chronic obstructive pulmonary disease (COPD) and emphysema. However, these studies do not fully explain disease heritability and in most cases, fail to implicate specific genes. Integrative methods that combine gene expression data with GWAS can provide more power in discovering disease-associated genes and give mechanistic insight into regulated genes. METHODS We applied a recently described method that imputes gene expression using reference transcriptome data to genome-wide association studies for two phenotypes (severe COPD and quantitative emphysema) and blood and lung tissue gene expression datasets. We further tested the potential causality of individual genes using multi-variant colocalization. RESULTS We identified seven genes significantly associated with severe COPD, and five genes significantly associated with quantitative emphysema in whole blood or lung. We validated results in independent transcriptome databases and confirmed colocalization signals for PSMA4, EGLN2, WNT3, DCBLD1, and LILRA3. Three of these genes were not located within previously reported GWAS loci for either phenotype. We also identified genetically driven pathways, including those related to immune regulation. CONCLUSIONS An integrative analysis of GWAS and gene expression identified novel associations with severe COPD and quantitative emphysema, and also suggested disease-associated genes in known COPD susceptibility loci. TRIAL REGISTRATION NCT00608764 , Registry: ClinicalTrials.gov, Date of Enrollment of First Participant: November 2007, Date Registered: January 28, 2008 (retrospectively registered); NCT00292552 , Registry: ClinicalTrials.gov, Date of Enrollment of First Participant: December 2005, Date Registered: February 14, 2006 (retrospectively registered).
Collapse
|
48
|
Abstract
Background Expression quantitative trait loci (eQTL) analysis identifies genetic markers associated with the expression of a gene. Most existing eQTL analyses and methods investigate association in a single, readily available tissue, such as blood. Joint analysis of eQTL in multiple tissues has the potential to improve, and expand the scope of, single-tissue analyses. Large-scale collaborative efforts such as the Genotype-Tissue Expression (GTEx) program are currently generating high quality data in a large number of tissues. However, computational constraints limit genome-wide multi-tissue eQTL analysis. Results We develop an integrative method under a hierarchical Bayesian framework for eQTL analysis in a large number of tissues. The model fitting procedure is highly scalable, and the computing time is a polynomial function of the number of tissues. Multi-tissue eQTLs are identified through a local false discovery rate approach, which rigorously controls the false discovery rate. Using simulation and GTEx real data studies, we show that the proposed method has superior performance to existing methods in terms of computing time and the power of eQTL discovery. Conclusions We provide a scalable method for eQTL analysis in a large number of tissues. The method enables the identification of eQTL with different configurations and facilitates the characterization of tissue specificity. Electronic supplementary material The online version of this article (10.1186/s12859-018-2088-3) contains supplementary material, which is available to authorized users.
Collapse
|
49
|
Association analysis of the SNP (rs345476947) in the FUT2 gene with the production and reproductive traits in pigs. Genes Genomics 2018; 40:199-206. [PMID: 29892924 DOI: 10.1007/s13258-017-0623-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 10/15/2017] [Indexed: 12/21/2022]
Abstract
The FUT2 gene was considered as an important candidate for pathogenic infections, while the potential associations between this gene and the production and reproductive traits of pigs have not been explored. In this study, we detected the genetic variants of porcine FUT2 gene and analyzed the associations of the polymorphisms with FUT2 mRNA expression and production and reproductive traits (age at 100 kg, backfat thickness at 100 kg, eye muscle thickness, the number of newborn piglets, the number of weaned piglets, and birth weight) in 100 Large White sows. One single nucleotide polymorphism (SNP) (rs345476947, C→T) in the intron of FUT2 and three genotypes (TT, CT and CC) were determined. Association analysis revealed significant associations between this SNP with the number of newborn piglets and weaned piglets. Furthermore, individuals with the TT genotype had significantly higher numbers of newborn piglets and weaned piglets than those with the CC genotype (P < 0.05). Quantitative PCR analysis showed that FUT2 expression in individuals with CC genotype was significantly higher than those with TT and CT genotypes in the liver and lymph gland (P < 0.05) and higher than that of CT in the spleen, kidney, and duodenum (P < 0.05). These findings indicated that the TT genotype may be a favorable genotype for the reproductive traits of pigs. Our study revealed the genetic variants of the FUT2 gene and identified a promising candidate SNP (rs345476947) associated with the reproductive traits, which has the potential to be applied in selective breeding of pigs.
Collapse
|
50
|
CD4+ and B Lymphocyte Expression Quantitative Traits at Rheumatoid Arthritis Risk Loci in Patients With Untreated Early Arthritis: Implications for Causal Gene Identification. Arthritis Rheumatol 2018; 70:361-370. [PMID: 29193869 PMCID: PMC5888199 DOI: 10.1002/art.40393] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2017] [Accepted: 11/22/2017] [Indexed: 12/04/2022]
Abstract
Objective Rheumatoid arthritis (RA) is a genetically complex disease of immune dysregulation. This study sought to gain further insight into the genetic risk mechanisms of RA by conducting an expression quantitative trait locus (eQTL) analysis of confirmed genetic risk loci in CD4+ T cells and B cells from carefully phenotyped patients with early arthritis who were naive to therapeutic immunomodulation. Methods RNA and DNA were isolated from purified B and/or CD4+ T cells obtained from the peripheral blood of 344 patients with early arthritis. Genotyping and global gene expression measurements were carried out using Illumina BeadChip microarrays. Variants in linkage disequilibrium (LD) with non‐HLA RA single‐nucleotide polymorphisms (defined as r2 ≥ 0.8) were analyzed, seeking evidence of cis‐ or trans‐eQTLs according to whether the associated probes were or were not within 4 Mb of these LD blocks. Results Genes subject to cis‐eQTL effects that were common to both CD4+ and B lymphocytes at RA risk loci were FADS1,FADS2,BLK,FCRL3,ORMDL3,PPIL3, and GSDMB. In contrast, those acting on METTL21B,JAZF1,IKZF3, and PADI4 were unique to CD4+ lymphocytes, with the latter candidate risk gene being identified for the first time in this cell subset. B lymphocyte–specific eQTLs for SYNGR1 and CD83 were also found. At the 8p23 BLK–FAM167A locus, adjacent genes were subject to eQTLs whose activity differed markedly between cell types; in particular, the FAM167A effect displayed striking B lymphocyte specificity. No trans‐eQTLs approached experiment‐wide significance, and linear modeling did not identify a significant influence of biologic covariates on cis‐eQTL effect sizes. Conclusion These findings further refine the understanding of candidate causal genes in RA pathogenesis, thus providing an important platform from which downstream functional studies, directed toward particular cell types, may be prioritized.
Collapse
|