1
|
Bhattacharyya S, Ay F. Identifying genetic variants associated with chromatin looping and genome function. Nat Commun 2024; 15:8174. [PMID: 39289357 PMCID: PMC11408621 DOI: 10.1038/s41467-024-52296-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open
Abstract
Here we present a comprehensive HiChIP dataset on naïve CD4 T cells (nCD4) from 30 donors and identify QTLs that associate with genotype-dependent and/or allele-specific variation of HiChIP contacts defining loops between active regulatory regions (iQTLs). We observe a substantial overlap between iQTLs and previously defined eQTLs and histone QTLs, and an enrichment for fine-mapped QTLs and GWAS variants. Furthermore, we describe a distinct subset of nCD4 iQTLs, for which the significant variation of chromatin contacts in nCD4 are translated into significant eQTL trends in CD4 T cell memory subsets. Finally, we define connectivity-QTLs as iQTLs that are significantly associated with concordant genotype-dependent changes in chromatin contacts over a broad genomic region (e.g., GWAS SNP in the RNASET2 locus). Our results demonstrate the importance of chromatin contacts as a complementary modality for QTL mapping and their power in identifying previously uncharacterized QTLs linked to cell-specific gene expression and connectivity.
Collapse
Affiliation(s)
| | - Ferhat Ay
- La Jolla Institute for Immunology, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
2
|
Wenz BM, He Y, Chen NC, Pickrell JK, Li JH, Dudek MF, Li T, Keener R, Voight BF, Brown CD, Battle A. Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.04.610850. [PMID: 39282458 PMCID: PMC11398312 DOI: 10.1101/2024.09.04.610850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/21/2024]
Abstract
Background Understanding the genetic causes for variability in chromatin accessibility can shed light on the molecular mechanisms through which genetic variants may affect complex traits. Thousands of ATAC-seq samples have been collected that hold information about chromatin accessibility across diverse cell types and contexts, but most of these are not paired with genetic information and come from diverse distinct projects and laboratories. Results We report here joint genotyping, chromatin accessibility peak calling, and discovery of quantitative trait loci which influence chromatin accessibility (caQTLs), demonstrating the capability of performing caQTL analysis on a large scale in a diverse sample set without pre-existing genotype information. Using 10,293 profiling samples representing 1,454 unique donor individuals across 653 studies from public databases, we catalog 23,381 caQTLs in total. After joint discovery analysis, we cluster samples based on accessible chromatin profiles to identify context-specific caQTLs. We find that caQTLs are strongly enriched for annotations of gene regulatory elements across diverse cell types and tissues and are often strongly linked with genetic variation associated with changes in expression (eQTLs), indicating that caQTLs can mediate genetic effects on gene expression. We demonstrate sharing of causal variants for chromatin accessibility and diverse complex human traits, enabling a more complete picture of the genetic mechanisms underlying complex human phenotypes. Conclusions Our work provides a proof of principle for caQTL calling from previously ungenotyped samples, and represents one of the largest, most diverse caQTL resources currently available, informing mechanisms of genetic regulation of gene expression and contribution to disease.
Collapse
Affiliation(s)
- Brandon M Wenz
- Genetics and Epigenetics Program, Cell and Molecular Biology Graduate Group, Biomedical Graduate Studies, University of Pennsylvania - Perelman School of Medicine, Philadelphia PA 19104
| | - Yuan He
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, 21218
| | | | | | - Max F Dudek
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA 19104
| | - Taibo Li
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
| | - Rebecca Keener
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
| | - Benjamin F Voight
- Department of Genetics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, PA, 19104
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania - Perelman School of Medicine, Philadelphia PA, 19104
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, PA, 19104
| | - Christopher D Brown
- Department of Genetics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, PA, 19104
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, 21218
- Department of Genetic Medicine, Johns Hopkins University; Baltimore, MD, 21218
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, 21218
- Data Science and AI Institute, Johns Hopkins University, Baltimore, MD, 21218
| |
Collapse
|
3
|
Qi G, Battle A. Computational methods for allele-specific expression in single cells. Trends Genet 2024:S0168-9525(24)00169-0. [PMID: 39127549 DOI: 10.1016/j.tig.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 08/12/2024]
Abstract
Allele-specific expression (ASE) is a powerful signal that can be used to investigate multiple molecular mechanisms, such as cis-regulatory effects and imprinting. Single-cell RNA-sequencing (scRNA-seq) enables ASE characterization at the resolution of individual cells. In this review, we highlight the computational methods for processing and analyzing single-cell ASE data. We first describe a bioinformatics pipeline to obtain ASE counts from raw reads synthesized from previous literature. We then discuss statistical methods for detecting allelic imbalance and its variability across conditions using scRNA-seq data. In addition, we describe other methods that use single-cell ASE to address specific biological questions. Finally, we discuss future directions and emphasize the need for an integrated, optimized bioinformatics pipeline, and further development of statistical methods for different technologies.
Collapse
Affiliation(s)
- Guanghao Qi
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA; Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD 21205, USA.
| |
Collapse
|
4
|
Adduri A, Kim S. Ornaments for efficient allele-specific expression estimation with bias correction. Am J Hum Genet 2024; 111:1770-1781. [PMID: 39047729 PMCID: PMC11339617 DOI: 10.1016/j.ajhg.2024.06.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 06/22/2024] [Accepted: 06/24/2024] [Indexed: 07/27/2024] Open
Abstract
Allele-specific expression plays a crucial role in unraveling various biological mechanisms, including genomic imprinting and gene expression controlled by cis-regulatory variants. However, existing methods for quantification from RNA-sequencing (RNA-seq) reads do not adequately and efficiently remove various allele-specific read mapping biases, such as reference bias arising from reads containing the alternative allele that do not map to the reference transcriptome or ambiguous mapping bias caused by reads containing the reference allele that map differently from reads containing the alternative allele. We present Ornaments, a computational tool for rapid and accurate estimation of allele-specific transcript expression at unphased heterozygous loci from RNA-seq reads while correcting for allele-specific read mapping biases. Ornaments removes reference bias by mapping reads to a personalized transcriptome and ambiguous mapping bias by probabilistically assigning reads to multiple transcripts and variant loci they map to. Ornaments is a lightweight extension of kallisto, a popular tool for fast RNA-seq quantification, that improves the efficiency and accuracy of WASP, a popular tool for bias correction in allele-specific read mapping. In experiments with simulated and human lymphoblastoid cell-line RNA-seq reads with the genomes of the 1000 Genomes Project, we demonstrate that Ornaments improves the accuracy of WASP and kallisto, is nearly as efficient as kallisto, and is an order of magnitude faster than WASP per sample, with the additional cost of constructing a personalized index for multiple samples. Additionally, we show that Ornaments finds imprinted transcripts with higher sensitivity than WASP, which detects imprinted signals only at gene level.
Collapse
Affiliation(s)
- Abhinav Adduri
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Seyoung Kim
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, PA 15261, USA.
| |
Collapse
|
5
|
Mummey HM, Elison W, Korgaonkar K, Elgamal RM, Kudtarkar P, Griffin E, Benaglio P, Miller M, Jha A, Fox JEM, McCarthy MI, Preissl S, Gloyn AL, MacDonald PE, Gaulton KJ. Single cell multiome profiling of pancreatic islets reveals physiological changes in cell type-specific regulation associated with diabetes risk. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.03.606460. [PMID: 39149326 PMCID: PMC11326183 DOI: 10.1101/2024.08.03.606460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Physiological variability in pancreatic cell type gene regulation and the impact on diabetes risk is poorly understood. In this study we mapped gene regulation in pancreatic cell types using single cell multiomic (joint RNA-seq and ATAC-seq) profiling in 28 non-diabetic donors in combination with single cell data from 35 non-diabetic donors in the Human Pancreas Analysis Program. We identified widespread associations with age, sex, BMI, and HbA1c, where gene regulatory responses were highly cell type- and phenotype-specific. In beta cells, donor age associated with hypoxia, apoptosis, unfolded protein response, and external signal-dependent transcriptional regulators, while HbA1c associated with inflammatory responses and gender with chromatin organization. We identified 10.8K loci where genetic variants were QTLs for cis regulatory element (cRE) accessibility, including 20% with lineage- or cell type-specific effects which disrupted distinct transcription factor motifs. Type 2 diabetes and glycemic trait associated variants were enriched in both phenotype- and QTL-associated beta cell cREs, whereas type 1 diabetes showed limited enrichment. Variants at 226 diabetes and glycemic trait loci were QTLs in beta and other cell types, including 40 that were statistically colocalized, and annotating target genes of colocalized QTLs revealed genes with putatively novel roles in disease. Our findings reveal diverse responses of pancreatic cell types to phenotype and genotype in physiology, and identify pathways, networks, and genes through which physiology impacts diabetes risk.
Collapse
Affiliation(s)
- Hannah M Mummey
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla CA
| | - Weston Elison
- Biomedical Sciences Program, University of California San Diego, La Jolla CA, USA
| | - Katha Korgaonkar
- Department of Pediatrics, University of California San Diego, La Jolla CA, USA
| | - Ruth M Elgamal
- Biomedical Sciences Program, University of California San Diego, La Jolla CA, USA
| | - Parul Kudtarkar
- Department of Pediatrics, University of California San Diego, La Jolla CA, USA
| | - Emily Griffin
- Department of Pediatrics, University of California San Diego, La Jolla CA, USA
| | - Paola Benaglio
- Department of Pediatrics, University of California San Diego, La Jolla CA, USA
| | - Michael Miller
- Center for Epigenomics, University of California San Diego, La Jolla CA, USA
| | - Alokkumar Jha
- Department of Pediatrics, Stanford School of Medicine, Stanford University, Stanford CA, USA
| | - Jocelyn E Manning Fox
- Department of Pharmacology, University of Alberta, Edmonton, Alberta, Canada
- Alberta Diabetes Institute, University of Alberta, Edmonton, Alberta, Canada
| | - Mark I McCarthy
- Wellcome Trust Center for Human Genetics, University of Oxford, Oxford, UK*
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego, La Jolla CA, USA
- Department of Genetics, Stanford School of Medicine, Stanford University, Stanford CA, USA
| | - Anna L Gloyn
- Department of Pediatrics, Stanford School of Medicine, Stanford University, Stanford CA, USA
- Department of Genetics, Stanford School of Medicine, Stanford University, Stanford CA, USA
- Stanford Diabetes Research Center, Stanford School of Medicine, Stanford, CA, USA
| | - Patrick E MacDonald
- Alberta Diabetes Institute, University of Alberta, Edmonton, Alberta, Canada
| | - Kyle J Gaulton
- Department of Pediatrics, University of California San Diego, La Jolla CA, USA
| |
Collapse
|
6
|
Biddie SC, Weykopf G, Hird EF, Friman ET, Bickmore WA. DNA-binding factor footprints and enhancer RNAs identify functional non-coding genetic variants. Genome Biol 2024; 25:208. [PMID: 39107801 PMCID: PMC11304670 DOI: 10.1186/s13059-024-03352-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/25/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have revealed a multitude of candidate genetic variants affecting the risk of developing complex traits and diseases. However, the highlighted regions are typically in the non-coding genome, and uncovering the functional causative single nucleotide variants (SNVs) is challenging. Prioritization of variants is commonly based on genomic annotation with markers of active regulatory elements, but current approaches still poorly predict functional variants. To address this, we systematically analyze six markers of active regulatory elements for their ability to identify functional variants. RESULTS We benchmark against molecular quantitative trait loci (molQTL) from assays of regulatory element activity that identify allelic effects on DNA-binding factor occupancy, reporter assay expression, and chromatin accessibility. We identify the combination of DNase footprints and divergent enhancer RNA (eRNA) as markers for functional variants. This signature provides high precision, but with a trade-off of low recall, thus substantially reducing candidate variant sets to prioritize variants for functional validation. We present this as a framework called FINDER-Functional SNV IdeNtification using DNase footprints and eRNA. CONCLUSIONS We demonstrate the utility to prioritize variants using leukocyte count trait and analyze variants in linkage disequilibrium with a lead variant to predict a functional variant in asthma. Our findings have implications for prioritizing variants from GWAS, in development of predictive scoring algorithms, and for functionally informed fine mapping approaches.
Collapse
Affiliation(s)
- Simon C Biddie
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
- NHS Lothian, Edinburgh, UK.
| | - Giovanna Weykopf
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | | | - Elias T Friman
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Wendy A Bickmore
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
7
|
Pushkarev O, van Mierlo G, Kribelbauer JF, Saelens W, Gardeux V, Deplancke B. Non-coding variants impact cis-regulatory coordination in a cell type-specific manner. Genome Biol 2024; 25:190. [PMID: 39026229 PMCID: PMC11256678 DOI: 10.1186/s13059-024-03333-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 07/09/2024] [Indexed: 07/20/2024] Open
Abstract
BACKGROUND Interactions among cis-regulatory elements (CREs) play a crucial role in gene regulation. Various approaches have been developed to map these interactions genome-wide, including those relying on interindividual epigenomic variation to identify groups of covariable regulatory elements, referred to as chromatin modules (CMs). While CM mapping allows to investigate the relationship between chromatin modularity and gene expression, the computational principles used for CM identification vary in their application and outcomes. RESULTS We comprehensively evaluate and streamline existing CM mapping tools and present guidelines for optimal utilization of epigenome data from a diverse population of individuals to assess regulatory coordination across the human genome. We showcase the effectiveness of our recommended practices by analyzing distinct cell types and demonstrate cell type specificity of CRE interactions in CMs and their relevance for gene expression. Integration of genotype information revealed that many non-coding disease-associated variants affect the activity of CMs in a cell type-specific manner by affecting the binding of cell type-specific transcription factors. We provide example cases that illustrate in detail how CMs can be used to deconstruct GWAS loci, assess variable expression of cell surface receptors in immune cells, and reveal how genetic variation can impact the expression of prognostic markers in chronic lymphocytic leukemia. CONCLUSIONS Our study presents an optimal strategy for CM mapping and reveals how CMs capture the coordination of CREs and its impact on gene expression. Non-coding genetic variants can disrupt this coordination, and we highlight how this may lead to disease predisposition in a cell type-specific manner.
Collapse
Affiliation(s)
- Olga Pushkarev
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Guido van Mierlo
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| | - Judith Franziska Kribelbauer
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Wouter Saelens
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Vincent Gardeux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
8
|
Fazel-Najafabadi M, Looger LL, Rallabandi HR, Nath SK. A Multilayered Post-Genome-Wide Association Study Analysis Pipeline Defines Functional Variants and Target Genes for Systemic Lupus Erythematosus. Arthritis Rheumatol 2024; 76:1071-1084. [PMID: 38369936 PMCID: PMC11213670 DOI: 10.1002/art.42829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 01/31/2024] [Accepted: 02/14/2024] [Indexed: 02/20/2024]
Abstract
OBJECTIVE Systemic lupus erythematosus (SLE), an autoimmune disease with incompletely understood etiology, has a strong genetic component. Although genome-wide association studies (GWASs) have revealed multiple SLE susceptibility loci and associated single-nucleotide polymorphisms (SNPs), the precise causal variants, target genes, cell types, tissues, and mechanisms of action remain largely unknown. METHODS Here, we report a comprehensive post-GWAS analysis using extensive bioinformatics, molecular modeling, and integrative functional genomic and epigenomic analyses to optimize fine-mapping. We compile and cross-reference immune cell-specific expression quantitative trait loci (cis- and trans-expression quantitative trait loci) with promoter capture high-throughput capture chromatin conformation (PCHi-C), allele-specific chromatin accessibility, and massively parallel reporter assay data to define predisposing variants and target genes. We experimentally validate a predicted locus using CRISPR/Cas9 genome editing, quantitative polymerase chain reaction, and Western blot. RESULTS Anchoring on 452 index SNPs, we selected 9,931 high linkage disequilibrium (r2 > 0.8) SNPs and defined 182 independent non-human leukocyte antigen (HLA) SLE loci. The 3,746 SNPs from 143 loci were identified as regulating 564 unique genes. Target genes are enriched in lupus-related tissues and associated with other autoimmune diseases. Of these, 329 SNPs (106 loci) showed significant allele-specific chromatin accessibility and/or enhancer activity, indicating regulatory potential. Using CRISPR/Cas9, we validated reference SNP identifier 57668933 (rs57668933) as a functional variant regulating multiple targets, including SLE-risk gene ELF1 in B cells. CONCLUSION We demonstrate and validate post-GWAS strategies for using multidimensional data to prioritize likely causal variants with cognate gene targets underlying SLE pathogenesis. Our results provide a catalog of significantly SLE-associated SNPs and loci, target genes, and likely biochemical mechanisms to guide experimental characterization.
Collapse
Affiliation(s)
- Mehdi Fazel-Najafabadi
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Loren L. Looger
- Department of Neurosciences, University of California, San Diego, La Jolla, CA 92121, USA
- Howard Hughes Medical Institute, University of California, San Diego, La Jolla, CA 92121, USA
| | - Harikrishna Reddy Rallabandi
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Swapan K. Nath
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| |
Collapse
|
9
|
Loeb GB, Kathail P, Shuai R, Chung R, Grona RJ, Peddada S, Sevim V, Federman S, Mader K, Chu A, Davitte J, Du J, Gupta AR, Ye CJ, Shafer S, Przybyla L, Rapiteanu R, Ioannidis N, Reiter JF. Variants in tubule epithelial regulatory elements mediate most heritable differences in human kidney function. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.18.599625. [PMID: 38948875 PMCID: PMC11212968 DOI: 10.1101/2024.06.18.599625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Kidney disease is highly heritable; however, the causal genetic variants, the cell types in which these variants function, and the molecular mechanisms underlying kidney disease remain largely unknown. To identify genetic loci affecting kidney function, we performed a GWAS using multiple kidney function biomarkers and identified 462 loci. To begin to investigate how these loci affect kidney function, we generated single-cell chromatin accessibility (scATAC-seq) maps of the human kidney and identified candidate cis-regulatory elements (cCREs) for kidney podocytes, tubule epithelial cells, and kidney endothelial, stromal, and immune cells. Kidney tubule epithelial cCREs explained 58% of kidney function SNP-heritability and kidney podocyte cCREs explained an additional 6.5% of SNP-heritability. In contrast, little kidney function heritability was explained by kidney endothelial, stromal, or immune cell-specific cCREs. Through functionally informed fine-mapping, we identified putative causal kidney function variants and their corresponding cCREs. Using kidney scATAC-seq data, we created a deep learning model (which we named ChromKid) to predict kidney cell type-specific chromatin accessibility from sequence. ChromKid and allele specific kidney scATAC-seq revealed that many fine-mapped kidney function variants locally change chromatin accessibility in tubule epithelial cells. Enhancer assays confirmed that fine-mapped kidney function variants alter tubule epithelial regulatory element function. To map the genes which these regulatory elements control, we used CRISPR interference (CRISPRi) to target these regulatory elements in tubule epithelial cells and assessed changes in gene expression. CRISPRi of enhancers harboring kidney function variants regulated NDRG1 and RBPMS expression. Thus, inherited differences in tubule epithelial NDRG1 and RBPMS expression may predispose to kidney disease in humans. We conclude that genetic variants affecting tubule epithelial regulatory element function account for most SNP-heritability of human kidney function. This work provides an experimental approach to identify the variants, regulatory elements, and genes involved in polygenic disease.
Collapse
Affiliation(s)
- Gabriel B. Loeb
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, US
| | - Pooja Kathail
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Richard Shuai
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Ryan Chung
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
| | - Reinier J. Grona
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Sailaja Peddada
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Volkan Sevim
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | - Scot Federman
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Karl Mader
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Audrey Chu
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | | | - Juan Du
- Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Alexander R. Gupta
- Department of Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Chun Jimmie Ye
- Division of Rheumatology, Department of Medicine; Bakar Computational Health Sciences Institute; Parker Institute for Cancer Immunotherapy; Institute for Human Genetics; Department of Epidemiology & Biostatistics; Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA and Gladstone-UCSF Institute of Genomic Immunology, San Francisco, CA, USA
| | - Shawn Shafer
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | - Laralynne Przybyla
- Laboratory for Genomics Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
| | - Radu Rapiteanu
- Genomic Sciences, GlaxoSmithKline, San Francisco, CA, USA
| | - Nilah Ioannidis
- Department of Electrical Engineering and Computer Science, Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Jeremy F. Reiter
- Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, US
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
10
|
Kumasaka N. Genetic association mapping leveraging Gaussian processes. J Hum Genet 2024:10.1038/s10038-024-01259-0. [PMID: 38834722 DOI: 10.1038/s10038-024-01259-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 05/16/2024] [Accepted: 05/20/2024] [Indexed: 06/06/2024]
Abstract
Gaussian processes (GPs) are a powerful and useful approach for modelling nonlinear phenomena in various scientific fields, including genomics and genetics. This review focuses on the application of GPs in genetic association mapping. The aim is to identify genetic variants that alter gene regulation along continuous cellular states at the molecular level, as well as disease susceptibility over time and space at the population level. The challenges and opportunities in this field are also addressed.
Collapse
Affiliation(s)
- Natsuhiko Kumasaka
- Division of Digital Genomics, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
11
|
Boye C, Nirmalan S, Ranjbaran A, Luca F. Genotype × environment interactions in gene regulation and complex traits. Nat Genet 2024; 56:1057-1068. [PMID: 38858456 DOI: 10.1038/s41588-024-01776-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 04/25/2024] [Indexed: 06/12/2024]
Abstract
Genotype × environment interactions (GxE) have long been recognized as a key mechanism underlying human phenotypic variation. Technological developments over the past 15 years have dramatically expanded our appreciation of the role of GxE in both gene regulation and complex traits. The richness and complexity of these datasets also required parallel efforts to develop robust and sensitive statistical and computational approaches. Although our understanding of the genetic architecture of molecular and complex traits has been maturing, a large proportion of complex trait heritability remains unexplained. Furthermore, there are increasing efforts to characterize the effect of environmental exposure on human health. We therefore review GxE in human gene regulation and complex traits, advocating for a comprehensive approach that jointly considers genetic and environmental factors in human health and disease.
Collapse
Affiliation(s)
- Carly Boye
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US
| | - Shreya Nirmalan
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US
| | - Ali Ranjbaran
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US.
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, US.
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy.
| |
Collapse
|
12
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
13
|
Hu X, Wang J, Yang K, Fan H, Wu J, Ren J, Han G, Li J, Xue Z, Liu X, Lv X. The GWAS SNP rs80207740 modulates erythrocyte traits via allele-specific binding of IKZF1 and targeting XPO7 gene. FASEB J 2024; 38:e23666. [PMID: 38780091 DOI: 10.1096/fj.202302017r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 03/31/2024] [Accepted: 04/30/2024] [Indexed: 05/25/2024]
Abstract
Genome-wide association studies have identified many single nucleotide polymorphisms (SNPs) associated with erythrocyte traits. However, the functional variants and their working mechanisms remain largely unknown. Here, we reported that the SNP of rs80207740, which was associated with red blood cell (RBC) volume and hemoglobin content across populations, conferred enhancer activity to XPO7 gene via allele-differentially binding to Ikaros family zinc finger 1 (IKZF1). We showed that the region around rs80207740 was an erythroid-specific enhancer using reporter assays, and that the G-allele further enhanced activity. 3D genome evidence showed that the enhancer interacted with the XPO7 promoter, and eQTL analysis suggested that the G-allele upregulated expression of XPO7. We further showed that the rs80207740-G allele facilitated the binding of transcription factor IKZF1 in EMSA and ChIP analyses. Knockdown of IKZF1 and GATA1 resulted in decreased expression of Xpo7 in both human and mouse erythroid cells. Finally, we constructed Xpo7 knockout mouse by CRISPR/Cas9 and observed anemic phenotype with reduced volume and hemoglobin content of RBC, consistent to the effect of rs80207740 on erythrocyte traits. Overall, our study demonstrated that rs80207740 modulated erythroid indices by regulating IKZF1 binding and Xpo7 expression.
Collapse
Affiliation(s)
- Xinjun Hu
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Jiaxin Wang
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Ke Yang
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Hong Fan
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Jie Wu
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, P. R. China
| | - Jiuqiang Ren
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Gaijing Han
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Jing Li
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Zheng Xue
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Xuehui Liu
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| | - Xiang Lv
- State Key Laboratory of Complex, Severe, and Rare Diseases, Haihe Laboratory of Cell Ecosystem, Department of Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China
| |
Collapse
|
14
|
Zeng B, Bendl J, Deng C, Lee D, Misir R, Reach SM, Kleopoulos SP, Auluck P, Marenco S, Lewis DA, Haroutunian V, Ahituv N, Fullard JF, Hoffman GE, Roussos P. Genetic regulation of cell type-specific chromatin accessibility shapes brain disease etiology. Science 2024; 384:eadh4265. [PMID: 38781378 DOI: 10.1126/science.adh4265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 12/20/2023] [Indexed: 05/25/2024]
Abstract
Nucleotide variants in cell type-specific gene regulatory elements in the human brain are risk factors for human disease. We measured chromatin accessibility in 1932 aliquots of sorted neurons and non-neurons from 616 human postmortem brains and identified 34,539 open chromatin regions with chromatin accessibility quantitative trait loci (caQTLs). Only 10.4% of caQTLs are shared between neurons and non-neurons, which supports cell type-specific genetic regulation of the brain regulome. Incorporating allele-specific chromatin accessibility improves statistical fine-mapping and refines molecular mechanisms that underlie disease risk. Using massively parallel reporter assays in induced excitatory neurons, we screened 19,893 brain QTLs and identified the functional impact of 476 regulatory variants. Combined, this comprehensive resource captures variation in the human brain regulome and provides insights into disease etiology.
Collapse
Affiliation(s)
- Biao Zeng
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Donghoon Lee
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ruth Misir
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sarah M Reach
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Steven P Kleopoulos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Pavan Auluck
- Human Brain Collection Core, National Institute of Mental Health-Intramural Research Program, Bethesda, MD 20892, USA
| | - Stefano Marenco
- Human Brain Collection Core, National Institute of Mental Health-Intramural Research Program, Bethesda, MD 20892, USA
| | - David A Lewis
- Translational Neuroscience Program, Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY 10468, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY 10468, USA
| |
Collapse
|
15
|
Kuffler L, Skelly DA, Czechanski A, Fortin HJ, Munger SC, Baker CL, Reinholdt LG, Carter GW. Imputation of 3D genome structure by genetic-epigenetic interaction modeling in mice. eLife 2024; 12:RP88222. [PMID: 38669177 PMCID: PMC11052574 DOI: 10.7554/elife.88222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024] Open
Abstract
Gene expression is known to be affected by interactions between local genetic variation and DNA accessibility, with the latter organized into three-dimensional chromatin structures. Analyses of these interactions have previously been limited, obscuring their regulatory context, and the extent to which they occur throughout the genome. Here, we undertake a genome-scale analysis of these interactions in a genetically diverse population to systematically identify global genetic-epigenetic interaction, and reveal constraints imposed by chromatin structure. We establish the extent and structure of genotype-by-epigenotype interaction using embryonic stem cells derived from Diversity Outbred mice. This mouse population segregates millions of variants from eight inbred founders, enabling precision genetic mapping with extensive genotypic and phenotypic diversity. With 176 samples profiled for genotype, gene expression, and open chromatin, we used regression modeling to infer genetic-epigenetic interactions on a genome-wide scale. Our results demonstrate that statistical interactions between genetic variants and chromatin accessibility are common throughout the genome. We found that these interactions occur within the local area of the affected gene, and that this locality corresponds to topologically associated domains (TADs). The likelihood of interaction was most strongly defined by the three-dimensional (3D) domain structure rather than linear DNA sequence. We show that stable 3D genome structure is an effective tool to guide searches for regulatory elements and, conversely, that regulatory elements in genetically diverse populations provide a means to infer 3D genome structure. We confirmed this finding with CTCF ChIP-seq that revealed strain-specific binding in the inbred founder mice. In stem cells, open chromatin participating in the most significant regression models demonstrated an enrichment for developmental genes and the TAD-forming CTCF-binding complex, providing an opportunity for statistical inference of shifting TAD boundaries operating during early development. These findings provide evidence that genetic and epigenetic factors operate within the context of 3D chromatin structure.
Collapse
|
16
|
Arthur TD, Nguyen JP, D'Antonio-Chronowska A, Jaureguy J, Silva N, Henson B, Panopoulos AD, Belmonte JCI, D'Antonio M, McVicker G, Frazer KA. Multi-omic QTL mapping in early developmental tissues reveals phenotypic and temporal complexity of regulatory variants underlying GWAS loci. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.10.588874. [PMID: 38645112 PMCID: PMC11030419 DOI: 10.1101/2024.04.10.588874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Most GWAS loci are presumed to affect gene regulation, however, only ∼43% colocalize with expression quantitative trait loci (eQTLs). To address this colocalization gap, we identify eQTLs, chromatin accessibility QTLs (caQTLs), and histone acetylation QTLs (haQTLs) using molecular samples from three early developmental (EDev) tissues. Through colocalization, we annotate 586 GWAS loci for 17 traits by QTL complexity, QTL phenotype, and QTL temporal specificity. We show that GWAS loci are highly enriched for colocalization with complex QTL modules that affect multiple elements (genes and/or peaks). We also demonstrate that caQTLs and haQTLs capture regulatory variations not associated with eQTLs and explain ∼49% of the functionally annotated GWAS loci. Additionally, we show that EDev-unique QTLs are strongly depleted for colocalizing with GWAS loci. By conducting one of the largest multi-omic QTL studies to date, we demonstrate that many GWAS loci exhibit phenotypic complexity and therefore, are missed by traditional eQTL analyses.
Collapse
|
17
|
Sakaue S, Weinand K, Isaac S, Dey KK, Jagadeesh K, Kanai M, Watts GFM, Zhu Z, Brenner MB, McDavid A, Donlin LT, Wei K, Price AL, Raychaudhuri S. Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet 2024; 56:615-626. [PMID: 38594305 DOI: 10.1038/s41588-024-01682-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 02/07/2024] [Indexed: 04/11/2024]
Abstract
Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.
Collapse
Affiliation(s)
- Saori Sakaue
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kathryn Weinand
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shakson Isaac
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kushal K Dey
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Karthik Jagadeesh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Masahiro Kanai
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | - Gerald F M Watts
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhu Zhu
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael B Brenner
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Andrew McDavid
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
| | - Laura T Donlin
- Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medicine, New York, NY, USA
| | - Kevin Wei
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
18
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
19
|
LaPierre N, Pimentel H. Accounting for isoform expression increases power to identify genetic regulation of gene expression. PLoS Comput Biol 2024; 20:e1011857. [PMID: 38346082 PMCID: PMC10890775 DOI: 10.1371/journal.pcbi.1011857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 02/23/2024] [Accepted: 01/23/2024] [Indexed: 02/25/2024] Open
Abstract
A core problem in genetics is molecular quantitative trait locus (QTL) mapping, in which genetic variants associated with changes in the molecular phenotypes are identified. One of the most-studied molecular QTL mapping problems is expression QTL (eQTL) mapping, in which the molecular phenotype is gene expression. It is common in eQTL mapping to compute gene expression by aggregating the expression levels of individual isoforms from the same gene and then performing linear regression between SNPs and this aggregated gene expression level. However, SNPs may regulate isoforms from the same gene in different directions due to alternative splicing, or only regulate the expression level of one isoform, causing this approach to lose power. Here, we examine a broader question: which genes have at least one isoform whose expression level is regulated by genetic variants? In this study, we propose and evaluate several approaches to answering this question, demonstrating that "isoform-aware" methods-those that account for the expression levels of individual isoforms-have substantially greater power to answer this question than standard "gene-level" eQTL mapping methods. We identify settings in which different approaches yield an inflated number of false discoveries or lose power. In particular, we show that calling an eGene if there is a significant association between a SNP and any isoform fails to control False Discovery Rate, even when applying standard False Discovery Rate correction. We show that similar trends are observed in real data from the GEUVADIS and GTEx studies, suggesting the possibility that similar effects are present in these consortia.
Collapse
Affiliation(s)
- Nathan LaPierre
- Department of Computer Science, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, University of Chicago, Illinois, United States of America
| | - Harold Pimentel
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
- Department of Computational Medicine, University of California, Los Angeles, California, United States of America
| |
Collapse
|
20
|
Ehsan N, Kotis BM, Castel SE, Song EJ, Mancuso N, Mohammadi P. Haplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data. Nat Commun 2024; 15:522. [PMID: 38225224 PMCID: PMC10789818 DOI: 10.1038/s41467-024-44710-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 12/30/2023] [Indexed: 01/17/2024] Open
Abstract
Expression Quantitative Trait Loci (eQTLs) are critical to understanding the mechanisms underlying disease-associated genomic loci. Nearly all protein-coding genes in the human genome have been associated with one or more eQTLs. Here we introduce a multi-variant generalization of allelic Fold Change (aFC), aFC-n, to enable quantification of the cis-regulatory effects in multi-eQTL genes under the assumption that all eQTLs are known and conditionally independent. Applying aFC-n to 458,465 eQTLs in the Genotype-Tissue Expression (GTEx) project data, we demonstrate significant improvements in accuracy over the original model in estimating the eQTL effect sizes and in predicting genetically regulated gene expression over the current tools. We characterize some of the empirical properties of the eQTL data and use this framework to assess the current state of eQTL data in terms of characterizing cis-regulatory landscape in individual genomes. Notably, we show that 77.4% of the genes with an allelic imbalance in a sample show 0.5 log2 fold or more of residual imbalance after accounting for the eQTL data underlining the remaining gap in characterizing regulatory landscape in individual genomes. We further contrast this gap across tissue types, and ancestry backgrounds to identify its correlates and guide future studies.
Collapse
Affiliation(s)
- Nava Ehsan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Bence M Kotis
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Stephane E Castel
- Department of Systems Biology, Columbia University, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Eric J Song
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern, California, CA, USA
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.
- Center for Immunity and Immunotherapies, Seattle Children's Research Institute, Seattle, WA, USA.
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA.
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| |
Collapse
|
21
|
Raabe FJ, Hausruckinger A, Gagliardi M, Ahmad R, Almeida V, Galinski S, Hoffmann A, Weigert L, Rummel CK, Murek V, Trastulla L, Jimenez-Barron L, Atella A, Maidl S, Menegaz D, Hauger B, Wagner EM, Gabellini N, Kauschat B, Riccardo S, Cesana M, Papiol S, Sportelli V, Rex-Haffner M, Stolte SJ, Wehr MC, Salcedo TO, Papazova I, Detera-Wadleigh S, McMahon FJ, Schmitt A, Falkai P, Hasan A, Cacchiarelli D, Dannlowski U, Nenadić I, Kircher T, Scheuss V, Eder M, Binder EB, Spengler D, Rossner MJ, Ziller MJ. Polygenic risk for schizophrenia converges on alternative polyadenylation as molecular mechanism underlying synaptic impairment. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.09.574815. [PMID: 38260577 PMCID: PMC10802452 DOI: 10.1101/2024.01.09.574815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Schizophrenia (SCZ) is a genetically heterogenous psychiatric disorder of highly polygenic nature. Correlative evidence from genetic studies indicate that the aggregated effects of distinct genetic risk factor combinations found in each patient converge onto common molecular mechanisms. To prove this on a functional level, we employed a reductionistic cellular model system for polygenic risk by differentiating induced pluripotent stem cells (iPSCs) from 104 individuals with high polygenic risk load and controls into cortical glutamatergic neurons (iNs). Multi-omics profiling identified widespread differences in alternative polyadenylation (APA) in the 3' untranslated region of many synaptic transcripts between iNs from SCZ patients and healthy donors. On the cellular level, 3'APA was associated with a reduction in synaptic density of iNs. Importantly, differential APA was largely conserved between postmortem human prefrontal cortex from SCZ patients and healthy donors, and strongly enriched for transcripts related to synapse biology. 3'APA was highly correlated with SCZ polygenic risk and affected genes were significantly enriched for SCZ associated common genetic variation. Integrative functional genomic analysis identified the RNA binding protein and SCZ GWAS risk gene PTBP2 as a critical trans-acting factor mediating 3'APA of synaptic genes in SCZ subjects. Functional characterization of PTBP2 in iNs confirmed its key role in 3'APA of synaptic transcripts and regulation of synapse density. Jointly, our findings show that the aggregated effects of polygenic risk converge on 3'APA as one common molecular mechanism that underlies synaptic impairments in SCZ.
Collapse
Affiliation(s)
- Florian J. Raabe
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), 80804 Munich, Germany
| | - Anna Hausruckinger
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
- Department of Psychiatry, University of Münster, 48149 Münster, Germany
| | - Miriam Gagliardi
- Department of Psychiatry, University of Münster, 48149 Münster, Germany
- Center for Soft Nanoscience, University of Münster, 48149 Münster, Germany
| | - Ruhel Ahmad
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Valeria Almeida
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- Institute of Biology, University of Campinas (UNICAMP), Campinas, São Paulo, Brazil
| | - Sabrina Galinski
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- Systasy Bioscience GmbH, 81669 Munich, Germany
| | - Anke Hoffmann
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Liesa Weigert
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Christine K. Rummel
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
- International Max Planck Research School for Translational Psychiatry (IMPRS-TP), 80804 Munich, Germany
| | - Vanessa Murek
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Lucia Trastulla
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Laura Jimenez-Barron
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Alessia Atella
- Department of Psychiatry, University of Münster, 48149 Münster, Germany
- Center for Soft Nanoscience, University of Münster, 48149 Münster, Germany
| | - Susanne Maidl
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Danusa Menegaz
- Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Barbara Hauger
- Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | | | - Nadia Gabellini
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
| | - Beate Kauschat
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
| | - Sara Riccardo
- Telethon Institute of Genetics and Medicine (TIGEM), Armenise/Harvard Laboratory of Integrative Genomics, Pozzuoli, Italy
- NEGEDIA (Next Generation Diagnostic), Pozzuoli, Italy
| | - Marcella Cesana
- Telethon Institute of Genetics and Medicine (TIGEM), Armenise/Harvard Laboratory of Integrative Genomics, Pozzuoli, Italy
- Department of Advanced Biomedical Sciences, University of Naples “Federico II”, Naples, Italy
| | - Sergi Papiol
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- Max Planck Institute of Psychiatry, 80804 Munich, Germany
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, 80336 Munich, Germany
| | - Vincenza Sportelli
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Monika Rex-Haffner
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Sebastian J. Stolte
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
| | - Michael C. Wehr
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- Systasy Bioscience GmbH, 81669 Munich, Germany
| | - Tatiana Oviedo Salcedo
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
| | - Irina Papazova
- Department of Psychiatry, Psychotherapy, and Psychosomatics, Medical Faculty, University of Augsburg, 86156 Augsburg, Germany
| | - Sevilla Detera-Wadleigh
- Human Genetics Branch, National Institute of Mental Health Intramural Research Program (NIMH-IRP), Bethesda, MD, 20892, USA
| | - Francis J McMahon
- Human Genetics Branch, National Institute of Mental Health Intramural Research Program (NIMH-IRP), Bethesda, MD, 20892, USA
| | - Andrea Schmitt
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- Laboratory of Neuroscience (LIM27), Institute of Psychiatry, University of São Paulo, São Paulo-SP 05403-903, Brazil
| | - Peter Falkai
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Alkomiet Hasan
- Department of Psychiatry, Psychotherapy, and Psychosomatics, Medical Faculty, University of Augsburg, 86156 Augsburg, Germany
| | - Davide Cacchiarelli
- Telethon Institute of Genetics and Medicine (TIGEM), Armenise/Harvard Laboratory of Integrative Genomics, Pozzuoli, Italy
- School for Advanced Studies, Genomics and Experimental Medicine Program, University of Naples “Federico II”, Naples, Italy
- Department of Translational Medicine, University of Naples “Federico II”, Naples, Italy
| | - Udo Dannlowski
- Institute for Translational Psychiatry, University of Münster, 48149 Münster, Germany
| | - Igor Nenadić
- Department of Psychiatry and Psychotherapy, Philipps-University and University Hospital Marburg, UKGM, 35039 Marburg, Germany
| | - Tilo Kircher
- Department of Psychiatry and Psychotherapy, Philipps-University and University Hospital Marburg, UKGM, 35039 Marburg, Germany
| | - Volker Scheuss
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
- MSH Medical School Hamburg, Hamburg, Germany
| | - Matthias Eder
- Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Elisabeth B. Binder
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Dietmar Spengler
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, 80804 Munich, Germany
| | - Moritz J. Rossner
- Department of Psychiatry and Psychotherapy, LMU University Hospital, LMU Munich, 80336 Munich, Germany
| | - Michael J. Ziller
- Lab for Genomics of Complex Diseases, Max Planck Institute of Psychiatry, 80804 Munich, Germany
- Department of Psychiatry, University of Münster, 48149 Münster, Germany
- Center for Soft Nanoscience, University of Münster, 48149 Münster, Germany
| |
Collapse
|
22
|
Hodonsky CJ, Turner AW, Khan MD, Barrientos NB, Methorst R, Ma L, Lopez NG, Mosquera JV, Auguste G, Farber E, Ma WF, Wong D, Onengut-Gumuscu S, Kavousi M, Peyser PA, van der Laan SW, Leeper NJ, Kovacic JC, Björkegren JLM, Miller CL. Multi-ancestry genetic analysis of gene regulation in coronary arteries prioritizes disease risk loci. CELL GENOMICS 2024; 4:100465. [PMID: 38190101 PMCID: PMC10794848 DOI: 10.1016/j.xgen.2023.100465] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 09/07/2023] [Accepted: 11/19/2023] [Indexed: 01/09/2024]
Abstract
Genome-wide association studies (GWASs) have identified hundreds of risk loci for coronary artery disease (CAD). However, non-European populations are underrepresented in GWASs, and the causal gene-regulatory mechanisms of these risk loci during atherosclerosis remain unclear. We incorporated local ancestry and haplotypes to identify quantitative trait loci for expression (eQTLs) and splicing (sQTLs) in coronary arteries from 138 ancestrally diverse Americans. Of 2,132 eQTL-associated genes (eGenes), 47% were previously unreported in coronary artery; 19% exhibited cell-type-specific expression. Colocalization revealed subgroups of eGenes unique to CAD and blood pressure GWAS. Fine-mapping highlighted additional eGenes, including TBX20 and IL5. We also identified sQTLs for 1,690 genes, among which TOR1AIP1 and ULK3 sQTLs demonstrated the importance of evaluating splicing to accurately identify disease-relevant isoform expression. Our work provides a patient-derived coronary artery eQTL resource and exemplifies the need for diverse study populations and multifaceted approaches to characterize gene regulation in disease processes.
Collapse
Affiliation(s)
- Chani J Hodonsky
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Adam W Turner
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Mohammad Daud Khan
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Nelson B Barrientos
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Ruben Methorst
- Central Diagnostics Laboratory, Division Laboratories, Pharmacy, and Biomedical Genetics, University Medical Center Utrecht, Utrecht University, 3584 CX Utrecht, the Netherlands
| | - Lijiang Ma
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Nicolas G Lopez
- Division of Vascular Surgery, Department of Surgery, Stanford University, Stanford, CA 94305, USA
| | - Jose Verdezoto Mosquera
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
| | - Gaëlle Auguste
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Emily Farber
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Wei Feng Ma
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Medical Scientist Training Program, Department of Pathology, University of Virginia, Charlottesville, VA 22908, USA
| | - Doris Wong
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
| | - Suna Onengut-Gumuscu
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Maryam Kavousi
- Department of Epidemiology, Erasmus University Medical Center, 3000 CA Rotterdam, the Netherlands
| | - Patricia A Peyser
- Department of Epidemiology, University of Michigan, Ann Arbor, MI 48019, USA
| | - Sander W van der Laan
- Central Diagnostics Laboratory, Division Laboratories, Pharmacy, and Biomedical Genetics, University Medical Center Utrecht, Utrecht University, 3584 CX Utrecht, the Netherlands
| | - Nicholas J Leeper
- Division of Vascular Surgery, Department of Surgery, Stanford University, Stanford, CA 94305, USA
| | - Jason C Kovacic
- Cardiovascular Research Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia; St. Vincent's Clinical School, University of New South Wales, Sydney, NSW 2052, Australia
| | - Johan L M Björkegren
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Medicine, Huddinge, Karolinska Institutet, 141 52 Huddinge, Sweden
| | - Clint L Miller
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA; Division of Vascular Surgery, Department of Surgery, Stanford University, Stanford, CA 94305, USA; Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA.
| |
Collapse
|
23
|
Barnett KR, Mobley RJ, Diedrich JD, Bergeron BP, Bhattarai KR, Monovich AC, Narina S, Yang W, Crews KR, Manring CS, Jabbour E, Paietta E, Litzow MR, Kornblau SM, Stock W, Inaba H, Jeha S, Pui CH, Mullighan CG, Relling MV, Pruett-Miller SM, Ryan RJH, Yang JJ, Evans WE, Savic D. Epigenomic mapping reveals distinct B cell acute lymphoblastic leukemia chromatin architectures and regulators. CELL GENOMICS 2023; 3:100442. [PMID: 38116118 PMCID: PMC10726428 DOI: 10.1016/j.xgen.2023.100442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 08/30/2023] [Accepted: 10/20/2023] [Indexed: 12/21/2023]
Abstract
B cell lineage acute lymphoblastic leukemia (B-ALL) is composed of diverse molecular subtypes, and while transcriptional and DNA methylation profiling has been extensively examined, the chromatin landscape is not well characterized for many subtypes. We therefore mapped chromatin accessibility using ATAC-seq in primary B-ALL cells from 156 patients spanning ten molecular subtypes and present this dataset as a resource. Differential chromatin accessibility and transcription factor (TF) footprint profiling were employed and identified B-ALL cell of origin, TF-target gene interactions enriched in B-ALL, and key TFs associated with accessible chromatin sites preferentially active in B-ALL. We further identified over 20% of accessible chromatin sites exhibiting strong subtype enrichment and candidate TFs that maintain subtype-specific chromatin architectures. Over 9,000 genetic variants were uncovered, contributing to variability in chromatin accessibility among patient samples. Our data suggest that distinct chromatin architectures are driven by diverse TFs and inherited genetic variants that promote unique gene-regulatory networks.
Collapse
Affiliation(s)
- Kelly R Barnett
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Robert J Mobley
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Jonathan D Diedrich
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Brennan P Bergeron
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Graduate School of Biomedical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Kashi Raj Bhattarai
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Alexander C Monovich
- Department of Pathology, University of Michigan-Ann Arbor, Rogel Cancer Center, Ann Arbor, MI 48109, USA
| | - Shilpa Narina
- Center for Advanced Genome Engineering, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Wenjian Yang
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Kristine R Crews
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Christopher S Manring
- Alliance Hematologic Malignancy Biorepository, Clara D. Bloomfield Center for Leukemia Outcomes Research, Columbus, OH 43210, USA
| | - Elias Jabbour
- Department of Leukemia, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
| | - Elisabeth Paietta
- Department of Oncology, Montefiore Medical Center, Bronx, NY 10467, USA
| | - Mark R Litzow
- Division of Hematology, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Steven M Kornblau
- Department of Leukemia, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
| | - Wendy Stock
- University of Chicago Comprehensive Cancer Center, Chicago, IL 60637, USA
| | - Hiroto Inaba
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Sima Jeha
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Ching-Hon Pui
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Charles G Mullighan
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Mary V Relling
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Shondra M Pruett-Miller
- Center for Advanced Genome Engineering, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Cell and Molecular Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Russell J H Ryan
- Department of Pathology, University of Michigan-Ann Arbor, Rogel Cancer Center, Ann Arbor, MI 48109, USA
| | - Jun J Yang
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Graduate School of Biomedical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Integrated Biomedical Sciences Program, University of Tennessee Health Science Center, Memphis, TN 38105, USA
| | - William E Evans
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Daniel Savic
- Hematological Malignancies Program, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Graduate School of Biomedical Sciences, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Integrated Biomedical Sciences Program, University of Tennessee Health Science Center, Memphis, TN 38105, USA.
| |
Collapse
|
24
|
Zhang J, Zhao H. eQTL studies: from bulk tissues to single cells. J Genet Genomics 2023; 50:925-933. [PMID: 37207929 PMCID: PMC10656365 DOI: 10.1016/j.jgg.2023.05.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 05/02/2023] [Accepted: 05/04/2023] [Indexed: 05/21/2023]
Abstract
An expression quantitative trait locus (eQTL) is a chromosomal region where genetic variants are associated with the expression levels of specific genes that can be both nearby or distant. The identifications of eQTLs for different tissues, cell types, and contexts have led to a better understanding of the dynamic regulations of gene expressions and implications of functional genes and variants for complex traits and diseases. Although most eQTL studies have been performed on data collected from bulk tissues, recent studies have demonstrated the importance of cell-type-specific and context-dependent gene regulations in biological processes and disease mechanisms. In this review, we discuss statistical methods that have been developed to enable the detection of cell-type-specific and context-dependent eQTLs from bulk tissues, purified cell types, and single cells. We also discuss the limitations of the current methods and future research opportunities.
Collapse
Affiliation(s)
- Jingfei Zhang
- Information Systems and Operations Management, Emory University, Atlanta, GA 30322, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, CT 208034, USA.
| |
Collapse
|
25
|
Wang J, Cheng X, Liang Q, Owen LA, Lu J, Zheng Y, Wang M, Chen S, DeAngelis MM, Li Y, Chen R. Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation. Genome Biol 2023; 24:269. [PMID: 38012720 PMCID: PMC10680294 DOI: 10.1186/s13059-023-03111-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 11/15/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Systematic characterization of how genetic variation modulates gene regulation in a cell type-specific context is essential for understanding complex traits. To address this question, we profile gene expression and chromatin accessibility in cells from healthy retinae of 20 human donors through single-cell multiomics and genomic sequencing. RESULTS We map eQTL, caQTL, allelic-specific expression, and allelic-specific chromatin accessibility in major retinal cell types. By integrating these results, we identify and characterize regulatory elements and genetic variants effective on gene regulation in individual cell types. The majority of identified sc-eQTLs and sc-caQTLs display cell type-specific effects, while the cis-elements containing genetic variants with cell type-specific effects are often accessible in multiple cell types. Furthermore, the transcription factors whose binding sites are perturbed by genetic variants tend to have higher expression levels in the cell types where the variants exert their effects, compared to the cell types where the variants have no impact. We further validate our findings with high-throughput reporter assays. Lastly, we identify the enriched cell types, candidate causal variants and genes, and cell type-specific regulatory mechanism underlying GWAS loci. CONCLUSIONS Overall, genetic effects on gene regulation are highly context dependent. Our results suggest that cell type-dependent genetic effect is driven by precise modulation of both trans-factor expression and chromatin accessibility of cis-elements. Our findings indicate hierarchical collaboration among transcription factors plays a crucial role in mediating cell type-specific effects of genetic variants on gene regulation.
Collapse
Affiliation(s)
- Jun Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Xuesen Cheng
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Qingnan Liang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Leah A Owen
- Department of Ophthalmology and Visual Sciences, John A. Moran Eye Center, University of Utah, Salt Lake City, UT, USA
| | - Jiaxiong Lu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
| | - Meng Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, MO, USA
| | - Margaret M DeAngelis
- Department of Ophthalmology, University at Buffalo the State University of New York, Buffalo, NY, USA
| | - Yumei Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Rui Chen
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
26
|
Rumker L, Sakaue S, Reshef Y, Kang JB, Yazar S, Alquicira-Hernandez J, Valencia C, Lagattuta KA, Mah-Som A, Nathan A, Powell JE, Loh PR, Raychaudhuri S. Identifying genetic variants that influence the abundance of cell states in single-cell data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.13.566919. [PMID: 38014313 PMCID: PMC10680752 DOI: 10.1101/2023.11.13.566919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Introductory ParagraphTo understand genetic mechanisms driving disease, it is essential but difficult to map how risk alleles affect the composition of cells present in the body. Single-cell profiling quantifies granular information about tissues, but variant-associated cell states may reflect diverse combinations of the profiled cell features that are challenging to predefine. We introduce GeNA (Genotype-Neighborhood Associations), a statistical tool to identify cell state abundance quantitative trait loci (csaQTLs) in high-dimensional single-cell datasets. Instead of testing associations to predefined cell states, GeNA flexibly identifies the cell states whose abundance is most associated with genetic variants. In a genome-wide survey of scRNA-seq peripheral blood profiling from 969 individuals,1GeNA identifies five independent loci associated with shifts in the relative abundance of immune cell states. For example, rs3003-T (p=1.96×10-11) associates with increased abundance of NK cells expressing TNF-α response programs. This csaQTL colocalizes with increased risk for psoriasis, an autoimmune disease that responds to anti-TNF treatments. Flexibly characterizing csaQTLs for granular cell states may help illuminate how genetic background alters cellular composition to confer disease risk.
Collapse
Affiliation(s)
- Laurie Rumker
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Saori Sakaue
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yakir Reshef
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joyce B. Kang
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Seyhan Yazar
- Translational Genomics, Garvan Institute of Medical Research, Sydney, Australia
- UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, Australia
| | - Jose Alquicira-Hernandez
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Cristian Valencia
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kaitlyn A Lagattuta
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Annelise Mah-Som
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Aparna Nathan
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joseph E. Powell
- Translational Genomics, Garvan Institute of Medical Research, Sydney, Australia
- UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, Australia
| | - Po-Ru Loh
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
27
|
Zhu Z, Chen X, Zhang S, Yu R, Qi C, Cheng L, Zhang X. Leveraging molecular quantitative trait loci to comprehend complex diseases/traits from the omics perspective. Hum Genet 2023; 142:1543-1560. [PMID: 37755483 DOI: 10.1007/s00439-023-02602-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 09/14/2023] [Indexed: 09/28/2023]
Abstract
Comprehending the molecular basis of quantitative genetic variation is a principal goal for complex diseases or traits. Molecular quantitative trait loci (molQTLs) have made it possible to investigate the effects of genetic variants hiding behind large-scale omics data. A deeper understanding of molQTL is urgently required in light of the multi-dimensionalization of omics data to more fully elucidate the pertinent biological mechanisms. Herein, we reviewed molQTLs with the corresponding resource from the omics perspective and further discussed the integrative strategy of GWAS-molQTL to infer their causal effects. Subsequently, we described the opportunities and challenges encountered by molQTL. The case studies showed that molQTL is essential for complex diseases and traits, whether single- or multi-omics QTLs. Overall, we highlighted the functional significance of genetic variants to employ the discovery of molQTL in complex diseases and traits.
Collapse
Affiliation(s)
- Zijun Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Xinyu Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Sainan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Rui Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China.
| | - Xue Zhang
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China
- McKusick-Zhang Center for Genetic Medicine, State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005, China
| |
Collapse
|
28
|
Yoon JH, Kim S. Learning gene networks under SNP perturbation using SNP and allele-specific expression data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.23.563661. [PMID: 37961468 PMCID: PMC10634764 DOI: 10.1101/2023.10.23.563661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Allele-specific expression quantification from RNA-seq reads provides opportunities to study the control of gene regulatory networks by cis-acting and trans-acting genetic variants. Many existing methods performed a single-gene and single-SNP association analysis to identify expression quantitative trait loci (eQTLs), and placed the eQTLs against known gene networks for functional interpretation. Instead, we view eQTL data as a capture of the effects of perturbation of gene regulatory system by a large number of genetic variants and reconstruct a gene network perturbed by eQTLs. We introduce a statistical framework called CiTruss for simultaneously learning a gene network and cis-acting and trans-acting eQTLs that perturb this network, given population allele-specific expression and SNP data. CiTruss uses a multi-level conditional Gaussian graphical model to model trans-acting eQTLs perturbing the expression of both alleles in gene network at the top level and cis-acting eQTLs perturbing the expression of each allele at the bottom level. We derive a transformation of this model that allows efficient learning for large-scale human data. Our analysis of the GTEx and LG×SM advanced intercross line mouse data for multiple tissue types with CiTruss provides new insights into genetics of gene regulation. CiTruss revealed that gene networks consist of local subnetworks over proximally located genes and global subnetworks over genes scattered across genome, and that several aspects of gene regulation by eQTLs such as the impact of genetic diversity, pleiotropy, tissue-specific gene regulation, and local and long-range linkage disequilibrium among eQTLs can be explained through these local and global subnetworks.
Collapse
Affiliation(s)
- Jun Ho Yoon
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | | |
Collapse
|
29
|
Qi G, Strober BJ, Popp JM, Keener R, Ji H, Battle A. Single-cell allele-specific expression analysis reveals dynamic and cell-type-specific regulatory effects. Nat Commun 2023; 14:6317. [PMID: 37813843 PMCID: PMC10562474 DOI: 10.1038/s41467-023-42016-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 09/27/2023] [Indexed: 10/11/2023] Open
Abstract
Differential allele-specific expression (ASE) is a powerful tool to study context-specific cis-regulation of gene expression. Such effects can reflect the interaction between genetic or epigenetic factors and a measured context or condition. Single-cell RNA sequencing (scRNA-seq) allows the measurement of ASE at individual-cell resolution, but there is a lack of statistical methods to analyze such data. We present Differential Allelic Expression using Single-Cell data (DAESC), a powerful method for differential ASE analysis using scRNA-seq from multiple individuals, with statistical behavior confirmed through simulation. DAESC accounts for non-independence between cells from the same individual and incorporates implicit haplotype phasing. Application to data from 105 induced pluripotent stem cell (iPSC) lines identifies 657 genes dynamically regulated during endoderm differentiation, with enrichment for changes in chromatin state. Application to a type-2 diabetes dataset identifies several differentially regulated genes between patients and controls in pancreatic endocrine cells. DAESC is a powerful method for single-cell ASE analysis and can uncover novel insights on gene regulation.
Collapse
Affiliation(s)
- Guanghao Qi
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
| | - Benjamin J Strober
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Joshua M Popp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Rebecca Keener
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, 21205, USA.
| |
Collapse
|
30
|
Fazel-Najafabadi M, Looger LL, Reddy-Rallabandi H, Nath SK. A multilayered post-GWAS analysis pipeline defines functional variants and target genes for systemic lupus erythematosus (SLE). MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.07.23288295. [PMID: 37066327 PMCID: PMC10104240 DOI: 10.1101/2023.04.07.23288295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Objectives Systemic lupus erythematosus (SLE), an autoimmune disease with incompletely understood etiology, has a strong genetic component. Although genome-wide association studies (GWAS) have revealed multiple SLE susceptibility loci and associated single nucleotide polymorphisms (SNPs), the precise causal variants, target genes, cell types, tissues, and mechanisms of action remain largely unknown. Methods Here, we report a comprehensive post-GWAS analysis using extensive bioinformatics, molecular modeling, and integrative functional genomic and epigenomic analyses to optimize fine-mapping. We compile and cross-reference immune cell-specific expression quantitative trait loci ( cis - and trans -eQTLs) with promoter-capture Hi-C, allele-specific chromatin accessibility, and massively parallel reporter assay data to define predisposing variants and target genes. We experimentally validate a predicted locus using CRISPR/Cas9 genome editing, qPCR, and Western blot. Results Anchoring on 452 index SNPs, we selected 9,931 high-linkage disequilibrium (r 2 >0.8) SNPs and defined 182 independent non-HLA SLE loci. 3,746 SNPs from 143 loci were identified as regulating 564 unique genes. Target genes are enriched in lupus-related tissues and associated with other autoimmune diseases. Of these, 329 SNPs (106 loci) showed significant allele-specific chromatin accessibility and/or enhancer activity, indicating regulatory potential. Using CRISPR/Cas9, we validated rs57668933 as a functional variant regulating multiple targets, including SLE risk gene ELF1 , in B-cells. Conclusion We demonstrate and validate post-GWAS strategies for utilizing multi-dimensional data to prioritize likely causal variants with cognate gene targets underlying SLE pathogenesis. Our results provide a catalog of significantly SLE-associated SNPs and loci, target genes, and likely biochemical mechanisms, to guide experimental characterization.
Collapse
|
31
|
Gaulton KJ, Preissl S, Ren B. Interpreting non-coding disease-associated human variants using single-cell epigenomics. Nat Rev Genet 2023; 24:516-534. [PMID: 37161089 PMCID: PMC10629587 DOI: 10.1038/s41576-023-00598-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/27/2023] [Indexed: 05/11/2023]
Abstract
Genome-wide association studies (GWAS) have linked hundreds of thousands of sequence variants in the human genome to common traits and diseases. However, translating this knowledge into a mechanistic understanding of disease-relevant biology remains challenging, largely because such variants are predominantly in non-protein-coding sequences that still lack functional annotation at cell-type resolution. Recent advances in single-cell epigenomics assays have enabled the generation of cell type-, subtype- and state-resolved maps of the epigenome in heterogeneous human tissues. These maps have facilitated cell type-specific annotation of candidate cis-regulatory elements and their gene targets in the human genome, enhancing our ability to interpret the genetic basis of common traits and diseases.
Collapse
Affiliation(s)
- Kyle J Gaulton
- Department of Paediatrics, Paediatric Diabetes Research Center, University of California San Diego School of Medicine, La Jolla, CA, USA.
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| | - Bing Ren
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
| |
Collapse
|
32
|
Singh PP, Benayoun BA. Considerations for reproducible omics in aging research. NATURE AGING 2023; 3:921-930. [PMID: 37386258 PMCID: PMC10527412 DOI: 10.1038/s43587-023-00448-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 06/01/2023] [Indexed: 07/01/2023]
Abstract
Technical advancements over the past two decades have enabled the measurement of the panoply of molecules of cells and tissues including transcriptomes, epigenomes, metabolomes and proteomes at unprecedented resolution. Unbiased profiling of these molecular landscapes in the context of aging can reveal important details about mechanisms underlying age-related functional decline and age-related diseases. However, the high-throughput nature of these experiments creates unique analytical and design demands for robustness and reproducibility. In addition, 'omic' experiments are generally onerous, making it crucial to effectively design them to eliminate as many spurious sources of variation as possible as well as account for any biological or technical parameter that may influence such measures. In this Perspective, we provide general guidelines on best practices in the design and analysis of omic experiments in aging research from experimental design to data analysis and considerations for long-term reproducibility and validation of such studies.
Collapse
Affiliation(s)
- Param Priya Singh
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA.
- Bakar Aging Research Institute, University of California, San Francisco, San Francisco, CA, USA.
| | - Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA.
- Molecular and Computational Biology Department, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, USA.
- Biochemistry and Molecular Medicine Department, USC Keck School of Medicine, Los Angeles, CA, USA.
- Epigenetics and Gene Regulation, USC Norris Comprehensive Cancer Center, Los Angeles, CA, USA.
- USC Stem Cell Initiative, Los Angeles, CA, USA.
| |
Collapse
|
33
|
Barnett KR, Mobley RJ, Diedrich JD, Bergeron BP, Bhattarai KR, Yang W, Crews KR, Manring CS, Jabbour E, Paietta E, Litzow MR, Kornblau SM, Stock W, Inaba H, Jeha S, Pui CH, Mullighan CG, Relling MV, Yang JJ, Evans WE, Savic D. Epigenomic mapping in B-cell acute lymphoblastic leukemia identifies transcriptional regulators and noncoding variants promoting distinct chromatin architectures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.14.528493. [PMID: 36824825 PMCID: PMC9949063 DOI: 10.1101/2023.02.14.528493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
B-cell lineage acute lymphoblastic leukemia (B-ALL) is comprised of diverse molecular subtypes and while transcriptional and DNA methylation profiling of B-ALL subtypes has been extensively examined, the accompanying chromatin landscape is not well characterized for many subtypes. We therefore mapped chromatin accessibility using ATAC-seq for 10 B-ALL molecular subtypes in primary ALL cells from 154 patients. Comparisons with B-cell progenitors identified candidate B-ALL cell-of-origin and AP-1-associated cis-regulatory rewiring in B-ALL. Cis-regulatory rewiring promoted B-ALL-specific gene regulatory networks impacting oncogenic signaling pathways that perturb normal B-cell development. We also identified that over 20% of B-ALL accessible chromatin sites exhibit strong subtype enrichment, with transcription factor (TF) footprint profiling identifying candidate TFs that maintain subtype-specific chromatin architectures. Over 9000 inherited genetic variants were further uncovered that contribute to variability in chromatin accessibility among individual patient samples. Overall, our data suggest that distinct chromatin architectures are driven by diverse TFs and inherited genetic variants which promote unique gene regulatory networks that contribute to transcriptional differences among B-ALL subtypes.
Collapse
Affiliation(s)
- Kelly R. Barnett
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Robert J. Mobley
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Jonathan D. Diedrich
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Brennan P. Bergeron
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Graduate School of Biomedical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Kashi Raj Bhattarai
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Wenjian Yang
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Kristine R. Crews
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Christopher S. Manring
- Alliance Hematologic Malignancy Biorepository; Clara D. Bloomfield Center for Leukemia Outcomes Research, Columbus, OH 43210, USA
| | - Elias Jabbour
- Department of Leukemia, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
| | - Elisabeth Paietta
- Department of Oncology, Montefiore Medical Center, Bronx, NY 10467, USA
| | - Mark R. Litzow
- Division of Hematology, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Steven M. Kornblau
- Department of Leukemia, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
| | - Wendy Stock
- University of Chicago Comprehensive Cancer Center, Chicago, IL 60637, USA
| | - Hiroto Inaba
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Sima Jeha
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Ching-Hon Pui
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Charles G. Mullighan
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pathology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Mary V. Relling
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Jun J. Yang
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Graduate School of Biomedical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Integrated Biomedical Sciences Program, University of Tennessee Health Science Center, Memphis, TN 38105, USA
| | - William E. Evans
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Daniel Savic
- Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Graduate School of Biomedical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Integrated Biomedical Sciences Program, University of Tennessee Health Science Center, Memphis, TN 38105, USA
| |
Collapse
|
34
|
Jeong R, Bulyk ML. Blood cell traits' GWAS loci colocalization with variation in PU.1 genomic occupancy prioritizes causal noncoding regulatory variants. CELL GENOMICS 2023; 3:100327. [PMID: 37492098 PMCID: PMC10363807 DOI: 10.1016/j.xgen.2023.100327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/10/2023] [Accepted: 04/25/2023] [Indexed: 07/27/2023]
Abstract
Genome-wide association studies (GWASs) have uncovered numerous trait-associated loci across the human genome, most of which are located in noncoding regions, making interpretation difficult. Moreover, causal variants are hard to statistically fine-map at many loci because of widespread linkage disequilibrium. To address this challenge, we present a strategy utilizing transcription factor (TF) binding quantitative trait loci (bQTLs) for colocalization analysis to identify trait associations likely mediated by TF occupancy variation and to pinpoint likely causal variants using motif scores. We applied this approach to PU.1 bQTLs in lymphoblastoid cell lines and blood cell trait GWAS data. Colocalization analysis revealed 69 blood cell trait GWAS loci putatively driven by PU.1 occupancy variation. We nominate PU.1 motif-altering variants as the likely shared causal variants at 51 loci. Such integration of TF bQTL data with other GWAS data may reveal transcriptional regulatory mechanisms and causal noncoding variants underlying additional complex traits.
Collapse
Affiliation(s)
- Raehoon Jeong
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
- Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
35
|
Wu EY, Singh NP, Choi K, Zakeri M, Vincent M, Churchill GA, Ackert-Bicknell CL, Patro R, Love MI. SEESAW: detecting isoform-level allelic imbalance accounting for inferential uncertainty. Genome Biol 2023; 24:165. [PMID: 37438847 PMCID: PMC10337143 DOI: 10.1186/s13059-023-03003-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 06/29/2023] [Indexed: 07/14/2023] Open
Abstract
Detecting allelic imbalance at the isoform level requires accounting for inferential uncertainty, caused by multi-mapping of RNA-seq reads. Our proposed method, SEESAW, uses Salmon and Swish to offer analysis at various levels of resolution, including gene, isoform, and aggregating isoforms to groups by transcription start site. The aggregation strategies strengthen the signal for transcripts with high uncertainty. The SEESAW suite of methods is shown to have higher power than other allelic imbalance methods when there is isoform-level allelic imbalance. We also introduce a new test for detecting imbalance that varies across a covariate, such as time.
Collapse
Affiliation(s)
- Euphy Y Wu
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA
| | - Noor P Singh
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | | | - Mohsen Zakeri
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | | | | | - Cheryl L Ackert-Bicknell
- Department of Orthopedics, School of Medicine, University of Colorado, Anschutz Campus, Aurora, CO, USA
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.
- Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
36
|
Cooper S, Schwartzentruber J, Coomber EL, Wu Q, Bassett A. Screening for functional regulatory variants in open chromatin using GenIE-ATAC. Nucleic Acids Res 2023; 51:e64. [PMID: 37125635 PMCID: PMC10287956 DOI: 10.1093/nar/gkad332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/29/2023] [Accepted: 04/20/2023] [Indexed: 05/02/2023] Open
Abstract
Understanding the effects of genetic variation in gene regulatory elements is crucial to interpreting genome function. This is particularly pertinent for the hundreds of thousands of disease-associated variants identified by GWAS, which frequently sit within gene regulatory elements but whose functional effects are often unknown. Current methods are limited in their scalability and ability to assay regulatory variants in their endogenous context, independently of other tightly linked variants. Here, we present a new medium-throughput screening system: genome engineering based interrogation of enhancers assay for transposase accessible chromatin (GenIE-ATAC), that measures the effect of individual variants on chromatin accessibility in their endogenous genomic and chromatin context. We employ this assay to screen for the effects of regulatory variants in human induced pluripotent stem cells, validating a subset of causal variants, and extend our software package (rgenie) to analyse these new data. We demonstrate that this methodology can be used to understand the impact of defined deletions and point mutations within transcription factor binding sites. We thus establish GenIE-ATAC as a method to screen for the effect of gene regulatory element variation, allowing identification and prioritisation of causal variants from GWAS for functional follow-up and understanding the mechanisms of regulatory element function.
Collapse
Affiliation(s)
- Sarah Cooper
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- OpenTargets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Jeremy Schwartzentruber
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- OpenTargets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Eve L Coomber
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Qianxin Wu
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Andrew Bassett
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- OpenTargets, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
37
|
Benaglio P, Newsome J, Han JY, Chiou J, Aylward A, Corban S, Miller M, Okino ML, Kaur J, Preissl S, Gorkin DU, Gaulton KJ. Mapping genetic effects on cell type-specific chromatin accessibility and annotating complex immune trait variants using single nucleus ATAC-seq in peripheral blood. PLoS Genet 2023; 19:e1010759. [PMID: 37289818 DOI: 10.1371/journal.pgen.1010759] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 04/25/2023] [Indexed: 06/10/2023] Open
Abstract
Gene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 13 individuals. Clustering chromatin accessibility profiles of 96,002 total nuclei identified 17 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type using individuals of European ancestry which identified 6,901 caQTLs at FDR < .10 and 4,220 caQTLs at FDR < .05, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,941 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters. We fine-mapped loci associated with 16 complex immune traits and identified immune cell caQTLs at 622 candidate causal variants, including those with cell type-specific effects. At the 6q15 locus associated with type 1 diabetes, in line with previous reports, variant rs72928038 was a naïve CD4+ T cell caQTL linked to BACH2 and we validated the allelic effects of this variant on regulatory activity in Jurkat T cells. These results highlight the utility of snATAC-seq for mapping genetic effects on accessible chromatin in specific cell types.
Collapse
Affiliation(s)
- Paola Benaglio
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Jacklyn Newsome
- Bioinformatics and Systems Biology Program, University of California San Diego, San Diego, California, United States of America
| | - Jee Yun Han
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - Joshua Chiou
- Biomedical Sciences Graduate Program. University of California San Diego, San Diego, California, United States of America
| | - Anthony Aylward
- Bioinformatics and Systems Biology Program, University of California San Diego, San Diego, California, United States of America
| | - Sierra Corban
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Michael Miller
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - Mei-Lin Okino
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Jaspreet Kaur
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Sebastian Preissl
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - David U Gorkin
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - Kyle J Gaulton
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| |
Collapse
|
38
|
Kumasaka N, Rostom R, Huang N, Polanski K, Meyer KB, Patel S, Boyd R, Gomez C, Barnett SN, Panousis NI, Schwartzentruber J, Ghoussaini M, Lyons PA, Calero-Nieto FJ, Göttgens B, Barnes JL, Worlock KB, Yoshida M, Nikolić MZ, Stephenson E, Reynolds G, Haniffa M, Marioni JC, Stegle O, Hagai T, Teichmann SA. Mapping interindividual dynamics of innate immune response at single-cell resolution. Nat Genet 2023; 55:1066-1075. [PMID: 37308670 PMCID: PMC10260404 DOI: 10.1038/s41588-023-01421-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Accepted: 04/27/2023] [Indexed: 06/14/2023]
Abstract
Common genetic variants across individuals modulate the cellular response to pathogens and are implicated in diverse immune pathologies, yet how they dynamically alter the response upon infection is not well understood. Here, we triggered antiviral responses in human fibroblasts from 68 healthy donors, and profiled tens of thousands of cells using single-cell RNA-sequencing. We developed GASPACHO (GAuSsian Processes for Association mapping leveraging Cell HeterOgeneity), a statistical approach designed to identify nonlinear dynamic genetic effects across transcriptional trajectories of cells. This approach identified 1,275 expression quantitative trait loci (local false discovery rate 10%) that manifested during the responses, many of which were colocalized with susceptibility loci identified by genome-wide association studies of infectious and autoimmune diseases, including the OAS1 splicing quantitative trait locus in a COVID-19 susceptibility locus. In summary, our analytical approach provides a unique framework for delineation of the genetic variants that shape a wide spectrum of transcriptional responses at single-cell resolution.
Collapse
Affiliation(s)
- Natsuhiko Kumasaka
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Medical Support Center of Japan Environment and Children's Study (JECS), National Center for Child Health and Development, Tokyo, Japan
| | - Raghd Rostom
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Ni Huang
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | | | - Kerstin B Meyer
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Sharad Patel
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Rachel Boyd
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Celine Gomez
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Sam N Barnett
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | | | - Jeremy Schwartzentruber
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Open Targets, Wellcome Genome Campus, Hinxton, UK
| | - Maya Ghoussaini
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Open Targets, Wellcome Genome Campus, Hinxton, UK
| | - Paul A Lyons
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
| | | | - Berthold Göttgens
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Josephine L Barnes
- UCL Respiratory, Division of Medicine, University College London, London, UK
| | - Kaylee B Worlock
- UCL Respiratory, Division of Medicine, University College London, London, UK
| | - Masahiro Yoshida
- UCL Respiratory, Division of Medicine, University College London, London, UK
| | - Marko Z Nikolić
- UCL Respiratory, Division of Medicine, University College London, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - Emily Stephenson
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Gary Reynolds
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Muzlifah Haniffa
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
- Department of Dermatology, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - John C Marioni
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Oliver Stegle
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Tzachi Hagai
- Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.
| | - Sarah A Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
- Theory of Condensed Matter Group, Cavendish Laboratory/Department of Physics, University of Cambridge, Cambridge, UK.
| |
Collapse
|
39
|
Huynh K, Smith BR, Macdonald SJ, Long AD. Genetic variation in chromatin state across multiple tissues in Drosophila melanogaster. PLoS Genet 2023; 19:e1010439. [PMID: 37146087 PMCID: PMC10191298 DOI: 10.1371/journal.pgen.1010439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 05/17/2023] [Accepted: 04/20/2023] [Indexed: 05/07/2023] Open
Abstract
We use ATAC-seq to examine chromatin accessibility for four different tissues in Drosophila melanogaster: adult female brain, ovaries, and both wing and eye-antennal imaginal discs from males. Each tissue is assayed in eight different inbred strain genetic backgrounds, seven associated with a reference quality genome assembly. We develop a method for the quantile normalization of ATAC-seq fragments and test for differences in coverage among genotypes, tissues, and their interaction at 44099 peaks throughout the euchromatic genome. For the strains with reference quality genome assemblies, we correct ATAC-seq profiles for read mis-mapping due to nearby polymorphic structural variants (SVs). Comparing coverage among genotypes without accounting for SVs results in a highly elevated rate (55%) of identifying false positive differences in chromatin state between genotypes. After SV correction, we identify 1050, 30383, and 4508 regions whose peak heights are polymorphic among genotypes, among tissues, or exhibit genotype-by-tissue interactions, respectively. Finally, we identify 3988 candidate causative variants that explain at least 80% of the variance in chromatin state at nearby ATAC-seq peaks.
Collapse
Affiliation(s)
- Khoi Huynh
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California, United States of America
| | - Brittny R. Smith
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| | - Stuart J. Macdonald
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
- Center for Computational Biology, University of Kansas, Lawrence, Kansas, United States of America
| | - Anthony D. Long
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California, United States of America
| |
Collapse
|
40
|
Kerimov N, Tambets R, Hayhurst JD, Rahu I, Kolberg P, Raudvere U, Kuzmin I, Chowdhary A, Vija A, Teras HJ, Kanai M, Ulirsch J, Ryten M, Hardy J, Guelfi S, Trabzuni D, Kim-Hellmuth S, Rayner W, Finucane H, Peterson H, Mosaku A, Parkinson H, Alasoo K. Systematic visualisation of molecular QTLs reveals variant mechanisms at GWAS loci. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.06.535816. [PMID: 37066341 PMCID: PMC10104061 DOI: 10.1101/2023.04.06.535816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Splicing quantitative trait loci (QTLs) have been implicated as a common mechanism underlying complex trait associations. However, utilising splicing QTLs in target discovery and prioritisation has been challenging due to extensive data normalisation which often renders the direction of the genetic effect as well as its magnitude difficult to interpret. This is further complicated by the fact that strong expression QTLs often manifest as weak splicing QTLs and vice versa, making it difficult to uniquely identify the underlying molecular mechanism at each locus. We find that these ambiguities can be mitigated by visualising the association between the genotype and average RNA sequencing read coverage in the region. Here, we generate these QTL coverage plots for 1.7 million molecular QTL associations in the eQTL Catalogue identified with five quantification methods. We illustrate the utility of these QTL coverage plots by performing colocalisation between vitamin D levels in the UK Biobank and all molecular QTLs in the eQTL Catalogue. We find that while visually confirmed splicing QTLs explain just 6/53 of the colocalising signals, they are significantly less pleiotropic than eQTLs and identify a prioritised causal gene in 4/6 cases. All our association summary statistics and QTL coverage plots are freely available at https://www.ebi.ac.uk/eqtl/.
Collapse
Affiliation(s)
- Nurlan Kerimov
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
- Open Targets, South Building, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ralf Tambets
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - James D Hayhurst
- Open Targets, South Building, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ida Rahu
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Peep Kolberg
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Uku Raudvere
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Anshika Chowdhary
- Institute of Translational Genomics, Helmholtz Munich, Neuherberg, Germany
| | - Andreas Vija
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Hans J Teras
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jacob Ulirsch
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mina Ryten
- Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London, London
| | - John Hardy
- Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London, London
| | - Sebastian Guelfi
- Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London, London
| | - Daniah Trabzuni
- Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London, London
| | - Sarah Kim-Hellmuth
- Institute of Translational Genomics, Helmholtz Munich, Neuherberg, Germany
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital LMU Munich, Munich, Germany
| | - Will Rayner
- Institute of Translational Genomics, Helmholtz Munich, Neuherberg, Germany
| | - Hilary Finucane
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
| | - Abayomi Mosaku
- Open Targets, South Building, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Parkinson
- Open Targets, South Building, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kaur Alasoo
- Institute of Computer Science, University of Tartu, Tartu, 51009, Estonia
- Open Targets, South Building, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
41
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
42
|
Zeng B, Bendl J, Deng C, Lee D, Misir R, Reach SM, Kleopoulos SP, Auluck P, Marenco S, Lewis DA, Haroutunian V, Ahituv N, Fullard JF, Hoffman GE, Roussos P. Genetic regulation of cell-type specific chromatin accessibility shapes the etiology of brain diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.02.530826. [PMID: 37090548 PMCID: PMC10120699 DOI: 10.1101/2023.03.02.530826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Nucleotide variants in cell type-specific gene regulatory elements in the human brain are major risk factors of human disease. We measured chromatin accessibility in sorted neurons and glia from 1,932 samples of human postmortem brain and identified 34,539 open chromatin regions with chromatin accessibility quantitative trait loci (caQTL). Only 10.4% of caQTL are shared between neurons and glia, supporting the cell type specificity of genetic regulation of the brain regulome. Incorporating allele specific chromatin accessibility improves statistical fine-mapping and refines molecular mechanisms underlying disease risk. Using massively parallel reporter assays in induced excitatory neurons, we screened 19,893 brain QTLs, identifying the functional impact of 476 regulatory variants. Combined, this comprehensive resource captures variation in the human brain regulome and provides novel insights into brain disease etiology.
Collapse
Affiliation(s)
- Biao Zeng
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - Donghoon Lee
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruth Misir
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sarah M. Reach
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Steven P. Kleopoulos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Pavan Auluck
- Human Brain Collection Core, National Institute of Mental Health-Intramural Research Program, Bethesda, MD, USA
| | - Stefano Marenco
- Human Brain Collection Core, National Institute of Mental Health-Intramural Research Program, Bethesda, MD, USA
| | - David A. Lewis
- Translational Neuroscience Program, Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Vahram Haroutunian
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, 94158, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - John F. Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gabriel E. Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY, USA
| |
Collapse
|
43
|
Zhang J, Zhao H. eQTL Studies: from Bulk Tissues to Single Cells. ARXIV 2023:arXiv:2302.11662v1. [PMID: 36866231 PMCID: PMC9980190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
An expression quantitative trait locus (eQTL) is a chromosomal region where genetic variants are associated with the expression levels of certain genes that can be both nearby or distant. The identifications of eQTLs for different tissues, cell types, and contexts have led to better understanding of the dynamic regulations of gene expressions and implications of functional genes and variants for complex traits and diseases. Although most eQTL studies to date have been performed on data collected from bulk tissues, recent studies have demonstrated the importance of cell-type-specific and context-dependent gene regulations in biological processes and disease mechanisms. In this review, we discuss statistical methods that have been developed to enable the detections of cell-type-specific and context-dependent eQTLs from bulk tissues, purified cell types, and single cells. We also discuss the limitations of the current methods and future research opportunities.
Collapse
Affiliation(s)
- Jingfei Zhang
- Information Systems and Operations Management, Emory University
| | - Hongyu Zhao
- Department of Biostatistics, Yale University
| |
Collapse
|
44
|
Hodonsky CJ, Turner AW, Khan MD, Barrientos NB, Methorst R, Ma L, Lopez NG, Mosquera JV, Auguste G, Farber E, Ma WF, Wong D, Onengut-Gumuscu S, Kavousi M, Peyser PA, van der Laan SW, Leeper NJ, Kovacic JC, Björkegren JLM, Miller CL. Integrative multi-ancestry genetic analysis of gene regulation in coronary arteries prioritizes disease risk loci. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.09.23285622. [PMID: 36824883 PMCID: PMC9949190 DOI: 10.1101/2023.02.09.23285622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Genome-wide association studies (GWAS) have identified hundreds of genetic risk loci for coronary artery disease (CAD). However, non-European populations are underrepresented in GWAS and the causal gene-regulatory mechanisms of these risk loci during atherosclerosis remain unclear. We incorporated local ancestry and haplotype information to identify quantitative trait loci (QTL) for gene expression and splicing in coronary arteries obtained from 138 ancestrally diverse Americans. Of 2,132 eQTL-associated genes (eGenes), 47% were previously unreported in coronary arteries and 19% exhibited cell-type-specific expression. Colocalization analysis with GWAS identified subgroups of eGenes unique to CAD and blood pressure. Fine-mapping highlighted additional eGenes of interest, including TBX20 and IL5 . Splicing (s)QTLs for 1,690 genes were also identified, among which TOR1AIP1 and ULK3 sQTLs demonstrated the importance of evaluating splicing events to accurately identify disease-relevant gene expression. Our work provides the first human coronary artery eQTL resource from a patient sample and exemplifies the necessity of diverse study populations and multi-omic approaches to characterize gene regulation in critical disease processes. Study Design Overview
Collapse
|
45
|
Nassar AH, Abou Alaiwi S, Baca SC, Adib E, Corona RI, Seo JH, Fonseca MAS, Spisak S, El Zarif T, Tisza V, Braun DA, Du H, He M, Flaifel A, Alchoueiry M, Denize T, Matar SG, Acosta A, Shukla S, Hou Y, Steinharter J, Bouchard G, Berchuck JE, O'Connor E, Bell C, Nuzzo PV, Mary Lee GS, Signoretti S, Hirsch MS, Pomerantz M, Henske E, Gusev A, Lawrenson K, Choueiri TK, Kwiatkowski DJ, Freedman ML. Epigenomic charting and functional annotation of risk loci in renal cell carcinoma. Nat Commun 2023; 14:346. [PMID: 36681680 PMCID: PMC9867739 DOI: 10.1038/s41467-023-35833-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 01/04/2023] [Indexed: 01/22/2023] Open
Abstract
While the mutational and transcriptional landscapes of renal cell carcinoma (RCC) are well-known, the epigenome is poorly understood. We characterize the epigenome of clear cell (ccRCC), papillary (pRCC), and chromophobe RCC (chRCC) by using ChIP-seq, ATAC-Seq, RNA-seq, and SNP arrays. We integrate 153 individual data sets from 42 patients and nominate 50 histology-specific master transcription factors (MTF) to define RCC histologic subtypes, including EPAS1 and ETS-1 in ccRCC, HNF1B in pRCC, and FOXI1 in chRCC. We confirm histology-specific MTFs via immunohistochemistry including a ccRCC-specific TF, BHLHE41. FOXI1 overexpression with knock-down of EPAS1 in the 786-O ccRCC cell line induces transcriptional upregulation of chRCC-specific genes, TFCP2L1, ATP6V0D2, KIT, and INSRR, implicating FOXI1 as a MTF for chRCC. Integrating RCC GWAS risk SNPs with H3K27ac ChIP-seq and ATAC-seq data reveals that risk-variants are significantly enriched in allelically-imbalanced peaks. This epigenomic atlas in primary human samples provides a resource for future investigation.
Collapse
Affiliation(s)
- Amin H Nassar
- Department of Hematology/Oncology, Yale New Haven Hospital, New Haven, CT, 06510, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Sarah Abou Alaiwi
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Sylvan C Baca
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Elio Adib
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Rosario I Corona
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Center for Bioinformatics and Functional Genomics, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Ji-Heui Seo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Marcos A S Fonseca
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sandor Spisak
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- The Eli and Edythe L. Broad Institute, Cambridge, MA, 02142, USA
| | - Talal El Zarif
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Viktoria Tisza
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- The Eli and Edythe L. Broad Institute, Cambridge, MA, 02142, USA
| | - David A Braun
- Department of Hematology/Oncology, Yale New Haven Hospital, New Haven, CT, 06510, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- The Eli and Edythe L. Broad Institute, Cambridge, MA, 02142, USA
| | - Heng Du
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Monica He
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Abdallah Flaifel
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Michel Alchoueiry
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Thomas Denize
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Sayed G Matar
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Andres Acosta
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Sachet Shukla
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Translational Immunogenomics Lab, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Yue Hou
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Translational Immunogenomics Lab, Dana-Farber Cancer Institute, Boston, MA, USA
| | - John Steinharter
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Gabrielle Bouchard
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Jacob E Berchuck
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Edward O'Connor
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Connor Bell
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Pier Vitale Nuzzo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Gwo-Shu Mary Lee
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Sabina Signoretti
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Michelle S Hirsch
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Mark Pomerantz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Elizabeth Henske
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- McGraw/Patterson Center for Population Sciences, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
| | - Kate Lawrenson
- Women's Cancer Research Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Center for Bioinformatics and Functional Genomics, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Toni K Choueiri
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
| | - David J Kwiatkowski
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
| | - Matthew L Freedman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
- The Eli and Edythe L. Broad Institute, Cambridge, MA, 02142, USA.
| |
Collapse
|
46
|
Kaczor-Urbanowicz KE, Wong DTW. RNA Sequencing Analysis of Saliva exRNA. Methods Mol Biol 2023; 2588:3-11. [PMID: 36418678 DOI: 10.1007/978-1-0716-2780-8_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Next-generation sequencing (NGS) methodologies are rapidly developing. However, RNA Sequencing of saliva is challenging due to low abundance and integrity of extracellular RNA, as well as large amounts of bacterial RNAs that may be encountered in saliva. In addition, the literature about human salivary extracellular RNA is very scarce. Therefore, in our chapter, we present the most appropriate protocols for saliva collection, pre- and post-processing, including bioinformatic analysis of salivary RNA Sequencing data. However, the choice of the proper method for RNA extraction, cDNA library preparation, and computational pipeline can make a significant impact on the final quality of data and their interpretation.
Collapse
Affiliation(s)
- Karolina Elżbieta Kaczor-Urbanowicz
- Center for Oral and Head/Neck Oncology Research, UCLA School of Dentistry, University of California at Los Angeles, Los Angeles, CA, USA.,UCLA Institute for Quantitative and Computational Biosciences, University of California at Los Angeles, Los Angeles, CA, USA.,UCLA Section of Orthodontics, University of California at Los Angeles, Los Angeles, CA, USA.,Section of Biosystems and Function, UCLA School of Dentistry, University of California at Los Angeles, Los Angeles, CA, USA
| | - David T W Wong
- Center for Oral and Head/Neck Oncology Research, UCLA School of Dentistry, University of California at Los Angeles, Los Angeles, CA, USA. .,Section of Biosystems and Function, UCLA School of Dentistry, University of California at Los Angeles, Los Angeles, CA, USA. .,UCLA's Jonsson Comprehensive Cancer Center, Los Angeles, CA, USA.
| |
Collapse
|
47
|
Schott BH, Wang L, Zhu X, Harding AT, Ko ER, Bourgeois JS, Washington EJ, Burke TW, Anderson J, Bergstrom E, Gardener Z, Paterson S, Brennan RG, Chiu C, McClain MT, Woods CW, Gregory SG, Heaton NS, Ko DC. Single-cell genome-wide association reveals that a nonsynonymous variant in ERAP1 confers increased susceptibility to influenza virus. CELL GENOMICS 2022; 2:100207. [PMID: 36465279 PMCID: PMC9718543 DOI: 10.1016/j.xgen.2022.100207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 07/26/2022] [Accepted: 10/07/2022] [Indexed: 06/17/2023]
Abstract
During pandemics, individuals exhibit differences in risk and clinical outcomes. Here, we developed single-cell high-throughput human in vitro susceptibility testing (scHi-HOST), a method for rapidly identifying genetic variants that confer resistance and susceptibility. We applied this method to influenza A virus (IAV), the cause of four pandemics since the start of the 20th century. scHi-HOST leverages single-cell RNA sequencing (scRNA-seq) to simultaneously assign genetic identity to cells in mixed infections of cell lines of European, African, and Asian origin, reveal associated genetic variants for viral burden, and identify expression quantitative trait loci. Integration of scHi-HOST with human challenge and experimental validation demonstrated that a missense variant in endoplasmic reticulum aminopeptidase 1 (ERAP1; rs27895) increased IAV burden in cells and human volunteers. rs27895 exhibits population differentiation, likely contributing to greater permissivity of cells from African populations to IAV. scHi-HOST is a broadly applicable method and resource for decoding infectious-disease genetics.
Collapse
Affiliation(s)
- Benjamin H Schott
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
- Duke University Program in Genetics and Genomics, Duke University, Durham, NC 27710, USA
- These authors contributed equally
| | - Liuyang Wang
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
- These authors contributed equally
| | - Xinyu Zhu
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
| | - Alfred T Harding
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
| | - Emily R Ko
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC 27710, USA
- Hospital Medicine, Division of General Internal Medicine, Department of Medicine, Duke Regional Hospital, Durham, NC 27705, USA
| | - Jeffrey S Bourgeois
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
- Duke University Program in Genetics and Genomics, Duke University, Durham, NC 27710, USA
| | - Erica J Washington
- Department of Biochemistry, School of Medicine, Duke University, Durham, NC 27710, USA
| | - Thomas W Burke
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC 27710, USA
| | - Jack Anderson
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC 27710, USA
| | - Emma Bergstrom
- Section of Infectious Diseases and Immunity, Imperial College London, London, W12 0NN, UK
| | - Zoe Gardener
- Section of Infectious Diseases and Immunity, Imperial College London, London, W12 0NN, UK
| | - Suzanna Paterson
- Section of Infectious Diseases and Immunity, Imperial College London, London, W12 0NN, UK
| | - Richard G Brennan
- Department of Biochemistry, School of Medicine, Duke University, Durham, NC 27710, USA
| | - Christopher Chiu
- Section of Infectious Diseases and Immunity, Imperial College London, London, W12 0NN, UK
| | - Micah T McClain
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC 27710, USA
- Durham Veterans Affairs Health Care System, Durham, NC 27705, USA
- Division of Infectious Diseases, Department of Medicine, School of Medicine, Duke University, Durham, NC 27710, USA
| | - Christopher W Woods
- Center for Applied Genomics and Precision Medicine, Department of Medicine, Duke University, Durham, NC 27710, USA
- Durham Veterans Affairs Health Care System, Durham, NC 27705, USA
- Division of Infectious Diseases, Department of Medicine, School of Medicine, Duke University, Durham, NC 27710, USA
| | - Simon G Gregory
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC 27710, USA
| | - Nicholas S Heaton
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
| | - Dennis C Ko
- Department of Molecular Genetics and Microbiology, School of Medicine, Duke University, 0048B CARL Building Box 3053, 213 Research Drive, Durham, NC 27710, USA
- Duke University Program in Genetics and Genomics, Duke University, Durham, NC 27710, USA
- Division of Infectious Diseases, Department of Medicine, School of Medicine, Duke University, Durham, NC 27710, USA
- Lead contact
| |
Collapse
|
48
|
Huang D, Feng X, Yang H, Wang J, Zhang W, Fan X, Dong X, Chen K, Yu Y, Ma X, Yi X, Li M. QTLbase2: an enhanced catalog of human quantitative trait loci on extensive molecular phenotypes. Nucleic Acids Res 2022; 51:D1122-D1128. [PMID: 36330927 PMCID: PMC9825467 DOI: 10.1093/nar/gkac1020] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/17/2022] [Accepted: 10/21/2022] [Indexed: 11/06/2022] Open
Abstract
Deciphering the fine-scale molecular mechanisms that shape the genetic effects at disease-associated loci from genome-wide association studies (GWAS) remains challenging. The key avenue is to identify the essential molecular phenotypes that mediate the causal variant and disease under particular biological conditions. Therefore, integrating GWAS signals with context-specific quantitative trait loci (QTLs) (such as different tissue/cell types, disease states, and perturbations) from extensive molecular phenotypes would present important strategies for full understanding of disease genetics. Via persistent curation and systematic data processing of large-scale human molecular trait QTLs (xQTLs), we updated our previous QTLbase database (now QTLbase2, http://mulinlab.org/qtlbase) to comprehensively analyze and visualize context-specific QTLs across 22 molecular phenotypes and over 95 tissue/cell types. Overall, the resource features the following major updates and novel functions: (i) 960 more genome-wide QTL summary statistics from 146 independent studies; (ii) new data for 10 previously uncompiled QTL types; (iii) variant query scope expanded to fit 195 QTL datasets based on whole-genome sequencing; (iv) supports filtering and comparison of QTLs for different biological conditions, such as stimulation types and disease states; (v) a new linkage disequilibrium viewer to facilitate variant prioritization across tissue/cell types and QTL types.
Collapse
Affiliation(s)
| | | | - Hongxi Yang
- Department of Pharmacology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Jianhua Wang
- Department of Pharmacology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Wenwen Zhang
- Department of Pharmacology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xutong Fan
- Department of Pharmacology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xiaobao Dong
- Department of Bioinformatics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Kexin Chen
- Department of Bioinformatics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Ying Yu
- Department of Pharmacology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xin Ma
- Correspondence may also be addressed to Xin Ma.
| | - Xianfu Yi
- Correspondence may also be addressed to Xianfu Yi.
| | - Mulin Jun Li
- To whom correspondence should be addressed. Tel: +86 22 83336668; Fax: +86 22 83336668;
| |
Collapse
|
49
|
Wang D, Wu X, Jiang G, Yang J, Yu Z, Yang Y, Yang W, Niu X, Tang K, Gong J. Systematic analysis of the effects of genetic variants on chromatin accessibility to decipher functional variants in non-coding regions. Front Oncol 2022; 12:1035855. [PMID: 36330496 PMCID: PMC9623183 DOI: 10.3389/fonc.2022.1035855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Accepted: 10/03/2022] [Indexed: 11/18/2022] Open
Abstract
Genome-wide association study (GWAS) has identified thousands of single nucleotide polymorphisms (SNPs) associated with complex diseases and traits. However, deciphering the functions of these SNPs still faces challenges. Recent studies have shown that SNPs could alter chromatin accessibility and result in differences in tumor susceptibility between individuals. Therefore, systematically analyzing the effects of SNPs on chromatin accessibility could help decipher the functions of SNPs, especially those in non-coding regions. Using data from The Cancer Genome Atlas (TCGA), chromatin accessibility quantitative trait locus (caQTL) analysis was conducted to estimate the associations between genetic variants and chromatin accessibility. We analyzed caQTLs in 23 human cancer types and identified 9,478 caQTLs in breast carcinoma (BRCA). In BRCA, these caQTLs tend to alter the binding affinity of transcription factors, and open chromatin regions regulated by these caQTLs are enriched in regulatory elements. By integrating with eQTL data, we identified 141 caQTLs showing a strong signal for colocalization with eQTLs. We also identified 173 caQTLs in genome-wide association studies (GWAS) loci and inferred several possible target genes of these caQTLs. By performing survival analysis, we found that ~10% caQTLs potentially influence the prognosis of patients. To facilitate access to relevant data, we developed a user-friendly data portal, BCaQTL (http://gong_lab.hzau.edu.cn/caqtl_database), for data searching and downloading. Our work may facilitate fine-map regulatory mechanisms underlying risk loci of cancer and discover the biomarkers or therapeutic targets for cancer prognosis. The BCaQTL database will be an important resource for genetic and epigenetic studies.
Collapse
Affiliation(s)
- Dongyang Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Xiaohong Wu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Guanghui Jiang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Jianye Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Zhanhui Yu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Yanbo Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Wenqian Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Xiaohui Niu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Ke Tang
- Department of Biochemistry and Molecular Biology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- *Correspondence: Jing Gong, ; Ke Tang,
| | - Jing Gong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
- College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, China
- *Correspondence: Jing Gong, ; Ke Tang,
| |
Collapse
|
50
|
Baca SC, Singler C, Zacharia S, Seo JH, Morova T, Hach F, Ding Y, Schwarz T, Huang CCF, Anderson J, Fay AP, Kalita C, Groha S, Pomerantz MM, Wang V, Linder S, Sweeney CJ, Zwart W, Lack NA, Pasaniuc B, Takeda DY, Gusev A, Freedman ML. Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation. Nat Genet 2022; 54:1364-1375. [PMID: 36071171 PMCID: PMC9784646 DOI: 10.1038/s41588-022-01168-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 07/19/2022] [Indexed: 12/25/2022]
Abstract
Many genetic variants affect disease risk by altering context-dependent gene regulation. Such variants are difficult to study mechanistically using current methods that link genetic variation to steady-state gene expression levels, such as expression quantitative trait loci (eQTLs). To address this challenge, we developed the cistrome-wide association study (CWAS), a framework for identifying genotypic and allele-specific effects on chromatin that are also associated with disease. In prostate cancer, CWAS identified regulatory elements and androgen receptor-binding sites that explained the association at 52 of 98 known prostate cancer risk loci and discovered 17 additional risk loci. CWAS implicated key developmental transcription factors in prostate cancer risk that are overlooked by eQTL-based approaches due to context-dependent gene regulation. We experimentally validated associations and demonstrated the extensibility of CWAS to additional epigenomic datasets and phenotypes, including response to prostate cancer treatment. CWAS is a powerful and biologically interpretable paradigm for studying variants that influence traits by affecting transcriptional regulation.
Collapse
Affiliation(s)
- Sylvan C. Baca
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Cassandra Singler
- Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Soumya Zacharia
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ji-Heui Seo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tunc Morova
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada
| | - Faraz Hach
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada
| | - Yi Ding
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA
| | - Tommer Schwarz
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA
| | | | - Jacob Anderson
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - André P. Fay
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Cynthia Kalita
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA
| | - Stefan Groha
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Mark M. Pomerantz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Victoria Wang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA,Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Simon Linder
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands,Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | - Wilbert Zwart
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands,Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Nathan A. Lack
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada,School of Medicine, Koç University, Istanbul, Turkey
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA,Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA USA,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - David Y. Takeda
- Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA,Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA,These authors jointly supervised this work. Correspondence should be directed to M.L.F or A.G. ()
| | - Matthew L. Freedman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA,These authors jointly supervised this work. Correspondence should be directed to M.L.F or A.G. ()
| |
Collapse
|