1
|
Fast and flexible joint fine-mapping of multiple traits via the Sum of Single Effects model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.14.536893. [PMID: 37425935 PMCID: PMC10327118 DOI: 10.1101/2023.04.14.536893] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
We introduce mvSuSiE, a multi-trait fine-mapping method for identifying putative causal variants from genetic association data (individual-level or summary data). mvSuSiE learns patterns of shared genetic effects from data, and exploits these patterns to improve power to identify causal SNPs. Comparisons on simulated data show that mvSuSiE is competitive in speed, power and precision with existing multi-trait methods, and uniformly improves on single-trait fine-mapping (SuSiE) in each trait separately. We applied mvSuSiE to jointly fine-map 16 blood cell traits using data from the UK Biobank. By jointly analyzing the traits and modeling heterogeneous effect sharing patterns, we discovered a much larger number of causal SNPs (>3,000) compared with single-trait fine-mapping, and with narrower credible sets. mvSuSiE also more comprehensively characterized the ways in which the genetic variants affect one or more blood cell traits; 68% of causal SNPs showed significant effects in more than one blood cell type.
Collapse
|
2
|
Mapping the functional impact of non-coding regulatory elements in primary T cells through single-cell CRISPR screens. Genome Biol 2024; 25:42. [PMID: 38308274 PMCID: PMC10835965 DOI: 10.1186/s13059-024-03176-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 01/18/2024] [Indexed: 02/04/2024] Open
Abstract
BACKGROUND Drug targets with genetic evidence are expected to increase clinical success by at least twofold. Yet, translating disease-associated genetic variants into functional knowledge remains a fundamental challenge of drug discovery. A key issue is that the vast majority of complex disease associations cannot be cleanly mapped to a gene. Immune disease-associated variants are enriched within regulatory elements found in T-cell-specific open chromatin regions. RESULTS To identify genes and molecular programs modulated by these regulatory elements, we develop a CRISPRi-based single-cell functional screening approach in primary human T cells. Our pipeline enables the interrogation of transcriptomic changes induced by the perturbation of regulatory elements at scale. We first optimize an efficient CRISPRi protocol in primary CD4+ T cells via CROPseq vectors. Subsequently, we perform a screen targeting 45 non-coding regulatory elements and 35 transcription start sites and profile approximately 250,000 T -cell single-cell transcriptomes. We develop a bespoke analytical pipeline for element-to-gene (E2G) mapping and demonstrate that our method can identify both previously annotated and novel E2G links. Lastly, we integrate genetic association data for immune-related traits and demonstrate how our platform can aid in the identification of effector genes for GWAS loci. CONCLUSIONS We describe "primary T cell crisprQTL" - a scalable, single-cell functional genomics approach for mapping regulatory elements to genes in primary human T cells. We show how this framework can facilitate the interrogation of immune disease GWAS hits and propose that the combination of experimental and QTL-based techniques is likely to address the variant-to-function problem.
Collapse
|
3
|
Heterogeneity of platelets and their responses. Res Pract Thromb Haemost 2024; 8:102356. [PMID: 38666061 PMCID: PMC11043642 DOI: 10.1016/j.rpth.2024.102356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 01/22/2024] [Accepted: 02/06/2024] [Indexed: 04/28/2024] Open
Abstract
There has been increasing recognition of heterogeneity in blood platelets and their responses, particularly in recent years, where next-generation technologies and advanced bioinformatic tools that interrogate "big data" have enabled large-scale studies of RNA and protein expression across a growing list of disease states. However, pioneering platelet biologists and clinicians were already hypothesizing upon and investigating heterogeneity in platelet (and megakaryocyte) activity and platelet metabolism and aggregation over half a century ago. Building on their foundational hypotheses, in particular Professor Marian A. Packham's pioneering work and a State of the Art lecture in her memoriam at the 2023 International Society on Thrombosis and Haemostasis Congress by Anandi Krishnan, this review outlines the key features that contribute to the heterogeneity of platelets between and within individuals. Starting with important epidemiologic factors, we move stepwise through successively smaller scales down to heterogeneity revealed by single-cell technologies in health and disease. We hope that this overview will urge future scientific and clinical studies to recognize and account for heterogeneity of platelets and aim to apply methods that capture that heterogeneity. Finally, we summarize other exciting new data presented on this topic at the 2023 International Society on Thrombosis and Haemostasis Congress.
Collapse
|
4
|
GWAS for systemic sclerosis identifies six novel susceptibility loci including one in the Fcγ receptor region. Nat Commun 2024; 15:319. [PMID: 38296975 PMCID: PMC10830486 DOI: 10.1038/s41467-023-44541-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 12/18/2023] [Indexed: 02/02/2024] Open
Abstract
Here we report the largest Asian genome-wide association study (GWAS) for systemic sclerosis performed to date, based on data from Japanese subjects and comprising of 1428 cases and 112,599 controls. The lead SNP is in the FCGR/FCRL region, which shows a penetrating association in the Asian population, while a complete linkage disequilibrium SNP, rs10917688, is found in a cis-regulatory element for IRF8. IRF8 is also a significant locus in European GWAS for systemic sclerosis, but rs10917688 only shows an association in the presence of the risk allele of IRF8 in the Japanese population. Further analysis shows that rs10917688 is marked with H3K4me1 in primary B cells. A meta-analysis with a European GWAS detects 30 additional significant loci. Polygenic risk scores constructed with the effect sizes of the meta-analysis suggest the potential portability of genetic associations beyond populations. Prioritizing the top 5% of SNPs of IRF8 binding sites in B cells improves the fitting of the polygenic risk scores, underscoring the roles of B cells and IRF8 in the development of systemic sclerosis. The results also suggest that systemic sclerosis shares a common genetic architecture across populations.
Collapse
|
5
|
Leveraging single-cell ATAC-seq and RNA-seq to identify disease-critical fetal and adult brain cell types. Nat Commun 2024; 15:563. [PMID: 38233398 PMCID: PMC10794712 DOI: 10.1038/s41467-024-44742-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 01/02/2024] [Indexed: 01/19/2024] Open
Abstract
Prioritizing disease-critical cell types by integrating genome-wide association studies (GWAS) with functional data is a fundamental goal. Single-cell chromatin accessibility (scATAC-seq) and gene expression (scRNA-seq) have characterized cell types at high resolution, and studies integrating GWAS with scRNA-seq have shown promise, but studies integrating GWAS with scATAC-seq have been limited. Here, we identify disease-critical fetal and adult brain cell types by integrating GWAS summary statistics from 28 brain-related diseases/traits (average N = 298 K) with 3.2 million scATAC-seq and scRNA-seq profiles from 83 cell types. We identified disease-critical fetal (respectively adult) brain cell types for 22 (respectively 23) of 28 traits using scATAC-seq, and for 8 (respectively 17) of 28 traits using scRNA-seq. Significant scATAC-seq enrichments included fetal photoreceptor cells for major depressive disorder, fetal ganglion cells for BMI, fetal astrocytes for ADHD, and adult VGLUT2 excitatory neurons for schizophrenia. Our findings improve our understanding of brain-related diseases/traits and inform future analyses.
Collapse
|
6
|
Abstract
Fine-mapping aims to identify causal genetic variants for phenotypes. Bayesian fine-mapping algorithms (for example, SuSiE, FINEMAP, ABF and COJO-ABF) are widely used, but assessing posterior probability calibration remains challenging in real data, where model misspecification probably exists, and true causal variants are unknown. We introduce replication failure rate (RFR), a metric to assess fine-mapping consistency by downsampling. SuSiE, FINEMAP and COJO-ABF show high RFR, indicating potential overconfidence in their output. Simulations reveal that nonsparse genetic architecture can lead to miscalibration, while imputation noise, nonuniform distribution of causal variants and quality control filters have minimal impact. Here we present SuSiE-inf and FINEMAP-inf, fine-mapping methods modeling infinitesimal effects alongside fewer larger causal effects. Our methods show improved calibration, RFR and functional enrichment, competitive recall and computational efficiency. Notably, using our methods' posterior effect sizes substantially increases polygenic risk score accuracy over SuSiE and FINEMAP. Our work improves causal variant identification for complex traits, a fundamental goal of human genetics.
Collapse
|
7
|
Rapid and Quantitative Functional Interrogation of Human Enhancer Variant Activity in Live Mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.10.570890. [PMID: 38105996 PMCID: PMC10723448 DOI: 10.1101/2023.12.10.570890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Functional analysis of non-coding variants associated with human congenital disorders remains challenging due to the lack of efficient in vivo models. Here we introduce dual-enSERT, a robust Cas9-based two-color fluorescent reporter system which enables rapid, quantitative comparison of enhancer allele activities in live mice of any genetic background. We use this new technology to examine and measure the gain- and loss-of-function effects of enhancer variants linked to limb polydactyly, autism, and craniofacial malformation. By combining dual-enSERT with single-cell transcriptomics, we characterize variant enhancer alleles at cellular resolution, thereby implicating candidate molecular pathways in pathogenic enhancer misregulation. We further show that independent, polydactyly-linked enhancer variants lead to ectopic expression in the same cell populations, indicating shared genetic mechanisms underlying non-coding variant pathogenesis. Finally, we streamline dual-enSERT for analysis in F0 animals by placing both reporters on the same transgene separated by a synthetic insulator. Dual-enSERT allows researchers to go from identifying candidate enhancer variants to analysis of comparative enhancer activity in live embryos in under two weeks.
Collapse
|
8
|
FOXO1 is a master regulator of CAR T memory programming. RESEARCH SQUARE 2023:rs.3.rs-2802998. [PMID: 37986944 PMCID: PMC10659532 DOI: 10.21203/rs.3.rs-2802998/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Poor CAR T persistence limits CAR T cell therapies for B cell malignancies and solid tumors1,2. The expression of memory-associated genes such as TCF7 (protein name TCF1) is linked to response and long-term persistence in patients3-7, thereby implicating memory programs in therapeutic efficacy. Here, we demonstrate that the pioneer transcription factor, FOXO1, is responsible for promoting memory programs and restraining exhaustion in human CAR T cells. Pharmacologic inhibition or gene editing of endogenous FOXO1 in human CAR T cells diminished the expression of memory-associated genes, promoted an exhaustion-like phenotype, and impaired antitumor activity in vitro and in vivo. FOXO1 overexpression induced a gene expression program consistent with T cell memory and increased chromatin accessibility at FOXO1 binding motifs. FOXO1-overexpressing cells retained function, memory potential, and metabolic fitness during settings of chronic stimulation and exhibited enhanced persistence and antitumor activity in vivo. In contrast, TCF1 overexpression failed to enforce canonical memory programs or enhance CAR T cell potency. Importantly, endogenous FOXO1 activity correlated with CAR T and TIL responses in patients, underscoring its clinical relevance in cancer immunotherapy. Our results demonstrate that memory reprogramming through FOXO1 can enhance the persistence and potency of human CAR T cells and highlights the utility of pioneer factors, which bind condensed chromatin and induce local epigenetic remodeling, for optimizing therapeutic T cell states.
Collapse
|
9
|
XMAP: Cross-population fine-mapping by leveraging genetic diversity and accounting for confounding bias. Nat Commun 2023; 14:6870. [PMID: 37898663 PMCID: PMC10613261 DOI: 10.1038/s41467-023-42614-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Accepted: 10/17/2023] [Indexed: 10/30/2023] Open
Abstract
Fine-mapping prioritizes risk variants identified by genome-wide association studies (GWASs), serving as a critical step to uncover biological mechanisms underlying complex traits. However, several major challenges still remain for existing fine-mapping methods. First, the strong linkage disequilibrium among variants can limit the statistical power and resolution of fine-mapping. Second, it is computationally expensive to simultaneously search for multiple causal variants. Third, the confounding bias hidden in GWAS summary statistics can produce spurious signals. To address these challenges, we develop a statistical method for cross-population fine-mapping (XMAP) by leveraging genetic diversity and accounting for confounding bias. By using cross-population GWAS summary statistics from global biobanks and genomic consortia, we show that XMAP can achieve greater statistical power, better control of false positive rate, and substantially higher computational efficiency for identifying multiple causal signals, compared to existing methods. Importantly, we show that the output of XMAP can be integrated with single-cell datasets, which greatly improves the interpretation of putative causal variants in their cellular context at single-cell resolution.
Collapse
|
10
|
Single-cell chromatin accessibility and transcriptomic characterization of Behcet's disease. Commun Biol 2023; 6:1048. [PMID: 37848613 PMCID: PMC10582193 DOI: 10.1038/s42003-023-05420-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023] Open
Abstract
Behect's disease is a chronic vasculitis characterized by complex multi-organ immune aberrations. However, a comprehensive understanding of the gene-regulatory profile of peripheral autoimmunity and the diverse immune responses across distinct cell types in Behcet's disease (BD) is still lacking. Here, we present a multi-omic single-cell study of 424,817 cells in BD patients and non-BD individuals. This study maps chromatin accessibility and gene expression in the same biological samples, unraveling vast cellular heterogeneity. We identify widespread cell-type-specific, disease-associated active and pro-inflammatory immunity in both transcript and epigenomic aspects. Notably, integrative multi-omic analysis reveals putative TF regulators that might contribute to chromatin accessibility and gene expression in BD. Moreover, we predicted gene-regulatory networks within nominated TF activators, including AP-1, NF-kB, and ETS transcript factor families, which may regulate cellular interaction and govern inflammation. Our study illustrates the epigenetic and transcriptional landscape in BD peripheral blood and expands understanding of potential epigenomic immunopathology in this disease.
Collapse
|
11
|
A genome-wide association study of blood cell morphology identifies cellular proteins implicated in disease aetiology. Nat Commun 2023; 14:5023. [PMID: 37596262 PMCID: PMC10439125 DOI: 10.1038/s41467-023-40679-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 08/07/2023] [Indexed: 08/20/2023] Open
Abstract
Blood cells contain functionally important intracellular structures, such as granules, critical to immunity and thrombosis. Quantitative variation in these structures has not been subjected previously to large-scale genetic analysis. We perform genome-wide association studies of 63 flow-cytometry derived cellular phenotypes-including cell-type specific measures of granularity, nucleic acid content and reactivity-in 41,515 participants in the INTERVAL study. We identify 2172 distinct variant-trait associations, including associations near genes coding for proteins in organelles implicated in inflammatory and thrombotic diseases. By integrating with epigenetic data we show that many intracellular structures are likely to be determined in immature precursor cells. By integrating with proteomic data we identify the transcription factor FOG2 as an early regulator of platelet formation and α-granularity. Finally, we show that colocalisation of our associations with disease risk signals can suggest aetiological cell-types-variants in IL2RA and ITGA4 respectively mirror the known effects of daclizumab in multiple sclerosis and vedolizumab in inflammatory bowel disease.
Collapse
|
12
|
Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat Genet 2023; 55:1267-1276. [PMID: 37443254 PMCID: PMC10836580 DOI: 10.1038/s41588-023-01443-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 06/09/2023] [Indexed: 07/15/2023]
Abstract
Genome-wide association studies (GWASs) are a valuable tool for understanding the biology of complex human traits and diseases, but associated variants rarely point directly to causal genes. In the present study, we introduce a new method, polygenic priority score (PoPS), that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. Using a large evaluation set of genes with fine-mapped coding variants, we show that PoPS and the closest gene individually outperform other gene prioritization methods, but observe the best overall performance by combining PoPS with orthogonal methods. Using this combined approach, we prioritize 10,642 unique gene-trait pairs across 113 complex traits and diseases with high precision, finding not only well-established gene-trait relationships but nominating new genes at unresolved loci, such as LGR4 for estimated glomerular filtration rate and CCR7 for deep vein thrombosis. Overall, we demonstrate that PoPS provides a powerful addition to the gene prioritization toolbox.
Collapse
|
13
|
Interpreting non-coding disease-associated human variants using single-cell epigenomics. Nat Rev Genet 2023; 24:516-534. [PMID: 37161089 PMCID: PMC10629587 DOI: 10.1038/s41576-023-00598-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/27/2023] [Indexed: 05/11/2023]
Abstract
Genome-wide association studies (GWAS) have linked hundreds of thousands of sequence variants in the human genome to common traits and diseases. However, translating this knowledge into a mechanistic understanding of disease-relevant biology remains challenging, largely because such variants are predominantly in non-protein-coding sequences that still lack functional annotation at cell-type resolution. Recent advances in single-cell epigenomics assays have enabled the generation of cell type-, subtype- and state-resolved maps of the epigenome in heterogeneous human tissues. These maps have facilitated cell type-specific annotation of candidate cis-regulatory elements and their gene targets in the human genome, enhancing our ability to interpret the genetic basis of common traits and diseases.
Collapse
|
14
|
Blood cell traits' GWAS loci colocalization with variation in PU.1 genomic occupancy prioritizes causal noncoding regulatory variants. CELL GENOMICS 2023; 3:100327. [PMID: 37492098 PMCID: PMC10363807 DOI: 10.1016/j.xgen.2023.100327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/10/2023] [Accepted: 04/25/2023] [Indexed: 07/27/2023]
Abstract
Genome-wide association studies (GWASs) have uncovered numerous trait-associated loci across the human genome, most of which are located in noncoding regions, making interpretation difficult. Moreover, causal variants are hard to statistically fine-map at many loci because of widespread linkage disequilibrium. To address this challenge, we present a strategy utilizing transcription factor (TF) binding quantitative trait loci (bQTLs) for colocalization analysis to identify trait associations likely mediated by TF occupancy variation and to pinpoint likely causal variants using motif scores. We applied this approach to PU.1 bQTLs in lymphoblastoid cell lines and blood cell trait GWAS data. Colocalization analysis revealed 69 blood cell trait GWAS loci putatively driven by PU.1 occupancy variation. We nominate PU.1 motif-altering variants as the likely shared causal variants at 51 loci. Such integration of TF bQTL data with other GWAS data may reveal transcriptional regulatory mechanisms and causal noncoding variants underlying additional complex traits.
Collapse
|
15
|
Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.01.07.23284293. [PMID: 36711496 PMCID: PMC9882563 DOI: 10.1101/2023.01.07.23284293] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Genome-wide association studies (GWAS) of human complex traits or diseases often implicate genetic loci that span hundreds or thousands of genetic variants, many of which have similar statistical significance. While statistical fine-mapping in individuals of European ancestries has made important discoveries, cross-population fine-mapping has the potential to improve power and resolution by capitalizing on the genomic diversity across ancestries. Here we present SuSiEx, an accurate and computationally efficient method for cross-population fine-mapping, which builds on the single-population fine-mapping framework, Sum of Single Effects (SuSiE). SuSiEx integrates data from an arbitrary number of ancestries, explicitly models population-specific allele frequencies and LD patterns, accounts for multiple causal variants in a genomic region, and can be applied to GWAS summary statistics. We comprehensively evaluated SuSiEx using simulations, a range of quantitative traits measured in both UK Biobank and Taiwan Biobank, and schizophrenia GWAS across East Asian and European ancestries. In all evaluations, SuSiEx fine-mapped more association signals, produced smaller credible sets and higher posterior inclusion probability (PIP) for putative causal variants, and captured population-specific causal variants.
Collapse
|
16
|
An atlas of healthy and injured cell states and niches in the human kidney. Nature 2023; 619:585-594. [PMID: 37468583 PMCID: PMC10356613 DOI: 10.1038/s41586-023-05769-3] [Citation(s) in RCA: 63] [Impact Index Per Article: 63.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 01/30/2023] [Indexed: 07/21/2023]
Abstract
Understanding kidney disease relies on defining the complexity of cell types and states, their associated molecular profiles and interactions within tissue neighbourhoods1. Here we applied multiple single-cell and single-nucleus assays (>400,000 nuclei or cells) and spatial imaging technologies to a broad spectrum of healthy reference kidneys (45 donors) and diseased kidneys (48 patients). This has provided a high-resolution cellular atlas of 51 main cell types, which include rare and previously undescribed cell populations. The multi-omic approach provides detailed transcriptomic profiles, regulatory factors and spatial localizations spanning the entire kidney. We also define 28 cellular states across nephron segments and interstitium that were altered in kidney injury, encompassing cycling, adaptive (successful or maladaptive repair), transitioning and degenerative states. Molecular signatures permitted the localization of these states within injury neighbourhoods using spatial transcriptomics, while large-scale 3D imaging analysis (around 1.2 million neighbourhoods) provided corresponding linkages to active immune responses. These analyses defined biological pathways that are relevant to injury time-course and niches, including signatures underlying epithelial repair that predicted maladaptive states associated with a decline in kidney function. This integrated multimodal spatial cell atlas of healthy and diseased human kidneys represents a comprehensive benchmark of cellular states, neighbourhoods, outcome-associated signatures and publicly available interactive visualizations.
Collapse
|
17
|
3D genome organization and epigenetic regulation in autoimmune diseases. Front Immunol 2023; 14:1196123. [PMID: 37346038 PMCID: PMC10279977 DOI: 10.3389/fimmu.2023.1196123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/17/2023] [Indexed: 06/23/2023] Open
Abstract
Three-dimensional (3D) genomics is an emerging field of research that investigates the relationship between gene regulatory function and the spatial structure of chromatin. Chromatin folding can be studied using chromosome conformation capture (3C) technology and 3C-based derivative sequencing technologies, including chromosome conformation capture-on-chip (4C), chromosome conformation capture carbon copy (5C), and high-throughput chromosome conformation capture (Hi-C), which allow scientists to capture 3D conformations from a single site to the entire genome. A comprehensive analysis of the relationships between various regulatory components and gene function also requires the integration of multi-omics data such as genomics, transcriptomics, and epigenomics. 3D genome folding is involved in immune cell differentiation, activation, and dysfunction and participates in a wide range of diseases, including autoimmune diseases. We describe hierarchical 3D chromatin organization in this review and conclude with characteristics of C-techniques and multi-omics applications of the 3D genome. In addition, we describe the relationship between 3D genome structure and the differentiation and maturation of immune cells and address how changes in chromosome folding contribute to autoimmune diseases.
Collapse
|
18
|
Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 2023; 380:eadh7699. [PMID: 37141313 PMCID: PMC10518238 DOI: 10.1126/science.adh7699] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 04/20/2023] [Indexed: 05/06/2023]
Abstract
Most variants associated with complex traits and diseases identified by genome-wide association studies (GWAS) map to noncoding regions of the genome with unknown effects. Using ancestrally diverse, biobank-scale GWAS data, massively parallel CRISPR screens, and single-cell transcriptomic and proteomic sequencing, we discovered 124 cis-target genes of 91 noncoding blood trait GWAS loci. Using precise variant insertion through base editing, we connected specific variants with gene expression changes. We also identified trans-effect networks of noncoding loci when cis target genes encoded transcription factors or microRNAs. Networks were themselves enriched for GWAS variants and demonstrated polygenic contributions to complex traits. This platform enables massively parallel characterization of the target genes and mechanisms of human noncoding variants in both cis and trans.
Collapse
|
19
|
Identification of a genomic DNA sequence that quantitatively modulates KLF1 transcription factor expression in differentiating human hematopoietic cells. Sci Rep 2023; 13:7589. [PMID: 37165057 PMCID: PMC10172341 DOI: 10.1038/s41598-023-34805-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 05/08/2023] [Indexed: 05/12/2023] Open
Abstract
The onset of erythropoiesis is under strict developmental control, with direct and indirect inputs influencing its derivation from the hematopoietic stem cell. A major regulator of this transition is KLF1/EKLF, a zinc finger transcription factor that plays a global role in all aspects of erythropoiesis. Here, we have identified a short, conserved enhancer element in KLF1 intron 1 that is important for establishing optimal levels of KLF1 in mouse and human cells. Chromatin accessibility of this site exhibits cell-type specificity and is under developmental control during the differentiation of human CD34+ cells towards the erythroid lineage. This site binds GATA1, SMAD1, TAL1, and ETV6. In vivo editing of this region in cell lines and primary cells reduces KLF1 expression quantitatively. However, we find that, similar to observations seen in pedigrees of families with KLF1 mutations, downstream effects are variable, suggesting that the global architecture of the site is buffered towards keeping the KLF1 genetic region in an active state. We propose that modification of intron 1 in both alleles is not equivalent to complete loss of function of one allele.
Collapse
|
20
|
AdaLiftOver: high-resolution identification of orthologous regulatory elements with Adaptive liftOver. Bioinformatics 2023; 39:btad149. [PMID: 37004197 PMCID: PMC10085516 DOI: 10.1093/bioinformatics/btad149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 03/02/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023] Open
Abstract
MOTIVATION Elucidating functionally similar orthologous regulatory regions for human and model organism genomes is critical for exploiting model organism research and advancing our understanding of results from genome-wide association studies (GWAS). Sequence conservation is the de facto approach for finding orthologous non-coding regions between human and model organism genomes. However, existing methods for mapping non-coding genomic regions across species are challenged by the multi-mapping, low precision, and low mapping rate issues. RESULTS We develop Adaptive liftOver (AdaLiftOver), a large-scale computational tool for identifying functionally similar orthologous non-coding regions across species. AdaLiftOver builds on the UCSC liftOver framework to extend the query regions and prioritizes the resulting candidate target regions based on the conservation of the epigenomic and the sequence grammar features. Evaluations of AdaLiftOver with multiple case studies, spanning both genomic intervals from epigenome datasets across a wide range of model organisms and GWAS SNPs, yield AdaLiftOver as a versatile method for deriving hard-to-obtain human epigenome datasets as well as reliably identifying orthologous loci for GWAS SNPs. AVAILABILITY AND IMPLEMENTATION The R package and the data for AdaLiftOver is available from https://github.com/keleslab/AdaLiftOver.
Collapse
|
21
|
Colocalization of blood cell traits GWAS associations and variation in PU.1 genomic occupancy prioritizes causal noncoding regulatory variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.29.534582. [PMID: 37034747 PMCID: PMC10081269 DOI: 10.1101/2023.03.29.534582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Genome-wide association studies (GWAS) have uncovered numerous trait-associated loci across the human genome, most of which are located in noncoding regions, making interpretations difficult. Moreover, causal variants are hard to statistically fine-map at many loci because of widespread linkage disequilibrium. To address this challenge, we present a strategy utilizing transcription factor (TF) binding quantitative trait loci (bQTLs) for colocalization analysis to identify trait associations likely mediated by TF occupancy variation and to pinpoint likely causal variants using motif scores. We applied this approach to PU.1 bQTLs in lymphoblastoid cell lines and blood cell traits GWAS data. Colocalization analysis revealed 69 blood cell trait GWAS loci putatively driven by PU.1 occupancy variation. We nominate PU.1 motif-altering variants as the likely shared causal variants at 51 loci. Such integration of TF bQTL data with other GWAS data may reveal transcriptional regulatory mechanisms and causal noncoding variants underlying additional complex traits.
Collapse
|
22
|
An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping. Nat Commun 2023; 14:1208. [PMID: 36869052 PMCID: PMC9984425 DOI: 10.1038/s41467-023-36897-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 02/22/2023] [Indexed: 03/05/2023] Open
Abstract
Genetic sharing is extensively observed for autoimmune diseases, but the causal variants and their underlying molecular mechanisms remain largely unknown. Through systematic investigation of autoimmune disease pleiotropic loci, we found most of these shared genetic effects are transmitted from regulatory code. We used an evidence-based strategy to functionally prioritize causal pleiotropic variants and identify their target genes. A top-ranked pleiotropic variant, rs4728142, yielded many lines of evidence as being causal. Mechanistically, the rs4728142-containing region interacts with the IRF5 alternative promoter in an allele-specific manner and orchestrates its upstream enhancer to regulate IRF5 alternative promoter usage through chromatin looping. A putative structural regulator, ZBTB3, mediates the allele-specific loop to promote IRF5-short transcript expression at the rs4728142 risk allele, resulting in IRF5 overactivation and M1 macrophage polarization. Together, our findings establish a causal mechanism between the regulatory variant and fine-scale molecular phenotype underlying the dysfunction of pleiotropic genes in human autoimmunity.
Collapse
|
23
|
Hacking hematopoiesis - emerging tools for examining variant effects. Dis Model Mech 2023; 16:288409. [PMID: 36826849 PMCID: PMC9983777 DOI: 10.1242/dmm.049857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023] Open
Abstract
Hematopoiesis is a continuous process of blood and immune cell production. It is orchestrated by thousands of gene products that respond to extracellular signals by guiding cell fate decisions to meet the needs of the organism. Although much of our knowledge of this process comes from work in model systems, we have learned a great deal from studies on human genetic variation. Considerable insight has emerged from studies on presumed monogenic blood disorders, which continue to provide key insights into the mechanisms critical for hematopoiesis. Furthermore, the emergence of large-scale biobanks and cohorts has uncovered thousands of genomic loci associated with blood cell traits and diseases. Some of these blood cell trait-associated loci act as modifiers of what were once thought to be monogenic blood diseases. However, most of these loci await functional validation. Here, we discuss the validation bottleneck and emerging methods to more effectively connect variant to function. In particular, we highlight recent innovations in genome editing, which have paved the path forward for high-throughput functional assessment of loci. Finally, we discuss existing barriers to progress, including challenges in manipulating the genomes of primary hematopoietic cells.
Collapse
|
24
|
A genetic disorder reveals a hematopoietic stem cell regulatory network co-opted in leukemia. Nat Immunol 2023; 24:69-83. [PMID: 36522544 PMCID: PMC9810535 DOI: 10.1038/s41590-022-01370-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 10/25/2022] [Indexed: 12/23/2022]
Abstract
The molecular regulation of human hematopoietic stem cell (HSC) maintenance is therapeutically important, but limitations in experimental systems and interspecies variation have constrained our knowledge of this process. Here, we have studied a rare genetic disorder due to MECOM haploinsufficiency, characterized by an early-onset absence of HSCs in vivo. By generating a faithful model of this disorder in primary human HSCs and coupling functional studies with integrative single-cell genomic analyses, we uncover a key transcriptional network involving hundreds of genes that is required for HSC maintenance. Through our analyses, we nominate cooperating transcriptional regulators and identify how MECOM prevents the CTCF-dependent genome reorganization that occurs as HSCs differentiate. We show that this transcriptional network is co-opted in high-risk leukemias, thereby enabling these cancers to acquire stem cell properties. Collectively, we illuminate a regulatory network necessary for HSC self-renewal through the study of a rare experiment of nature.
Collapse
|
25
|
Abstract
Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWASs). Fine-mapping of meta-analysis studies is typically performed as in a single-cohort study. Here, we first demonstrate that heterogeneity (e.g., of sample size, phenotyping, imputation) hurts calibration of meta-analysis fine-mapping. We propose a summary statistics-based quality-control (QC) method, suspicious loci analysis of meta-analysis summary statistics (SLALOM), that identifies suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics. We validate SLALOM in simulations and the GWAS Catalog. Applying SLALOM to 14 meta-analyses from the Global Biobank Meta-analysis Initiative (GBMI), we find that 67% of loci show suspicious patterns that call into question fine-mapping accuracy. These predicted suspicious loci are significantly depleted for having nonsynonymous variants as lead variant (2.7×; Fisher's exact p = 7.3 × 10-4). We find limited evidence of fine-mapping improvement in the GBMI meta-analyses compared with individual biobanks. We urge extreme caution when interpreting fine-mapping results from meta-analysis of heterogeneous cohorts.
Collapse
|
26
|
Current challenges in understanding the role of enhancers in disease. Nat Struct Mol Biol 2022; 29:1148-1158. [PMID: 36482255 DOI: 10.1038/s41594-022-00896-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/04/2022] [Indexed: 12/13/2022]
Abstract
Enhancers play a central role in the spatiotemporal control of gene expression and tend to work in a cell-type-specific manner. In addition, they are suggested to be major contributors to phenotypic variation, evolution and disease. There is growing evidence that enhancer dysfunction due to genetic, structural or epigenetic mechanisms contributes to a broad range of human diseases referred to as enhanceropathies. Such mechanisms often underlie the susceptibility to common diseases, but can also play a direct causal role in cancer or Mendelian diseases. Despite the recent gain of insights into enhancer biology and function, we still have a limited ability to predict how enhancer dysfunction impacts gene expression. Here we discuss the major challenges that need to be overcome when studying the role of enhancers in disease etiology and highlight opportunities and directions for future studies, aiming to disentangle the molecular basis of enhanceropathies.
Collapse
|
27
|
Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis. Nat Genet 2022; 54:1640-1651. [PMID: 36333501 PMCID: PMC10165422 DOI: 10.1038/s41588-022-01213-w] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 09/26/2022] [Indexed: 11/06/2022]
Abstract
Rheumatoid arthritis (RA) is a highly heritable complex disease with unknown etiology. Multi-ancestry genetic research of RA promises to improve power to detect genetic signals, fine-mapping resolution and performances of polygenic risk scores (PRS). Here, we present a large-scale genome-wide association study (GWAS) of RA, which includes 276,020 samples from five ancestral groups. We conducted a multi-ancestry meta-analysis and identified 124 loci (P < 5 × 10-8), of which 34 are novel. Candidate genes at the novel loci suggest essential roles of the immune system (for example, TNIP2 and TNFRSF11A) and joint tissues (for example, WISP1) in RA etiology. Multi-ancestry fine-mapping identified putatively causal variants with biological insights (for example, LEF1). Moreover, PRS based on multi-ancestry GWAS outperformed PRS based on single-ancestry GWAS and had comparable performance between populations of European and East Asian ancestries. Our study provides several insights into the etiology of RA and improves the genetic predictability of RA.
Collapse
|
28
|
Influences of rare copy-number variation on human complex traits. Cell 2022; 185:4233-4248.e27. [PMID: 36306736 PMCID: PMC9800003 DOI: 10.1016/j.cell.2022.09.028] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 07/22/2022] [Accepted: 09/19/2022] [Indexed: 11/06/2022]
Abstract
The human genome contains hundreds of thousands of regions harboring copy-number variants (CNV). However, the phenotypic effects of most such polymorphisms are unknown because only larger CNVs have been ascertainable from SNP-array data generated by large biobanks. We developed a computational approach leveraging haplotype sharing in biobank cohorts to more sensitively detect CNVs. Applied to UK Biobank, this approach accounted for approximately half of all rare gene inactivation events produced by genomic structural variation. This CNV call set enabled a detailed analysis of associations between CNVs and 56 quantitative traits, identifying 269 independent associations (p < 5 × 10-8) likely to be causally driven by CNVs. Putative target genes were identifiable for nearly half of the loci, enabling insights into dosage sensitivity of these genes and uncovering several gene-trait relationships. These results demonstrate the ability of haplotype-informed analysis to provide insights into the genetic basis of human complex traits.
Collapse
|
29
|
Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat Genet 2022; 54:1479-1492. [PMID: 36175791 PMCID: PMC9910198 DOI: 10.1038/s41588-022-01187-9] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 08/18/2022] [Indexed: 12/13/2022]
Abstract
Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell-disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.
Collapse
|
30
|
Abstract
Single-cell sequencing is a powerful approach that can detect genetic alterations and their phenotypic consequences in the context of human development, with cellular resolution. Humans start out as single-cell zygotes and undergo fission and differentiation to develop into multicellular organisms. Before fertilisation and during development, the cellular genome acquires hundreds of mutations that propagate down the cell lineage. Whether germline or somatic in nature, some of these mutations may have significant genotypic impact and lead to diseased cellular phenotypes, either systemically or confined to a tissue. Single-cell sequencing enables the detection and monitoring of the genotype and the consequent molecular phenotypes at a cellular resolution. It offers powerful tools to compare the cellular lineage between 'normal' and 'diseased' conditions and to establish genotype-phenotype relationships. By preserving cellular heterogeneity, single-cell sequencing, unlike bulk-sequencing, allows the detection of even small, diseased subpopulations of cells within an otherwise normal tissue. Indeed, the characterisation of biopsies with cellular resolution can provide a mechanistic view of the disease. While single-cell approaches are currently used mainly in basic research, it can be expected that applications of these technologies in the clinic may aid the detection, diagnosis and eventually the treatment of rare genetic diseases as well as cancer. This review article provides an overview of the single-cell sequencing technologies in the context of human genetics, with an aim to empower clinicians to understand and interpret the single-cell sequencing data and analyses. We discuss the state-of-the-art experimental and analytical workflows and highlight current challenges/limitations. Notably, we focus on two prospective applications of the technology in human genetics, namely the annotation of the non-coding genome using single-cell functional genomics and the use of single-cell sequencing data for in silico variant prioritisation.
Collapse
|
31
|
Genome-wide analyses of 200,453 individuals yield new insights into the causes and consequences of clonal hematopoiesis. Nat Genet 2022; 54:1155-1166. [PMID: 35835912 PMCID: PMC9355874 DOI: 10.1038/s41588-022-01121-z] [Citation(s) in RCA: 93] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 06/06/2022] [Indexed: 12/14/2022]
Abstract
Clonal hematopoiesis (CH), the clonal expansion of a blood stem cell and its progeny driven by somatic driver mutations, affects over a third of people, yet remains poorly understood. Here we analyze genetic data from 200,453 UK Biobank participants to map the landscape of inherited predisposition to CH, increasing the number of germline associations with CH in European-ancestry populations from 4 to 14. Genes at new loci implicate DNA damage repair (PARP1, ATM, CHEK2), hematopoietic stem cell migration/homing (CD164) and myeloid oncogenesis (SETBP1). Several associations were CH-subtype-specific including variants at TCL1A and CD164 that had opposite associations with DNMT3A- versus TET2-mutant CH, the two most common CH subtypes, proposing key roles for these two loci in CH development. Mendelian randomization analyses showed that smoking and longer leukocyte telomere length are causal risk factors for CH and that genetic predisposition to CH increases risks of myeloproliferative neoplasia, nonhematological malignancies, atrial fibrillation and blood epigenetic ageing.
Collapse
|
32
|
Predicting causal genes from psychiatric genome-wide association studies using high-level etiological knowledge. Mol Psychiatry 2022; 27:3095-3106. [PMID: 35411039 DOI: 10.1038/s41380-022-01542-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/08/2022] [Accepted: 03/21/2022] [Indexed: 12/24/2022]
Abstract
Genome-wide association studies have discovered hundreds of genomic loci associated with psychiatric traits, but the causal genes underlying these associations are often unclear, a research gap that has hindered clinical translation. Here, we present a Psychiatric Omnilocus Prioritization Score (PsyOPS) derived from just three binary features encapsulating high-level assumptions about psychiatric disease etiology - namely, that causal psychiatric disease genes are likely to be mutationally constrained, be specifically expressed in the brain, and overlap with known neurodevelopmental disease genes. To our knowledge, PsyOPS is the first method specifically tailored to prioritizing causal genes at psychiatric GWAS loci. We show that, despite its extreme simplicity, PsyOPS achieves state-of-the-art performance at this task, comparable to a prior domain-agnostic approach relying on tens of thousands of features. Genes prioritized by PsyOPS are substantially more likely than other genes at the same loci to have convergent evidence of direct regulation by the GWAS variant according to both DNA looping assays and expression or splicing quantitative trait locus (QTL) maps. We provide examples of genes hundreds of kilobases away from the lead variant, like GABBR1 for schizophrenia, that are prioritized by all three of PsyOPS, DNA looping and QTLs. Our results underscore the power of incorporating high-level knowledge of trait etiology into causal gene prediction at GWAS loci, and comprise a resource for researchers interested in experimentally characterizing psychiatric gene candidates.
Collapse
|
33
|
Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease. Nat Genet 2022; 54:950-962. [PMID: 35710981 DOI: 10.1038/s41588-022-01097-w] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 05/09/2022] [Indexed: 12/29/2022]
Abstract
More than 800 million people suffer from kidney disease, yet the mechanism of kidney dysfunction is poorly understood. In the present study, we define the genetic association with kidney function in 1.5 million individuals and identify 878 (126 new) loci. We map the genotype effect on the methylome in 443 kidneys, transcriptome in 686 samples and single-cell open chromatin in 57,229 kidney cells. Heritability analysis reveals that methylation variation explains a larger fraction of heritability than gene expression. We present a multi-stage prioritization strategy and prioritize target genes for 87% of kidney function loci. We highlight key roles of proximal tubules and metabolism in kidney function regulation. Furthermore, the causal role of SLC47A1 in kidney disease is defined in mice with genetic loss of Slc47a1 and in human individuals carrying loss-of-function variants. Our findings emphasize the key role of bulk and single-cell epigenomic information in translating genome-wide association studies into identifying causal genes, cellular origins and mechanisms of complex traits.
Collapse
|
34
|
Prioritizing risk genes as novel stratification biomarkers for acute monocytic leukemia by integrative analysis. Discov Oncol 2022; 13:55. [PMID: 35771283 PMCID: PMC9247126 DOI: 10.1007/s12672-022-00516-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 06/08/2022] [Indexed: 12/13/2022] Open
Abstract
Acute myeloid leukemia (AML) is a blood cancer with high heterogeneity and stratified as M0-M7 subtypes in the French-American-British (FAB) diagnosis system. Improved diagnosis with leverage of key molecular inputs will assist precisive medicine. Through deep-analyzing the transcriptomic data and mutations of AML, we report that a modern clustering algorithm, t-distributed Stochastic Neighbor Embedding (t-SNE), successfully demarcates M2, M3 and M5 territories while M4 bias to M5 and M0 & M1 bias to M2, consistent with the traditional FAB classification. Combining with mutation profiles, the results show that top recurrent AML mutations were unbiasedly allocated into M2 and M5 territories, indicating the t-SNE instructed transcriptomic stratification profoundly outperforms mutation profiling in the FAB system. Further functional data mining prioritizes several myeloid-specific genes as potential regulators of AML progression and treatment by Venetoclax, a BCL2 inhibitor. Among them two encode membrane proteins, LILRB4 and LRRC25, which could be utilized as cell surface biomarkers for monocytic AML or for innovative immuno-therapy candidates in future. In summary, our deep functional data-mining analysis warrants several unappreciated immune signaling-encoding genes as novel diagnostic biomarkers and potential therapeutic targets.
Collapse
|
35
|
Phylogeographic dynamics of the arthropod vector, the blacklegged tick (Ixodes scapularis). Parasit Vectors 2022; 15:238. [PMID: 35765050 PMCID: PMC9241328 DOI: 10.1186/s13071-022-05304-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 04/15/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The emergence of vector-borne pathogens in novel geographic areas is regulated by the migration of their arthropod vectors. Blacklegged ticks (Ixodes scapularis) and the pathogens they vector, including the causative agents of Lyme disease, babesiosis and anaplasmosis, continue to grow in their population sizes and to expand in geographic range. Migration of this vector over the previous decades has been implicated as the cause of the re-emergence of the most prevalent infectious diseases in North America. METHODS We systematically collected ticks from across New York State (hereafter referred to as New York) from 2004 to 2017 as part of routine tick-borne pathogen surveillance in the state. This time frame corresponds with an increase in range and incidence of tick-borne diseases within New York. We randomly sampled ticks from this collection to explore the evolutionary history and population dynamics of I. scapularis. We sequenced the mitochondrial genomes of each tick to characterize their current and historical spatial genetic structure and population growth using phylogeographic methods. RESULTS We sequenced whole mitochondrial genomes from 277 ticks collected across New York between 2004 and 2017. We found evidence of population genetic structure at a broad geographic scale due to differences in the relative abundance, but not the composition, of haplotypes among sampled ticks. Ticks were often most closely related to ticks from the same and nearby collection sites. The data indicate that both short- and long-range migration events shape the population dynamics of blacklegged ticks in New York. CONCLUSIONS We detailed the population dynamics of the blacklegged tick (Ixodes scapularis) in New York during a time frame in which tick-borne diseases were increasing in range and incidence. Migration of ticks occurred at both coarse and fine scales in the recent past despite evidence of limits to gene flow. Past and current tick population dynamics have implications for further range expansion as habitat suitability for ticks changes due to global climate change. Analyses of mitochondrial genome sequencing data will expound upon previously identified drivers of tick presence and abundance as well as identify additional drivers. These data provide a foundation on which to generate testable hypotheses on the drivers of tick population dynamics occurring at finer scales.
Collapse
|
36
|
Dissection of multiple sclerosis genetics identifies B and CD4+ T cells as driver cell subsets. Genome Biol 2022; 23:127. [PMID: 35672799 PMCID: PMC9175345 DOI: 10.1186/s13059-022-02694-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 05/16/2022] [Indexed: 11/10/2022] Open
Abstract
Background Multiple sclerosis (MS) is an autoimmune condition of the central nervous system with a well-characterized genetic background. Prior analyses of MS genetics have identified broad enrichments across peripheral immune cells, yet the driver immune subsets are unclear. Results We utilize chromatin accessibility data across hematopoietic cells to identify cell type-specific enrichments of MS genetic signals. We find that CD4 T and B cells are independently enriched for MS genetics and further refine the driver subsets to Th17 and memory B cells, respectively. We replicate our findings in data from untreated and treated MS patients and find that immunomodulatory treatments suppress chromatin accessibility at driver cell types. Integration of statistical fine-mapping and chromatin interactions nominate numerous putative causal genes, illustrating complex interplay between shared and cell-specific genes. Conclusions Overall, our study finds that open chromatin regions in CD4 T cells and B cells independently drive MS genetic signals. Our study highlights how careful integration of genetics and epigenetics can provide fine-scale insights into causal cell types and nominate new genes and pathways for disease. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02694-y.
Collapse
|
37
|
Whole-Genome Amplification—Surveying Yield, Reproducibility, and Heterozygous Balance, Reported by STR-Targeting MIPs. Int J Mol Sci 2022; 23:ijms23116161. [PMID: 35682839 PMCID: PMC9181316 DOI: 10.3390/ijms23116161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/27/2022] [Accepted: 05/27/2022] [Indexed: 02/01/2023] Open
Abstract
Whole-genome amplification is a crucial first step in nearly all single-cell genomic analyses, with the following steps focused on its products. Bias and variance caused by the whole-genome amplification process add numerous challenges to the world of single-cell genomics. Short tandem repeats are sensitive genomic markers used widely in population genetics, forensics, and retrospective lineage tracing. A previous evaluation of common whole-genome amplification targeting ~1000 non-autosomal short tandem repeat loci is extended here to ~12,000 loci across the entire genome via duplex molecular inversion probes. Other than its improved scale and reduced noise, this system detects an abundance of heterogeneous short tandem repeat loci, allowing the allelic balance to be reported. We show here that while the best overall yield is obtained using RepliG-SC, the maximum uniformity between alleles and reproducibility across cells are maximized by Ampli1, rendering it the best candidate for the comparative heterozygous analysis of single-cell genomes.
Collapse
|
38
|
Network Control Models With Personalized Genomics Data for Understanding Tumor Heterogeneity in Cancer. Front Oncol 2022; 12:891676. [PMID: 35712516 PMCID: PMC9195174 DOI: 10.3389/fonc.2022.891676] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 04/12/2022] [Indexed: 11/25/2022] Open
Abstract
Due to rapid development of high-throughput sequencing and biotechnology, it has brought new opportunities and challenges in developing efficient computational methods for exploring personalized genomics data of cancer patients. Because of the high-dimension and small sample size characteristics of these personalized genomics data, it is difficult for excavating effective information by using traditional statistical methods. In the past few years, network control methods have been proposed to solve networked system with high-dimension and small sample size. Researchers have made progress in the design and optimization of network control principles. However, there are few studies comprehensively surveying network control methods to analyze the biomolecular network data of individual patients. To address this problem, here we comprehensively surveyed complex network control methods on personalized omics data for understanding tumor heterogeneity in precision medicine of individual patients with cancer.
Collapse
|
39
|
Molecular and cellular mechanisms that regulate human erythropoiesis. Blood 2022; 139:2450-2459. [PMID: 34936695 PMCID: PMC9029096 DOI: 10.1182/blood.2021011044] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 12/15/2021] [Indexed: 12/03/2022] Open
Abstract
To enable effective oxygen transport, ∼200 billion red blood cells (RBCs) need to be produced every day in the bone marrow through the fine-tuned process of erythropoiesis. Erythropoiesis is regulated at multiple levels to ensure that defective RBC maturation or overproduction can be avoided. Here, we provide an overview of different layers of this control, ranging from cytokine signaling mechanisms that enable extrinsic regulation of RBC production to intrinsic transcriptional pathways necessary for effective erythropoiesis. Recent studies have also elucidated the importance of posttranscriptional regulation and highlighted additional gatekeeping mechanisms necessary for effective erythropoiesis. We additionally discuss the insights gained by studying human genetic variation affecting erythropoiesis and highlight the discovery of BCL11A as a regulator of hemoglobin switching through genetic studies. Finally, we provide an outlook of how our ability to measure multiple facets of this process at single-cell resolution, while accounting for the impact of human variation, will continue to refine our knowledge of erythropoiesis and how this process is perturbed in disease. As we learn more about this intricate and important process, additional opportunities to modulate erythropoiesis for therapeutic purposes will undoubtedly emerge.
Collapse
|
40
|
Genetic associations at regulatory phenotypes improve fine-mapping of causal variants for 12 immune-mediated diseases. Nat Genet 2022; 54:251-262. [PMID: 35288711 DOI: 10.1038/s41588-022-01025-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 01/31/2022] [Indexed: 12/11/2022]
Abstract
The resolution of causal genetic variants informs understanding of disease biology. We used regulatory quantitative trait loci (QTLs) from the BLUEPRINT, GTEx and eQTLGen projects to fine-map putative causal variants for 12 immune-mediated diseases. We identify 340 unique loci that colocalize with high posterior probability (≥98%) with regulatory QTLs and apply Bayesian frameworks to fine-map associations at each locus. We show that fine-mapping credible sets derived from regulatory QTLs are smaller compared to disease summary statistics. Further, they are enriched for more functionally interpretable candidate causal variants and for putatively causal insertion/deletion (INDEL) polymorphisms. Finally, we use massively parallel reporter assays to evaluate candidate causal variants at the ITGA4 locus associated with inflammatory bowel disease. Overall, our findings suggest that fine-mapping applied to disease-colocalizing regulatory QTLs can enhance the discovery of putative causal disease variants and enhance insights into the underlying causal genes and molecular mechanisms.
Collapse
|
41
|
A single-cell regulatory map of postnatal lung alveologenesis in humans and mice. CELL GENOMICS 2022; 2:100108. [PMID: 35434692 PMCID: PMC9012447 DOI: 10.1016/j.xgen.2022.100108] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 05/05/2021] [Accepted: 02/02/2022] [Indexed: 04/14/2023]
Abstract
Ex-utero regulation of the lungs' responses to breathing air and continued alveolar development shape adult respiratory health. Applying single-cell transposome hypersensitive site sequencing (scTHS-seq) to over 80,000 cells, we assembled the first regulatory atlas of postnatal human and mouse lung alveolar development. We defined regulatory modules and elucidated new mechanistic insights directing alveolar septation, including alveolar type 1 and myofibroblast cell signaling and differentiation, and a unique human matrix fibroblast population. Incorporating GWAS, we mapped lung function causal variants to myofibroblasts and identified a pathogenic regulatory unit linked to lineage marker FGF18, demonstrating the utility of chromatin accessibility data to uncover disease mechanism targets. Our regulatory map and analysis model provide valuable new resources to investigate age-dependent and species-specific control of critical developmental processes. Furthermore, these resources complement existing atlas efforts to advance our understanding of lung health and disease across the human lifespan.
Collapse
|
42
|
Simultaneous dimensionality reduction and integration for single-cell ATAC-seq data using deep learning. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00443-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AbstractAdvances in single-cell technologies enable the routine interrogation of chromatin accessibility for tens of thousands of single cells, elucidating gene regulatory processes at an unprecedented resolution. Meanwhile, size, sparsity and high dimensionality of the resulting data continue to pose challenges for its computational analysis, and specifically the integration of data from different sources. We have developed a dedicated computational approach: a variational auto-encoder using a noise model specifically designed for single-cell ATAC-seq (assay for transposase-accessible chromatin with high-throughput sequencing) data, which facilitates simultaneous dimensionality reduction and batch correction via an adversarial learning strategy. We showcase its benefits for detailed cell-type characterization on individual real and simulated datasets as well as for integrating multiple complex datasets.
Collapse
|
43
|
Body mass index and adipose distribution have opposing genetic impacts on human blood traits. eLife 2022; 11:75317. [PMID: 35166671 PMCID: PMC8884725 DOI: 10.7554/elife.75317] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 02/14/2022] [Indexed: 12/02/2022] Open
Abstract
Body mass index (BMI), hyperlipidemia, and truncal adipose distribution concordantly elevate cardiovascular disease risks, but have unknown genetic effects on blood trait variation. Using Mendelian randomization, we define unexpectedly opposing roles for increased BMI and truncal adipose distribution on blood traits. Elevated genetically determined BMI and lipid levels decreased hemoglobin and hematocrit levels, consistent with clinical observations associating obesity and anemia. We found that lipid-related effects were confined to erythroid traits. In contrast, BMI affected multiple blood lineages, indicating broad effects on hematopoiesis. Increased truncal adipose distribution opposed BMI effects, increasing hemoglobin and blood cell counts across lineages. Conditional analyses indicated genes, pathways, and cell types responsible for these effects, including Leptin Receptor and other blood cell-extrinsic factors in adipocytes and endothelium that regulate hematopoietic stem and progenitor cell biology. Our findings identify novel roles for obesity on hematopoiesis, including a previously underappreciated role for genetically determined adipose distribution in determining blood cell formation and function.
Collapse
|
44
|
Discovery of genomic loci of the human cerebral cortex using genetically informed brain atlases. Science 2022; 375:522-528. [PMID: 35113692 DOI: 10.1126/science.abe8457] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
To determine the impact of genetic variants on the brain, we used genetically informed brain atlases in genome-wide association studies of regional cortical surface area and thickness in 39,898 adults and 9136 children. We uncovered 440 genome-wide significant loci in the discovery cohort and 800 from a post hoc combined meta-analysis. Loci in adulthood were largely captured in childhood, showing signatures of negative selection, and were linked to early neurodevelopment and pathways associated with neuropsychiatric risk. Opposing gradations of decreased surface area and increased thickness were associated with common inversion polymorphisms. Inferior frontal regions, encompassing Broca's area, which is important for speech, were enriched for human-specific genomic elements. Thus, a mixed genetic landscape of conserved and human-specific features is concordant with brain hierarchy and morphogenetic gradients.
Collapse
|
45
|
Yu F, Cato LD, Weng C, Liggett LA, Jeon S, Xu K, Chiang CW, Wiemels JL, Weissman JS, de Smith AJ, Sankaran VG. Variant to function mapping at single-cell resolution through network propagation.. [PMID: 35118467 PMCID: PMC8811900 DOI: 10.1101/2022.01.23.477426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
With burgeoning human disease genetic associations and single-cell genomic atlases covering a range of tissues, there are unprecedented opportunities to systematically gain insights into the mechanisms of disease-causal variation. However, sparsity and noise, particularly in the context of single-cell epigenomic data, hamper the identification of disease- or trait-relevant cell types, states, and trajectories. To overcome these challenges, we have developed the SCAVENGE method, which maps causal variants to their relevant cellular context at single-cell resolution by employing the strategy of network propagation. We demonstrate how SCAVENGE can help identify key biological mechanisms underlying human genetic variation including enrichment of blood traits at distinct stages of human hematopoiesis, defining monocyte subsets that increase the risk for severe coronavirus disease 2019 (COVID-19), and identifying intermediate lymphocyte developmental states that are critical for predisposition to acute leukemia. Our approach not only provides a framework for enabling variant-to-function insights at single-cell resolution, but also suggests a more general strategy for maximizing the inferences that can be made using single-cell genomic data.
Collapse
|
46
|
eSCAN: scan regulatory regions for aggregate association testing using whole-genome sequencing data. Brief Bioinform 2022; 23:bbab497. [PMID: 34882196 PMCID: PMC8898002 DOI: 10.1093/bib/bbab497] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/25/2021] [Accepted: 10/30/2021] [Indexed: 02/07/2023] Open
Abstract
Multiple statistical methods for aggregate association testing have been developed for whole-genome sequencing (WGS) data. Many aggregate variants in a given genomic window and ignore existing knowledge to define test regions, resulting in many identified regions not clearly linked to genes, and thus, limiting biological understanding. Functional information from new technologies (such as Hi-C and its derivatives), which can help link enhancers to their effector genes, can be leveraged to predefine variant sets for aggregate testing in WGS data. Here, we propose the eSCAN (scan the enhancers) method for genome-wide assessment of enhancer regions in sequencing studies, combining the advantages of dynamic window selection in SCANG (SCAN the Genome), a previously developed method, with the advantages of incorporating putative regulatory regions from annotation. eSCAN, by searching in putative enhancers, increases statistical power and aids mechanistic interpretation, as demonstrated by extensive simulation studies. We also apply eSCAN for blood cell traits using NHLBI Trans-Omics for Precision Medicine WGS data. Results from real data analysis show that eSCAN is able to capture more significant signals, and these signals are of shorter length (indicating higher resolution fine-mapping capability) and drive association of larger regions detected by other methods.
Collapse
|
47
|
Genome-wide association study on 13,167 individuals identifies regulators of blood CD34+ cell levels. Blood 2022; 139:1659-1669. [PMID: 35007327 DOI: 10.1182/blood.2021013220] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/11/2021] [Indexed: 11/20/2022] Open
Abstract
Stem cell transplantation is a cornerstone in the treatment of blood malignancies. The most common method to harvest stem cells for transplantation is by leukapheresis, requiring mobilization of CD34+ hematopoietic stem and progenitor cells (HSPC) from the bone marrow into the blood. Identifying the genetic factors that control blood CD34+ cell levels could expose new drug targets for HSPC mobilization. Here, we report the first large-scale genome-wide association study on blood CD34+ cell levels. Across 13,167 individuals, we identify 9 significant and 2 suggestive associations, accounted for by 8 loci (PPM1H, CXCR4, ENO1-RERE, ITGA9, ARHGAP45, CEBPA, TERT and MYC). Notably, 4 of the identified associations map to CXCR4, demonstrating that bona fide regulators of blood CD34+ cell levels can be identified through genetic variation. Further, the most significant association maps to PPM1H, encoding a serine/threonine phosphatase never previously implicated in HSPC biology. PPM1H is expressed in HSPCs, and the allele that confers higher blood CD34+ cell levels downregulates PPM1H. Through functional fine-mapping, we find that this downregulation is caused by the variant rs772557-A, which abrogates a MYB transcription factor binding site in PPM1H intron 1 that is active in specific HSPC subpopulations, including hematopoietic stem cells, and interacts with the promoter by chromatin looping. Furthermore, PPM1H knockdown increases the proportion of CD34+ and CD34+90+ cells in cord blood assays. Our results provide first large-scale analysis of the genetic architecture of blood CD34+ cell levels, and warrant further investigation of PPM1H as a potential inhibition target for stem cell mobilization.
Collapse
|
48
|
Functional dissection of inherited non-coding variation influencing multiple myeloma risk. Nat Commun 2022; 13:151. [PMID: 35013207 PMCID: PMC8748989 DOI: 10.1038/s41467-021-27666-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 12/02/2021] [Indexed: 12/16/2022] Open
Abstract
Thousands of non-coding variants have been associated with increased risk of human diseases, yet the causal variants and their mechanisms-of-action remain obscure. In an integrative study combining massively parallel reporter assays (MPRA), expression analyses (eQTL, meQTL, PCHiC) and chromatin accessibility analyses in primary cells (caQTL), we investigate 1,039 variants associated with multiple myeloma (MM). We demonstrate that MM susceptibility is mediated by gene-regulatory changes in plasma cells and B-cells, and identify putative causal variants at six risk loci (SMARCD3, WAC, ELL2, CDCA7L, CEP120, and PREX1). Notably, three of these variants co-localize with significant plasma cell caQTLs, signaling the presence of causal activity at these precise genomic positions in an endogenous chromosomal context in vivo. Our results provide a systematic functional dissection of risk loci for a hematologic malignancy.
Collapse
|
49
|
Single-nucleus chromatin accessibility profiling highlights distinct astrocyte signatures in progressive supranuclear palsy and corticobasal degeneration. Acta Neuropathol 2022; 144:615-635. [PMID: 35976433 PMCID: PMC9468099 DOI: 10.1007/s00401-022-02483-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/03/2022] [Accepted: 08/08/2022] [Indexed: 01/31/2023]
Abstract
Tauopathies such as progressive supranuclear palsy (PSP) and corticobasal degeneration (CBD) exhibit characteristic neuronal and glial inclusions of hyperphosphorylated Tau (pTau). Although the astrocytic pTau phenotype upon neuropathological examination is the most guiding feature in distinguishing both diseases, regulatory mechanisms controlling their transitions into disease-specific states are poorly understood to date. Here, we provide accessible chromatin data of more than 45,000 single nuclei isolated from the frontal cortex of PSP, CBD, and control individuals. We found a strong association of disease-relevant molecular changes with astrocytes and demonstrate that tauopathy-relevant genetic risk variants are tightly linked to astrocytic chromatin accessibility profiles in the brains of PSP and CBD patients. Unlike the established pathogenesis in the secondary tauopathy Alzheimer disease, microglial alterations were relatively sparse. Transcription factor (TF) motif enrichments in pseudotime as well as modeling of the astrocytic TF interplay suggested a common pTau signature for CBD and PSP that is reminiscent of an inflammatory immediate-early response. Nonetheless, machine learning models also predicted discriminatory features, and we observed marked differences in molecular entities related to protein homeostasis between both diseases. Predicted TF involvement was supported by immunofluorescence analyses in postmortem brain tissue for their highly correlated target genes. Collectively, our data expand the current knowledge on risk gene involvement (e.g., MAPT, MAPK8, and NFE2L2) and molecular pathways leading to the phenotypic changes associated with CBD and PSP.
Collapse
|
50
|
Super interactive promoters provide insight into cell type-specific regulatory networks in blood lineage cell types. PLoS Genet 2022; 18:e1009984. [PMID: 35100265 PMCID: PMC8830683 DOI: 10.1371/journal.pgen.1009984] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 02/10/2022] [Accepted: 12/07/2021] [Indexed: 12/13/2022] Open
Abstract
Existing studies of chromatin conformation have primarily focused on potential enhancers interacting with gene promoters. By contrast, the interactivity of promoters per se, while equally critical to understanding transcriptional control, has been largely unexplored, particularly in a cell type-specific manner for blood lineage cell types. In this study, we leverage promoter capture Hi-C data across a compendium of blood lineage cell types to identify and characterize cell type-specific super-interactive promoters (SIPs). Notably, promoter-interacting regions (PIRs) of SIPs are more likely to overlap with cell type-specific ATAC-seq peaks and GWAS variants for relevant blood cell traits than PIRs of non-SIPs. Moreover, PIRs of cell-type-specific SIPs show enriched heritability of relevant blood cell trait (s), and are more enriched with GWAS variants associated with blood cell traits compared to PIRs of non-SIPs. Further, SIP genes tend to express at a higher level in the corresponding cell type. Importantly, SIP subnetworks incorporating cell-type-specific SIPs and ATAC-seq peaks help interpret GWAS variants. Examples include GWAS variants associated with platelet count near the megakaryocyte SIP gene EPHB3 and variants associated lymphocyte count near the native CD4 T-Cell SIP gene ETS1. Interestingly, around 25.7% ~ 39.6% blood cell traits GWAS variants residing in SIP PIR regions disrupt transcription factor binding motifs. Importantly, our analysis shows the potential of using promoter-centric analyses of chromatin spatial organization data to identify biologically important genes and their regulatory regions.
Collapse
|