1
|
Xu X, Chen Q, Huang Q, Cox TC, Zhu H, Hu J, Han X, Meng Z, Wang B, Liao Z, Xu W, Xiao B, Lang R, Liu J, Huang J, Tang X, Wang J, Li Q, Liu T, Zhang Q, Antonarakis SE, Zhang J, Fan X, Liu H, Zhang YB. Auricular malformations are driven by copy number variations in a hierarchical enhancer cluster and a dominant enhancer recapitulates human pathogenesis. Nat Commun 2025; 16:4598. [PMID: 40382324 DOI: 10.1038/s41467-025-59735-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 05/02/2025] [Indexed: 05/20/2025] Open
Abstract
Enhancers, through the combinatorial action of transcription factors (TFs), dictate both the spatial specificity and the levels of gene expression, and their aberrations can result in diseases. While a HMX1 downstream enhancer is associated with ear malformations, the mechanisms underlying bilateral constricted ear (BCE) remain unclear. Here, we identify a copy number variation (CNV) containing three enhancers-collectively termed the positional identity hierarchical enhancer cluster (PI-HEC)-that drives BCE by coordinately regulating HMX1 expression. Each enhancer exhibits distinct activity-location-structure features, and the dominant enhancer with high mobility group (HMG)-box combined with Coordinator and homeodomain TF motifs modulating its activity and specificity, respectively. Mouse models demonstrate that neural crest-derived fibroblasts with aberrant Hmx1 expression in the basal pinna, along with ectopic distal pinna expression, disrupt outer ear development, affecting cartilage, muscle, and epidermis. Our findings elucidate mammalian ear morphogenesis and underscore the complexity of synergistic regulation among enhancers and between enhancers and transcription factors.
Collapse
Affiliation(s)
- Xiaopeng Xu
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
- Bioland Laboratory, Guangzhou, 510320, Guangdong, China
| | - Qi Chen
- Department of Ear Reconstruction, Plastic Surgery Hospital, Chinese Academy of Medical Sciences, Beijing, 100144, China
| | - Qingpei Huang
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China
| | - Timothy C Cox
- Departments of Oral & Craniofacial Sciences, School of Dentistry, and Pediatrics, School of Medicine, University of Missouri-Kansas City, Kansas City, USA
| | - Hao Zhu
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Jintian Hu
- Department of Ear Reconstruction, Plastic Surgery Hospital, Chinese Academy of Medical Sciences, Beijing, 100144, China
| | - Xi Han
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China
| | - Ziqiu Meng
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Bingqing Wang
- Department of Ear Reconstruction, Plastic Surgery Hospital, Chinese Academy of Medical Sciences, Beijing, 100144, China
| | - Zhiying Liao
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China
| | - Wenxin Xu
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China
- Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230000, China
| | - Baichuan Xiao
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Ruirui Lang
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Jiqiang Liu
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Jian Huang
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Xiaokai Tang
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Jinmo Wang
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China
| | - Qiang Li
- Department of Plastic Surgery, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221000, China
| | - Ting Liu
- Department of Ophthalmology, Daping Hospital, Army Medical University, Chongqing, China
| | - Qingguo Zhang
- Department of Ear Reconstruction, Plastic Surgery Hospital, Chinese Academy of Medical Sciences, Beijing, 100144, China
| | - Stylianos E Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical Faculty, Geneva, 1211, Switzerland
- Medigenome, Swiss Institute of Genomic Medicine, 1207, Geneva, Switzerland
- iGE3 Institute of Genetics and Genomes in Geneva, Geneva, Switzerland
| | - Jiao Zhang
- Shandong collaborative innovation research institute of traditional Chinese medicine industry, Jinan, 250000, Shandong, China.
| | - Xiaoying Fan
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China.
- Bioland Laboratory, Guangzhou, 510320, Guangdong, China.
- The Fifth Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510320, Guangdong, China.
- GMU-GIBH Joint School of Life Sciences, Guangzhou, 510320, Guangdong, China.
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, 510320, Guangdong, China.
| | - Huisheng Liu
- Guangzhou National Laboratory, Guangzhou, 510320, Guangdong, China.
- Bioland Laboratory, Guangzhou, 510320, Guangdong, China.
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, 510320, Guangdong, China.
| | - Yong-Biao Zhang
- School of Bioengineering Medicine, Beihang University, Beijing, 100191, China.
- Key Laboratory of Big Data-Based Precision Medicine, Beihang University, Ministry of Industry and Information Technology, Beijing, 100191, China.
| |
Collapse
|
2
|
Hu J, Weber JN, Fuess LE, Steinel NC, Bolnick DI, Wang M. A spectral framework to map QTLs affecting joint differential networks of gene co-expression. PLoS Comput Biol 2025; 21:e1012953. [PMID: 40245036 PMCID: PMC12040279 DOI: 10.1371/journal.pcbi.1012953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 04/29/2025] [Accepted: 03/11/2025] [Indexed: 04/19/2025] Open
Abstract
Studying the mechanisms underlying the genotype-phenotype association is crucial in genetics. Gene expression studies have deepened our understanding of the genotype → expression → phenotype mechanisms. However, traditional expression quantitative trait loci (eQTL) methods often overlook the critical role of gene co-expression networks in translating genotype into phenotype. This gap highlights the need for more powerful statistical methods to analyze genotype → network → phenotype mechanism. Here, we develop a network-based method, called spectral network quantitative trait loci analysis (snQTL), to map quantitative trait loci affecting gene co-expression networks. Our approach tests the association between genotypes and joint differential networks of gene co-expression via a tensor-based spectral statistics, thereby overcoming the ubiquitous multiple testing challenges in existing methods. We demonstrate the effectiveness of snQTL in the analysis of three-spined stickleback (Gasterosteus aculeatus) data. Compared to conventional methods, our method snQTL uncovers chromosomal regions affecting gene co-expression networks, including one strong candidate gene that would have been missed by traditional eQTL analyses. Our framework suggests the limitation of current approaches and offers a powerful network-based tool for functional loci discoveries.
Collapse
Affiliation(s)
- Jiaxin Hu
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Jesse N. Weber
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Lauren E. Fuess
- Department of Biology, Texas State University, San Marcos, Texas, United States of America
| | - Natalie C. Steinel
- Department of Biological Sciences, University of Massachusetts Lowell, Lowell, Massachusetts, United States of America
| | - Daniel I. Bolnick
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, United States of America
| | - Miaoyan Wang
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
3
|
Zhang ZE, Kim A, Suboc N, Mancuso N, Gazal S. Efficient count-based models improve power and robustness for large-scale single-cell eQTL mapping. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.01.18.25320755. [PMID: 40093202 PMCID: PMC11908335 DOI: 10.1101/2025.01.18.25320755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/19/2025]
Abstract
Population-scale single-cell transcriptomic technologies (scRNA-seq) enable characterizing variant effects on gene regulation at the cellular level (e.g., single-cell eQTLs; sc-eQTLs). However, existing sc-eQTL mapping approaches are either not designed for analyzing sparse counts in scRNA-seq data or can become intractable in extremely large datasets. Here, we propose jaxQTL, a flexible and efficient sc-eQTL mapping framework using highly efficient count-based models given pseudobulk data. Using extensive simulations, we demonstrated that jaxQTL with a negative binomial model outperformed other models in identifying sc-eQTLs, while maintaining a calibrated type I error. We applied jaxQTL across 14 cell types of OneK1K scRNA-seq data (N=982), and identified 11-16% more eGenes compared with existing approaches, primarily driven by jaxQTL ability to identify lowly expressed eGenes. We observed that fine-mapped sc-eQTLs were further from transcription starting site (TSS) than fine-mapped eQTLs identified in all cells (bulk-eQTLs; P=1x10-4) and more enriched in cell-type-specific enhancers (P=3x10-10), suggesting that sc-eQTLs improve our ability to identify distal eQTLs that are missed in bulk tissues. Overall, the genetic effect of fine-mapped sc-eQTLs were largely shared across cell types, with cell-type-specificity increasing with distance to TSS. Lastly, we observed that sc-eQTLs explain more SNP-heritability (h2 ) than bulk-eQTLs (9.90 ± 0.88% vs. 6.10 ± 0.76% when meta-analyzed across 16 blood and immune-related traits), improving but not closing the missing link between GWAS and eQTLs. As an example, we highlight that sc-eQTLs in T cells (unlike bulk-eQTLs) can successfully nominate IL6ST as a candidate gene for rheumatoid arthritis. Overall, jaxQTL provides an efficient and powerful approach using count-based models to identify missing disease-associated eQTLs.
Collapse
Affiliation(s)
- Zixuan Eleanor Zhang
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Artem Kim
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Noah Suboc
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
- Department of Quantitative and Computational Biology, University of Southern California
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California
| | - Steven Gazal
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
- Department of Quantitative and Computational Biology, University of Southern California
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California
| |
Collapse
|
4
|
Spiliopoulou A, Iakovliev A, Plant D, Sutcliffe M, Sharma S, Cubuk C, Lewis M, Pitzalis C, Barton A, McKeigue PM. Genome-Wide Aggregated Trans Effects Analysis Identifies Genes Encoding Immune Checkpoints as Core Genes for Rheumatoid Arthritis. Arthritis Rheumatol 2025. [PMID: 39887658 DOI: 10.1002/art.43125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 12/06/2024] [Accepted: 01/07/2025] [Indexed: 02/01/2025]
Abstract
OBJECTIVE The sparse effector "omnigenic" hypothesis postulates that the polygenic effects of common single nucleotide polymorphisms (SNPs) on a typical complex trait are mediated by trans effects that coalesce on expression of a relatively sparse set of core genes. The objective of this study was to identify core genes for rheumatoid arthritis by testing for association of rheumatoid arthritis with genome-wide aggregated trans effects (GATE) scores for expression of each gene as transcript in whole blood or as circulating protein levels. METHODS GATE scores were calculated for 5,400 cases and 453,705 non-cases of primary rheumatoid arthritis in UK Biobank participants of European ancestry. RESULTS Testing for association with GATE scores identified 16 putative core genes for rheumatoid arthritis outside the HLA region, of which six-TP53BP1, PDCD1, TNFRSF14, LAIR1, LILRA4, and IDO1-were supported by Mendelian randomization analysis based on the marginal likelihood of the causal effect parameter. Five of these 16 genes were validated by a reported association of rheumatoid arthritis with SNPs within 200 kb of the transcription site, eight by association of the measured protein level with rheumatoid arthritis in UK Biobank, 10 by experimental perturbation in mouse models of inflammatory arthritis, and two-CTLA4 and PDCD1-by evidence that drugs targeting the gene cause or ameliorate inflammatory arthritis in humans. Fourteen of these 16 genes are in pathways affecting immunity or inflammation, and six-CD5, CTLA4, TIGIT, LAIR1, TNFRSF14, and PDCD1-encode receptors that have been characterized as immune checkpoints exploited by cancer cells to escape the immune response. CONCLUSION These results highlight the key role of immune checkpoints in rheumatoid arthritis and identify possible therapeutic targets.
Collapse
Affiliation(s)
| | | | - Darren Plant
- University of Manchester and the National Institute for Health and Care Research Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Oxford Road, Manchester, United Kingdom
| | | | - Seema Sharma
- University of Manchester and the National Institute for Health and Care Research Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Oxford Road, Manchester, United Kingdom
| | - Cankut Cubuk
- Queen Mary University of London and Barts Health NHS Trust and NIHR Barts Biomedical Research Centre, London, United Kingdom
| | - Myles Lewis
- Queen Mary University of London and Barts Health NHS Trust and NIHR Barts Biomedical Research Centre, London, United Kingdom
| | - Costantino Pitzalis
- Queen Mary University of London and Barts Health NHS Trust and NIHR Barts Biomedical Research Centre, London, United Kingdom
| | - Anne Barton
- University of Manchester and the National Institute for Health and Care Research Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Oxford Road, Manchester, United Kingdom
| | | |
Collapse
|
5
|
Ray-Jones H, Sung CK, Chan LT, Haglund A, Artemov P, Della Rosa M, Ruje L, Burden F, Kreuzhuber R, Litovskikh A, Weyenbergh E, Brusselaers Z, Tan VXH, Frontini M, Wallace C, Malysheva V, Bottolo L, Vigorito E, Spivakov M. Genetic coupling of enhancer activity and connectivity in gene expression control. Nat Commun 2025; 16:970. [PMID: 39870618 PMCID: PMC11772589 DOI: 10.1038/s41467-025-55900-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Accepted: 01/03/2025] [Indexed: 01/29/2025] Open
Abstract
Gene enhancers often form long-range contacts with promoters, but it remains unclear if the activity of enhancers and their chromosomal contacts are mediated by the same DNA sequences and recruited factors. Here, we study the effects of expression quantitative trait loci (eQTLs) on enhancer activity and promoter contacts in primary monocytes isolated from 34 male individuals. Using eQTL-Capture Hi-C and a Bayesian approach considering both intra- and inter-individual variation, we initially detect 19 eQTLs associated with enhancer-eGene promoter contacts, most of which also associate with enhancer accessibility and activity. Capitalising on these shared effects, we devise a multi-modality Bayesian strategy, identifying 629 "trimodal QTLs" jointly associated with enhancer accessibility, eGene promoter contact, and gene expression. Causal mediation analysis and CRISPR interference reveal causal relationships between these three modalities. Many detected QTLs overlap disease susceptibility loci and influence the predicted binding of myeloid transcription factors, including SPI1, GABPB and STAT3. Additionally, a variant associated with PCK2 promoter contact directly disrupts a CTCF binding motif and impacts promoter insulation from downstream enhancers. Jointly, our findings suggest an inherent genetic coupling of enhancer activity and connectivity in gene expression control relevant to human disease and highlight the regulatory role of genetically determined chromatin boundaries.
Collapse
Affiliation(s)
- Helen Ray-Jones
- MRC Laboratory of Medical Sciences, London, UK.
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK.
- Computational Neurobiology, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.
- Computational Neurobiology, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands.
| | - Chak Kei Sung
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
- LKS Faculty of Medicine, the University of Hong Kong, Hong Kong, Hong Kong
| | - Lai Ting Chan
- Computational Neurobiology, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Computational Neurobiology, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Alexander Haglund
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Pavel Artemov
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
| | - Monica Della Rosa
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
- Cyted, Cambridge, UK
| | - Luminita Ruje
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
| | - Frances Burden
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
- University of Kent, Canterbury, UK
| | - Roman Kreuzhuber
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
- EMBL-EBI, Wellcome Genome Campus, Cambridge, UK
- Swiss Federal Administration, Bern, Switzerland
| | - Anna Litovskikh
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
- Institute of Computational Biology, Helmholtz Zentrum München and Ludwig Maximilians University Munich, Faculty of Medicine, Munich, Germany
| | - Eline Weyenbergh
- Computational Neurobiology, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Computational Neurobiology, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
- University Hospital Antwerp (UZA), Antwerp, Belgium
| | - Zoï Brusselaers
- Computational Neurobiology, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Computational Neurobiology, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Vanessa Xue Hui Tan
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
- Hummingbird Bioscience, Singapore, Singapore
| | - Mattia Frontini
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- National Health Service (NHS) Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
- Department of Clinical and Biomedical Sciences, Faculty of Health and Life Sciences, University of Exeter Medical School, Exeter, UK
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge, UK
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Valeriya Malysheva
- MRC Laboratory of Medical Sciences, London, UK
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK
- Computational Neurobiology, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Computational Neurobiology, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Leonardo Bottolo
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK.
- Department of Medical Genetics, School of Clinical Medicine, University of Cambridge, Cambridge, UK.
- The Alan Turing Institute, London, UK.
| | - Elena Vigorito
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Mikhail Spivakov
- MRC Laboratory of Medical Sciences, London, UK.
- Institute of Clinical Sciences, Imperial College Faculty of Medicine, London, UK.
| |
Collapse
|
6
|
Chang YH, Head ST, Harrison T, Yu Y, Huff CD, Pasaniuc B, Lindström S, Bhattacharya A. Isoform-level analyses of 6 cancers uncover extensive genetic risk mechanisms undetected at the gene-level. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.10.29.24316388. [PMID: 39574839 PMCID: PMC11581093 DOI: 10.1101/2024.10.29.24316388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2024]
Abstract
Integrating genome-wide association study (GWAS) and transcriptomic datasets can help identify potential mediators for germline genetic risk of cancer. However, traditional methods have been largely unsuccessful because of an overreliance on total gene expression. These approaches overlook alternative splicing, which can produce multiple isoforms from the same gene, each with potentially different effects on cancer risk. Here, we integrate genetic and multi-tissue isoform-level gene expression data from the Genotype Tissue-Expression Project (GTEx, N = 108-574) with publicly available European-ancestry GWAS summary statistics (all N > 20,000 cases) to identify both isoform- and gene-level risk associations with six cancers (breast, endometrial, colorectal, lung, ovarian, prostate) and six related cancer subtype classifications (N = 12 total). Compared to traditional methods leveraging total gene expression, directly modeling isoform expression through transcriptome-wide association studies (isoTWAS) substantially increases discovery of transcriptomic mechanisms underlying genetic associations. Using the same RNA-seq datasets, isoTWAS identified 164% more significant unique gene associations compared to TWAS (6,163 and 2,336, respectively), with isoTWAS-prioritized genes enriched 4-fold for evolutionarily-constrained genes (P = 6.1 × 10-13). isoTWAS tags transcriptomic associations at 52% more independent GWAS loci compared to TWAS across the six cancers. Additionally, isoform expression mediates an estimated 63% greater proportion of cancer risk SNP heritability compared to gene expression when evaluating cis-genetic influence on isoform expression. We highlight several notable isoTWAS associations that demonstrate GWAS colocalization at the isoform level but not at the gene level, including, CLPTM1L (lung cancer), LAMC1 (colorectal), and BABAM1 (breast). These results underscore the critical importance of modeling isoform-level expression to maximize discovery of genetic risk mechanisms for cancers.
Collapse
Affiliation(s)
- Yung-Han Chang
- Quantitative Sciences Program, The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX, USA
| | - S. Taylor Head
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tabitha Harrison
- Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA, USA
| | - Yao Yu
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Chad D. Huff
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Bogdan Pasaniuc
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Sara Lindström
- Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Arjun Bhattacharya
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Institute for Data Science in Oncology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
7
|
Bohrer CH, Fursova NA, Larson DR. Enhancers: A Focus on Synthetic Biology and Correlated Gene Expression. ACS Synth Biol 2024; 13:3093-3108. [PMID: 39276360 DOI: 10.1021/acssynbio.4c00244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2024]
Abstract
Enhancers are central for the regulation of metazoan transcription but have proven difficult to study, primarily due to a myriad of interdependent variables shaping their activity. Consequently, synthetic biology has emerged as the main approach for dissecting mechanisms of enhancer function. We start by reviewing simple but highly parallel reporter assays, which have been successful in quantifying the complexity of the activator/coactivator mechanisms at enhancers. We then describe studies that examine how enhancers function in the genomic context and in combination with other enhancers, revealing that they activate genes through a variety of different mechanisms, working together as a system. Here, we primarily focus on synthetic reporter genes that can quantify the dynamics of enhancer biology through time. We end by considering the consequences of having many genes and enhancers within a 'local environment', which we believe leads to correlated gene expression and likely reports on the general principles of enhancer biology.
Collapse
Affiliation(s)
- Christopher H Bohrer
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Nadezda A Fursova
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Daniel R Larson
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
8
|
Shao M, Tian M, Chen K, Jiang H, Zhang S, Li Z, Shen Y, Chen F, Shen B, Cao C, Gu N. Leveraging Random Effects in Cistrome-Wide Association Studies for Decoding the Genetic Determinants of Prostate Cancer. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2400815. [PMID: 39099406 PMCID: PMC11423091 DOI: 10.1002/advs.202400815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 07/09/2024] [Indexed: 08/06/2024]
Abstract
Cistrome-wide association studies (CWAS) are pivotal for identifying genetic determinants of diseases by correlating genetically regulated cistrome states with phenotypes. Traditional CWAS typically develops a model based on cistrome and genotype data to associate predicted cistrome states with phenotypes. The random effect cistrome-wide association study (RECWAS), reevaluates the necessity of cistrome state prediction in CWAS. RECWAS utilizes either a linear model or marginal effect for initial feature selection, followed by kernel-based feature aggregation for association testing is introduced. Through simulations and analysis of prostate cancer data, a thorough evaluation of CWAS and RECWAS is conducted. The results suggest that RECWAS offers improved power compared to traditional CWAS, identifying additional genomic regions associated with prostate cancer. CWAS identified 102 significant regions, while RECWAS found 50 additional significant regions compared to CWAS, many of which are validated. Validation encompassed a range of biological evidence, including risk signals from the GWAS catalog, susceptibility genes from the DisGeNET database, and enhancer-domain scores. RECWAS consistently demonstrated improved performance over traditional CWAS in identifying genomic regions associated with prostate cancer. These findings demonstrate the benefits of incorporating kernel methods into CWAS and provide new insights for genetic discovery in complex diseases.
Collapse
Affiliation(s)
- Mengting Shao
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Min Tian
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Kaiyang Chen
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Hangjin Jiang
- Center for Data ScienceZhejiang UniversityHangzhou310058P. R. China
| | - Shuting Zhang
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Zhenghui Li
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Yan Shen
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Feng Chen
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
| | - Baixin Shen
- Department of UrologyThe Second Affiliated Hospital of Nanjing Medical UniversityNanjing210011P. R. China
| | - Chen Cao
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
- Department of UrologyThe Second Affiliated Hospital of Nanjing Medical UniversityNanjing210011P. R. China
| | - Ning Gu
- Key Laboratory for Bio‐Electromagnetic Environment and Advanced Medical TheranosticsSchool of Biomedical Engineering and InformaticsNanjing Medical UniversityNanjing211166P. R. China
- Nanjing Key Laboratory for Cardiovascular Information and Health Engineering MedicineInstitute of Clinical MedicineNanjing Drum Tower HospitalMedical SchoolNanjing UniversityNanjing210093P. R. China
| |
Collapse
|
9
|
DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A. Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation. Genome Biol 2024; 25:221. [PMID: 39143563 PMCID: PMC11323586 DOI: 10.1186/s13059-024-03365-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 08/01/2024] [Indexed: 08/16/2024] Open
Abstract
BACKGROUND Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of the regulatory programs this variation affects can shed light on the apparatuses of human diseases. RESULTS We collect epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we construct networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks serve as the base for a rich series of analyses, through which we demonstrate their temporal dynamics and enrichment for various disease-associated variants. We apply the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrate methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. CONCLUSIONS Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes; this includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
Collapse
Affiliation(s)
- William DeGroat
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ, 08854, USA
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Tal Ashuach
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, 387 Soda Hall, Berkeley, CA, 94720, USA
| | - Nir Yosef
- Department of Systems Immunology, Weizmann Institute of Science, 234 Herzl Street, Rehovot, 7610001, Israel
- Chan-Zuckerberg Biohub, 499 Illinois St, San Francisco, CA, 94158, USA
- Department of Systems Immunology, Ragon Institute of MGH, MIT, and Harvard Institute of Science, 400 Technology Square, Cambridge, MA, 02139, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, 513 Parnassus Ave, San Francisco, CA, 94143, USA
- Institute for Human Genetics, University of California, 513 Parnassus Ave, San Francisco, CA, 94143, USA
| | - Anat Kreimer
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ, 08854, USA.
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ, 08854, USA.
| |
Collapse
|
10
|
Vanderstichele T, Burnham KL, de Klein N, Tardaguila M, Howell B, Walter K, Kundu K, Koeppel J, Lee W, Tokolyi A, Persyn E, Nath AP, Marten J, Petrovski S, Roberts DJ, Di Angelantonio E, Danesh J, Berton A, Platt A, Butterworth AS, Soranzo N, Parts L, Inouye M, Paul DS, Davenport EE. Misexpression of inactive genes in whole blood is associated with nearby rare structural variants. Am J Hum Genet 2024; 111:1524-1543. [PMID: 39053458 PMCID: PMC11339615 DOI: 10.1016/j.ajhg.2024.06.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 06/27/2024] [Accepted: 06/27/2024] [Indexed: 07/27/2024] Open
Abstract
Gene misexpression is the aberrant transcription of a gene in a context where it is usually inactive. Despite its known pathological consequences in specific rare diseases, we have a limited understanding of its wider prevalence and mechanisms in humans. To address this, we analyzed gene misexpression in 4,568 whole-blood bulk RNA sequencing samples from INTERVAL study blood donors. We found that while individual misexpression events occur rarely, in aggregate they were found in almost all samples and a third of inactive protein-coding genes. Using 2,821 paired whole-genome and RNA sequencing samples, we identified that misexpression events are enriched in cis for rare structural variants. We established putative mechanisms through which a subset of SVs lead to gene misexpression, including transcriptional readthrough, transcript fusions, and gene inversion. Overall, we develop misexpression as a type of transcriptomic outlier analysis and extend our understanding of the variety of mechanisms by which genetic variants can influence gene expression.
Collapse
Affiliation(s)
| | - Katie L Burnham
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Niek de Klein
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Brittany Howell
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Klaudia Walter
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Kousik Kundu
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK; Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Puddicombe Way, Cambridge, UK
| | - Jonas Koeppel
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Wanseon Lee
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Alex Tokolyi
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Elodie Persyn
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Artika P Nath
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Jonathan Marten
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK; Department of Medicine, University of Melbourne, Austin Health, Melbourne, VIC, Australia
| | - David J Roberts
- Radcliffe Department of Medicine, John Radcliffe Hospital, Oxford, UK; Clinical Services, NHS Blood and Transplant, Oxford Centre, John Radcliffe Hospital, Oxford, UK
| | - Emanuele Di Angelantonio
- Human Technopole, Fondazione Human Technopole, Milan, Italy; British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK; National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK; Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - John Danesh
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK; British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK; National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK; Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Alix Berton
- Translational Science and Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Molndal, Sweden
| | - Adam Platt
- Translational Science and Experimental Medicine, Research and Early Development, Respiratory and Immunology, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Adam S Butterworth
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK; National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK; Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Nicole Soranzo
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK; Human Technopole, Fondazione Human Technopole, Milan, Italy; Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Puddicombe Way, Cambridge, UK; British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK; National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
| | - Leopold Parts
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Michael Inouye
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia; British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK; Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Dirk S Paul
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | | |
Collapse
|
11
|
Xu L, Liu Y. Identification, Design, and Application of Noncoding Cis-Regulatory Elements. Biomolecules 2024; 14:945. [PMID: 39199333 PMCID: PMC11352686 DOI: 10.3390/biom14080945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Revised: 07/25/2024] [Accepted: 07/30/2024] [Indexed: 09/01/2024] Open
Abstract
Cis-regulatory elements (CREs) play a pivotal role in orchestrating interactions with trans-regulatory factors such as transcription factors, RNA-binding proteins, and noncoding RNAs. These interactions are fundamental to the molecular architecture underpinning complex and diverse biological functions in living organisms, facilitating a myriad of sophisticated and dynamic processes. The rapid advancement in the identification and characterization of these regulatory elements has been marked by initiatives such as the Encyclopedia of DNA Elements (ENCODE) project, which represents a significant milestone in the field. Concurrently, the development of CRE detection technologies, exemplified by massively parallel reporter assays, has progressed at an impressive pace, providing powerful tools for CRE discovery. The exponential growth of multimodal functional genomic data has necessitated the application of advanced analytical methods. Deep learning algorithms, particularly large language models, have emerged as invaluable tools for deconstructing the intricate nucleotide sequences governing CRE function. These advancements facilitate precise predictions of CRE activity and enable the de novo design of CREs. A deeper understanding of CRE operational dynamics is crucial for harnessing their versatile regulatory properties. Such insights are instrumental in refining gene therapy techniques, enhancing the efficacy of selective breeding programs, pushing the boundaries of genetic innovation, and opening new possibilities in microbial synthetic biology.
Collapse
Affiliation(s)
- Lingna Xu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China;
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Yuwen Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China;
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Chinese Academy of Agricultural Sciences, Foshan 528226, China
| |
Collapse
|
12
|
Zeng T, Spence JP, Mostafavi H, Pritchard JK. Bayesian estimation of gene constraint from an evolutionary model with gene features. Nat Genet 2024; 56:1632-1643. [PMID: 38977852 DOI: 10.1038/s41588-024-01820-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 05/29/2024] [Indexed: 07/10/2024]
Abstract
Measures of selective constraint on genes have been used for many applications, including clinical interpretation of rare coding variants, disease gene discovery and studies of genome evolution. However, widely used metrics are severely underpowered at detecting constraints for the shortest ~25% of genes, potentially causing important pathogenic mutations to be overlooked. Here we developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease and other phenotypes, especially for short genes. Our estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve the estimation of many gene-level properties, such as rare variant burden or gene expression differences.
Collapse
Affiliation(s)
- Tony Zeng
- Department of Genetics, Stanford University, Stanford, CA, USA.
| | | | - Hakhamanesh Mostafavi
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Population Health, New York University, New York, NY, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
13
|
Chen M, Dahl A. A robust model for cell type-specific interindividual variation in single-cell RNA sequencing data. Nat Commun 2024; 15:5229. [PMID: 38898015 PMCID: PMC11186839 DOI: 10.1038/s41467-024-49242-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 05/28/2024] [Indexed: 06/21/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has been widely used to characterize cell types based on their average gene expression profiles. However, most studies do not consider cell type-specific variation across donors. Modelling this cell type-specific inter-individual variation could help elucidate cell type-specific biology and inform genes and cell types underlying complex traits. We therefore develop a new model to detect and quantify cell type-specific variation across individuals called CTMM (Cell Type-specific linear Mixed Model). We use extensive simulations to show that CTMM is powerful and unbiased in realistic settings. We also derive calibrated tests for cell type-specific interindividual variation, which is challenging given the modest sample sizes in scRNA-seq. We apply CTMM to scRNA-seq data from human induced pluripotent stem cells to characterize the transcriptomic variation across donors as cells differentiate into endoderm. We find that almost 100% of transcriptome-wide variability between donors is differentiation stage-specific. CTMM also identifies individual genes with statistically significant stage-specific variability across samples, including 85 genes that do not have significant stage-specific mean expression. Finally, we extend CTMM to partition interindividual covariance between stages, which recapitulates the overall differentiation trajectory. Overall, CTMM is a powerful tool to illuminate cell type-specific biology in scRNA-seq.
Collapse
Affiliation(s)
- Minhui Chen
- Section of Genetic Medicine, University of Chicago, Chicago, IL, 60637, USA.
| | - Andy Dahl
- Section of Genetic Medicine, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
14
|
Cai YM, Lu ZQ, Li B, Huang JY, Zhang M, Chen C, Fan LY, Ma QY, He CY, Chen SN, Jiang Y, Li YM, Ning CB, Zhang FW, Wang WZ, Liu YZ, Zhang H, Jin M, Wang XY, Han JX, Xiong Z, Cai M, Huang CQ, Yang XJ, Zhu X, Zhu Y, Miao XP, Zhang SK, Wei YC, Tian JB. Genome-wide enhancer RNA profiling adds molecular links between genetic variation and human cancers. Mil Med Res 2024; 11:36. [PMID: 38863031 PMCID: PMC11165858 DOI: 10.1186/s40779-024-00539-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 05/17/2024] [Indexed: 06/13/2024] Open
Abstract
BACKGROUND Dysregulation of enhancer transcription occurs in multiple cancers. Enhancer RNAs (eRNAs) are transcribed products from enhancers that play critical roles in transcriptional control. Characterizing the genetic basis of eRNA expression may elucidate the molecular mechanisms underlying cancers. METHODS Initially, a comprehensive analysis of eRNA quantitative trait loci (eRNAQTLs) was performed in The Cancer Genome Atlas (TCGA), and functional features were characterized using multi-omics data. To establish the first eRNAQTL profiles for colorectal cancer (CRC) in China, epigenomic data were used to define active enhancers, which were subsequently integrated with transcription and genotyping data from 154 paired CRC samples. Finally, large-scale case-control studies (34,585 cases and 69,544 controls) were conducted along with multipronged experiments to investigate the potential mechanisms by which candidate eRNAQTLs affect CRC risk. RESULTS A total of 300,112 eRNAQTLs were identified across 30 different cancer types, which exert their influence on eRNA transcription by modulating chromatin status, binding affinity to transcription factors and RNA-binding proteins. These eRNAQTLs were found to be significantly enriched in cancer risk loci, explaining a substantial proportion of cancer heritability. Additionally, tumor-specific eRNAQTLs exhibited high responsiveness to the development of cancer. Moreover, the target genes of these eRNAs were associated with dysregulated signaling pathways and immune cell infiltration in cancer, highlighting their potential as therapeutic targets. Furthermore, multiple ethnic population studies have confirmed that an eRNAQTL rs3094296-T variant decreases the risk of CRC in populations from China (OR = 0.91, 95%CI 0.88-0.95, P = 2.92 × 10-7) and Europe (OR = 0.92, 95%CI 0.88-0.95, P = 4.61 × 10-6). Mechanistically, rs3094296 had an allele-specific effect on the transcription of the eRNA ENSR00000155786, which functioned as a transcriptional activator promoting the expression of its target gene SENP7. These two genes synergistically suppressed tumor cell proliferation. Our curated list of variants, genes, and drugs has been made available in CancereRNAQTL ( http://canernaqtl.whu.edu.cn/#/ ) to serve as an informative resource for advancing this field. CONCLUSION Our findings underscore the significance of eRNAQTLs in transcriptional regulation and disease heritability, pinpointing the potential of eRNA-based therapeutic strategies in cancers.
Collapse
Affiliation(s)
- Yi-Min Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Ze-Qun Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Bin Li
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Jin-Yu Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Ming Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Can Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Lin-Yun Fan
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Qian-Ying Ma
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Chun-Yi He
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Shuo-Ni Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Yuan Jiang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Yan-Min Li
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Cai-Bo Ning
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Fu-Wei Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Wen-Zhuo Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Yi-Zhuo Liu
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Heng Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Meng Jin
- Department of Oncology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Xiao-Yang Wang
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China
| | - Jin-Xin Han
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Zhen Xiong
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Ming Cai
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Chao-Qun Huang
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Xiao-Jun Yang
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Xu Zhu
- Department of Gastrointestinal Surgery, Renmin Hospital of Wuhan University, Wuhan, 430060, China
| | - Ying Zhu
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China
| | - Xiao-Ping Miao
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China.
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China.
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Jiangsu Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, 211166, China.
| | - Shao-Kai Zhang
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China.
| | - Yong-Chang Wei
- Department of Gastrointestinal Oncology, Hubei Cancer Clinical Study Center, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.
| | - Jian-Bo Tian
- Department of Epidemiology and Biostatistics, School of Public Health, Research Center of Public Health, Renmin Hospital of Wuhan University, TaiKang Center for Life and Medical Sciences, Wuhan University, Wuhan, 430071, China.
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China.
- Department of Cancer Epidemiology, Henan Engineering Research Center of Cancer Prevention and Control, Henan International Joint Laboratory of Cancer Prevention, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, 450008, China.
| |
Collapse
|
15
|
DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A. Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.22.595375. [PMID: 38826254 PMCID: PMC11142193 DOI: 10.1101/2024.05.22.595375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Background Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of regulatory programs this variation affects can shed light on the apparatuses of human diseases. Results We collected epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we constructed networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks served as the base for a rich series of analyses, through which we demonstrated their temporal dynamics and enrichment for various disease-associated variants. We applied the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrated methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Conclusions Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes. This includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
Collapse
Affiliation(s)
- William DeGroat
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ 08854, UAS
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Tal Ashuach
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, 387 Soda Hall, Berkeley, CA 94720, USA
| | - Nir Yosef
- Department of Systems Immunology, Weizmann Institute of Science, 234 Herzl Street, Rehovot 7610001, Israel
- Chan-Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
- Department of Systems Immunology, Ragon Institute of MGH, MIT, and Harvard Institute of Science, 400 Technology Square, Cambridge, MA 02139, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 513 Parnassus Ave, CA 94143, USA
- Institute for Human Genetics, University of California, San Francisco, 513 Parnassus Ave, CA 94143, USA
| | - Anat Kreimer
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ 08854, UAS
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ 08854, USA
| |
Collapse
|
16
|
Siraj L, Castro RI, Dewey H, Kales S, Nguyen TTL, Kanai M, Berenzy D, Mouri K, Wang QS, McCaw ZR, Gosai SJ, Aguet F, Cui R, Vockley CM, Lareau CA, Okada Y, Gusev A, Jones TR, Lander ES, Sabeti PC, Finucane HK, Reilly SK, Ulirsch JC, Tewhey R. Functional dissection of complex and molecular trait variants at single nucleotide resolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.05.592437. [PMID: 38766054 PMCID: PMC11100724 DOI: 10.1101/2024.05.05.592437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Identifying the causal variants and mechanisms that drive complex traits and diseases remains a core problem in human genetics. The majority of these variants have individually weak effects and lie in non-coding gene-regulatory elements where we lack a complete understanding of how single nucleotide alterations modulate transcriptional processes to affect human phenotypes. To address this, we measured the activity of 221,412 trait-associated variants that had been statistically fine-mapped using a Massively Parallel Reporter Assay (MPRA) in 5 diverse cell-types. We show that MPRA is able to discriminate between likely causal variants and controls, identifying 12,025 regulatory variants with high precision. Although the effects of these variants largely agree with orthogonal measures of function, only 69% can plausibly be explained by the disruption of a known transcription factor (TF) binding motif. We dissect the mechanisms of 136 variants using saturation mutagenesis and assign impacted TFs for 91% of variants without a clear canonical mechanism. Finally, we provide evidence that epistasis is prevalent for variants in close proximity and identify multiple functional variants on the same haplotype at a small, but important, subset of trait-associated loci. Overall, our study provides a systematic functional characterization of likely causal common variants underlying complex and molecular human traits, enabling new insights into the regulatory grammar underlying disease risk.
Collapse
Affiliation(s)
- Layla Siraj
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biophysics, Harvard Graduate School of Arts and Sciences, Boston, MA, USA
- Harvard-Massachusetts Institute of Technology MD/PhD Program, Harvard Medical School, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Qingbo S. Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
| | | | - Sager J. Gosai
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - François Aguet
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ran Cui
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Caleb A. Lareau
- Program in Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - Thouis R. Jones
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric S. Lander
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Pardis C. Sabeti
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Hilary K. Finucane
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - Jacob C. Ulirsch
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ryan Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
17
|
Lu Z, Wang X, Carr M, Kim A, Gazal S, Mohammadi P, Wu L, Gusev A, Pirruccello J, Kachuri L, Mancuso N. Improved multi-ancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305836. [PMID: 38699369 PMCID: PMC11065034 DOI: 10.1101/2024.04.15.24305836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Multi-ancestry statistical fine-mapping of cis-molecular quantitative trait loci (cis-molQTL) aims to improve the precision of distinguishing causal cis-molQTLs from tagging variants. However, existing approaches fail to reflect shared genetic architectures. To solve this limitation, we present the Sum of Shared Single Effects (SuShiE) model, which leverages LD heterogeneity to improve fine-mapping precision, infer cross-ancestry effect size correlations, and estimate ancestry-specific expression prediction weights. We apply SuShiE to mRNA expression measured in PBMCs (n=956) and LCLs (n=814) together with plasma protein levels (n=854) from individuals of diverse ancestries in the TOPMed MESA and GENOA studies. We find SuShiE fine-maps cis-molQTLs for 16% more genes compared with baselines while prioritizing fewer variants with greater functional enrichment. SuShiE infers highly consistent cis-molQTL architectures across ancestries on average; however, we also find evidence of heterogeneity at genes with predicted loss-of-function intolerance, suggesting that environmental interactions may partially explain differences in cis-molQTL effect sizes across ancestries. Lastly, we leverage estimated cis-molQTL effect-sizes to perform individual-level TWAS and PWAS on six white blood cell-related traits in AOU Biobank individuals (n=86k), and identify 44 more genes compared with baselines, further highlighting its benefits in identifying genes relevant for complex disease risk. Overall, SuShiE provides new insights into the cis-genetic architecture of molecular traits.
Collapse
Affiliation(s)
- Zeyun Lu
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Xinran Wang
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Matthew Carr
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Artem Kim
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA
| | - Pejman Mohammadi
- Center for Immunity and Immunotherapies, Seattle Children’s Research Institute, Seattle, WA, USA
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaiʻi Cancer Center, University of Hawaiʻi at Mānoa, Honolulu, HI, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - James Pirruccello
- Division of Cardiology, University of California San Francisco, San Francisco, CA, USA
| | - Linda Kachuri
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA
| |
Collapse
|
18
|
Zeng T, Spence JP, Mostafavi H, Pritchard JK. Bayesian estimation of gene constraint from an evolutionary model with gene features. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.19.541520. [PMID: 37292653 PMCID: PMC10245655 DOI: 10.1101/2023.05.19.541520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Measures of selective constraint on genes have been used for many applications including clinical interpretation of rare coding variants, disease gene discovery, and studies of genome evolution. However, widely-used metrics are severely underpowered at detecting constraint for the shortest ∼25% of genes, potentially causing important pathogenic mutations to be overlooked. We developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease, and other phenotypes, especially for short genes. Our new estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve estimation of many gene-level properties, such as rare variant burden or gene expression differences.
Collapse
Affiliation(s)
- Tony Zeng
- Department of Genetics, Stanford University, Stanford CA
| | | | | | - Jonathan K. Pritchard
- Department of Genetics, Stanford University, Stanford CA
- Department of Biology, Stanford University, Stanford CA
| |
Collapse
|
19
|
Sakaue S, Weinand K, Isaac S, Dey KK, Jagadeesh K, Kanai M, Watts GFM, Zhu Z, Brenner MB, McDavid A, Donlin LT, Wei K, Price AL, Raychaudhuri S. Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet 2024; 56:615-626. [PMID: 38594305 PMCID: PMC11456345 DOI: 10.1038/s41588-024-01682-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 02/07/2024] [Indexed: 04/11/2024]
Abstract
Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.
Collapse
Affiliation(s)
- Saori Sakaue
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kathryn Weinand
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shakson Isaac
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kushal K Dey
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Karthik Jagadeesh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Masahiro Kanai
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | - Gerald F M Watts
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhu Zhu
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael B Brenner
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Andrew McDavid
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
| | - Laura T Donlin
- Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medicine, New York, NY, USA
| | - Kevin Wei
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
20
|
Houzelstein D, Eozenou C, Lagos CF, Elzaiat M, Bignon-Topalovic J, Gonzalez I, Laville V, Schlick L, Wankanit S, Madon P, Kirtane J, Athalye A, Buonocore F, Bigou S, Conway GS, Bohl D, Achermann JC, Bashamboo A, McElreavey K. A conserved NR5A1-responsive enhancer regulates SRY in testis-determination. Nat Commun 2024; 15:2796. [PMID: 38555298 PMCID: PMC10981742 DOI: 10.1038/s41467-024-47162-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 03/21/2024] [Indexed: 04/02/2024] Open
Abstract
The Y-linked SRY gene initiates mammalian testis-determination. However, how the expression of SRY is regulated remains elusive. Here, we demonstrate that a conserved steroidogenic factor-1 (SF-1)/NR5A1 binding enhancer is required for appropriate SRY expression to initiate testis-determination in humans. Comparative sequence analysis of SRY 5' regions in mammals identified an evolutionary conserved SF-1/NR5A1-binding motif within a 250 bp region of open chromatin located 5 kilobases upstream of the SRY transcription start site. Genomic analysis of 46,XY individuals with disrupted testis-determination, including a large multigenerational family, identified unique single-base substitutions of highly conserved residues within the SF-1/NR5A1-binding element. In silico modelling and in vitro assays demonstrate the enhancer properties of the NR5A1 motif. Deletion of this hemizygous element by genome-editing, in a novel in vitro cellular model recapitulating human Sertoli cell formation, resulted in a significant reduction in expression of SRY. Therefore, human NR5A1 acts as a regulatory switch between testis and ovary development by upregulating SRY expression, a role that may predate the eutherian radiation. We show that disruption of an enhancer can phenocopy variants in the coding regions of SRY that cause human testis dysgenesis. Since disease causing variants in enhancers are currently rare, the regulation of gene expression in testis-determination offers a paradigm to define enhancer activity in a key developmental process.
Collapse
Affiliation(s)
- Denis Houzelstein
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France.
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France.
| | - Caroline Eozenou
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Cochin, Université Paris Cité, INSERM, CNRS, Paris, France
| | - Carlos F Lagos
- Chemical Biology & Drug Discovery Lab, Escuela de Química y Farmacia, Facultad de Medicina y Ciencia, Universidad San Sebastián, Campus Los Leones, Lota 2465 Providencia, 7510157, Santiago, Chile
- Centro Ciencia & Vida, Fundación Ciencia & Vida, Av. del Valle Norte 725, Huechuraba, 8580702, Santiago, Chile
| | - Maëva Elzaiat
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Joelle Bignon-Topalovic
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Inma Gonzalez
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Pasteur, Université Paris Cité, Epigenomics, Proliferation, and the Identity of Cells Unit, F-75015, Paris, France
| | - Vincent Laville
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Pasteur, Université Paris Cité, Stem Cells and Development Unit, F-75015, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Laurène Schlick
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Somboon Wankanit
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Department of Pediatrics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Prochi Madon
- Department of Assisted Reproduction and Genetics, Jaslok Hospital and Research Centre, Mumbai, India
| | - Jyotsna Kirtane
- Department of Pediatric Surgery, Jaslok Hospital and Research Centre, Mumbai, India
| | - Arundhati Athalye
- Department of Assisted Reproduction and Genetics, Jaslok Hospital and Research Centre, Mumbai, India
| | - Federica Buonocore
- Genetics and Genomic Medicine Research & Teaching Department, UCL GOS Institute of Child Health, University College London, London, United Kingdom
| | - Stéphanie Bigou
- ICV-iPS core facility, Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - Gerard S Conway
- Institute for Women's Health, University College London, London, United Kingdom
| | - Delphine Bohl
- ICV-iPS core facility, Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - John C Achermann
- Genetics and Genomic Medicine Research & Teaching Department, UCL GOS Institute of Child Health, University College London, London, United Kingdom
| | - Anu Bashamboo
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Ken McElreavey
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France.
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France.
| |
Collapse
|
21
|
Lin W, Wall JD, Li G, Newman D, Yang Y, Abney M, VandeBerg JL, Olivier M, Gilad Y, Cox LA. Genetic regulatory effects in response to a high-cholesterol, high-fat diet in baboons. CELL GENOMICS 2024; 4:100509. [PMID: 38430910 PMCID: PMC10943580 DOI: 10.1016/j.xgen.2024.100509] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 11/20/2023] [Accepted: 02/05/2024] [Indexed: 03/05/2024]
Abstract
Steady-state expression quantitative trait loci (eQTLs) explain only a fraction of disease-associated loci identified through genome-wide association studies (GWASs), while eQTLs involved in gene-by-environment (GxE) interactions have rarely been characterized in humans due to experimental challenges. Using a baboon model, we found hundreds of eQTLs that emerge in adipose, liver, and muscle after prolonged exposure to high dietary fat and cholesterol. Diet-responsive eQTLs exhibit genomic localization and genic features that are distinct from steady-state eQTLs. Furthermore, the human orthologs associated with diet-responsive eQTLs are enriched for GWAS genes associated with human metabolic traits, suggesting that context-responsive eQTLs with more complex regulatory effects are likely to explain GWAS hits that do not seem to overlap with standard eQTLs. Our results highlight the complexity of genetic regulatory effects and the potential of eQTLs with disease-relevant GxE interactions in enhancing the understanding of GWAS signals for human complex disease using non-human primate models.
Collapse
Affiliation(s)
- Wenhe Lin
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA.
| | - Jeffrey D Wall
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Ge Li
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Deborah Newman
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX 78229, USA
| | - Yunqi Yang
- Committee on Genetics, Genomics and System Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Mark Abney
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - John L VandeBerg
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, TX 78520, USA
| | - Michael Olivier
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA
| | - Yoav Gilad
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA; Department of Medicine, Section of Genetic Medicine, The University of Chicago, Chicago, IL 60637, USA.
| | - Laura A Cox
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC 27157, USA; Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX 78229, USA.
| |
Collapse
|
22
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 PMCID: PMC10977002 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
23
|
Lin W, Wall JD, Li G, Newman D, Yang Y, Abney M, VandeBerg JL, Olivier M, Gilad Y, Cox LA. Genetic regulatory effects in response to a high cholesterol, high fat diet in baboons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.01.551489. [PMID: 37577666 PMCID: PMC10418186 DOI: 10.1101/2023.08.01.551489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Steady-state expression quantitative trait loci (eQTLs) explain only a fraction of disease-associated loci identified through genome-wide association studies (GWAS), while eQTLs involved in gene-by-environment (GxE) interactions have rarely been characterized in humans due to experimental challenges. Using a baboon model, we found hundreds of eQTLs that emerge in adipose, liver, and muscle after prolonged exposure to high dietary fat and cholesterol. Diet-responsive eQTLs exhibit genomic localization and genic features that are distinct from steady-state eQTLs. Furthermore, the human orthologs associated with diet-responsive eQTLs are enriched for GWAS genes associated with human metabolic traits, suggesting that context-responsive eQTLs with more complex regulatory effects are likely to explain GWAS hits that do not seem to overlap with standard eQTLs. Our results highlight the complexity of genetic regulatory effects and the potential of eQTLs with disease-relevant GxE interactions in enhancing the understanding of GWAS signals for human complex disease using nonhuman primate models.
Collapse
Affiliation(s)
- Wenhe Lin
- Department of Human Genetics, The University of Chicago, Chicago, USA
| | - Jeffrey D. Wall
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Present address: Galatea Bio, Hialeah, FL, USA
| | - Ge Li
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Deborah Newman
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Yunqi Yang
- Committee on Genetics, Genomics and System Biology, The University of Chicago, Chicago, USA
| | - Mark Abney
- Department of Human Genetics, The University of Chicago, Chicago, USA
| | - John L. VandeBerg
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grand Valley, Brownsville, TX, USA
| | - Michael Olivier
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Yoav Gilad
- Department of Human Genetics, The University of Chicago, Chicago, USA
- Department of Medicine, Section of Genetic Medicine, The University of Chicago, Chicago, IL, USA
- Lead contact
| | - Laura A. Cox
- Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
| |
Collapse
|
24
|
Gschwind AR, Mualim KS, Karbalayghareh A, Sheth MU, Dey KK, Jagoda E, Nurtdinov RN, Xi W, Tan AS, Jones H, Ma XR, Yao D, Nasser J, Avsec Ž, James BT, Shamim MS, Durand NC, Rao SSP, Mahajan R, Doughty BR, Andreeva K, Ulirsch JC, Fan K, Perez EM, Nguyen TC, Kelley DR, Finucane HK, Moore JE, Weng Z, Kellis M, Bassik MC, Price AL, Beer MA, Guigó R, Stamatoyannopoulos JA, Lieberman Aiden E, Greenleaf WJ, Leslie CS, Steinmetz LM, Kundaje A, Engreitz JM. An encyclopedia of enhancer-gene regulatory interactions in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.563812. [PMID: 38014075 PMCID: PMC10680627 DOI: 10.1101/2023.11.09.563812] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.
Collapse
Affiliation(s)
- Andreas R. Gschwind
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Kristy S. Mualim
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Plant Biology, Carnegie Institute of Science, Stanford, CA, USA
| | - Alireza Karbalayghareh
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Maya U. Sheth
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Kushal K. Dey
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Evelyn Jagoda
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ramil N. Nurtdinov
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Wang Xi
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Anthony S. Tan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - X. Rosa Ma
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - David Yao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Benjamin T. James
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Muhammad S. Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, Texas, USA
| | - Neva C. Durand
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Suhas S. P. Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
| | - Ragini Mahajan
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Biosciences, Rice University, Houston, TX, USA
| | - Benjamin R. Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Kalina Andreeva
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Jacob C. Ulirsch
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Present Address: Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA
| | | | - Tri C. Nguyen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | | | - Hilary K. Finucane
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jill E. Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael C. Bassik
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Michael A. Beer
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - John A. Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Clinical Research Division, Fred Hutch Cancer Center, Seattle, WA, USA
| | - Erez Lieberman Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
| | | | - Lars M. Steinmetz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Genome Technology Center, Palo Alto, CA, USA
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
25
|
You J, Liu Z, Qi Z, Ma Y, Sun M, Su L, Niu H, Peng Y, Luo X, Zhu M, Huang Y, Chang X, Hu X, Zhang Y, Pi R, Liu Y, Meng Q, Li J, Zhang Q, Zhu L, Lin Z, Min L, Yuan D, Grover CE, Fang DD, Lindsey K, Wendel JF, Tu L, Zhang X, Wang M. Regulatory controls of duplicated gene expression during fiber development in allotetraploid cotton. Nat Genet 2023; 55:1987-1997. [PMID: 37845354 PMCID: PMC10632151 DOI: 10.1038/s41588-023-01530-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 09/14/2023] [Indexed: 10/18/2023]
Abstract
Polyploidy complicates transcriptional regulation and increases phenotypic diversity in organisms. The dynamics of genetic regulation of gene expression between coresident subgenomes in polyploids remains to be understood. Here we document the genetic regulation of fiber development in allotetraploid cotton Gossypium hirsutum by sequencing 376 genomes and 2,215 time-series transcriptomes. We characterize 1,258 genes comprising 36 genetic modules that control staged fiber development and uncover genetic components governing their partitioned expression relative to subgenomic duplicated genes (homoeologs). Only about 30% of fiber quality-related homoeologs show phenotypically favorable allele aggregation in cultivars, highlighting the potential for subgenome additivity in fiber improvement. We envision a genome-enabled breeding strategy, with particular attention to 48 favorable alleles related to fiber phenotypes that have been subjected to purifying selection during domestication. Our work delineates the dynamics of gene regulation during fiber development and highlights the potential of subgenomic coordination underpinning phenotypes in polyploid plants.
Collapse
Affiliation(s)
- Jiaqi You
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Zhenping Liu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Zhengyang Qi
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yizan Ma
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Mengling Sun
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Ling Su
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Hao Niu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yabing Peng
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Xuanxuan Luo
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Mengmeng Zhu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yuefan Huang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Xing Chang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Xiubao Hu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yuqi Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Ruizhen Pi
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Yuqi Liu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Qingying Meng
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Jianying Li
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Qinghua Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Longfu Zhu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Zhongxu Lin
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Ling Min
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Daojun Yuan
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China
| | - Corrinne E Grover
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - David D Fang
- Cotton Fiber Bioscience Research Unit, USDA-ARS, Southern Regional Research Center, New Orleans, LA, USA
| | - Keith Lindsey
- Department of Biosciences, Durham University, Durham, UK
| | - Jonathan F Wendel
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - Lili Tu
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
| | - Xianlong Zhang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
| | - Maojun Wang
- National Key Laboratory of Crop Genetic Improvement, Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan, China.
| |
Collapse
|
26
|
Mostafavi H, Spence JP, Naqvi S, Pritchard JK. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nat Genet 2023; 55:1866-1875. [PMID: 37857933 DOI: 10.1038/s41588-023-01529-1] [Citation(s) in RCA: 125] [Impact Index Per Article: 62.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 09/14/2023] [Indexed: 10/21/2023]
Abstract
Most signals in genome-wide association studies (GWAS) of complex traits implicate noncoding genetic variants with putative gene regulatory effects. However, currently identified regulatory variants, notably expression quantitative trait loci (eQTLs), explain only a small fraction of GWAS signals. Here, we show that GWAS and cis-eQTL hits are systematically different: eQTLs cluster strongly near transcription start sites, whereas GWAS hits do not. Genes near GWAS hits are enriched in key functional annotations, are under strong selective constraint and have complex regulatory landscapes across different tissue/cell types, whereas genes near eQTLs are depleted of most functional annotations, show relaxed constraint, and have simpler regulatory landscapes. We describe a model to understand these observations, including how natural selection on complex traits hinders discovery of functionally relevant eQTLs. Our results imply that GWAS and eQTL studies are systematically biased toward different types of variant, and support the use of complementary functional approaches alongside the next generation of eQTL studies.
Collapse
Affiliation(s)
| | | | - Sahin Naqvi
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
27
|
Brown BC, Morris JA, Lappalainen T, Knowles DA. Large-scale causal discovery using interventional data sheds light on the regulatory network architecture of blood traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.13.562293. [PMID: 37905013 PMCID: PMC10614812 DOI: 10.1101/2023.10.13.562293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Inference of directed biological networks is an important but notoriously challenging problem. We introduce inverse sparse regression (inspre), an approach to learning causal networks that leverages large-scale intervention-response data. Applied to 788 genes from the genome-wide perturb-seq dataset, inspre helps elucidate the network architecture of blood traits.
Collapse
Affiliation(s)
- Brielin C. Brown
- New York Genome Center, New York, NY, USA
- Data Science Institute, Columbia University, New York, NY, USA
| | | | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Systems Biology, Columbia University, New York, NY
| | - David A. Knowles
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY
- Department of Computer Science, Columbia University, New York, NY
| |
Collapse
|
28
|
McAfee JC, Lee S, Lee J, Bell JL, Krupa O, Davis J, Insigne K, Bond ML, Zhao N, Boyle AP, Phanstiel DH, Love MI, Stein JL, Ruzicka WB, Davila-Velderrain J, Kosuri S, Won H. Systematic investigation of allelic regulatory activity of schizophrenia-associated common variants. CELL GENOMICS 2023; 3:100404. [PMID: 37868037 PMCID: PMC10589626 DOI: 10.1016/j.xgen.2023.100404] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 02/23/2023] [Accepted: 08/21/2023] [Indexed: 10/24/2023]
Abstract
Genome-wide association studies (GWASs) have successfully identified 145 genomic regions that contribute to schizophrenia risk, but linkage disequilibrium makes it challenging to discern causal variants. We performed a massively parallel reporter assay (MPRA) on 5,173 fine-mapped schizophrenia GWAS variants in primary human neural progenitors and identified 439 variants with allelic regulatory effects (MPRA-positive variants). Transcription factor binding had modest predictive power, while fine-map posterior probability, enhancer overlap, and evolutionary conservation failed to predict MPRA-positive variants. Furthermore, 64% of MPRA-positive variants did not exhibit expressive quantitative trait loci signature, suggesting that MPRA could identify yet unexplored variants with regulatory potentials. To predict the combinatorial effect of MPRA-positive variants on gene regulation, we propose an accessibility-by-contact model that combines MPRA-measured allelic activity with neuronal chromatin architecture.
Collapse
Affiliation(s)
- Jessica C. McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Sool Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiseok Lee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica L. Bell
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jessica Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kimberly Insigne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Marielle L. Bond
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nanxiang Zhao
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alan P. Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Douglas H. Phanstiel
- Curriculum in Genetics and Molecular Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael I. Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason L. Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - W. Brad Ruzicka
- Laboratory for Epigenomics in Human Psychopathology, McLean Hospital, Belmont, MA 02141, USA
- Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
29
|
Liu Z, Huang YF. Deep multiple-instance learning accurately predicts gene haploinsufficiency and deletion pathogenicity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.29.555384. [PMID: 37693607 PMCID: PMC10491176 DOI: 10.1101/2023.08.29.555384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Copy number losses (deletions) are a major contributor to the etiology of severe genetic disorders. Although haploinsufficient genes play a critical role in deletion pathogenicity, current methods for deletion pathogenicity prediction fail to integrate multiple lines of evidence for haploinsufficiency at the gene level, limiting their power to pinpoint deleterious deletions associated with genetic disorders. Here we introduce DosaCNV, a deep multiple-instance learning framework that, for the first time, models deletion pathogenicity jointly with gene haploinsufficiency. By integrating over 30 gene-level features potentially predictive of haploinsufficiency, DosaCNV shows unmatched performance in prioritizing pathogenic deletions associated with a broad spectrum of genetic disorders. Furthermore, DosaCNV outperforms existing methods in predicting gene haploinsufficiency even though it is not trained on known haploinsufficient genes. Finally, DosaCNV leverages a state-of-the-art technique to quantify the contributions of individual gene-level features to haploinsufficiency, allowing for human-understandable explanations of model predictions. Altogether, DosaCNV is a powerful computational tool for both fundamental and translational research.
Collapse
Affiliation(s)
- Zhihan Liu
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
- Molecular, Cellular, and Integrative Biosciences Program, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Yi-Fei Huang
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
30
|
Zeng T, Spence JP, Mostafavi H, Pritchard JK. Bayesian estimation of gene constraint from an evolutionary model with gene features. RESEARCH SQUARE 2023:rs.3.rs-3012879. [PMID: 37398424 PMCID: PMC10312940 DOI: 10.21203/rs.3.rs-3012879/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Measures of selective constraint on genes have been used for many applications including clinical interpretation of rare coding variants, disease gene discovery, and studies of genome evolution. However, widely-used metrics are severely underpowered at detecting constraint for the shortest ~25% of genes, potentially causing important pathogenic mutations to be overlooked. We developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, s het . Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease, and other phenotypes, especially for short genes. Our new estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve estimation of many gene-level properties, such as rare variant burden or gene expression differences.
Collapse
Affiliation(s)
- Tony Zeng
- Department of Genetics, Stanford University, Stanford CA
| | | | | | - Jonathan K. Pritchard
- Department of Genetics, Stanford University, Stanford CA
- Department of Biology, Stanford University, Stanford CA
| |
Collapse
|
31
|
Ziyani C, Delaneau O, Ribeiro DM. Multimodal single cell analysis infers widespread enhancer co-activity in a lymphoblastoid cell line. Commun Biol 2023; 6:563. [PMID: 37237005 PMCID: PMC10219981 DOI: 10.1038/s42003-023-04954-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
Non-coding regulatory elements such as enhancers are key in controlling the cell-type specificity and spatio-temporal expression of genes. To drive stable and precise gene transcription robust to genetic variation and environmental stress, genes are often targeted by multiple enhancers with redundant action. However, it is unknown whether enhancers targeting the same gene display simultaneous activity or whether some enhancer combinations are more often co-active than others. Here, we take advantage of recent developments in single cell technology that permit assessing chromatin status (scATAC-seq) and gene expression (scRNA-seq) in the same single cells to correlate gene expression to the activity of multiple enhancers. Measuring activity patterns across 24,844 human lymphoblastoid single cells, we find that the majority of enhancers associated with the same gene display significant correlation in their chromatin profiles. For 6944 expressed genes associated with enhancers, we predict 89,885 significant enhancer-enhancer associations between nearby enhancers. We find that associated enhancers share similar transcription factor binding profiles and that gene essentiality is linked with higher enhancer co-activity. We provide a set of predicted enhancer-enhancer associations based on correlation derived from a single cell line, which can be further investigated for functional relevance.
Collapse
Affiliation(s)
- Chaymae Ziyani
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Diogo M Ribeiro
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
32
|
Romero IG. Seeing humans through an evolutionary lens. Science 2023; 380:360-361. [PMID: 37104588 DOI: 10.1126/science.adh0745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
A collection of mammalian genomes provides insights into human biology and evolution.
Collapse
Affiliation(s)
- Irene Gallego Romero
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC, Australia
- School of BioSciences, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
33
|
Chan WF, Coughlan HD, Ruhle M, Iannarella N, Alvarado C, Groom JR, Keenan CR, Kueh AJ, Wheatley AK, Smyth GK, Allan RS, Johanson TM. Survey of activation-induced genome architecture reveals a novel enhancer of Myc. Immunol Cell Biol 2023; 101:345-357. [PMID: 36710659 PMCID: PMC10952581 DOI: 10.1111/imcb.12626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 01/25/2023] [Accepted: 01/27/2023] [Indexed: 01/31/2023]
Abstract
The transcription factor Myc is critically important in driving cell proliferation, a function that is frequently dysregulated in cancer. To avoid this dysregulation Myc is tightly controlled by numerous layers of regulation. One such layer is the use of distal regulatory enhancers to drive Myc expression. Here, using chromosome conformation capture to examine B cells of the immune system in the first hours after their activation, we reveal a previously unidentified enhancer of Myc. The interactivity of this enhancer coincides with a dramatic, but discrete, spike in Myc expression 3 h post-activation. However, genetic deletion of this region, has little impact on Myc expression, Myc protein level or in vitro and in vivo cell proliferation. Examination of the enhancer deleted regulatory landscape suggests that enhancer redundancy likely sustains Myc expression. This work highlights not only the importance of temporally examining enhancers, but also the complexity and dynamics of the regulation of critical genes such as Myc.
Collapse
Affiliation(s)
- Wing Fuk Chan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Hannah D Coughlan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Michelle Ruhle
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Nadia Iannarella
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Carolina Alvarado
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Joanna R Groom
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Christine R Keenan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Andrew J Kueh
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Adam K Wheatley
- Department of Microbiology and ImmunologyUniversity of Melbourne at the Peter Doherty Institute for Infection and ImmunityMelbourneVICAustralia
| | - Gordon K Smyth
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- School of Mathematics and StatisticsThe University of MelbourneParkvilleVICAustralia
| | - Rhys S Allan
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| | - Timothy M Johanson
- The Walter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
- Department of Medical BiologyThe University of MelbourneParkvilleVICAustralia
| |
Collapse
|
34
|
de Klein N, Tsai EA, Vochteloo M, Baird D, Huang Y, Chen CY, van Dam S, Oelen R, Deelen P, Bakker OB, El Garwany O, Ouyang Z, Marshall EE, Zavodszky MI, van Rheenen W, Bakker MK, Veldink J, Gaunt TR, Runz H, Franke L, Westra HJ. Brain expression quantitative trait locus and network analyses reveal downstream effects and putative drivers for brain-related diseases. Nat Genet 2023; 55:377-388. [PMID: 36823318 PMCID: PMC10011140 DOI: 10.1038/s41588-023-01300-6] [Citation(s) in RCA: 67] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
Identification of therapeutic targets from genome-wide association studies (GWAS) requires insights into downstream functional consequences. We harmonized 8,613 RNA-sequencing samples from 14 brain datasets to create the MetaBrain resource and performed cis- and trans-expression quantitative trait locus (eQTL) meta-analyses in multiple brain region- and ancestry-specific datasets (n ≤ 2,759). Many of the 16,169 cortex cis-eQTLs were tissue-dependent when compared with blood cis-eQTLs. We inferred brain cell types for 3,549 cis-eQTLs by interaction analysis. We prioritized 186 cis-eQTLs for 31 brain-related traits using Mendelian randomization and co-localization including 40 cis-eQTLs with an inferred cell type, such as a neuron-specific cis-eQTL (CYP24A1) for multiple sclerosis. We further describe 737 trans-eQTLs for 526 unique variants and 108 unique genes. We used brain-specific gene-co-regulation networks to link GWAS loci and prioritize additional genes for five central nervous system diseases. This study represents a valuable resource for post-GWAS research on central nervous system diseases.
Collapse
Affiliation(s)
- Niek de Klein
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Wellcome Sanger Institute, Hinxton, UK
| | - Ellen A Tsai
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Martijn Vochteloo
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Institute for Life Science and Technology, Hanze University of Applied Sciences, Groningen, The Netherlands
- Oncode Institute, Groningen, The Netherlands
| | - Denis Baird
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Yunfeng Huang
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Chia-Yen Chen
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Sipko van Dam
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Ancora Health, Groningen, The Netherlands
| | - Roy Oelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Groningen, The Netherlands
| | - Patrick Deelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Groningen, The Netherlands
| | - Olivier B Bakker
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Wellcome Sanger Institute, Hinxton, UK
| | - Omar El Garwany
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Wellcome Sanger Institute, Hinxton, UK
| | | | - Eric E Marshall
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Maria I Zavodszky
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Wouter van Rheenen
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Mark K Bakker
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Jan Veldink
- Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK
| | - Heiko Runz
- Translational Biology, Research and Development, Biogen Inc., Cambridge, MA, USA.
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Oncode Institute, Groningen, The Netherlands.
| | - Harm-Jan Westra
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Oncode Institute, Groningen, The Netherlands.
| |
Collapse
|
35
|
Morova T, Ding Y, Huang CCF, Sar F, Schwarz T, Giambartolomei C, Baca S, Grishin D, Hach F, Gusev A, Freedman M, Pasaniuc B, Lack N. Optimized high-throughput screening of non-coding variants identified from genome-wide association studies. Nucleic Acids Res 2022; 51:e18. [PMID: 36546757 PMCID: PMC9943666 DOI: 10.1093/nar/gkac1198] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/19/2022] [Accepted: 12/06/2022] [Indexed: 12/24/2022] Open
Abstract
The vast majority of disease-associated single nucleotide polymorphisms (SNP) identified from genome-wide association studies (GWAS) are localized in non-coding regions. A significant fraction of these variants impact transcription factors binding to enhancer elements and alter gene expression. To functionally interrogate the activity of such variants we developed snpSTARRseq, a high-throughput experimental method that can interrogate the functional impact of hundreds to thousands of non-coding variants on enhancer activity. snpSTARRseq dramatically improves signal-to-noise by utilizing a novel sequencing and bioinformatic approach that increases both insert size and the number of variants tested per loci. Using this strategy, we interrogated known prostate cancer (PCa) risk-associated loci and demonstrated that 35% of them harbor SNPs that significantly altered enhancer activity. Combining these results with chromosomal looping data we could identify interacting genes and provide a mechanism of action for 20 PCa GWAS risk regions. When benchmarked to orthogonal methods, snpSTARRseq showed a strong correlation with in vivo experimental allelic-imbalance studies whereas there was no correlation with predictive in silico approaches. Overall, snpSTARRseq provides an integrated experimental and computational framework to functionally test non-coding genetic variants.
Collapse
Affiliation(s)
- Tunc Morova
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Yi Ding
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | | | - Funda Sar
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Tommer Schwarz
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Claudia Giambartolomei
- Central RNA Lab, Istituto Italiano di Tecnologia, Genova 16163, Italy,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sylvan C Baca
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Dennis Grishin
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada,Department of Urologic Science, University of British Columbia, Vancouver, BC V5Z 1M9, Canada
| | - Alexander Gusev
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Matthew L Freedman
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana Farber Cancer Institute, Boston, MA 02215, USA,The Center for Cancer Genome Discovery, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nathan A Lack
- To whom correspondence should be addressed. Tel: +1 604 875 4411;
| |
Collapse
|
36
|
Leyhr J, Waldmann L, Filipek-Górniok B, Zhang H, Allalou A, Haitina T. A novel cis-regulatory element drives early expression of Nkx3.2 in the gnathostome primary jaw joint. eLife 2022; 11:e75749. [PMID: 36377467 PMCID: PMC9665848 DOI: 10.7554/elife.75749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 09/30/2022] [Indexed: 11/16/2022] Open
Abstract
The acquisition of movable jaws was a major event during vertebrate evolution. The role of NK3 homeobox 2 (Nkx3.2) transcription factor in patterning the primary jaw joint of gnathostomes (jawed vertebrates) is well known, however knowledge about its regulatory mechanism is lacking. In this study, we report a proximal enhancer element of Nkx3.2 that is deeply conserved in most gnathostomes but undetectable in the jawless hagfish and lamprey. This enhancer is active in the developing jaw joint region of the zebrafish Danio rerio, and was thus designated as jaw joint regulatory sequence 1 (JRS1). We further show that JRS1 enhancer sequences from a range of gnathostome species, including a chondrichthyan and mammals, have the same activity in the jaw joint as the native zebrafish enhancer, indicating a high degree of functional conservation despite the divergence of cartilaginous and bony fish lineages or the transition of the primary jaw joint into the middle ear of mammals. Finally, we show that deletion of JRS1 from the zebrafish genome using CRISPR/Cas9 results in a significant reduction of early gene expression of nkx3.2 and leads to a transient jaw joint deformation and partial fusion. Emergence of this Nkx3.2 enhancer in early gnathostomes may have contributed to the origin and shaping of the articulating surfaces of vertebrate jaws.
Collapse
Affiliation(s)
- Jake Leyhr
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala UniversityUppsalaSweden
| | - Laura Waldmann
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala UniversityUppsalaSweden
| | - Beata Filipek-Górniok
- Science for Life Laboratory Genome Engineering Zebrafish Facility, Department of Organismal Biology, Uppsala UniversityUppsalaSweden
| | - Hanqing Zhang
- Division of Visual Information and Interaction, Department of Information Technology, Uppsala UniversityUppsalaSweden
- Science for Life Laboratory BioImage Informatics FacilityUppsalaSweden
| | - Amin Allalou
- Division of Visual Information and Interaction, Department of Information Technology, Uppsala UniversityUppsalaSweden
- Science for Life Laboratory BioImage Informatics FacilityUppsalaSweden
| | - Tatjana Haitina
- Subdepartment of Evolution and Development, Department of Organismal Biology, Uppsala UniversityUppsalaSweden
| |
Collapse
|
37
|
Dong P, Hoffman GE, Apontes P, Bendl J, Rahman S, Fernando MB, Zeng B, Vicari JM, Zhang W, Girdhar K, Townsley KG, Misir R, Brennand KJ, Haroutunian V, Voloudakis G, Fullard JF, Roussos P. Population-level variation in enhancer expression identifies disease mechanisms in the human brain. Nat Genet 2022; 54:1493-1503. [PMID: 36163279 PMCID: PMC9547946 DOI: 10.1038/s41588-022-01170-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/25/2022] [Indexed: 02/06/2023]
Abstract
Identification of risk variants for neuropsychiatric diseases within enhancers underscores the importance of understanding population-level variation in enhancer function in the human brain. Besides regulating tissue-specific and cell-type-specific transcription of target genes, enhancers themselves can be transcribed. By jointly analyzing large-scale cell-type-specific transcriptome and regulome data, we cataloged 30,795 neuronal and 23,265 non-neuronal candidate transcribed enhancers. Examination of the transcriptome in 1,382 brain samples identified robust expression of transcribed enhancers. We explored gene-enhancer coordination and found that enhancer-linked genes are strongly implicated in neuropsychiatric disease. We identified expression quantitative trait loci (eQTLs) for both genes and enhancers and found that enhancer eQTLs mediate a substantial fraction of neuropsychiatric trait heritability. Inclusion of enhancer eQTLs in transcriptome-wide association studies enhanced functional interpretation of disease loci. Overall, our study characterizes the gene-enhancer regulome and genetic mechanisms in the human cortex in both healthy and diseased states.
Collapse
Affiliation(s)
- Pengfei Dong
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Pasha Apontes
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research Education and Clinical Center (MIRECC), James J. Peters VA Medical Center, New York, NY, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Samir Rahman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael B Fernando
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Biao Zeng
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - James M Vicari
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Wen Zhang
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kiran Girdhar
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kayla G Townsley
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruth Misir
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kristen J Brennand
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Yale University, New Haven, CT, USA
| | - Vahram Haroutunian
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research Education and Clinical Center (MIRECC), James J. Peters VA Medical Center, New York, NY, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Georgios Voloudakis
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Mental Illness Research Education and Clinical Center (MIRECC), James J. Peters VA Medical Center, New York, NY, USA.
- Center for Dementia Research, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA.
| |
Collapse
|
38
|
Sharma SP, Peterson T. Complex chromosomal rearrangements induced by transposons in maize. Genetics 2022; 223:6702042. [PMID: 36111993 PMCID: PMC9910405 DOI: 10.1093/genetics/iyac124] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
Eukaryotic genomes are large and complex, and gene expression can be affected by multiple regulatory elements and their positions within the dynamic chromatin architecture. Transposable elements are known to play important roles in genome evolution, yet questions remain as to how transposable elements alter genome structure and affect gene expression. Previous studies have shown that genome rearrangements can be induced by Reversed Ends Transposition involving termini of Activator and related transposable elements in maize and other plants. Here, we show that complex alleles can be formed by the rapid and progressive accumulation of Activator-induced duplications and rearrangements. The p1 gene enhancer in maize can induce ectopic expression of the nearby p2 gene in pericarp tissue when placed near it via different structural rearrangements. By screening for p2 expression, we identified and studied 5 cases in which multiple sequential transposition events occurred and increased the p1 enhancer copy number. We see active p2 expression due to multiple copies of the p1 enhancer present near p2 in all 5 cases. The p1 enhancer effects are confirmed by the observation that loss of p2 expression is correlated with transposition-induced excision of the p1 enhancers. We also performed a targeted Chromosome Conformation Capture experiment to test the physical interaction between the p1 enhancer and p2 promoter region. Together, our results show that transposon-induced rearrangements can accumulate rapidly and progressively increase genetic variation important for genomic evolution.
Collapse
Affiliation(s)
- Sharu Paul Sharma
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Thomas Peterson
- Corresponding author: Department of Genetics, Development and Cell Biology, Iowa State University, 2258 Molecular Biology, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
39
|
Baca SC, Singler C, Zacharia S, Seo JH, Morova T, Hach F, Ding Y, Schwarz T, Huang CCF, Anderson J, Fay AP, Kalita C, Groha S, Pomerantz MM, Wang V, Linder S, Sweeney CJ, Zwart W, Lack NA, Pasaniuc B, Takeda DY, Gusev A, Freedman ML. Genetic determinants of chromatin reveal prostate cancer risk mediated by context-dependent gene regulation. Nat Genet 2022; 54:1364-1375. [PMID: 36071171 PMCID: PMC9784646 DOI: 10.1038/s41588-022-01168-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 07/19/2022] [Indexed: 12/25/2022]
Abstract
Many genetic variants affect disease risk by altering context-dependent gene regulation. Such variants are difficult to study mechanistically using current methods that link genetic variation to steady-state gene expression levels, such as expression quantitative trait loci (eQTLs). To address this challenge, we developed the cistrome-wide association study (CWAS), a framework for identifying genotypic and allele-specific effects on chromatin that are also associated with disease. In prostate cancer, CWAS identified regulatory elements and androgen receptor-binding sites that explained the association at 52 of 98 known prostate cancer risk loci and discovered 17 additional risk loci. CWAS implicated key developmental transcription factors in prostate cancer risk that are overlooked by eQTL-based approaches due to context-dependent gene regulation. We experimentally validated associations and demonstrated the extensibility of CWAS to additional epigenomic datasets and phenotypes, including response to prostate cancer treatment. CWAS is a powerful and biologically interpretable paradigm for studying variants that influence traits by affecting transcriptional regulation.
Collapse
Affiliation(s)
- Sylvan C. Baca
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Cassandra Singler
- Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Soumya Zacharia
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ji-Heui Seo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tunc Morova
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada
| | - Faraz Hach
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada
| | - Yi Ding
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA
| | - Tommer Schwarz
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA
| | | | - Jacob Anderson
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - André P. Fay
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Cynthia Kalita
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA
| | - Stefan Groha
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Mark M. Pomerantz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Victoria Wang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA,Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Simon Linder
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands,Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | - Wilbert Zwart
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, The Netherlands,Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Nathan A. Lack
- Vancouver Prostate Centre University of British Columbia, Vancouver, BC, Canada,School of Medicine, Koç University, Istanbul, Turkey
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA,Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA USA,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - David Y. Takeda
- Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA,Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA,These authors jointly supervised this work. Correspondence should be directed to M.L.F or A.G. ()
| | - Matthew L. Freedman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA,Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA,These authors jointly supervised this work. Correspondence should be directed to M.L.F or A.G. ()
| |
Collapse
|
40
|
Lin X, Liu Y, Liu S, Zhu X, Wu L, Zhu Y, Zhao D, Xu X, Chemparathy A, Wang H, Cao Y, Nakamura M, Noordermeer JN, La Russa M, Wong WH, Zhao K, Qi LS. Nested epistasis enhancer networks for robust genome regulation. Science 2022; 377:1077-1085. [PMID: 35951677 DOI: 10.1126/science.abk3512] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Mammalian genomes possess multiple enhancers spanning an ultralong distance (>megabases) to modulate important genes, yet it is unclear how these enhancers coordinate to achieve this task. Here, we combine multiplexed CRISPRi screening with machine learning to define quantitative enhancer-enhancer interactions. We find that the ultralong distance enhancer network possesses a nested multi-layer architecture that confers functional robustness of gene expression. Experimental characterization reveals that enhancer epistasis is maintained by three-dimensional chromosomal interactions and BRD4 condensation. Machine learning prediction of synergistic enhancers provides an effective strategy to identify non-coding variant pairs associated with pathogenic genes in diseases beyond Genome-Wide Association Studies (GWAS) analysis. Our work unveils nested epistasis enhancer networks, which can better explain enhancer functions within cells and in diseases.
Collapse
Affiliation(s)
- Xueqiu Lin
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yanxia Liu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Shuai Liu
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Xiang Zhu
- Department of Statistics, Stanford University, Stanford, CA 94305, USA.,Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA
| | - Lingling Wu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yanyu Zhu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Dehua Zhao
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Xiaoshu Xu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Haifeng Wang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Yaqiang Cao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Muneaki Nakamura
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Marie La Russa
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Wing Hung Wong
- Department of Statistics, Stanford University, Stanford, CA 94305, USA.,Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Keji Zhao
- Laboratory of Epigenome Biology, Systems Biology Center, National Heart, Lung and Blood Institute NIH, Bethesda, MD 20892, USA
| | - Lei S Qi
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.,ChEM-H, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg BioHub, San Francisco, CA 94158, USA
| |
Collapse
|
41
|
Song W, Yuan K, Liu Z, Cai W, Chen J, Yu S, Zhao M, Lin GN. Locus-level antagonistic selection shaped the polygenic architecture of human complex diseases. Hum Genet 2022; 141:1935-1947. [PMID: 35943608 DOI: 10.1007/s00439-022-02471-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 07/11/2022] [Indexed: 12/01/2022]
Abstract
BACKGROUND We aimed to evaluate the potential role of antagonistic selection in polygenic diseases: if one variant increases the risk of one disease and decreases the risk of another disease, the signals of genetic risk elimination by natural selection will be distorted, which leads to a higher frequency of risk alleles. METHODS We applied local genetic correlations and transcriptome-wide association studies to identify genomic loci and genes adversely associated with at least two diseases. Then, we used different population genetic metrics to measure the signals of natural selection for these loci and genes. RESULTS First, we identified 2120 cases of antagonistic pleiotropy (negative local genetic correlation) among 87 diseases in 716 genomic loci (antagonistic loci). Next, by comparing with non-antagonistic loci, we observed that antagonistic loci explained an excess proportion of disease heritability (median 6%), showed enhanced signals of balancing selection, and reduced signals of directional polygenic adaptation. Then, at the gene expression level, we identified 31,991 cases of antagonistic pleiotropy among 98 diseases at 4368 genes. However, evidence of altered signals of selection pressure and heritability distribution at the gene expression level is limited. CONCLUSION We conclude that antagonistic pleiotropy is widespread among human polygenic diseases, and it has distorted the evolutionary signal and genetic architecture of diseases at the locus level.
Collapse
Affiliation(s)
- Weichen Song
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Kai Yuan
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zhe Liu
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Wenxiang Cai
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jue Chen
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Shunying Yu
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China
| | - Min Zhao
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China. .,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China.
| | - Guan Ning Lin
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China. .,Shanghai Key Laboratory of Psychotic Disorders, Shanghai, China.
| |
Collapse
|
42
|
Collins RL, Glessner JT, Porcu E, Lepamets M, Brandon R, Lauricella C, Han L, Morley T, Niestroj LM, Ulirsch J, Everett S, Howrigan DP, Boone PM, Fu J, Karczewski KJ, Kellaris G, Lowther C, Lucente D, Mohajeri K, Nõukas M, Nuttle X, Samocha KE, Trinh M, Ullah F, Võsa U, Hurles ME, Aradhya S, Davis EE, Finucane H, Gusella JF, Janze A, Katsanis N, Matyakhina L, Neale BM, Sanders D, Warren S, Hodge JC, Lal D, Ruderfer DM, Meck J, Mägi R, Esko T, Reymond A, Kutalik Z, Hakonarson H, Sunyaev S, Brand H, Talkowski ME. A cross-disorder dosage sensitivity map of the human genome. Cell 2022; 185:3041-3055.e25. [PMID: 35917817 PMCID: PMC9742861 DOI: 10.1016/j.cell.2022.06.036] [Citation(s) in RCA: 178] [Impact Index Per Article: 59.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/17/2022] [Accepted: 06/20/2022] [Indexed: 02/06/2023]
Abstract
Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.
Collapse
Affiliation(s)
- Ryan L Collins
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
| | - Joseph T Glessner
- Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, Division of Human Genetics, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Eleonora Porcu
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maarja Lepamets
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia; Institute of Molecular and Cell Biology, University of Tartu, 51010 Tartu, Estonia
| | | | | | - Lide Han
- Division of Genetic Medicine, Department of Medicine, and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Theodore Morley
- Division of Genetic Medicine, Department of Medicine, and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | - Jacob Ulirsch
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Selin Everett
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Daniel P Howrigan
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Philip M Boone
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA 02115, USA
| | - Jack Fu
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Konrad J Karczewski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Georgios Kellaris
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Chelsea Lowther
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Diane Lucente
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Kiana Mohajeri
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Margit Nõukas
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia; Institute of Molecular and Cell Biology, University of Tartu, 51010 Tartu, Estonia
| | - Xander Nuttle
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Kaitlin E Samocha
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10, UK
| | - Mi Trinh
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10, UK
| | - Farid Ullah
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Urmo Võsa
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia
| | - Matthew E Hurles
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10, UK
| | | | - Erica E Davis
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Hilary Finucane
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - James F Gusella
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | | | - Nicholas Katsanis
- Advanced Center for Translational and Genetic Medicine, Stanley Manne Children's Research Institute, Lurie Children's Hospital, Chicago, IL 60611, USA; Departments of Pediatrics and Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | | | - Benjamin M Neale
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | | | | | - Jennelle C Hodge
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Dennis Lal
- Cologne Center for Genomics, University of Cologne, 51149 Cologne, Germany; Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Douglas M Ruderfer
- Division of Genetic Medicine, Department of Medicine, and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Center for Precision Medicine, Department of Biomedical Informatics, and Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia
| | - Tõnu Esko
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 51010 Tartu, Estonia
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Zoltán Kutalik
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Center for Primary Care and Public Health, University of Lausanne, 1015 Lausanne, Switzerland; Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
| | - Hakon Hakonarson
- Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, Division of Human Genetics, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shamil Sunyaev
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Division of Medical Sciences and Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA 02114, USA.
| | - Michael E Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
43
|
Dey KK, Gazal S, van de Geijn B, Kim SS, Nasser J, Engreitz JM, Price AL. SNP-to-gene linking strategies reveal contributions of enhancer-related and candidate master-regulator genes to autoimmune disease. CELL GENOMICS 2022; 2:100145. [PMID: 35873673 PMCID: PMC9306342 DOI: 10.1016/j.xgen.2022.100145] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 04/03/2021] [Accepted: 05/27/2022] [Indexed: 12/11/2022]
Abstract
We assess contributions to autoimmune disease of genes whose regulation is driven by enhancer regions (enhancer-related) and genes that regulate other genes in trans (candidate master-regulator). We link these genes to SNPs using several SNP-to-gene (S2G) strategies and apply heritability analyses to draw three conclusions about 11 autoimmune/blood-related diseases/traits. First, several characterizations of enhancer-related genes using functional genomics data are informative for autoimmune disease heritability after conditioning on a broad set of regulatory annotations. Second, candidate master-regulator genes defined using trans-eQTL in blood are also conditionally informative for autoimmune disease heritability. Third, integrating enhancer-related and master-regulator gene sets with protein-protein interaction (PPI) network information magnified their disease signal. The resulting PPI-enhancer gene score produced >2-fold stronger heritability signal and >2-fold stronger enrichment for drug targets, compared with the recently proposed enhancer domain score. In each case, functionally informed S2G strategies produced 4.1- to 13-fold stronger disease signals than conventional window-based strategies.
Collapse
Affiliation(s)
- Kushal K. Dey
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Bryce van de Geijn
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Genentech, South San Francisco, CA 94080, USA
| | - Samuel Sungil Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- BASE Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford University School of Medicine, Stanford, CA 94304, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
44
|
Gazal S, Weissbrod O, Hormozdiari F, Dey KK, Nasser J, Jagadeesh KA, Weiner DJ, Shi H, Fulco CP, O'Connor LJ, Pasaniuc B, Engreitz JM, Price AL. Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nat Genet 2022; 54:827-836. [PMID: 35668300 PMCID: PMC9894581 DOI: 10.1038/s41588-022-01087-y] [Citation(s) in RCA: 93] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 04/27/2022] [Indexed: 02/04/2023]
Abstract
Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.
Collapse
Affiliation(s)
- Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Farhad Hormozdiari
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kushal K Dey
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Karthik A Jagadeesh
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Bristol Myers Squibb, Cambridge, MA, USA
| | | | - Bogdan Pasaniuc
- Departments of Computational Medicine, Human Genetics, Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Jesse M Engreitz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford University School of Medicine, Stanford, CA, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
45
|
Siewert-Rocks KM, Kim SS, Yao DW, Shi H, Price AL. Leveraging gene co-regulation to identify gene sets enriched for disease heritability. Am J Hum Genet 2022; 109:393-404. [PMID: 35108496 PMCID: PMC8948163 DOI: 10.1016/j.ajhg.2022.01.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 01/04/2022] [Indexed: 12/15/2022] Open
Abstract
Identifying gene sets that are associated to disease can provide valuable biological knowledge, but a fundamental challenge of gene set analyses of GWAS data is linking disease-associated SNPs to genes. Transcriptome-wide association studies (TWASs) detect associations between the genetically predicted expression of a gene and disease risk, thus implicating candidate disease genes. However, causal disease genes at TWAS-associated loci generally remain unknown due to gene co-regulation, which leads to correlations across genes in predicted expression. We developed a method, gene co-regulation score (GCSC) regression, to identify gene sets that are enriched for disease heritability explained by predicted expression. GCSC regresses TWAS chi-square statistics on gene co-regulation scores reflecting correlations in predicted gene expression; a gene set is enriched for heritability if genes with high co-regulation to the set have higher TWAS chi-square statistics than genes with low co-regulation to the set, beyond what is expected based on co-regulation to all genes. We verified via simulations that GCSC is well calibrated and well powered. We applied GCSC to gene expression data from GTEx (48 tissues) and GWAS summary statistics for 43 independent diseases and complex traits analyzing a broad set of biological pathways and specifically expressed gene sets. We identified many enriched sets, recapitulating known biology. For Alzheimer disease, we detected evidence of an immune basis, and specifically a role for antigen presentation, in analyses of both biological pathways and specifically expressed gene sets. Our results highlight the advantages of leveraging gene co-regulation within the TWAS framework to identify enriched gene sets.
Collapse
Affiliation(s)
- Katherine M Siewert-Rocks
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Samuel S Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Douglas W Yao
- Program in Systems, Synthetic, and Quantitative Biology, Harvard University, Cambridge, MA 02138, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
| |
Collapse
|
46
|
Improving genetic diagnosis of Mendelian disease with RNA sequencing: a narrative review. JOURNAL OF BIO-X RESEARCH 2022. [DOI: 10.1097/jbr.0000000000000100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
47
|
Wang X, Glubb DM, O'Mara TA. 10 Years of GWAS discovery in endometrial cancer: Aetiology, function and translation. EBioMedicine 2022; 77:103895. [PMID: 35219087 PMCID: PMC8881374 DOI: 10.1016/j.ebiom.2022.103895] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 02/07/2022] [Accepted: 02/08/2022] [Indexed: 12/24/2022] Open
Abstract
Endometrial cancer is a common gynaecological cancer with increasing incidence and mortality. In the last decade, endometrial cancer genome-wide association studies (GWAS) have provided a resource to explore aetiology and for functional interpretation of heritable risk variation, informing endometrial cancer biology. Indeed, GWAS data have been used to assess relationships with other traits through correlation and Mendelian randomisation analyses, establishing genetic relationships and potential risk factors. Cross-trait GWAS analyses have increased statistical power and identified novel endometrial cancer risk variation related to other traits. Functional analysis of risk loci has helped prioritise candidate susceptibility genes, revealing molecular mechanisms and networks. Lastly, risk scores generated using endometrial cancer GWAS data may allow for clinical translation through identification of patients at high risk of disease. In the next decade, this knowledge base should enable substantial progress in our understanding of endometrial cancer and, potentially, new approaches for its screening and treatment.
Collapse
|
48
|
Hoskins JW, Chung CC, O’Brien A, Zhong J, Connelly K, Collins I, Shi J, Amundadottir LT. Inferred expression regulator activities suggest genes mediating cardiometabolic genetic signals. PLoS Comput Biol 2021; 17:e1009563. [PMID: 34793442 PMCID: PMC8639061 DOI: 10.1371/journal.pcbi.1009563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 12/02/2021] [Accepted: 10/15/2021] [Indexed: 12/12/2022] Open
Abstract
Expression QTL (eQTL) analyses have suggested many genes mediating genome-wide association study (GWAS) signals but most GWAS signals still lack compelling explanatory genes. We have leveraged an adipose-specific gene regulatory network to infer expression regulator activities and phenotypic master regulators (MRs), which were used to detect activity QTLs (aQTLs) at cardiometabolic trait GWAS loci. Regulator activities were inferred with the VIPER algorithm that integrates enrichment of expected expression changes among a regulator's target genes with confidence in their regulator-target network interactions and target overlap between different regulators (i.e., pleiotropy). Phenotypic MRs were identified as those regulators whose activities were most important in predicting their respective phenotypes using random forest modeling. While eQTLs were typically more significant than aQTLs in cis, the opposite was true among candidate MRs in trans. Several GWAS loci colocalized with MR trans-eQTLs/aQTLs in the absence of colocalized cis-QTLs. Intriguingly, at the 1p36.1 BMI GWAS locus the EPHB2 cis-aQTL was stronger than its cis-eQTL and colocalized with the GWAS signal and 35 BMI MR trans-aQTLs, suggesting the GWAS signal may be mediated by effects on EPHB2 activity and its downstream effects on a network of BMI MRs. These MR and aQTL analyses represent systems genetic methods that may be broadly applied to supplement standard eQTL analyses for suggesting molecular effects mediating GWAS signals.
Collapse
Affiliation(s)
- Jason W. Hoskins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (JWH); (LTA)
| | - Charles C. Chung
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- Cancer Genome Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Aidan O’Brien
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jun Zhong
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Katelyn Connelly
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Irene Collins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jianxin Shi
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Laufey T. Amundadottir
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (JWH); (LTA)
| |
Collapse
|
49
|
The non-coding genome in genetic brain disorders: new targets for therapy? Essays Biochem 2021; 65:671-683. [PMID: 34414418 PMCID: PMC8564736 DOI: 10.1042/ebc20200121] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 07/12/2021] [Accepted: 07/26/2021] [Indexed: 11/30/2022]
Abstract
The non-coding genome, consisting of more than 98% of all genetic information in humans and once judged as ‘Junk DNA’, is increasingly moving into the spotlight in the field of human genetics. Non-coding regulatory elements (NCREs) are crucial to ensure correct spatio-temporal gene expression. Technological advancements have allowed to identify NCREs on a large scale, and mechanistic studies have helped to understand the biological mechanisms underlying their function. It is increasingly becoming clear that genetic alterations of NCREs can cause genetic disorders, including brain diseases. In this review, we concisely discuss mechanisms of gene regulation and how to investigate them, and give examples of non-coding alterations of NCREs that give rise to human brain disorders. The cross-talk between basic and clinical studies enhances the understanding of normal and pathological function of NCREs, allowing better interpretation of already existing and novel data. Improved functional annotation of NCREs will not only benefit diagnostics for patients, but might also lead to novel areas of investigations for targeted therapies, applicable to a wide panel of genetic disorders. The intrinsic complexity and precision of the gene regulation process can be turned to the advantage of highly specific treatments. We further discuss this exciting new field of ‘enhancer therapy’ based on recent examples.
Collapse
|
50
|
Yousefi S, Deng R, Lanko K, Salsench EM, Nikoncuk A, van der Linde HC, Perenthaler E, van Ham TJ, Mulugeta E, Barakat TS. Comprehensive multi-omics integration identifies differentially active enhancers during human brain development with clinical relevance. Genome Med 2021; 13:162. [PMID: 34663447 PMCID: PMC8524963 DOI: 10.1186/s13073-021-00980-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/29/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Non-coding regulatory elements (NCREs), such as enhancers, play a crucial role in gene regulation, and genetic aberrations in NCREs can lead to human disease, including brain disorders. The human brain is a complex organ that is susceptible to numerous disorders; many of these are caused by genetic changes, but a multitude remain currently unexplained. Understanding NCREs acting during brain development has the potential to shed light on previously unrecognized genetic causes of human brain disease. Despite immense community-wide efforts to understand the role of the non-coding genome and NCREs, annotating functional NCREs remains challenging. METHODS Here we performed an integrative computational analysis of virtually all currently available epigenome data sets related to human fetal brain. RESULTS Our in-depth analysis unravels 39,709 differentially active enhancers (DAEs) that show dynamic epigenomic rearrangement during early stages of human brain development, indicating likely biological function. Many of these DAEs are linked to clinically relevant genes, and functional validation of selected DAEs in cell models and zebrafish confirms their role in gene regulation. Compared to enhancers without dynamic epigenomic rearrangement, DAEs are subjected to higher sequence constraints in humans, have distinct sequence characteristics and are bound by a distinct transcription factor landscape. DAEs are enriched for GWAS loci for brain-related traits and for genetic variation found in individuals with neurodevelopmental disorders, including autism. CONCLUSION This compendium of high-confidence enhancers will assist in deciphering the mechanism behind developmental genetics of human brain and will be relevant to uncover missing heritability in human genetic brain disorders.
Collapse
Affiliation(s)
- Soheil Yousefi
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Ruizhi Deng
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Kristina Lanko
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Eva Medico Salsench
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Anita Nikoncuk
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Herma C. van der Linde
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Elena Perenthaler
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Tjakko J. van Ham
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Eskeatnaf Mulugeta
- Department of Cell Biology, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| | - Tahsin Stefan Barakat
- Department of Clinical Genetics, Erasmus MC University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|