1
|
de Langen P, Ballester B. MUFFIN: a suite of tools for the analysis of functional sequencing data. NAR Genom Bioinform 2024; 6:lqae051. [PMID: 38745992 PMCID: PMC11091926 DOI: 10.1093/nargab/lqae051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 04/10/2024] [Accepted: 04/27/2024] [Indexed: 05/16/2024] Open
Abstract
The large diversity of functional genomic assays allows for the characterization of non-coding and coding events at the tissue level or at a single-cell resolution. However, this diversity also leads to protocol differences, widely varying sequencing depths, substantial disparities in sample sizes, and number of features. In this work, we have built a Python package, MUFFIN, which offers a wide variety of tools suitable for a broad range of genomic assays and brings many tools that were missing from the Python ecosystem. First, MUFFIN has specialized tools for the exploration of the non-coding regions of genomes, such as a function to identify consensus peaks in peak-called assays, as well as linking genomic regions to genes and performing Gene Set Enrichment Analyses. MUFFIN also possesses a robust and flexible count table processing pipeline, comprising normalization, count transformation, dimensionality reduction, Differential Expression, and clustering. Our tools were tested on three widely different scRNA-seq, ChIP-seq and ATAC-seq datasets. MUFFIN integrates with the popular Scanpy ecosystem and is available on Conda and at https://github.com/pdelangen/Muffin.
Collapse
|
2
|
Pratt HE, Andrews G, Shedd N, Phalke N, Li T, Pampari A, Jensen M, Wen C, Consortium P, Gandal MJ, Geschwind DH, Gerstein M, Moore J, Kundaje A, Colubri A, Weng Z. Using a comprehensive atlas and predictive models to reveal the complexity and evolution of brain-active regulatory elements. SCIENCE ADVANCES 2024; 10:eadj4452. [PMID: 38781344 PMCID: PMC11114231 DOI: 10.1126/sciadv.adj4452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 04/25/2024] [Indexed: 05/25/2024]
Abstract
Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.
Collapse
Affiliation(s)
- Henry E. Pratt
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Gregory Andrews
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Nicole Shedd
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Nishigandha Phalke
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Tongxin Li
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
- Khoury College of Computer Science, Northeastern University, Boston, MA 02115, USA
| | - Anusri Pampari
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matthew Jensen
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Cindy Wen
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | | | - Michael J. Gandal
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Daniel H. Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute of Precision Health, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - Jill Moore
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Andrés Colubri
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Zhiping Weng
- Department of Genomics and Computational Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
3
|
Kim A, Zhang Z, Legros C, Lu Z, de Smith A, Moore JE, Mancuso N, Gazal S. Inferring causal cell types of human diseases and risk variants from candidate regulatory elements. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.17.24307556. [PMID: 38798383 PMCID: PMC11118635 DOI: 10.1101/2024.05.17.24307556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The heritability of human diseases is extremely enriched in candidate regulatory elements (cRE) from disease-relevant cell types. Critical next steps are to infer which and how many cell types are truly causal for a disease (after accounting for co-regulation across cell types), and to understand how individual variants impact disease risk through single or multiple causal cell types. Here, we propose CT-FM and CT-FM-SNP, two methods that leverage cell-type-specific cREs to fine-map causal cell types for a trait and for its candidate causal variants, respectively. We applied CT-FM to 63 GWAS summary statistics (average N = 417K) using nearly one thousand cRE annotations, primarily coming from ENCODE4. CT-FM inferred 81 causal cell types with corresponding SNP-annotations explaining a high fraction of trait SNP-heritability (∼2/3 of the SNP-heritability explained by existing cREs), identified 16 traits with multiple causal cell types, highlighted cell-disease relationships consistent with known biology, and uncovered previously unexplored cellular mechanisms in psychiatric and immune-related diseases. Finally, we applied CT-FM-SNP to 39 UK Biobank traits and predicted high confidence causal cell types for 2,798 candidate causal non-coding SNPs. Our results suggest that most SNPs impact a phenotype through a single cell type, and that pleiotropic SNPs target different cell types depending on the phenotype context. Altogether, CT-FM and CT-FM-SNP shed light on how genetic variants act collectively and individually at the cellular level to impact disease risk.
Collapse
|
4
|
Xiang G, He X, Giardine BM, Isaac KJ, Taylor DJ, McCoy RC, Jansen C, Keller CA, Wixom AQ, Cockburn A, Miller A, Qi Q, He Y, Li Y, Lichtenberg J, Heuston EF, Anderson SM, Luan J, Vermunt MW, Yue F, Sauria MEG, Schatz MC, Taylor J, Gottgens B, Hughes JR, Higgs DR, Weiss MJ, Cheng Y, Blobel GA, Bodine DM, Zhang Y, Li Q, Mahony S, Hardison RC. Interspecies regulatory landscapes and elements revealed by novel joint systematic integration of human and mouse blood cell epigenomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.02.535219. [PMID: 37066352 PMCID: PMC10103973 DOI: 10.1101/2023.04.02.535219] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Knowledge of locations and activities of cis-regulatory elements (CREs) is needed to decipher basic mechanisms of gene regulation and to understand the impact of genetic variants on complex traits. Previous studies identified candidate CREs (cCREs) using epigenetic features in one species, making comparisons difficult between species. In contrast, we conducted an interspecies study defining epigenetic states and identifying cCREs in blood cell types to generate regulatory maps that are comparable between species, using integrative modeling of eight epigenetic features jointly in human and mouse in our Validated Systematic Integration (VISION) Project. The resulting catalogs of cCREs are useful resources for further studies of gene regulation in blood cells, indicated by high overlap with known functional elements and strong enrichment for human genetic variants associated with blood cell phenotypes. The contribution of each epigenetic state in cCREs to gene regulation, inferred from a multivariate regression, was used to estimate epigenetic state Regulatory Potential (esRP) scores for each cCRE in each cell type, which were used to categorize dynamic changes in cCREs. Groups of cCREs displaying similar patterns of regulatory activity in human and mouse cell types, obtained by joint clustering on esRP scores, harbored distinctive transcription factor binding motifs that were similar between species. An interspecies comparison of cCREs revealed both conserved and species-specific patterns of epigenetic evolution. Finally, we showed that comparisons of the epigenetic landscape between species can reveal elements with similar roles in regulation, even in the absence of genomic sequence alignment.
Collapse
|
5
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024:e2300210. [PMID: 38715516 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Ioannis Mouratidis
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Nikol Chantzi
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Yasin Uzun
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Ilias Georgakopoulos-Soares
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
| |
Collapse
|
6
|
Momin MM, Zhou X, Hyppönen E, Benyamin B, Lee SH. Cross-ancestry genetic architecture and prediction for cholesterol traits. Hum Genet 2024; 143:635-648. [PMID: 38536467 DOI: 10.1007/s00439-024-02660-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 02/13/2024] [Indexed: 05/18/2024]
Abstract
While cholesterol is essential, a high level of cholesterol is associated with the risk of cardiovascular diseases. Genome-wide association studies (GWASs) have proven successful in identifying genetic variants that are linked to cholesterol levels, predominantly in white European populations. However, the extent to which genetic effects on cholesterol vary across different ancestries remains largely unexplored. Here, we estimate cross-ancestry genetic correlation to address questions on how genetic effects are shared across ancestries. We find significant genetic heterogeneity between ancestries for cholesterol traits. Furthermore, we demonstrate that single nucleotide polymorphisms (SNPs) with concordant effects across ancestries for cholesterol are more frequently found in regulatory regions compared to other genomic regions. Indeed, the positive genetic covariance between ancestries is mostly driven by the effects of the concordant SNPs, whereas the genetic heterogeneity is attributed to the discordant SNPs. We also show that the predictive ability of the concordant SNPs is significantly higher than the discordant SNPs in the cross-ancestry polygenic prediction. The list of concordant SNPs for cholesterol is available in GWAS Catalog. These findings have relevance for the understanding of shared genetic architecture across ancestries, contributing to the development of clinical strategies for polygenic prediction of cholesterol in cross-ancestral settings.
Collapse
Affiliation(s)
- Md Moksedul Momin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia.
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia.
- Department of Genetics and Animal Breeding, Faculty of Veterinary Medicine, Chattogram Veterinary and Animal Sciences University (CVASU), Khulshi, Chattogram, 4225, Bangladesh.
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia.
| | - Xuan Zhou
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia
| | - Elina Hyppönen
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Clinical and Health Sciences, University of South Australia, Adelaide, SA, Australia
| | - Beben Benyamin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia
| | - S Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia.
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia.
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia.
| |
Collapse
|
7
|
Foroozandeh Shahraki M, Farahbod M, Libbrecht MW. Robust chromatin state annotation. Genome Res 2024; 34:469-483. [PMID: 38514204 PMCID: PMC11067878 DOI: 10.1101/gr.278343.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 03/19/2024] [Indexed: 03/23/2024]
Abstract
With the goal of mapping genomic activity, international projects have recently measured epigenetic activity in hundreds of cell and tissue types. Chromatin state annotations produced by segmentation and genome annotation (SAGA) methods have emerged as the predominant way to summarize these epigenomic data sets in order to annotate the genome. These chromatin state annotations are essential for many genomic tasks, including identifying active regulatory elements and interpreting disease-associated genetic variation. However, despite the widespread applications of SAGA methods, no principled approach exists to evaluate the statistical significance of chromatin state assignments. Here, we propose the first method for assigning calibrated confidence scores to chromatin state annotations. Toward this goal, we performed a comprehensive evaluation of the reproducibility of the two most widely used existing SAGA methods, ChromHMM and Segway. We found that their predictions are frequently irreproducible. For example, when applying the same SAGA method on two sets of experimental replicates, 27%-69% of predicted enhancers fail to replicate. This suggests that a substantial fraction of predicted elements in existing chromatin state annotations cannot be relied upon. To remedy this problem, we introduce SAGAconf, a method for assigning a measure of confidence (r-value) to chromatin state annotations. SAGAconf works with any SAGA method and assigns an r-value to each genomic bin of a chromatin state annotation that represents the probability that the label of this bin will be reproduced in a replicated experiment. Thus, SAGAconf allows a researcher to select only the reliable predictions from a chromatin annotation for use in downstream analyses.
Collapse
Affiliation(s)
| | - Marjan Farahbod
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia V51 1S6, Canada
| | - Maxwell W Libbrecht
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia V51 1S6, Canada
| |
Collapse
|
8
|
Kosicki M, Cintrón DL, Page NF, Georgakopoulos-Soares I, Akiyama JA, Plajzer-Frick I, Novak CS, Kato M, Hunter RD, von Maydell K, Barton S, Godfrey P, Beckman E, Sanders SJ, Pennacchio LA, Ahituv N. Massively parallel reporter assays and mouse transgenic assays provide complementary information about neuronal enhancer activity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.22.590634. [PMID: 38712228 PMCID: PMC11071441 DOI: 10.1101/2024.04.22.590634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Genetic studies find hundreds of thousands of noncoding variants associated with psychiatric disorders. Massively parallel reporter assays (MPRAs) and in vivo transgenic mouse assays can be used to assay the impact of these variants. However, the relevance of MPRAs to in vivo function is unknown and transgenic assays suffer from low throughput. Here, we studied the utility of combining the two assays to study the impact of non-coding variants. We carried out an MPRA on over 50,000 sequences derived from enhancers validated in transgenic mouse assays and from multiple fetal neuronal ATAC-seq datasets. We also tested over 20,000 variants, including synthetic mutations in highly active neuronal enhancers and 177 common variants associated with psychiatric disorders. Variants with a high impact on MPRA activity were further tested in mice. We found a strong and specific correlation between MPRA and mouse neuronal enhancer activity including changes in neuronal enhancer activity in mouse embryos for variants with strong MPRA effects. Mouse assays also revealed pleiotropic variant effects that could not be observed in MPRA. Our work provides a large catalog of functional neuronal enhancers and variant effects and highlights the effectiveness of combining MPRAs and mouse transgenic assays.
Collapse
Affiliation(s)
- Michael Kosicki
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dianne Laboy Cintrón
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nicholas F. Page
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Psychiatry and Behavioral Sciences, Kavli Institute for Fundamental Neuroscience, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Jennifer A. Akiyama
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ingrid Plajzer-Frick
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Catherine S. Novak
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Momoe Kato
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Riana D. Hunter
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Kianna von Maydell
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Sarah Barton
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Patrick Godfrey
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Erik Beckman
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Stephan J. Sanders
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Psychiatry and Behavioral Sciences, Kavli Institute for Fundamental Neuroscience, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
- Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, OX3 16 7TY, UK
| | - Len A. Pennacchio
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
9
|
Choi J, Lee SM, Norwitz ER, Kim JH, Jung YM, Park CW, Jun JK, Lee D, Jin Y, Kim S, Cha B, Park JS, Kim JI. Placental expression quantitative trait loci in an East Asian population. HGG ADVANCES 2024; 5:100276. [PMID: 38310352 PMCID: PMC10883826 DOI: 10.1016/j.xhgg.2024.100276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/29/2024] [Accepted: 01/29/2024] [Indexed: 02/05/2024] Open
Abstract
Expression quantitative trait loci (eQTL) analysis measures the contribution of genetic variation in gene expression on complex traits. Although this methodology has been used to examine gene regulation in numerous human tissues, eQTL research in solid tissues is relatively lacking. We conducted eQTL analysis on placentas collected from an East Asian population in an effort to identify gene regulatory mechanisms in this tissue. Placentas (n = 102) were collected at the time of cesarean delivery. mRNA was extracted, sequenced with NGS, and compared with matched maternal and fetal DNA arrays performed using maternal and neonatal cord blood. Linear regression modeling was performed using tensorQTL. Fine-mapping along with epigenomic annotation was used to select putative functional variants. We identified 2,703 coding genes that contained at least one eQTL with statistical significance (false discovery rate <0.05). After fine-mapping, we found 108 previously unreported eQTL variants with posterior inclusion probability >0.1. Of these, 19% were located in genomic regions with evidence from public placental epigenome suggesting that they may be functionally relevant. For example, variant rs28379289 located in the placenta-specific regulatory region changes the binding affinity of transcription factor leading to higher expression of LGALS3, which is known to affect placental function. This study expands the knowledge base of regulatory elements within the human placenta and identifies 108 previously unreported placenta eQTL signals, which are listed in our publicly available GMI eQTL database. Further studies are needed to identify and characterize genetic regulatory mechanisms that affect placental function in normal pregnancy and placenta-related diseases.
Collapse
Affiliation(s)
- Jaeyong Choi
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea; Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea
| | - Seung Mi Lee
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, Korea
| | | | - Ji Hoi Kim
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, Korea
| | - Young Mi Jung
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, Korea
| | - Chan-Wook Park
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, Korea
| | - Jong Kwan Jun
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, Korea
| | - Dakyung Lee
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea; Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea
| | - Yongjoon Jin
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea; Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea
| | - Sookyung Kim
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea
| | - Bukyoung Cha
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea
| | - Joong Shin Park
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul, Korea.
| | - Jong-Il Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea; Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Korea.
| |
Collapse
|
10
|
Houzelstein D, Eozenou C, Lagos CF, Elzaiat M, Bignon-Topalovic J, Gonzalez I, Laville V, Schlick L, Wankanit S, Madon P, Kirtane J, Athalye A, Buonocore F, Bigou S, Conway GS, Bohl D, Achermann JC, Bashamboo A, McElreavey K. A conserved NR5A1-responsive enhancer regulates SRY in testis-determination. Nat Commun 2024; 15:2796. [PMID: 38555298 PMCID: PMC10981742 DOI: 10.1038/s41467-024-47162-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 03/21/2024] [Indexed: 04/02/2024] Open
Abstract
The Y-linked SRY gene initiates mammalian testis-determination. However, how the expression of SRY is regulated remains elusive. Here, we demonstrate that a conserved steroidogenic factor-1 (SF-1)/NR5A1 binding enhancer is required for appropriate SRY expression to initiate testis-determination in humans. Comparative sequence analysis of SRY 5' regions in mammals identified an evolutionary conserved SF-1/NR5A1-binding motif within a 250 bp region of open chromatin located 5 kilobases upstream of the SRY transcription start site. Genomic analysis of 46,XY individuals with disrupted testis-determination, including a large multigenerational family, identified unique single-base substitutions of highly conserved residues within the SF-1/NR5A1-binding element. In silico modelling and in vitro assays demonstrate the enhancer properties of the NR5A1 motif. Deletion of this hemizygous element by genome-editing, in a novel in vitro cellular model recapitulating human Sertoli cell formation, resulted in a significant reduction in expression of SRY. Therefore, human NR5A1 acts as a regulatory switch between testis and ovary development by upregulating SRY expression, a role that may predate the eutherian radiation. We show that disruption of an enhancer can phenocopy variants in the coding regions of SRY that cause human testis dysgenesis. Since disease causing variants in enhancers are currently rare, the regulation of gene expression in testis-determination offers a paradigm to define enhancer activity in a key developmental process.
Collapse
Affiliation(s)
- Denis Houzelstein
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France.
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France.
| | - Caroline Eozenou
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Cochin, Université Paris Cité, INSERM, CNRS, Paris, France
| | - Carlos F Lagos
- Chemical Biology & Drug Discovery Lab, Escuela de Química y Farmacia, Facultad de Medicina y Ciencia, Universidad San Sebastián, Campus Los Leones, Lota 2465 Providencia, 7510157, Santiago, Chile
- Centro Ciencia & Vida, Fundación Ciencia & Vida, Av. del Valle Norte 725, Huechuraba, 8580702, Santiago, Chile
| | - Maëva Elzaiat
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Joelle Bignon-Topalovic
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Inma Gonzalez
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Pasteur, Université Paris Cité, Epigenomics, Proliferation, and the Identity of Cells Unit, F-75015, Paris, France
| | - Vincent Laville
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Institut Pasteur, Université Paris Cité, Stem Cells and Development Unit, F-75015, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Laurène Schlick
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Somboon Wankanit
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
- Department of Pediatrics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Prochi Madon
- Department of Assisted Reproduction and Genetics, Jaslok Hospital and Research Centre, Mumbai, India
| | - Jyotsna Kirtane
- Department of Pediatric Surgery, Jaslok Hospital and Research Centre, Mumbai, India
| | - Arundhati Athalye
- Department of Assisted Reproduction and Genetics, Jaslok Hospital and Research Centre, Mumbai, India
| | - Federica Buonocore
- Genetics and Genomic Medicine Research & Teaching Department, UCL GOS Institute of Child Health, University College London, London, United Kingdom
| | - Stéphanie Bigou
- ICV-iPS core facility, Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - Gerard S Conway
- Institute for Women's Health, University College London, London, United Kingdom
| | - Delphine Bohl
- ICV-iPS core facility, Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Inserm, CNRS, APHP, Hôpital de la Pitié Salpêtrière, Paris, France
| | - John C Achermann
- Genetics and Genomic Medicine Research & Teaching Department, UCL GOS Institute of Child Health, University College London, London, United Kingdom
| | - Anu Bashamboo
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France
| | - Ken McElreavey
- Institut Pasteur, Université Paris Cité, Human Developmental Genetics Unit, F-75015, Paris, France.
- Centre National de la Recherche Scientifique, CNRS, UMR 3738, Paris, France.
| |
Collapse
|
11
|
Che H, Jiang P, Choy LYL, Cheng SH, Peng W, Chan RWY, Liu J, Zhou Q, Lam WKJ, Yu SCY, Lau SL, Leung TY, Wong J, Wong VWS, Wong GLH, Chan SL, Chan KCA, Lo YMD. Genomic origin, fragmentomics, and transcriptional properties of long cell-free DNA molecules in human plasma. Genome Res 2024; 34:189-200. [PMID: 38408788 PMCID: PMC10984381 DOI: 10.1101/gr.278556.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 02/14/2024] [Indexed: 02/28/2024]
Abstract
Recent studies have revealed an unexplored population of long cell-free DNA (cfDNA) molecules in human plasma using long-read sequencing technologies. However, the biological properties of long cfDNA molecules (>500 bp) remain largely unknown. To this end, we have investigated the origins of long cfDNA molecules from different genomic elements. Analysis of plasma cfDNA using long-read sequencing reveals an uneven distribution of long molecules from across the genome. Long cfDNA molecules show overrepresentation in euchromatic regions of the genome, in sharp contrast to short DNA molecules. We observe a stronger relationship between the abundance of long molecules and mRNA gene expression levels, compared with short molecules (Pearson's r = 0.71 vs. -0.14). Moreover, long and short molecules show distinct fragmentation patterns surrounding CpG sites. Leveraging the cleavage preferences surrounding CpG sites, the combined cleavage ratios of long and short molecules can differentiate patients with hepatocellular carcinoma (HCC) from non-HCC subjects (AUC = 0.87). We also investigated knockout mice in which selected nuclease genes had been inactivated in comparison with wild-type mice. The proportion of long molecules originating from transcription start sites are lower in Dffb-deficient mice but higher in Dnase1l3-deficient mice compared with that of wild-type mice. This work thus provides new insights into the biological properties and potential clinical applications of long cfDNA molecules.
Collapse
Affiliation(s)
- Huiwen Che
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Peiyong Jiang
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- State Key Laboratory of Translational Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - L Y Lois Choy
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- State Key Laboratory of Translational Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Suk Hang Cheng
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Wenlei Peng
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Rebecca W Y Chan
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Jing Liu
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Qing Zhou
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - W K Jacky Lam
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- State Key Laboratory of Translational Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Stephanie C Y Yu
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - So Ling Lau
- Department of Obstetrics and Gynecology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Tak Y Leung
- Department of Obstetrics and Gynecology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - John Wong
- Department of Surgery, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Vincent Wai-Sun Wong
- Department of Medicine and Therapeutics, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Grace L H Wong
- Department of Medicine and Therapeutics, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Stephen L Chan
- State Key Laboratory of Translational Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Clinical Oncology, Sir Y.K. Pao Centre for Cancer, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - K C Allen Chan
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- State Key Laboratory of Translational Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Y M Dennis Lo
- Centre for Novostics, Hong Kong Science Park, Pak Shek Kok, Hong Kong SAR, China;
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- Department of Chemical Pathology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
- State Key Laboratory of Translational Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| |
Collapse
|
12
|
Eletr LF, Ibnouf SH, Salih TA, Ibrahim HI, Mustafa MI, Alhashmi NA, Alfaki M. Comprehensive Analysis Reveals Deoxyribonuclease 1 as a Potential Prognostic and Diagnostic Biomarker in Human Cancers. Cureus 2024; 16:e56171. [PMID: 38618458 PMCID: PMC11015913 DOI: 10.7759/cureus.56171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/14/2024] [Indexed: 04/16/2024] Open
Abstract
BACKGROUND Deoxyribonuclease 1 (DNASE1) is an important gene associated with several cancers, including liver, bladder, and gastric cancer. It has been linked to autoimmune illnesses, including systemic lupus erythematosus, which may lead to cancer formation. However, the role of DNASE1 in cancer has not been studied. MATERIALS AND METHODS We performed a pan-cancer analysis using bioinformatics tools, including Tumor Immune Estimation Resource (TIMER), Gene Expression Profiling Interactive Analysis (GEPIA), and University of Alabama at Birmingham Cancer Data Analysis Portal (UALCAN) databases, Kaplan-Meier plotter, and cBioPortal, to investigate the expression of DNASE1 across various cancers as well as its association with immune infiltration and genetic alterations. Public datasets were used to validate DNASE1 expression in kidney renal clear cell carcinoma (KIRC) and kidney papillary renal cell carcinoma (KIRP) samples. RESULTS DNASE1 was found to be highly expressed in many cancers, such as bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), head and neck squamous cell carcinoma (HNSC), and was lowly expressed in other cancers, including KIRC, KIRP, and thyroid carcinoma (THCA). Additionally, TIMER results showed an association of DNASE1 with immune cell infiltration in KIRC and KIRP. Survival analysis indicated that high DNASE1 expression was associated with poor prognosis in KIRC. We also discovered that altered DNASE1 expression was related to poor prognosis in The Cancer Genome Atlas (TCGA) tumors. CONCLUSION DNASE1 could potentially be used as a prognostic and diagnostic biomarker for KIRC and as a diagnostic biomarker for KIRP.
Collapse
Affiliation(s)
- Loai F Eletr
- Computing and Bioinformatics, Faculty of Science, Port Said University, Port Said, EGY
| | | | | | - Hadba I Ibrahim
- Zoology, Faculty of Science, University of Khartoum, Khartoum, SDN
| | - Mustafa I Mustafa
- Internal Medicine, Sudan Medical Specialization Board, Khartoum, SDN
- Clinical Immunology, Sudan Medical Specialization Board, Khartoum, SDN
- Neurology, King Abdulaziz Medical City Jeddah, Jeddah, SAU
| | | | | |
Collapse
|
13
|
Capauto D, Wang Y, Wu F, Norton S, Mariani J, Inoue F, Crawford GE, Ahituv N, Abyzov A, Vaccarino FM. Characterization of enhancer activity in early human neurodevelopment using Massively Parallel Reporter Assay (MPRA) and forebrain organoids. Sci Rep 2024; 14:3936. [PMID: 38365907 PMCID: PMC10873509 DOI: 10.1038/s41598-024-54302-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 02/11/2024] [Indexed: 02/18/2024] Open
Abstract
Regulation of gene expression through enhancers is one of the major processes shaping the structure and function of the human brain during development. High-throughput assays have predicted thousands of enhancers involved in neurodevelopment, and confirming their activity through orthogonal functional assays is crucial. Here, we utilized Massively Parallel Reporter Assays (MPRAs) in stem cells and forebrain organoids to evaluate the activity of ~ 7000 gene-linked enhancers previously identified in human fetal tissues and brain organoids. We used a Gaussian mixture model to evaluate the contribution of background noise in the measured activity signal to confirm the activity of ~ 35% of the tested enhancers, with most showing temporal-specific activity, suggesting their evolving role in neurodevelopment. The temporal specificity was further supported by the correlation of activity with gene expression. Our findings provide a valuable gene regulatory resource to the scientific community.
Collapse
Affiliation(s)
- Davide Capauto
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Yifan Wang
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA
| | - Feinan Wu
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Scott Norton
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Jessica Mariani
- Child Study Center, Yale University, New Haven, CT, 06520, USA
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | | | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Alexej Abyzov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA.
| | - Flora M Vaccarino
- Child Study Center, Yale University, New Haven, CT, 06520, USA.
- Department of Neuroscience, Yale University, New Haven, CT, 06520, USA.
- Yale Stem Cell Center, Yale University, New Haven, CT, 06520, USA.
| |
Collapse
|
14
|
Alda-Catalinas C, Ibarra-Soria X, Flouri C, Gordillo JE, Cousminer D, Hutchinson A, Sun B, Pembroke W, Ullrich S, Krejci A, Cortes A, Acevedo A, Malla S, Fishwick C, Drewes G, Rapiteanu R. Mapping the functional impact of non-coding regulatory elements in primary T cells through single-cell CRISPR screens. Genome Biol 2024; 25:42. [PMID: 38308274 PMCID: PMC10835965 DOI: 10.1186/s13059-024-03176-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 01/18/2024] [Indexed: 02/04/2024] Open
Abstract
BACKGROUND Drug targets with genetic evidence are expected to increase clinical success by at least twofold. Yet, translating disease-associated genetic variants into functional knowledge remains a fundamental challenge of drug discovery. A key issue is that the vast majority of complex disease associations cannot be cleanly mapped to a gene. Immune disease-associated variants are enriched within regulatory elements found in T-cell-specific open chromatin regions. RESULTS To identify genes and molecular programs modulated by these regulatory elements, we develop a CRISPRi-based single-cell functional screening approach in primary human T cells. Our pipeline enables the interrogation of transcriptomic changes induced by the perturbation of regulatory elements at scale. We first optimize an efficient CRISPRi protocol in primary CD4+ T cells via CROPseq vectors. Subsequently, we perform a screen targeting 45 non-coding regulatory elements and 35 transcription start sites and profile approximately 250,000 T -cell single-cell transcriptomes. We develop a bespoke analytical pipeline for element-to-gene (E2G) mapping and demonstrate that our method can identify both previously annotated and novel E2G links. Lastly, we integrate genetic association data for immune-related traits and demonstrate how our platform can aid in the identification of effector genes for GWAS loci. CONCLUSIONS We describe "primary T cell crisprQTL" - a scalable, single-cell functional genomics approach for mapping regulatory elements to genes in primary human T cells. We show how this framework can facilitate the interrogation of immune disease GWAS hits and propose that the combination of experimental and QTL-based techniques is likely to address the variant-to-function problem.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Bin Sun
- Genomic Sciences, GSK, Stevenage, UK
| | | | | | | | | | | | | | | | - Gerard Drewes
- Genomic Sciences, GSK, Stevenage, UK
- Genomic Sciences, GSK, Collegeville, PA, USA
| | | |
Collapse
|
15
|
DaSilva LF, Senan S, Patel ZM, Janardhan Reddy A, Gabbita S, Nussbaum Z, Valdez Córdova CM, Wenteler A, Weber N, Tunjic TM, Ahmad Khan T, Li Z, Smith C, Bejan M, Karmel Louis L, Cornejo P, Connell W, Wong ES, Meuleman W, Pinello L. DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory Elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.01.578352. [PMID: 38352499 PMCID: PMC10862870 DOI: 10.1101/2024.02.01.578352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
The challenge of systematically modifying and optimizing regulatory elements for precise gene expression control is central to modern genomics and synthetic biology. Advancements in generative AI have paved the way for designing synthetic sequences with the aim of safely and accurately modulating gene expression. We leverage diffusion models to design context-specific DNA regulatory sequences, which hold significant potential toward enabling novel therapeutic applications requiring precise modulation of gene expression. Our framework uses a cell type-specific diffusion model to generate synthetic 200 bp regulatory elements based on chromatin accessibility across different cell types. We evaluate the generated sequences based on key metrics to ensure they retain properties of endogenous sequences: transcription factor binding site composition, potential for cell type-specific chromatin accessibility, and capacity for sequences generated by DNA diffusion to activate gene expression in different cell contexts using state-of-the-art prediction models. Our results demonstrate the ability to robustly generate DNA sequences with cell type-specific regulatory potential. DNA-Diffusion paves the way for revolutionizing a regulatory modulation approach to mammalian synthetic biology and precision gene therapy.
Collapse
Affiliation(s)
- Lucas Ferreira DaSilva
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | - Simon Senan
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Zain Munir Patel
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Aniketh Janardhan Reddy
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
| | - Sameer Gabbita
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | | | | | | | - Zelun Li
- Victor Chang Cardiac Institute, Darlinghurst, New South Wales, Australia
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Sydney, Australia
| | - Cameron Smith
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | | | - Lithin Karmel Louis
- Victor Chang Cardiac Institute, Darlinghurst, New South Wales, Australia
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Sydney, Australia
| | - Paola Cornejo
- Victor Chang Cardiac Institute, Darlinghurst, New South Wales, Australia
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Sydney, Australia
| | | | - Emily S. Wong
- Victor Chang Cardiac Institute, Darlinghurst, New South Wales, Australia
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Sydney, Australia
| | - Wouter Meuleman
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Luca Pinello
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| |
Collapse
|
16
|
Taskiran II, Spanier KI, Dickmänken H, Kempynck N, Pančíková A, Ekşi EC, Hulselmans G, Ismail JN, Theunis K, Vandepoel R, Christiaens V, Mauduit D, Aerts S. Cell-type-directed design of synthetic enhancers. Nature 2024; 626:212-220. [PMID: 38086419 PMCID: PMC10830415 DOI: 10.1038/s41586-023-06936-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 12/05/2023] [Indexed: 01/19/2024]
Abstract
Transcriptional enhancers act as docking stations for combinations of transcription factors and thereby regulate spatiotemporal activation of their target genes1. It has been a long-standing goal in the field to decode the regulatory logic of an enhancer and to understand the details of how spatiotemporal gene expression is encoded in an enhancer sequence. Here we show that deep learning models2-6, can be used to efficiently design synthetic, cell-type-specific enhancers, starting from random sequences, and that this optimization process allows detailed tracing of enhancer features at single-nucleotide resolution. We evaluate the function of fully synthetic enhancers to specifically target Kenyon cells or glial cells in the fruit fly brain using transgenic animals. We further exploit enhancer design to create 'dual-code' enhancers that target two cell types and minimal enhancers smaller than 50 base pairs that are fully functional. By examining the state space searches towards local optima, we characterize enhancer codes through the strength, combination and arrangement of transcription factor activator and transcription factor repressor motifs. Finally, we apply the same strategies to successfully design human enhancers, which adhere to enhancer rules similar to those of Drosophila enhancers. Enhancer design guided by deep learning leads to better understanding of how enhancers work and shows that their code can be exploited to manipulate cell states.
Collapse
Affiliation(s)
- Ibrahim I Taskiran
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Katina I Spanier
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Hannah Dickmänken
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Niklas Kempynck
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Alexandra Pančíková
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
- VIB-KULeuven Center for Cancer Biology, Leuven, Belgium
| | - Eren Can Ekşi
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Gert Hulselmans
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Joy N Ismail
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
- UK Dementia Research Institute at Imperial College London, London, UK
| | - Koen Theunis
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Roel Vandepoel
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Valerie Christiaens
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - David Mauduit
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium.
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium.
- Department of Human Genetics, KU Leuven, Leuven, Belgium.
| |
Collapse
|
17
|
Willemin A, Szabó D, Pombo A. Epigenetic regulatory layers in the 3D nucleus. Mol Cell 2024; 84:415-428. [PMID: 38242127 PMCID: PMC10872226 DOI: 10.1016/j.molcel.2023.12.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 11/21/2023] [Accepted: 12/15/2023] [Indexed: 01/21/2024]
Abstract
Nearly 7 decades have elapsed since Francis Crick introduced the central dogma of molecular biology, as part of his ideas on protein synthesis, setting the fundamental rules of sequence information transfer from DNA to RNAs and proteins. We have since learned that gene expression is finely tuned in time and space, due to the activities of RNAs and proteins on regulatory DNA elements, and through cell-type-specific three-dimensional conformations of the genome. Here, we review major advances in genome biology and discuss a set of ideas on gene regulation and highlight how various biomolecular assemblies lead to the formation of structural and regulatory features within the nucleus, with roles in transcriptional control. We conclude by suggesting further developments that will help capture the complex, dynamic, and often spatially restricted events that govern gene expression in mammalian cells.
Collapse
Affiliation(s)
- Andréa Willemin
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin Institute for Medical Systems Biology (BIMSB), Epigenetic Regulation and Chromatin Architecture Group, Berlin, Germany; Humboldt-Universität zu Berlin, Institute for Biology, Berlin, Germany.
| | - Dominik Szabó
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin Institute for Medical Systems Biology (BIMSB), Epigenetic Regulation and Chromatin Architecture Group, Berlin, Germany; Humboldt-Universität zu Berlin, Institute for Biology, Berlin, Germany
| | - Ana Pombo
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin Institute for Medical Systems Biology (BIMSB), Epigenetic Regulation and Chromatin Architecture Group, Berlin, Germany; Humboldt-Universität zu Berlin, Institute for Biology, Berlin, Germany.
| |
Collapse
|
18
|
Salvadores M, Supek F. Cell cycle gene alterations associate with a redistribution of mutation risk across chromosomal domains in human cancers. NATURE CANCER 2024; 5:330-346. [PMID: 38200245 DOI: 10.1038/s43018-023-00707-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 12/11/2023] [Indexed: 01/12/2024]
Abstract
Mutations in human cells exhibit increased burden in heterochromatic, late DNA replication time (RT) chromosomal domains, with variation in mutation rates between tissues mirroring variation in heterochromatin and RT. We observed that regional mutation risk further varies between individual tumors in a manner independent of cell type, identifying three signatures of domain-scale mutagenesis in >4,000 tumor genomes. The major signature reflects remodeling of heterochromatin and of the RT program domains seen across tumors, tissues and cultured cells, and is robustly linked with higher expression of cell proliferation genes. Regional mutagenesis is associated with loss of activity of the tumor-suppressor genes RB1 and TP53, consistent with their roles in cell cycle control, with distinct mutational patterns generated by the two genes. Loss of regional heterogeneity in mutagenesis is associated with deficiencies in various DNA repair pathways. These mutation risk redistribution processes modify the mutation supply towards important genes, diverting the course of somatic evolution.
Collapse
Affiliation(s)
- Marina Salvadores
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Fran Supek
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), Barcelona Institute of Science and Technology, Barcelona, Spain.
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain.
| |
Collapse
|
19
|
Lim F, Solvason JJ, Ryan GE, Le SH, Jindal GA, Steffen P, Jandu SK, Farley EK. Affinity-optimizing enhancer variants disrupt development. Nature 2024; 626:151-159. [PMID: 38233525 PMCID: PMC10830414 DOI: 10.1038/s41586-023-06922-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 11/30/2023] [Indexed: 01/19/2024]
Abstract
Enhancers control the location and timing of gene expression and contain the majority of variants associated with disease1-3. The ZRS is arguably the most well-studied vertebrate enhancer and mediates the expression of Shh in the developing limb4. Thirty-one human single-nucleotide variants (SNVs) within the ZRS are associated with polydactyly4-6. However, how this enhancer encodes tissue-specific activity, and the mechanisms by which SNVs alter the number of digits, are poorly understood. Here we show that the ETS sites within the ZRS are low affinity, and identify a functional ETS site, ETS-A, with extremely low affinity. Two human SNVs and a synthetic variant optimize the binding affinity of ETS-A subtly from 15% to around 25% relative to the strongest ETS binding sequence, and cause polydactyly with the same penetrance and severity. A greater increase in affinity results in phenotypes that are more penetrant and more severe. Affinity-optimizing SNVs in other ETS sites in the ZRS, as well as in ETS, interferon regulatory factor (IRF), HOX and activator protein 1 (AP-1) sites within a wide variety of enhancers, cause gain-of-function gene expression. The prevalence of binding sites with suboptimal affinity in enhancers creates a vulnerability in genomes whereby SNVs that optimize affinity, even slightly, can be pathogenic. Searching for affinity-optimizing SNVs in genomes could provide a mechanistic approach to identify causal variants that underlie enhanceropathies.
Collapse
Affiliation(s)
- Fabian Lim
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Biological Sciences Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Joe J Solvason
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Genevieve E Ryan
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Sophia H Le
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Granton A Jindal
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Paige Steffen
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Simran K Jandu
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Emma K Farley
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Molecular Biology, Biological Sciences, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
20
|
Xiang G, Guo Y, Bumcrot D, Sigova A. JMnorm: a novel joint multi-feature normalization method for integrative and comparative epigenomics. Nucleic Acids Res 2024; 52:e11. [PMID: 38055833 PMCID: PMC10810286 DOI: 10.1093/nar/gkad1146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 10/25/2023] [Accepted: 11/14/2023] [Indexed: 12/08/2023] Open
Abstract
Combinatorial patterns of epigenetic features reflect transcriptional states and functions of genomic regions. While many epigenetic features have correlated relationships, most existing data normalization approaches analyze each feature independently. Such strategies may distort relationships between functionally correlated epigenetic features and hinder biological interpretation. We present a novel approach named JMnorm that simultaneously normalizes multiple epigenetic features across cell types, species, and experimental conditions by leveraging information from partially correlated epigenetic features. We demonstrate that JMnorm-normalized data can better preserve cross-epigenetic-feature correlations across different cell types and enhance consistency between biological replicates than data normalized by other methods. Additionally, we show that JMnorm-normalized data can consistently improve the performance of various downstream analyses, which include candidate cis-regulatory element clustering, cross-cell-type gene expression prediction, detection of transcription factor binding and changes upon perturbations. These findings suggest that JMnorm effectively minimizes technical noise while preserving true biologically significant relationships between epigenetic datasets. We anticipate that JMnorm will enhance integrative and comparative epigenomics.
Collapse
Affiliation(s)
- Guanjue Xiang
- CAMP4 Therapeutics Corp., One Kendall Square, Building 1400 West, Cambridge, MA 02139, USA
| | - Yuchun Guo
- CAMP4 Therapeutics Corp., One Kendall Square, Building 1400 West, Cambridge, MA 02139, USA
| | - David Bumcrot
- CAMP4 Therapeutics Corp., One Kendall Square, Building 1400 West, Cambridge, MA 02139, USA
| | - Alla Sigova
- CAMP4 Therapeutics Corp., One Kendall Square, Building 1400 West, Cambridge, MA 02139, USA
| |
Collapse
|
21
|
Mehmood F, Arshad S, Shoaib M. ADH-Enhancer: an attention-based deep hybrid framework for enhancer identification and strength prediction. Brief Bioinform 2024; 25:bbae030. [PMID: 38385876 PMCID: PMC10885011 DOI: 10.1093/bib/bbae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/30/2023] [Accepted: 01/11/2024] [Indexed: 02/23/2024] Open
Abstract
Enhancers play an important role in the process of gene expression regulation. In DNA sequence abundance or absence of enhancers and irregularities in the strength of enhancers affects gene expression process that leads to the initiation and propagation of diverse types of genetic diseases such as hemophilia, bladder cancer, diabetes and congenital disorders. Enhancer identification and strength prediction through experimental approaches is expensive, time-consuming and error-prone. To accelerate and expedite the research related to enhancers identification and strength prediction, around 19 computational frameworks have been proposed. These frameworks used machine and deep learning methods that take raw DNA sequences and predict enhancer's presence and strength. However, these frameworks still lack in performance and are not useful in real time analysis. This paper presents a novel deep learning framework that uses language modeling strategies for transforming DNA sequences into statistical feature space. It applies transfer learning by training a language model in an unsupervised fashion by predicting a group of nucleotides also known as k-mers based on the context of existing k-mers in a sequence. At the classification stage, it presents a novel classifier that reaps the benefits of two different architectures: convolutional neural network and attention mechanism. The proposed framework is evaluated over the enhancer identification benchmark dataset where it outperforms the existing best-performing framework by 5%, and 9% in terms of accuracy and MCC. Similarly, when evaluated over the enhancer strength prediction benchmark dataset, it outperforms the existing best-performing framework by 4%, and 7% in terms of accuracy and MCC.
Collapse
Affiliation(s)
- Faiza Mehmood
- Department of Computer Science, University of Engineering and Technology Lahore, (Faisalabad Campus) Pakistan
| | - Shazia Arshad
- Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan
| | - Muhammad Shoaib
- Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan
| |
Collapse
|
22
|
Autio MI, Motakis E, Perrin A, Bin Amin T, Tiang Z, Do DV, Wang J, Tan J, Ding SSL, Tan WX, Lee CJM, Teo AKK, Foo RSY. Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells. eLife 2024; 13:e79592. [PMID: 38164941 PMCID: PMC10836832 DOI: 10.7554/elife.79592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 12/28/2023] [Indexed: 01/03/2024] Open
Abstract
Selection of the target site is an inherent question for any project aiming for directed transgene integration. Genomic safe harbour (GSH) loci have been proposed as safe sites in the human genome for transgene integration. Although several sites have been characterised for transgene integration in the literature, most of these do not meet criteria set out for a GSH and the limited set that do have not been characterised extensively. Here, we conducted a computational analysis using publicly available data to identify 25 unique putative GSH loci that reside in active chromosomal compartments. We validated stable transgene expression and minimal disruption of the native transcriptome in three GSH sites in vitro using human embryonic stem cells (hESCs) and their differentiated progeny. Furthermore, for easy targeted transgene expression, we have engineered constitutive landing pad expression constructs into the three validated GSH in hESCs.
Collapse
Affiliation(s)
- Matias I Autio
- Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, Singapore, Singapore
| | - Efthymios Motakis
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Arnaud Perrin
- Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Talal Bin Amin
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, Singapore, Singapore
| | - Zenia Tiang
- Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Dang Vinh Do
- Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Jiaxu Wang
- Laboratory of RNA Genomics and Structure, Genome Institute of Singapore, Singapore, Singapore
| | - Joanna Tan
- Center for Genome Diagnostics, Genome Institute of Singapore, Singapore, Singapore
| | - Shirley Suet Lee Ding
- Stem Cells and Diabetes Laboratory, Institute of Molecular and Cell Biology, Singapore, Singapore
| | - Wei Xuan Tan
- Stem Cells and Diabetes Laboratory, Institute of Molecular and Cell Biology, Singapore, Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Chang Jie Mick Lee
- Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Adrian Kee Keong Teo
- Stem Cells and Diabetes Laboratory, Institute of Molecular and Cell Biology, Singapore, Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Precision Medicine Translational Research Programme, Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Roger S Y Foo
- Laboratory of Molecular Epigenomics and Chromatin Organization, Genome Institute of Singapore, Singapore, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, Singapore, Singapore
| |
Collapse
|
23
|
Kuderna LFK, Ulirsch JC, Rashid S, Ameen M, Sundaram L, Hickey G, Cox AJ, Gao H, Kumar A, Aguet F, Christmas MJ, Clawson H, Haeussler M, Janiak MC, Kuhlwilm M, Orkin JD, Bataillon T, Manu S, Valenzuela A, Bergman J, Rouselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, Schraiber JG, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, Valsecchi J, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin AD, Guschanski K, Schierup MH, Beck RMD, Karakikes I, Wang KC, Umapathy G, Roos C, Boubli JP, Siepel A, Kundaje A, Paten B, Lindblad-Toh K, Rogers J, Marques Bonet T, Farh KKH. Identification of constrained sequence elements across 239 primate genomes. Nature 2024; 625:735-742. [PMID: 38030727 PMCID: PMC10808062 DOI: 10.1038/s41586-023-06798-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 10/30/2023] [Indexed: 12/01/2023]
Abstract
Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.
Collapse
Affiliation(s)
- Lukas F K Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Jacob C Ulirsch
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Sabrina Rashid
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Mohamed Ameen
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Glenn Hickey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Anthony J Cox
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Arvind Kumar
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Matthew J Christmas
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Hiram Clawson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | | | - Mareike C Janiak
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Martin Kuhlwilm
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria
| | - Joseph D Orkin
- Département d'Anthropologie, Université de Montréal, Montréal, Quebec, Canada
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Alejandro Valenzuela
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Section for Ecoinformatics and Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Tefé, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Lidia Agueda
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Julie Blanc
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Marta Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Ian Goodhead
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - R Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | | | - Julie E Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - David Juan
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | | | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City, UT, USA
| | | | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
| | - João Valsecchi
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Brazil
- Rede de Pesquisa em Diversidade, Conservação e Uso da Fauna da Amazônia - RedeFauna, Manaus, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica-ComFauna, Iquitos, Peru
| | - Malu Messias
- Universidade Federal de Rondônia, Porto Velho, Brazil
| | | | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Rogerio Rossi
- Instituto de Biociências, Universidade Federal do Mato Grosso, Cuiabá, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
- Department of Biology, Trinity University, San Antonio, TX, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clément J Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clifford J Jolly
- Department of Anthropology, New York University, New York, NY, USA
| | - Jane Phillips-Conroy
- Department of Neuroscience, Washington University School of Medicine in St Louis, St Louis, MO, USA
| | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | - Joe H Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | | | - Sree Kanthaswamy
- School of Interdisciplinary Forensics, Arizona State University, Phoenix, AZ, USA
- California National Primate Research Center, University of California, Davis, CA, USA
| | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, Addis Ababa, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Long Zhou
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, China
- Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald-Insel Riems, Germany
- Professorship for International Animal Health/One Health, Faculty of Veterinary Medicine, Justus Liebig University, Giessen, Germany
| | - Minh D Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi, Vietnam
| | - Esther Lizano
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, Stuttgart, Germany
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Barcelonaβeta Brain Research Center, Pasqual Maragall Foundation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | | | - Patrick Tan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums Scotland, Edinburgh, UK
- School of Geosciences, Edinburgh, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, Göttingen, Germany
- Leibniz ScienceCampus Primate Cognition, Göttingen, Germany
| | - Ivo Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Amanda D Melin
- Department of Anthropology and Archaeology, University of Calgary, Calgary, Alberta, Canada
- Department of Medical Genetics, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Robin M D Beck
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Ioannis Karakikes
- Cardiovascular Institute, Stanford University, Stanford, CA, USA
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, USA
| | - Kevin C Wang
- Department of Cancer Biology, Stanford University, Stanford, CA, USA
- Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA
- Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
| | - Jean P Boubli
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| | - Tomas Marques Bonet
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain.
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain.
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
- Universitat Pompeu Fabra, Barcelona, Spain.
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA.
| |
Collapse
|
24
|
van der Sande M, Frölich S, Schäfers T, Smits JG, Snabel RR, Rinzema S, van Heeringen SJ. Seq2science: an end-to-end workflow for functional genomics analysis. PeerJ 2023; 11:e16380. [PMID: 38025697 PMCID: PMC10656911 DOI: 10.7717/peerj.16380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 10/09/2023] [Indexed: 12/01/2023] Open
Abstract
Sequencing databases contain enormous amounts of functional genomics data, making them an extensive resource for genome-scale analysis. Reanalyzing publicly available data, and integrating it with new, project-specific data sets, can be invaluable. With current technologies, genomic experiments have become feasible for virtually any species of interest. However, using and integrating this data comes with its challenges, such as standardized and reproducible analysis. Seq2science is a multi-purpose workflow that covers preprocessing, quality control, visualization, and analysis of functional genomics sequencing data. It facilitates the downloading of sequencing data from all major databases, including NCBI SRA, EBI ENA, DDBJ, GSA, and ENCODE. Furthermore, it automates the retrieval of any genome assembly available from Ensembl, NCBI, and UCSC. It has been tested on a variety of species, and includes diverse workflows such as ATAC-, RNA-, and ChIP-seq. It consists of both generic as well as advanced steps, such as differential gene expression or peak accessibility analysis and differential motif analysis. Seq2science is built on the Snakemake workflow language and thus can be run on a range of computing infrastructures. It is available at https://github.com/vanheeringen-lab/seq2science.
Collapse
Affiliation(s)
- Maarten van der Sande
- Molecular Developmental Biology, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Siebren Frölich
- Molecular Developmental Biology, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Tilman Schäfers
- Molecular Developmental Biology, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Jos G.A. Smits
- Molecular Developmental Biology, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Rebecca R. Snabel
- Molecular Developmental Biology, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Sybren Rinzema
- Molecular Developmental Biology, Radboud University Nijmegen, Nijmegen, the Netherlands
| | | |
Collapse
|
25
|
Gschwind AR, Mualim KS, Karbalayghareh A, Sheth MU, Dey KK, Jagoda E, Nurtdinov RN, Xi W, Tan AS, Jones H, Ma XR, Yao D, Nasser J, Avsec Ž, James BT, Shamim MS, Durand NC, Rao SSP, Mahajan R, Doughty BR, Andreeva K, Ulirsch JC, Fan K, Perez EM, Nguyen TC, Kelley DR, Finucane HK, Moore JE, Weng Z, Kellis M, Bassik MC, Price AL, Beer MA, Guigó R, Stamatoyannopoulos JA, Lieberman Aiden E, Greenleaf WJ, Leslie CS, Steinmetz LM, Kundaje A, Engreitz JM. An encyclopedia of enhancer-gene regulatory interactions in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.563812. [PMID: 38014075 PMCID: PMC10680627 DOI: 10.1101/2023.11.09.563812] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.
Collapse
Affiliation(s)
- Andreas R. Gschwind
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Kristy S. Mualim
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Plant Biology, Carnegie Institute of Science, Stanford, CA, USA
| | - Alireza Karbalayghareh
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Maya U. Sheth
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Kushal K. Dey
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Evelyn Jagoda
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ramil N. Nurtdinov
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Wang Xi
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Anthony S. Tan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - X. Rosa Ma
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - David Yao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Benjamin T. James
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Muhammad S. Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, Texas, USA
| | - Neva C. Durand
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Suhas S. P. Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
| | - Ragini Mahajan
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Biosciences, Rice University, Houston, TX, USA
| | - Benjamin R. Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Kalina Andreeva
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Jacob C. Ulirsch
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Present Address: Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA
| | | | - Tri C. Nguyen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | | | - Hilary K. Finucane
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jill E. Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael C. Bassik
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Michael A. Beer
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - John A. Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Clinical Research Division, Fred Hutch Cancer Center, Seattle, WA, USA
| | - Erez Lieberman Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
| | | | - Lars M. Steinmetz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Genome Technology Center, Palo Alto, CA, USA
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
26
|
Bosten JM, Lawrance-Owen AJ, Bargary G, Goodbourn PT, Mollon JD. 13q32.1 as a candidate region for physiological anisocoria. Br J Ophthalmol 2023; 107:1730-1735. [PMID: 35273018 DOI: 10.1136/bjophthalmol-2021-319936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 02/15/2022] [Indexed: 11/03/2022]
Abstract
BACKGROUND Physiological anisocoria is an asymmetry of pupil size in the absence of pathology. METHODS Images of the pupils under standard illumination were collected in the course of a whole-genome association study of a range of visual functions in 1060 healthy adults. DNA for each participant was extracted from saliva samples. RESULTS We found no relationship between anisocoria and the difference in refraction between the eyes, nor between anisocoria and difference in acuity. There was a small but significant relationship with lightness of the iris, in that the eye with the smaller pupil was associated with the lighter iris. There was a strong association between anisocoria and a local region of chromosome 13 (13q32.1), a region lying between the genes GPR180 and SOX21. The strongest association was with the single-nucleotide polymorphism rs9524583. CONCLUSION The very specific region associated with anisocoria is one where microdeletions (or microduplications) are known to lead to abnormal development of pupil dilator muscle and hence to the autosomal dominant condition of microcoria. It is possible that alterations at 13q32.1 act by altering the expression of SOX21, which encodes a nuclear transcription factor.
Collapse
Affiliation(s)
- Jenny M Bosten
- School of Psychology, University of Sussex, Brighton, UK
| | | | - Gary Bargary
- Department of Psychology, University of Cambridge, Cambridge, UK
| | - Patrick T Goodbourn
- School of Psychology, University of Melbourne, Melbourne, Victoria, Australia
| | - John D Mollon
- Department of Psychology, University of Cambridge, Cambridge, UK
| |
Collapse
|
27
|
Uttley K, Papanastasiou AS, Lahne M, Brisbane JM, MacDonald RB, Bickmore WA, Bhatia S. Unique activities of two overlapping PAX6 retinal enhancers. Life Sci Alliance 2023; 6:e202302126. [PMID: 37643867 PMCID: PMC10465922 DOI: 10.26508/lsa.202302126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 08/16/2023] [Accepted: 08/17/2023] [Indexed: 08/31/2023] Open
Abstract
Enhancers play a critical role in development by precisely modulating spatial, temporal, and cell type-specific gene expression. Sequence variants in enhancers have been implicated in diseases; however, establishing the functional consequences of these variants is challenging because of a lack of understanding of precise cell types and developmental stages where the enhancers are normally active. PAX6 is the master regulator of eye development, with a regulatory landscape containing multiple enhancers driving the expression in the eye. Whether these enhancers perform additive, redundant or distinct functions is unknown. Here, we describe the precise cell types and regulatory activity of two PAX6 retinal enhancers, HS5 and NRE. Using a unique combination of live imaging and single-cell RNA sequencing in dual enhancer-reporter zebrafish embryos, we uncover differences in the spatiotemporal activity of these enhancers. Our results show that although overlapping, these enhancers have distinct activities in different cell types and therefore likely nonredundant functions. This work demonstrates that unique cell type-specific activities can be uncovered for apparently similar enhancers when investigated at high resolution in vivo.
Collapse
Affiliation(s)
- Kirsty Uttley
- https://ror.org/011jsc803 MRC Human Genetics Unithttps://ror.org/01nrxwf90 , Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Andrew S Papanastasiou
- https://ror.org/011jsc803 MRC Human Genetics Unithttps://ror.org/01nrxwf90 , Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Manuela Lahne
- https://ror.org/02jx3x895 UCL Institute of Ophthalmology, University College London, Greater London, UK
| | - Jennifer M Brisbane
- https://ror.org/011jsc803 MRC Human Genetics Unithttps://ror.org/01nrxwf90 , Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Ryan B MacDonald
- https://ror.org/02jx3x895 UCL Institute of Ophthalmology, University College London, Greater London, UK
| | - Wendy A Bickmore
- https://ror.org/011jsc803 MRC Human Genetics Unithttps://ror.org/01nrxwf90 , Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| | - Shipra Bhatia
- https://ror.org/011jsc803 MRC Human Genetics Unithttps://ror.org/01nrxwf90 , Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
28
|
Grillo G, Keshavarzian T, Linder S, Arlidge C, Mout L, Nand A, Teng M, Qamra A, Zhou S, Kron KJ, Murison A, Hawley JR, Fraser M, van der Kwast TH, Raj GV, He HH, Zwart W, Lupien M. Transposable Elements Are Co-opted as Oncogenic Regulatory Elements by Lineage-Specific Transcription Factors in Prostate Cancer. Cancer Discov 2023; 13:2470-2487. [PMID: 37694973 PMCID: PMC10618745 DOI: 10.1158/2159-8290.cd-23-0331] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 07/30/2023] [Accepted: 09/08/2023] [Indexed: 09/12/2023]
Abstract
Transposable elements hold regulatory functions that impact cell fate determination by controlling gene expression. However, little is known about the transcriptional machinery engaged at transposable elements in pluripotent and mature versus oncogenic cell states. Through positional analysis over repetitive DNA sequences of H3K27ac chromatin immunoprecipitation sequencing data from 32 normal cell states, we report pluripotent/stem and mature cell state-specific "regulatory transposable elements." Pluripotent/stem elements are binding sites for pluripotency factors (e.g., NANOG, SOX2, OCT4). Mature cell elements are docking sites for lineage-specific transcription factors, including AR and FOXA1 in prostate epithelium. Expanding the analysis to prostate tumors, we identify a subset of regulatory transposable elements shared with pluripotent/stem cells, including Tigger3a. Using chromatin editing technology, we show how such elements promote prostate cancer growth by regulating AR transcriptional activity. Collectively, our results suggest that oncogenesis arises from lineage-specific transcription factors hijacking pluripotent/stem cell regulatory transposable elements. SIGNIFICANCE We show that oncogenesis relies on co-opting transposable elements from pluripotent stem cells as regulatory elements altering the recruitment of lineage-specific transcription factors. We further discover how co-option is dependent on active chromatin states with important implications for developing treatment options against drivers of oncogenesis across the repetitive DNA. This article is featured in Selected Articles from This Issue, p. 2293.
Collapse
Affiliation(s)
- Giacomo Grillo
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Tina Keshavarzian
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Simon Linder
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Christopher Arlidge
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Lisanne Mout
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Ankita Nand
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Mona Teng
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Aditi Qamra
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Stanley Zhou
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Ken J. Kron
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Alex Murison
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - James R. Hawley
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Michael Fraser
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Theodorus H. van der Kwast
- Laboratory Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Ganesh V. Raj
- Department of Urology, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Housheng Hansen He
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Wilbert Zwart
- Division of Oncogenomics, Oncode Institute, The Netherlands Cancer Institute, Amsterdam, the Netherlands
- Laboratory of Chemical Biology and Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Mathieu Lupien
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
29
|
Baca SC, Seo JH, Davidsohn MP, Fortunato B, Semaan K, Sotudian S, Lakshminarayanan G, Diossy M, Qiu X, El Zarif T, Savignano H, Canniff J, Madueke I, Saliby RM, Zhang Z, Li R, Jiang Y, Taing L, Awad M, Chau CH, DeCaprio JA, Figg WD, Greten TF, Hata AN, Hodi FS, Hughes ME, Ligon KL, Lin N, Ng K, Oser MG, Meador C, Parsons HA, Pomerantz MM, Rajan A, Ritz J, Thakuria M, Tolaney SM, Wen PY, Long H, Berchuck JE, Szallasi Z, Choueiri TK, Freedman ML. Liquid biopsy epigenomic profiling for cancer subtyping. Nat Med 2023; 29:2737-2741. [PMID: 37865722 PMCID: PMC10695830 DOI: 10.1038/s41591-023-02605-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 09/21/2023] [Indexed: 10/23/2023]
Abstract
Although circulating tumor DNA (ctDNA) assays are increasingly used to inform clinical decisions in cancer care, they have limited ability to identify the transcriptional programs that govern cancer phenotypes and their dynamic changes during the course of disease. To address these limitations, we developed a method for comprehensive epigenomic profiling of cancer from 1 ml of patient plasma. Using an immunoprecipitation-based approach targeting histone modifications and DNA methylation, we measured 1,268 epigenomic profiles in plasma from 433 individuals with one of 15 cancers. Our assay provided a robust proxy for transcriptional activity, allowing us to infer the expression levels of diagnostic markers and drug targets, measure the activity of therapeutically targetable transcription factors and detect epigenetic mechanisms of resistance. This proof-of-concept study in advanced cancers shows how plasma epigenomic profiling has the potential to unlock clinically actionable information that is currently accessible only via direct tissue sampling.
Collapse
Affiliation(s)
- Sylvan C Baca
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Ji-Heui Seo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Matthew P Davidsohn
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Brad Fortunato
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Karl Semaan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Shahabbedin Sotudian
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Gitanjali Lakshminarayanan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Miklos Diossy
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA
| | - Xintao Qiu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Talal El Zarif
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Hunter Savignano
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - John Canniff
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ikenna Madueke
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Renee Maria Saliby
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ziwei Zhang
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Eli and Edythe L. Broad Institute, Cambridge, MA, USA
| | - Rong Li
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Yijia Jiang
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Len Taing
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Mark Awad
- Lowe Center for Thoracic Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Cindy H Chau
- Molecular Pharmacology Section, Genitourinary Malignancies Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - James A DeCaprio
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - William D Figg
- Molecular Pharmacology Section, Genitourinary Malignancies Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Tim F Greten
- Liver Cancer Program, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Aaron N Hata
- Massachusetts General Hospital Cancer Center, Boston, MA, USA
- Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - F Stephen Hodi
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Melissa E Hughes
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Keith L Ligon
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nancy Lin
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kimmie Ng
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Matthew G Oser
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Catherine Meador
- Massachusetts General Hospital Cancer Center, Boston, MA, USA
- Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Heather A Parsons
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Mark M Pomerantz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Arun Rajan
- Thoracic and Gastrointestinal Malignancies Branch, Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, USA
| | - Jerome Ritz
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Manisha Thakuria
- Department of Dermatology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Center for Cutaneous Oncology, Dana-Farber/Brigham and Women's Cancer Center, Boston, MA, USA
| | - Sara M Tolaney
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Patrick Y Wen
- Center for Neuro-Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Neurology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Henry Long
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Jacob E Berchuck
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Zoltan Szallasi
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA
- Danish Cancer Institute, Copenhagen, Denmark
- Department of Bioinformatics and Department of Pathology, Forensic and Insurance Medicine, Semmelweis University, Budapest, Hungary
| | - Toni K Choueiri
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Matthew L Freedman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA.
- Eli and Edythe L. Broad Institute, Cambridge, MA, USA.
| |
Collapse
|
30
|
Mostafavi H, Spence JP, Naqvi S, Pritchard JK. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nat Genet 2023; 55:1866-1875. [PMID: 37857933 DOI: 10.1038/s41588-023-01529-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 09/14/2023] [Indexed: 10/21/2023]
Abstract
Most signals in genome-wide association studies (GWAS) of complex traits implicate noncoding genetic variants with putative gene regulatory effects. However, currently identified regulatory variants, notably expression quantitative trait loci (eQTLs), explain only a small fraction of GWAS signals. Here, we show that GWAS and cis-eQTL hits are systematically different: eQTLs cluster strongly near transcription start sites, whereas GWAS hits do not. Genes near GWAS hits are enriched in key functional annotations, are under strong selective constraint and have complex regulatory landscapes across different tissue/cell types, whereas genes near eQTLs are depleted of most functional annotations, show relaxed constraint, and have simpler regulatory landscapes. We describe a model to understand these observations, including how natural selection on complex traits hinders discovery of functionally relevant eQTLs. Our results imply that GWAS and eQTL studies are systematically biased toward different types of variant, and support the use of complementary functional approaches alongside the next generation of eQTL studies.
Collapse
Affiliation(s)
| | | | - Sahin Naqvi
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
31
|
Abatti LE, Lado-Fernández P, Huynh L, Collado M, Hoffman M, Mitchell J. Epigenetic reprogramming of a distal developmental enhancer cluster drives SOX2 overexpression in breast and lung adenocarcinoma. Nucleic Acids Res 2023; 51:10109-10131. [PMID: 37738673 PMCID: PMC10602899 DOI: 10.1093/nar/gkad734] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/18/2023] [Accepted: 08/24/2023] [Indexed: 09/24/2023] Open
Abstract
Enhancer reprogramming has been proposed as a key source of transcriptional dysregulation during tumorigenesis, but the molecular mechanisms underlying this process remain unclear. Here, we identify an enhancer cluster required for normal development that is aberrantly activated in breast and lung adenocarcinoma. Deletion of the SRR124-134 cluster disrupts expression of the SOX2 oncogene, dysregulates genome-wide transcription and chromatin accessibility and reduces the ability of cancer cells to form colonies in vitro. Analysis of primary tumors reveals a correlation between chromatin accessibility at this cluster and SOX2 overexpression in breast and lung cancer patients. We demonstrate that FOXA1 is an activator and NFIB is a repressor of SRR124-134 activity and SOX2 transcription in cancer cells, revealing a co-opting of the regulatory mechanisms involved in early development. Notably, we show that the conserved SRR124 and SRR134 regions are essential during mouse development, where homozygous deletion results in the lethal failure of esophageal-tracheal separation. These findings provide insights into how developmental enhancers can be reprogrammed during tumorigenesis and underscore the importance of understanding enhancer dynamics during development and disease.
Collapse
Affiliation(s)
- Luis E Abatti
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Patricia Lado-Fernández
- Laboratory of Cell Senescence, Cancer and Aging, Health Research Institute of Santiago de Compostela (IDIS), Xerencia de Xestión Integrada de Santiago (XXIS/SERGAS), Santiago de Compostela, Spain
- Department of Physiology and Center for Research in Molecular Medicine and Chronic Diseases (CiMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Linh Huynh
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Manuel Collado
- Laboratory of Cell Senescence, Cancer and Aging, Health Research Institute of Santiago de Compostela (IDIS), Xerencia de Xestión Integrada de Santiago (XXIS/SERGAS), Santiago de Compostela, Spain
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Jennifer A Mitchell
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
- Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
32
|
Nair S, Ameen M, Sundaram L, Pampari A, Schreiber J, Balsubramani A, Wang YX, Burns D, Blau HM, Karakikes I, Wang KC, Kundaje A. Transcription factor stoichiometry, motif affinity and syntax regulate single-cell chromatin dynamics during fibroblast reprogramming to pluripotency. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.04.560808. [PMID: 37873116 PMCID: PMC10592962 DOI: 10.1101/2023.10.04.560808] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Ectopic expression of OCT4, SOX2, KLF4 and MYC (OSKM) transforms differentiated cells into induced pluripotent stem cells. To refine our mechanistic understanding of reprogramming, especially during the early stages, we profiled chromatin accessibility and gene expression at single-cell resolution across a densely sampled time course of human fibroblast reprogramming. Using neural networks that map DNA sequence to ATAC-seq profiles at base-resolution, we annotated cell-state-specific predictive transcription factor (TF) motif syntax in regulatory elements, inferred affinity- and concentration-dependent dynamics of Tn5-bias corrected TF footprints, linked peaks to putative target genes, and elucidated rewiring of TF-to-gene cis-regulatory networks. Our models reveal that early in reprogramming, OSK, at supraphysiological concentrations, rapidly open transient regulatory elements by occupying non-canonical low-affinity binding sites. As OSK concentration falls, the accessibility of these transient elements decays as a function of motif affinity. We find that these OSK-dependent transient elements sequester the somatic TF AP-1. This redistribution is strongly associated with the silencing of fibroblast-specific genes within individual nuclei. Together, our integrated single-cell resource and models reveal insights into the cis-regulatory code of reprogramming at unprecedented resolution, connect TF stoichiometry and motif syntax to diversification of cell fate trajectories, and provide new perspectives on the dynamics and role of transient regulatory elements in somatic silencing.
Collapse
Affiliation(s)
- Surag Nair
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Mohamed Ameen
- Department of Cancer Biology, Stanford University, Stanford, CA, USA
- Cardiovascular Institute, Stanford University, Stanford, CA, USA
- Department of Dermatology, Stanford University, Stanford, CA, USA
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | | | - Anusri Pampari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jacob Schreiber
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | - Yu Xin Wang
- Baxter Laboratory for Stem Cell Biology, Stanford University, Stanford, CA, USA
| | - David Burns
- Baxter Laboratory for Stem Cell Biology, Stanford University, Stanford, CA, USA
| | - Helen M Blau
- Baxter Laboratory for Stem Cell Biology, Stanford University, Stanford, CA, USA
- Department of Microbiology and Immunology, Stanford University, Stanford, CA, USA
| | - Ioannis Karakikes
- Cardiovascular Institute, Stanford University, Stanford, CA, USA
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, USA
| | - Kevin C Wang
- Department of Dermatology, Stanford University, Stanford, CA, USA
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA
- Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| |
Collapse
|
33
|
Li YE, Preissl S, Miller M, Johnson ND, Wang Z, Jiao H, Zhu C, Wang Z, Xie Y, Poirion O, Kern C, Pinto-Duarte A, Tian W, Siletti K, Emerson N, Osteen J, Lucero J, Lin L, Yang Q, Zhu Q, Zemke N, Espinoza S, Yanny AM, Nyhus J, Dee N, Casper T, Shapovalova N, Hirschstein D, Hodge RD, Linnarsson S, Bakken T, Levi B, Keene CD, Shang J, Lein E, Wang A, Behrens MM, Ecker JR, Ren B. A comparative atlas of single-cell chromatin accessibility in the human brain. Science 2023; 382:eadf7044. [PMID: 37824643 PMCID: PMC10852054 DOI: 10.1126/science.adf7044] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 09/14/2023] [Indexed: 10/14/2023]
Abstract
Recent advances in single-cell transcriptomics have illuminated the diverse neuronal and glial cell types within the human brain. However, the regulatory programs governing cell identity and function remain unclear. Using a single-nucleus assay for transposase-accessible chromatin using sequencing (snATAC-seq), we explored open chromatin landscapes across 1.1 million cells in 42 brain regions from three adults. Integrating this data unveiled 107 distinct cell types and their specific utilization of 544,735 candidate cis-regulatory DNA elements (cCREs) in the human genome. Nearly a third of the cCREs demonstrated conservation and chromatin accessibility in the mouse brain cells. We reveal strong links between specific brain cell types and neuropsychiatric disorders including schizophrenia, bipolar disorder, Alzheimer's disease (AD), and major depression, and have developed deep learning models to predict the regulatory roles of noncoding risk variants in these disorders.
Collapse
Affiliation(s)
- Yang Eric Li
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Michael Miller
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | | | - Zihan Wang
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Henry Jiao
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Chenxu Zhu
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Zhaoning Wang
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Yang Xie
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Olivier Poirion
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Colin Kern
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | | | - Wei Tian
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Kimberly Siletti
- Division of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska Institute, 171 77 Stockholm, Sweden
| | - Nora Emerson
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Julia Osteen
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Jacinta Lucero
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Lin Lin
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Qian Yang
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Quan Zhu
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Nathan Zemke
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | - Sarah Espinoza
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | | | - Julie Nyhus
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Nick Dee
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Tamara Casper
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | | | | | | | - Sten Linnarsson
- Division of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska Institute, 171 77 Stockholm, Sweden
| | - Trygve Bakken
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Boaz Levi
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - C Dirk Keene
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98104, USA
| | - Jingbo Shang
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Ed Lein
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - Allen Wang
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| | | | - Joseph R Ecker
- The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
- Center for Epigenomics, University of California San Diego, School of Medicine, La Jolla, CA 92093, USA
| |
Collapse
|
34
|
He J, Wen W, Ping J, Li Q, Chen Z, Perera D, Shu X, Long J, Cai Q, Shu XO, Zheng W, Long Q, Guo X. Enhancing Disease Risk Gene Discovery by Integrating Transcription Factor-Linked Trans-located Variants into Transcriptome-Wide Association Analyses. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.10.23295443. [PMID: 37873299 PMCID: PMC10593059 DOI: 10.1101/2023.10.10.23295443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying putative disease susceptibility genes by integrating gene expression predictions with genome-wide association studies (GWAS) data. However, current TWAS models only consider cis-located variants to predict gene expression. Here, we introduce transTF-TWAS, which includes transcription factor (TF)-linked trans-located variants for model building. Using data from the Genotype-Tissue Expression project, we predict alternative splicing and gene expression and applied these models to large GWAS datasets for breast, prostate, and lung cancers. Our analysis revealed 887 putative cancer susceptibility genes, including 465 in regions not yet reported by previous GWAS and 137 in known GWAS loci but not yet reported previously, at Bonferroni-corrected P < 0.05. We demonstrate that transTF-TWAS surpasses other approaches in both building gene prediction models and identifying disease-associated genes. These results have shed new light on several genetically driven key regulators and their associated regulatory networks underlying disease susceptibility.
Collapse
|
35
|
de Langen P, Hammal F, Guéret E, Mouren JC, Spinelli L, Ballester B. Characterizing intergenic transcription at RNA polymerase II binding sites in normal and cancer tissues. CELL GENOMICS 2023; 3:100411. [PMID: 37868033 PMCID: PMC10589727 DOI: 10.1016/j.xgen.2023.100411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/29/2023] [Accepted: 09/04/2023] [Indexed: 10/24/2023]
Abstract
Intergenic transcription in normal and cancerous tissues is pervasive but incompletely understood. To investigate this, we constructed an atlas of over 180,000 consensus RNA polymerase II (RNAPII)-bound intergenic regions from 900 RNAPII chromatin immunoprecipitation sequencing (ChIP-seq) experiments in normal and cancer samples. Through unsupervised analysis, we identified 51 RNAPII consensus clusters, many of which mapped to specific biotypes and revealed tissue-specific regulatory signatures. We developed a meta-clustering methodology to integrate our RNAPII atlas with active transcription across 28,797 RNA sequencing (RNA-seq) samples from The Cancer Genome Atlas (TCGA), Genotype-Tissue Expression (GTEx), and Encyclopedia of DNA Elements (ENCODE). This analysis revealed strong tissue- and disease-specific interconnections between RNAPII occupancy and transcriptional activity. We demonstrate that intergenic transcription at RNAPII-bound regions is a novel per-cancer and pan-cancer biomarker. This biomarker displays genomic and clinically relevant characteristics, distinguishing cancer subtypes and linking to overall survival. Our results demonstrate the effectiveness of coherent data integration to uncover intergenic transcriptional activity in normal and cancer tissues.
Collapse
Affiliation(s)
| | | | - Elise Guéret
- Aix Marseille Univ, INSERM, TAGC, Marseille, France
| | | | | | | |
Collapse
|
36
|
Yuan C, Tang L, Lopdell T, Petrov VA, Oget-Ebrad C, Moreira GCM, Gualdrón Duarte JL, Sartelet A, Cheng Z, Salavati M, Wathes DC, Crowe MA, Coppieters W, Littlejohn M, Charlier C, Druet T, Georges M, Takeda H. An organism-wide ATAC-seq peak catalog for the bovine and its use to identify regulatory variants. Genome Res 2023; 33:1848-1864. [PMID: 37751945 PMCID: PMC10691486 DOI: 10.1101/gr.277947.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 09/19/2023] [Indexed: 09/28/2023]
Abstract
We report the generation of an organism-wide catalog of 976,813 cis-acting regulatory elements for the bovine detected by the assay for transposase accessible chromatin using sequencing (ATAC-seq). We regroup these regulatory elements in 16 components by nonnegative matrix factorization. Correlation between the genome-wide density of peaks and transcription start sites, correlation between peak accessibility and expression of neighboring genes, and enrichment in transcription factor binding motifs support their regulatory potential. Using a previously established catalog of 12,736,643 variants, we show that the proportion of single-nucleotide polymorphisms mapping to ATAC-seq peaks is higher than expected and that this is owing to an approximately 1.3-fold higher mutation rate within peaks. Their site frequency spectrum indicates that variants in ATAC-seq peaks are subject to purifying selection. We generate eQTL data sets for liver and blood and show that variants that drive eQTL fall into liver- and blood-specific ATAC-seq peaks more often than expected by chance. We combine ATAC-seq and eQTL data to estimate that the proportion of regulatory variants mapping to ATAC-seq peaks is approximately one in three and that the proportion of variants mapping to ATAC-seq peaks that are regulatory is approximately one in 25. We discuss the implication of these findings on the utility of ATAC-seq information to improve the accuracy of genomic selection.
Collapse
Affiliation(s)
- Can Yuan
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | - Lijing Tang
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | - Thomas Lopdell
- Research and Development, Livestock Improvement Corporation, Hamilton 3240, New Zealand
| | - Vyacheslav A Petrov
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | - Claire Oget-Ebrad
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | | | - José Luis Gualdrón Duarte
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | - Arnaud Sartelet
- Clinical Department of Ruminant, University of Liège, 4000 Liège, Belgium
| | - Zhangrui Cheng
- Royal Veterinary College, Hatfield, Herts AL9 7TA, United Kingdom
| | - Mazdak Salavati
- Royal Veterinary College, Hatfield, Herts AL9 7TA, United Kingdom
| | - D Claire Wathes
- Royal Veterinary College, Hatfield, Herts AL9 7TA, United Kingdom
| | - Mark A Crowe
- School of Veterinary Medicine, University College Dublin, Dublin 4, Ireland
| | - Wouter Coppieters
- GIGA Genomics platform, GIGA Institute, University of Liège, 4000 Liège, Belgium
| | - Mathew Littlejohn
- Research and Development, Livestock Improvement Corporation, Hamilton 3240, New Zealand
| | - Carole Charlier
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | - Tom Druet
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| | - Michel Georges
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium;
| | - Haruko Takeda
- Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium
| |
Collapse
|
37
|
Vollger MR, Korlach J, Eldred KC, Swanson E, Underwood JG, Cheng YHH, Ranchalis J, Mao Y, Blue EE, Schwarze U, Munson KM, Saunders CT, Wenger AM, Allworth A, Chanprasert S, Duerden BL, Glass I, Horike-Pyne M, Kim M, Leppig KA, McLaughlin IJ, Ogawa J, Rosenthal EA, Sheppeard S, Sherman SM, Strohbehn S, Yuen AL, Reh TA, Byers PH, Bamshad MJ, Hisama FM, Jarvik GP, Sancak Y, Dipple KM, Stergachis AB. Synchronized long-read genome, methylome, epigenome, and transcriptome for resolving a Mendelian condition. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559521. [PMID: 37808736 PMCID: PMC10557686 DOI: 10.1101/2023.09.26.559521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Resolving the molecular basis of a Mendelian condition (MC) remains challenging owing to the diverse mechanisms by which genetic variants cause disease. To address this, we developed a synchronized long-read genome, methylome, epigenome, and transcriptome sequencing approach, which enables accurate single-nucleotide, insertion-deletion, and structural variant calling and diploid de novo genome assembly, and permits the simultaneous elucidation of haplotype-resolved CpG methylation, chromatin accessibility, and full-length transcript information in a single long-read sequencing run. Application of this approach to an Undiagnosed Diseases Network (UDN) participant with a chromosome X;13 balanced translocation of uncertain significance revealed that this translocation disrupted the functioning of four separate genes (NBEA, PDK3, MAB21L1, and RB1) previously associated with single-gene MCs. Notably, the function of each gene was disrupted via a distinct mechanism that required integration of the four 'omes' to resolve. These included nonsense-mediated decay, fusion transcript formation, enhancer adoption, transcriptional readthrough silencing, and inappropriate X chromosome inactivation of autosomal genes. Overall, this highlights the utility of synchronized long-read multi-omic profiling for mechanistically resolving complex phenotypes.
Collapse
Affiliation(s)
- Mitchell R. Vollger
- University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | | | - Kiara C. Eldred
- University of Washington School of Medicine, Department of Biological Structure, Seattle, WA, USA
| | - Elliott Swanson
- University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
| | | | - Yong-Han H. Cheng
- University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
| | - Jane Ranchalis
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Yizi Mao
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Elizabeth E. Blue
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
- Institute for Public Health Genetics, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Ulrike Schwarze
- University of Washington School of Medicine, Department of Laboratory Medicine and Pathology, Seattle, WA, USA
| | - Katherine M. Munson
- University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
| | | | | | - Aimee Allworth
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Sirisak Chanprasert
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | | | - Ian Glass
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- University of Washington, Department of Pediatrics, Seattle, WA, USA
| | - Martha Horike-Pyne
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | | | - Kathleen A. Leppig
- Genetic Services, Kaiser Permanente Washington, Seattle, Washington, USA
| | | | | | | | - Sam Sheppeard
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Stephanie M. Sherman
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Samuel Strohbehn
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
| | - Amy L. Yuen
- Genetic Services, Kaiser Permanente Washington, Seattle, Washington, USA
| | | | - Thomas A. Reh
- University of Washington School of Medicine, Department of Biological Structure, Seattle, WA, USA
| | - Peter H. Byers
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
- University of Washington School of Medicine, Department of Laboratory Medicine and Pathology, Seattle, WA, USA
| | - Michael J. Bamshad
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- University of Washington, Department of Pediatrics, Seattle, WA, USA
| | - Fuki M. Hisama
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Gail P. Jarvik
- University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Yasemin Sancak
- University of Washington School of Medicine, Department of Pharmacology, Seattle, WA, USA
| | - Katrina M. Dipple
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- University of Washington, Department of Pediatrics, Seattle, WA, USA
| | - Andrew B. Stergachis
- University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
- University of Washington School of Medicine, Department of Medicine, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| |
Collapse
|
38
|
Pividori M, Lu S, Li B, Su C, Johnson ME, Wei WQ, Feng Q, Namjou B, Kiryluk K, Kullo IJ, Luo Y, Sullivan BD, Voight BF, Skarke C, Ritchie MD, Grant SFA, Greene CS. Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms. Nat Commun 2023; 14:5562. [PMID: 37689782 PMCID: PMC10492839 DOI: 10.1038/s41467-023-41057-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 08/18/2023] [Indexed: 09/11/2023] Open
Abstract
Genes act in concert with each other in specific contexts to perform their functions. Determining how these genes influence complex traits requires a mechanistic understanding of expression regulation across different conditions. It has been shown that this insight is critical for developing new therapies. Transcriptome-wide association studies have helped uncover the role of individual genes in disease-relevant mechanisms. However, modern models of the architecture of complex traits predict that gene-gene interactions play a crucial role in disease origin and progression. Here we introduce PhenoPLIER, a computational approach that maps gene-trait associations and pharmacological perturbation data into a common latent representation for a joint analysis. This representation is based on modules of genes with similar expression patterns across the same conditions. We observe that diseases are significantly associated with gene modules expressed in relevant cell types, and our approach is accurate in predicting known drug-disease pairs and inferring mechanisms of action. Furthermore, using a CRISPR screen to analyze lipid regulation, we find that functionally important players lack associations but are prioritized in trait-associated modules by PhenoPLIER. By incorporating groups of co-expressed genes, PhenoPLIER can contextualize genetic associations and reveal potential targets missed by single-gene strategies.
Collapse
Affiliation(s)
- Milton Pividori
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Sumei Lu
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Chun Su
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Matthew E Johnson
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Wei-Qi Wei
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Qiping Feng
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Bahram Namjou
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, 45229, USA
| | - Krzysztof Kiryluk
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, 10032, USA
| | | | - Yuan Luo
- Northwestern University, Chicago, IL, 60611, USA
| | - Blair D Sullivan
- Kahlert School of Computing, University of Utah, Salt Lake City, UT, 84112, USA
| | - Benjamin F Voight
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Carsten Skarke
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marylyn D Ritchie
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Struan F A Grant
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Division of Endocrinology and Diabetes, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
- Center for Health AI, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
39
|
Lundberg A, Zhang M, Aggarwal R, Li H, Zhang L, Foye A, Sjöström M, Chou J, Chang K, Moreno-Rodriguez T, Shrestha R, Baskin A, Zhu X, Weinstein AS, Younger N, Alumkal JJ, Beer TM, Chi KN, Evans CP, Gleave M, Lara PN, Reiter RE, Rettig MB, Witte ON, Wyatt AW, Feng FY, Small EJ, Quigley DA. The Genomic and Epigenomic Landscape of Double-Negative Metastatic Prostate Cancer. Cancer Res 2023; 83:2763-2774. [PMID: 37289025 PMCID: PMC10425725 DOI: 10.1158/0008-5472.can-23-0593] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 04/20/2023] [Accepted: 06/02/2023] [Indexed: 06/09/2023]
Abstract
Systemic targeted therapy in prostate cancer is primarily focused on ablating androgen signaling. Androgen deprivation therapy and second-generation androgen receptor (AR)-targeted therapy selectively favor the development of treatment-resistant subtypes of metastatic castration-resistant prostate cancer (mCRPC), defined by AR and neuroendocrine (NE) markers. Molecular drivers of double-negative (AR-/NE-) mCRPC are poorly defined. In this study, we comprehensively characterized treatment-emergent mCRPC by integrating matched RNA sequencing, whole-genome sequencing, and whole-genome bisulfite sequencing from 210 tumors. AR-/NE- tumors were clinically and molecularly distinct from other mCRPC subtypes, with the shortest survival, amplification of the chromatin remodeler CHD7, and PTEN loss. Methylation changes in CHD7 candidate enhancers were linked to elevated CHD7 expression in AR-/NE+ tumors. Genome-wide methylation analysis nominated Krüppel-like factor 5 (KLF5) as a driver of the AR-/NE- phenotype, and KLF5 activity was linked to RB1 loss. These observations reveal the aggressiveness of AR-/NE- mCRPC and could facilitate the identification of therapeutic targets in this highly aggressive disease. SIGNIFICANCE Comprehensive characterization of the five subtypes of metastatic castration-resistant prostate cancer identified transcription factors that drive each subtype and showed that the double-negative subtype has the worst prognosis.
Collapse
Affiliation(s)
- Arian Lundberg
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Meng Zhang
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Rahul Aggarwal
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - Haolong Li
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Li Zhang
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Adam Foye
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Martin Sjöström
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Jonathan Chou
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - Kevin Chang
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - Thaidy Moreno-Rodriguez
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Urology, University of California San Francisco, San Francisco, California
| | - Raunak Shrestha
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Avi Baskin
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Xiaolin Zhu
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - Alana S. Weinstein
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
| | - Noah Younger
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - Joshi J. Alumkal
- Division of Hematology and Oncology, University of Michigan Rogel Cancer Center, Ann Arbor, Michigan
| | - Tomasz M. Beer
- Knight Cancer Institute, Oregon Health and Science University, Portland, Oregon
| | - Kim N. Chi
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Christopher P. Evans
- Comprehensive Cancer Center, University of California Davis, Sacramento, California
- Department of Urologic Surgery, University of California Davis, Sacramento, California
| | - Martin Gleave
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Primo N. Lara
- Comprehensive Cancer Center, University of California Davis, Sacramento, California
- Division of Hematology Oncology, Department of Internal Medicine, University of California Davis, Sacramento, California
| | - Rob E. Reiter
- Departments of Medicine, Hematology/Oncology and Urology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California
- Jonsson Comprehensive Cancer Center, University of California Los Angeles, Los Angeles, California
| | - Matthew B. Rettig
- Departments of Medicine, Hematology/Oncology and Urology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California
- Jonsson Comprehensive Cancer Center, University of California Los Angeles, Los Angeles, California
- VA Greater Los Angeles Healthcare System, Los Angeles, California
| | - Owen N. Witte
- Department of Microbiology, Immunology, and Molecular Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California
| | - Alexander W. Wyatt
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Felix Y. Feng
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Radiation Oncology, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
- Department of Urology, University of California San Francisco, San Francisco, California
| | - Eric J. Small
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Division of Hematology and Oncology, Department of Medicine, University of California San Francisco, San Francisco, California
| | - David A. Quigley
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Urology, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| |
Collapse
|
40
|
Capauto D, Wang Y, Wu F, Norton S, Mariani J, Inoue F, Crawford GE, Ahituv N, Abyzov A, Vaccarino FM. Characterization of enhancer activity in early human neurodevelopment using Massively parallel reporter assay (MPRA) and forebrain organoids. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.14.553170. [PMID: 37645832 PMCID: PMC10461976 DOI: 10.1101/2023.08.14.553170] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Regulation of gene expression through enhancers is one of the major processes shaping the structure and function of the human brain during development. High-throughput assays have predicted thousands of enhancers involved in neurodevelopment, and confirming their activity through orthogonal functional assays is crucial. Here, we utilized Massively Parallel Reporter Assays (MPRAs) in stem cells and forebrain organoids to evaluate the activity of ~7,000 gene-linked enhancers previously identified in human fetal tissues and brain organoids. We used a Gaussian mixture model to evaluate the contribution of background noise in the measured activity signal to confirm the activity of ~35% of the tested enhancers, with most showing temporal-specific activity, suggesting their evolving role in neurodevelopment. The temporal specificity was further supported by the correlation of activity with gene expression. Our findings provide a valuable gene regulatory resource to the scientific community.
Collapse
Affiliation(s)
- Davide Capauto
- Child Study Center, Yale University, New Haven, CT 06520
| | - Yifan Wang
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Feinan Wu
- Child Study Center, Yale University, New Haven, CT 06520
| | - Scott Norton
- Child Study Center, Yale University, New Haven, CT 06520
| | | | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University; Kyoto, Japan
| | | | | | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Alexej Abyzov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Flora M. Vaccarino
- Child Study Center, Yale University, New Haven, CT 06520
- Department of Neuroscience, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
41
|
Varberg KM, Dominguez EM, Koseva B, Varberg JM, McNally RP, Moreno-Irusta A, Wesley ER, Iqbal K, Cheung WA, Schwendinger-Schreck C, Smail C, Okae H, Arima T, Lydic M, Holoch K, Marsh C, Soares MJ, Grundberg E. Extravillous trophoblast cell lineage development is associated with active remodeling of the chromatin landscape. Nat Commun 2023; 14:4826. [PMID: 37563143 PMCID: PMC10415281 DOI: 10.1038/s41467-023-40424-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/27/2023] [Indexed: 08/12/2023] Open
Abstract
The extravillous trophoblast cell lineage is a key feature of placentation and successful pregnancy. Knowledge of transcriptional regulation driving extravillous trophoblast cell development is limited. Here, we map the transcriptome and epigenome landscape as well as chromatin interactions of human trophoblast stem cells and their transition into extravillous trophoblast cells. We show that integrating chromatin accessibility, long-range chromatin interactions, transcriptomic, and transcription factor binding motif enrichment enables identification of transcription factors and regulatory mechanisms critical for extravillous trophoblast cell development. We elucidate functional roles for TFAP2C, SNAI1, and EPAS1 in the regulation of extravillous trophoblast cell development. EPAS1 is identified as an upstream regulator of key extravillous trophoblast cell transcription factors, including ASCL2 and SNAI1 and together with its target genes, is linked to pregnancy loss and birth weight. Collectively, we reveal activation of a dynamic regulatory network and provide a framework for understanding extravillous trophoblast cell specification in trophoblast cell lineage development and human placentation.
Collapse
Affiliation(s)
- Kaela M Varberg
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA.
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA.
| | - Esteban M Dominguez
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Boryana Koseva
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Joseph M Varberg
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Ross P McNally
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA
- Department of Obstetrics and Gynecology, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Ayelen Moreno-Irusta
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Emily R Wesley
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Khursheed Iqbal
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Warren A Cheung
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Carl Schwendinger-Schreck
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Craig Smail
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Hiroaki Okae
- Department of Informative Genetics, Environment and Genome Research Center, Tohoku University Graduate School of Medicine, Sendai, 980-8575, Japan
- Department of Trophoblast Research, Institute of Molecular Embryology and Genetics, Kumamoto University, 2-2-1 Honjo, Chuo-ku, Kumamoto, 860-0811, Japan
| | - Takahiro Arima
- Department of Informative Genetics, Environment and Genome Research Center, Tohoku University Graduate School of Medicine, Sendai, 980-8575, Japan
| | - Michael Lydic
- Department of Obstetrics and Gynecology, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Kristin Holoch
- Department of Obstetrics and Gynecology, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Courtney Marsh
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA
- Department of Obstetrics and Gynecology, University of Kansas Medical Center, Kansas City, KS, 66160, USA
| | - Michael J Soares
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA.
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA.
- Department of Obstetrics and Gynecology, University of Kansas Medical Center, Kansas City, KS, 66160, USA.
- Center for Perinatal Research, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA.
| | - Elin Grundberg
- Institute for Reproductive and Developmental Sciences, University of Kansas Medical Center, Kansas City, Kansas, 66160, USA.
- Department of Pathology & Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS, 66160, USA.
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA.
| |
Collapse
|
42
|
Choi J, Kim S, Kim J, Son HY, Yoo SK, Kim CU, Park YJ, Moon S, Cha B, Jeon MC, Park K, Yun JM, Cho B, Kim N, Kim C, Kwon NJ, Park YJ, Matsuda F, Momozawa Y, Kubo M, Kim HJ, Park JH, Seo JS, Kim JI, Im SW. A whole-genome reference panel of 14,393 individuals for East Asian populations accelerates discovery of rare functional variants. SCIENCE ADVANCES 2023; 9:eadg6319. [PMID: 37556544 PMCID: PMC10411914 DOI: 10.1126/sciadv.adg6319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 07/06/2023] [Indexed: 08/11/2023]
Abstract
Underrepresentation of non-European (EUR) populations hinders growth of global precision medicine. Resources such as imputation reference panels that match the study population are necessary to find low-frequency variants with substantial effects. We created a reference panel consisting of 14,393 whole-genome sequences including more than 11,000 Asian individuals. Genome-wide association studies were conducted using the reference panel and a population-specific genotype array of 72,298 subjects for eight phenotypes. This panel yields improved imputation accuracy of rare and low-frequency variants within East Asian populations compared with the largest reference panel. Thirty-nine previously unidentified associations were found, and more than half of the variants were East Asian specific. We discovered genes with rare protein-altering variants, including LTBP1 for height and GPR75 for body mass index, as well as putative regulatory mechanisms for rare noncoding variants with cell type-specific effects. We suggest that this dataset will add to the potential value of Asian precision medicine.
Collapse
Affiliation(s)
- Jaeyong Choi
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
| | | | - Juhyun Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Ho-Young Son
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Seong-Keun Yoo
- The Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Young Jun Park
- Department of Translational Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Sungji Moon
- Interdisciplinary Program in Cancer Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University, Seoul, Republic of Korea
| | - Bukyoung Cha
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Min Chul Jeon
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Kyunghyuk Park
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Jae Moon Yun
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Belong Cho
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Family Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | | | | | | | - Young Joo Park
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Republic of Korea
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | | | - Michiaki Kubo
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Hyun-Jin Kim
- National Cancer Control Institute, National Cancer Center, Goyang, Republic of Korea
| | - Jin-Ho Park
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Family Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jeong-Sun Seo
- Macrogen Inc., Seoul, Republic of Korea
- Asian Genome Center, Seoul National University Bundang Hospital, Gyeonggi, Republic of Korea
| | - Jong-Il Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University, Seoul, Republic of Korea
- Department of Biochemistry and Molecular Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Sun-Wha Im
- Department of Biochemistry and Molecular Biology, Kangwon National University School of Medicine, Gangwon, Republic of Korea
| |
Collapse
|
43
|
Jiang X, Boutin T, Vitart V. Colocalization of corneal resistance factor GWAS loci with GTEx e/sQTLs highlights plausible candidate causal genes for keratoconus postnatal corneal stroma weakening. Front Genet 2023; 14:1171217. [PMID: 37621707 PMCID: PMC10445647 DOI: 10.3389/fgene.2023.1171217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 07/17/2023] [Indexed: 08/26/2023] Open
Abstract
Background: Genome-wide association studies (GWAS) for corneal resistance factor (CRF) have identified 100s of loci and proved useful to uncover genetic determinants for keratoconus, a corneal ectasia of early-adulthood onset and common indication of corneal transplantation. In the current absence of studies to probe the impact of candidate causal variants in the cornea, we aimed to fill some of this knowledge gap by leveraging tissue-shared genetic effects. Methods: 181 CRF signals were examined for evidence of colocalization with genetic signals affecting steady-state gene transcription and splicing in adult, non-eye, tissues of the Genotype-Tissue Expression (GTEx) project. Expression of candidate causal genes thus nominated was evaluated in single cell transcriptomes from adult cornea, limbus and conjunctiva. Fine-mapping and colocalization of CRF and keratoconus GWAS signals was also deployed to support their sharing causal variants. Results and discussion: 26.5% of CRF causal signals colocalized with GTEx v8 signals and nominated genes enriched in genes with high and specific expression in corneal stromal cells amongst tissues examined. Enrichment analyses carried out with nearest genes to all 181 CRF GWAS signals indicated that stromal cells of the limbus could be susceptible to signals that did not colocalize with GTEx's. These cells might not be well represented in GTEx and/or the genetic associations might have context specific effects. The causal signals shared with GTEx provide new insights into mediation of CRF genetic effects, including modulation of splicing events. Functionally relevant roles for several implicated genes' products in providing tensile strength, mechano-sensing and signaling make the corresponding genes and regulatory variants prime candidates to be validated and their roles and effects across tissues elucidated. Colocalization of CRF and keratoconus GWAS signals strengthened support for shared causal variants but also highlighted many ways into which likely true shared signals could be missed when using readily available GWAS summary statistics.
Collapse
Affiliation(s)
- Xinyi Jiang
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
- Centre for Genetics and Molecular Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| | - Thibaud Boutin
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| | - Veronique Vitart
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
44
|
Gosai SJ, Castro RI, Fuentes N, Butts JC, Kales S, Noche RR, Mouri K, Sabeti PC, Reilly SK, Tewhey R. Machine-guided design of synthetic cell type-specific cis-regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.08.552077. [PMID: 37609287 PMCID: PMC10441439 DOI: 10.1101/2023.08.08.552077] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Cis-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing, and stimulus responses, which collectively define the thousands of unique cell types in the body. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for an intended purpose has arisen naturally through evolution. Here, we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell type specificity. We leverage innovations in deep neural network modeling of CRE activity across three cell types, efficient in silico optimization, and massively parallel reporter assays (MPRAs) to design and empirically test thousands of CREs. Through in vitro and in vivo validation, we show that synthetic sequences outperform natural sequences from the human genome in driving cell type-specific expression. Synthetic sequences leverage unique sequence syntax to promote activity in the on-target cell type and simultaneously reduce activity in off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs and demonstrate the required literacy to write regulatory code that is fit-for-purpose in vivo across vertebrates.
Collapse
Affiliation(s)
- SJ Gosai
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Graduate Program in Biological and Biomedical Science, Boston MA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - RI Castro
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - N Fuentes
- The Jackson Laboratory, Bar Harbor, ME, USA
- Harvard College, Harvard University, Cambridge, MA, USA
| | - JC Butts
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
| | - S Kales
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - RR Noche
- Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA
- Yale Zebrafish Research Core, Yale School of Medicine, New Haven, CT, USA
| | - K Mouri
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - PC Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - SK Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - R Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
45
|
Gaulton KJ, Preissl S, Ren B. Interpreting non-coding disease-associated human variants using single-cell epigenomics. Nat Rev Genet 2023; 24:516-534. [PMID: 37161089 PMCID: PMC10629587 DOI: 10.1038/s41576-023-00598-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/27/2023] [Indexed: 05/11/2023]
Abstract
Genome-wide association studies (GWAS) have linked hundreds of thousands of sequence variants in the human genome to common traits and diseases. However, translating this knowledge into a mechanistic understanding of disease-relevant biology remains challenging, largely because such variants are predominantly in non-protein-coding sequences that still lack functional annotation at cell-type resolution. Recent advances in single-cell epigenomics assays have enabled the generation of cell type-, subtype- and state-resolved maps of the epigenome in heterogeneous human tissues. These maps have facilitated cell type-specific annotation of candidate cis-regulatory elements and their gene targets in the human genome, enhancing our ability to interpret the genetic basis of common traits and diseases.
Collapse
Affiliation(s)
- Kyle J Gaulton
- Department of Paediatrics, Paediatric Diabetes Research Center, University of California San Diego School of Medicine, La Jolla, CA, USA.
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| | - Bing Ren
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
| |
Collapse
|
46
|
Grampp S, Krüger R, Lauer V, Uebel S, Knaup KX, Naas J, Höffken V, Weide T, Schiffer M, Naas S, Schödel J. Hypoxia hits APOL1 in the kidney. Kidney Int 2023; 104:53-60. [PMID: 37098381 DOI: 10.1016/j.kint.2023.03.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 03/02/2023] [Accepted: 03/24/2023] [Indexed: 04/27/2023]
Abstract
Individuals of African ancestry carrying two pathogenic variants of apolipoprotein 1 (APOL1) have a substantially increased risk for developing chronic kidney disease. The course of APOL1 nephropathy is extremely heterogeneous and shaped by systemic factors such as a response to interferon. However, additional environmental factors operating in this second-hit model have been less well defined. Here, we reveal that stabilization of hypoxia-inducible transcription factors (HIF) by hypoxia or HIF prolyl hydroxylase inhibitors activates transcription of APOL1 in podocytes and tubular cells. An active regulatory DNA-element upstream of APOL1 that interacted with HIF was identified. This enhancer was accessible preferentially in kidney cells. Importantly, upregulation of APOL1 by HIF was additive to the effects of interferon. Furthermore, HIF stimulated expression of APOL1 in tubular cells derived from the urine of an individual carrying a risk variant for kidney disease. Thus, hypoxic insults may serve as important modulators of APOL1 nephropathy.
Collapse
Affiliation(s)
- Steffen Grampp
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - René Krüger
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Victoria Lauer
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Sebastian Uebel
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Karl X Knaup
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Julia Naas
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, Vienna BioCenter, University of Vienna and Medical University of Vienna, Wien, Austria
| | - Verena Höffken
- Medical Clinic D, Institute of Molecular Nephrology, University Hospital of Münster, Münster, Germany
| | - Thomas Weide
- Medical Clinic D, Institute of Molecular Nephrology, University Hospital of Münster, Münster, Germany
| | - Mario Schiffer
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Stephanie Naas
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Johannes Schödel
- Department of Nephrology and Hypertension, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
| |
Collapse
|
47
|
Als TD, Kurki MI, Grove J, Voloudakis G, Therrien K, Tasanko E, Nielsen TT, Naamanka J, Veerapen K, Levey DF, Bendl J, Bybjerg-Grauholm J, Zeng B, Demontis D, Rosengren A, Athanasiadis G, Bækved-Hansen M, Qvist P, Bragi Walters G, Thorgeirsson T, Stefánsson H, Musliner KL, Rajagopal VM, Farajzadeh L, Thirstrup J, Vilhjálmsson BJ, McGrath JJ, Mattheisen M, Meier S, Agerbo E, Stefánsson K, Nordentoft M, Werge T, Hougaard DM, Mortensen PB, Stein MB, Gelernter J, Hovatta I, Roussos P, Daly MJ, Mors O, Palotie A, Børglum AD. Depression pathophysiology, risk prediction of recurrence and comorbid psychiatric disorders using genome-wide analyses. Nat Med 2023; 29:1832-1844. [PMID: 37464041 PMCID: PMC10839245 DOI: 10.1038/s41591-023-02352-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 04/17/2023] [Indexed: 07/20/2023]
Abstract
Depression is a common psychiatric disorder and a leading cause of disability worldwide. Here we conducted a genome-wide association study meta-analysis of six datasets, including >1.3 million individuals (371,184 with depression) and identified 243 risk loci. Overall, 64 loci were new, including genes encoding glutamate and GABA receptors, which are targets for antidepressant drugs. Intersection with functional genomics data prioritized likely causal genes and revealed new enrichment of prenatal GABAergic neurons, astrocytes and oligodendrocyte lineages. We found depression to be highly polygenic, with ~11,700 variants explaining 90% of the single-nucleotide polymorphism heritability, estimating that >95% of risk variants for other psychiatric disorders (anxiety, schizophrenia, bipolar disorder and attention deficit hyperactivity disorder) were influencing depression risk when both concordant and discordant variants were considered, and nearly all depression risk variants influenced educational attainment. Additionally, depression genetic risk was associated with impaired complex cognition domains. We dissected the genetic and clinical heterogeneity, revealing distinct polygenic architectures across subgroups of depression and demonstrating significantly increased absolute risks for recurrence and psychiatric comorbidity among cases of depression with the highest polygenic burden, with considerable sex differences. The risks were up to 5- and 32-fold higher than cases with the lowest polygenic burden and the background population, respectively. These results deepen the understanding of the biology underlying depression, its disease progression and inform precision medicine approaches to treatment.
Collapse
Affiliation(s)
- Thomas D Als
- Department of Biomedicine, Aarhus University, Aarhus, Denmark.
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
- Center for Genomics and Personalized Medicine, Aarhus, Denmark.
| | - Mitja I Kurki
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jakob Grove
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Georgios Voloudakis
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J Peters VA Medical Center, Bronx, NY, USA
| | - Karen Therrien
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J Peters VA Medical Center, Bronx, NY, USA
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Elisa Tasanko
- Department of Psychology and Logopedics, SleepWell Research Program, University of Helsinki, Helsinki, Finland
| | - Trine Tollerup Nielsen
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Joonas Naamanka
- Department of Psychology and Logopedics, SleepWell Research Program, University of Helsinki, Helsinki, Finland
| | - Kumar Veerapen
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Daniel F Levey
- Division of Human Genetics, Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
- Department of Psychiatry, Veterans Affairs Connecticut Healthcare Center, West Haven, CT, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jonas Bybjerg-Grauholm
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Biao Zeng
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ditte Demontis
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Anders Rosengren
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Mental Health Centre Sct. Hans, Capital Region of Denmark, Institute of Biological Psychiatry, Copenhagen University Hospital, Copenhagen, Denmark
| | - Georgios Athanasiadis
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Mental Health Centre Sct. Hans, Capital Region of Denmark, Institute of Biological Psychiatry, Copenhagen University Hospital, Copenhagen, Denmark
- Department of Evolutionary Biology, Ecology and Environmental Sciences, University of Barcelona, Barcelona, Spain
| | - Marie Bækved-Hansen
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Per Qvist
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | | | | | | | - Katherine L Musliner
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- National Centre for Register-Based Research (NCRR), Business and Social Sciences, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital-Psychiatry, Aarhus, Denmark
- The Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Veera M Rajagopal
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Leila Farajzadeh
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Janne Thirstrup
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, Aarhus, Denmark
| | - Bjarni J Vilhjálmsson
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
| | - John J McGrath
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Queensland Centre for Mental Health Research, The Park Centre for Mental Health, Brisbane, Queensland, Australia
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Manuel Mattheisen
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich, Germany
- Department of Community Health and Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Sandra Meier
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Department of Community Health and Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Esben Agerbo
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- National Centre for Register-Based Research (NCRR), Business and Social Sciences, Aarhus University, Aarhus, Denmark
- Centre for Integrated Register-based Research, CIRRAU, Aarhus University, Aarhus, Denmark
| | | | - Merete Nordentoft
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Mental Health Centre Copenhagen, Capital Region of Denmark, Copenhagen University Hospital, Copenhagen, Denmark
| | - Thomas Werge
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Mental Health Centre Sct. Hans, Capital Region of Denmark, Institute of Biological Psychiatry, Copenhagen University Hospital, Copenhagen, Denmark
- Institute of Clinical Sciences and GLOBE Institute, LF Center for GeoGenetics, University of Copenhagen, Copenhagen, Denmark
| | - David M Hougaard
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Preben B Mortensen
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- National Centre for Register-Based Research (NCRR), Business and Social Sciences, Aarhus University, Aarhus, Denmark
- Centre for Integrated Register-based Research, CIRRAU, Aarhus University, Aarhus, Denmark
| | - Murray B Stein
- Psychiatry Service, VA San Diego Healthcare System, San Diego, CA, USA
- Departments of Psychiatry and Herbert Wertheim School of Public Health, University of California, San Diego, La Jolla, CA, USA
| | - Joel Gelernter
- Division of Human Genetics, Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
- Department of Psychiatry, Veterans Affairs Connecticut Healthcare Center, West Haven, CT, USA
| | - Iiris Hovatta
- Department of Psychology and Logopedics, SleepWell Research Program, University of Helsinki, Helsinki, Finland
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J Peters VA Medical Center, Bronx, NY, USA
- Center for Dementia Research, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA
| | - Mark J Daly
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Ole Mors
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Psychosis Research Unit, Aarhus University Hospital-Psychiatry, Aarhus, Denmark
| | - Aarno Palotie
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Anders D Børglum
- Department of Biomedicine, Aarhus University, Aarhus, Denmark.
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
- Center for Genomics and Personalized Medicine, Aarhus, Denmark.
| |
Collapse
|
48
|
Benaglio P, Newsome J, Han JY, Chiou J, Aylward A, Corban S, Miller M, Okino ML, Kaur J, Preissl S, Gorkin DU, Gaulton KJ. Mapping genetic effects on cell type-specific chromatin accessibility and annotating complex immune trait variants using single nucleus ATAC-seq in peripheral blood. PLoS Genet 2023; 19:e1010759. [PMID: 37289818 DOI: 10.1371/journal.pgen.1010759] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 04/25/2023] [Indexed: 06/10/2023] Open
Abstract
Gene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 13 individuals. Clustering chromatin accessibility profiles of 96,002 total nuclei identified 17 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type using individuals of European ancestry which identified 6,901 caQTLs at FDR < .10 and 4,220 caQTLs at FDR < .05, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,941 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters. We fine-mapped loci associated with 16 complex immune traits and identified immune cell caQTLs at 622 candidate causal variants, including those with cell type-specific effects. At the 6q15 locus associated with type 1 diabetes, in line with previous reports, variant rs72928038 was a naïve CD4+ T cell caQTL linked to BACH2 and we validated the allelic effects of this variant on regulatory activity in Jurkat T cells. These results highlight the utility of snATAC-seq for mapping genetic effects on accessible chromatin in specific cell types.
Collapse
Affiliation(s)
- Paola Benaglio
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Jacklyn Newsome
- Bioinformatics and Systems Biology Program, University of California San Diego, San Diego, California, United States of America
| | - Jee Yun Han
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - Joshua Chiou
- Biomedical Sciences Graduate Program. University of California San Diego, San Diego, California, United States of America
| | - Anthony Aylward
- Bioinformatics and Systems Biology Program, University of California San Diego, San Diego, California, United States of America
| | - Sierra Corban
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Michael Miller
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - Mei-Lin Okino
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Jaspreet Kaur
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| | - Sebastian Preissl
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - David U Gorkin
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California San Diego, San Diego, California, United States of America
| | - Kyle J Gaulton
- Department of Pediatrics, University of California San Diego, San Diego, California, United States of America
| |
Collapse
|
49
|
Wang LS, Sun ZL. iDHS-FFLG: Identifying DNase I Hypersensitive Sites by Feature Fusion and Local-Global Feature Extraction Network. Interdiscip Sci 2023; 15:155-170. [PMID: 36166165 DOI: 10.1007/s12539-022-00538-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/12/2022] [Accepted: 09/12/2022] [Indexed: 05/01/2023]
Abstract
The DNase I hypersensitive sites (DHSs) are active regions on chromatin that have been found to be highly sensitive to DNase I. These regions contain various cis-regulatory elements, including promoters, enhancers and silencers. Accurate identification of DHSs helps researchers better understand the transcriptional machinery of DNA and deepen the knowledge of functional DNA elements in non-coding sequences. Researchers have developed many methods based on traditional experiments and machine learning to identify DHSs. However, low prediction accuracy and robustness limit their application in genetics research. In this paper, a novel computational approach based on deep learning is proposed by feature fusion and local-global feature extraction network to identify DHSs in mouse, named iDHS-FFLG. First of all, multiple binary features of nucleotides are fused to better express sequence information. Then, a network consisting of the convolutional neural network (CNN), bidirectional long short-term memory (BiLSTM) and self-attention mechanism is designed to extract local features and global contextual associations. In the end, the prediction module is applied to distinguish between DHSs and non-DHSs. The results of several experiments demonstrate the superior performances of iDHS-FFLG compared to the latest methods.
Collapse
Affiliation(s)
- Lei-Shan Wang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China
| | - Zhan-Li Sun
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China.
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
50
|
Cheung WA, Johnson AF, Rowell WJ, Farrow E, Hall R, Cohen ASA, Means JC, Zion TN, Portik DM, Saunders CT, Koseva B, Bi C, Truong TK, Schwendinger-Schreck C, Yoo B, Johnston JJ, Gibson M, Evrony G, Rizzo WB, Thiffault I, Younger ST, Curran T, Wenger AM, Grundberg E, Pastinen T. Direct haplotype-resolved 5-base HiFi sequencing for genome-wide profiling of hypermethylation outliers in a rare disease cohort. Nat Commun 2023; 14:3090. [PMID: 37248219 DOI: 10.1038/s41467-023-38782-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 05/15/2023] [Indexed: 05/31/2023] Open
Abstract
Long-read HiFi genome sequencing allows for accurate detection and direct phasing of single nucleotide variants, indels, and structural variants. Recent algorithmic development enables simultaneous detection of CpG methylation for analysis of regulatory element activity directly in HiFi reads. We present a comprehensive haplotype resolved 5-base HiFi genome sequencing dataset from a rare disease cohort of 276 samples in 152 families to identify rare (~0.5%) hypermethylation events. We find that 80% of these events are allele-specific and predicted to cause loss of regulatory element activity. We demonstrate heritability of extreme hypermethylation including rare cis variants associated with short (~200 bp) and large hypermethylation events (>1 kb), respectively. We identify repeat expansions in proximal promoters predicting allelic gene silencing via hypermethylation and demonstrate allelic transcriptional events downstream. On average 30-40 rare hypermethylation tiles overlap rare disease genes per patient, providing indications for variation prioritization including a previously undiagnosed pathogenic allele in DIP2B causing global developmental delay. We propose that use of HiFi genome sequencing in unsolved rare disease cases will allow detection of unconventional diseases alleles due to loss of regulatory element activity.
Collapse
Affiliation(s)
- Warren A Cheung
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Adam F Johnson
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | | | - Emily Farrow
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
| | | | - Ana S A Cohen
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO, USA
| | - John C Means
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Tricia N Zion
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | | | | | - Boryana Koseva
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Chengpeng Bi
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Tina K Truong
- Center for Human Genetics and Genomics, Department of Pediatrics, Department of Neuroscience and Physiology, New York University Grossman School of Medicine, New York, NY, USA
| | - Carl Schwendinger-Schreck
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Byunggil Yoo
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Jeffrey J Johnston
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Margaret Gibson
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Gilad Evrony
- Center for Human Genetics and Genomics, Department of Pediatrics, Department of Neuroscience and Physiology, New York University Grossman School of Medicine, New York, NY, USA
| | - William B Rizzo
- Child Health Research Institute, Department of Pediatrics, Nebraska Medical Center, Omaha, NE, USA
| | - Isabelle Thiffault
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Scott T Younger
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
| | - Tom Curran
- Children's Mercy Research Institute, Kansas City, MO, USA
| | | | - Elin Grundberg
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA.
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA.
| | - Tomi Pastinen
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA.
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA.
| |
Collapse
|