1
|
Koido M, Tomizuka K, Terao C. Fundamentals for predicting transcriptional regulations from DNA sequence patterns. J Hum Genet 2024:10.1038/s10038-024-01256-3. [PMID: 38730006 DOI: 10.1038/s10038-024-01256-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 04/10/2024] [Accepted: 04/25/2024] [Indexed: 05/12/2024]
Abstract
Cell-type-specific regulatory elements, cataloged through extensive experiments and bioinformatics in large-scale consortiums, have enabled enrichment analyses of genetic associations that primarily utilize positional information of the regulatory elements. These analyses have identified cell types and pathways genetically associated with human complex traits. However, our understanding of detailed allelic effects on these elements' activities and on-off states remains incomplete, hampering the interpretation of human genetic study results. This review introduces machine learning methods to learn sequence-dependent transcriptional regulation mechanisms from DNA sequences for predicting such allelic effects (not associations). We provide a concise history of machine-learning-based approaches, the requirements, and the key computational processes, focusing on primers in machine learning. Convolution and self-attention, pivotal in modern deep-learning models, are explained through geometrical interpretations using dot products. This facilitates understanding of the concept and why these have been used for machine learning for DNA sequences. These will inspire further research in this genetics and genomics field.
Collapse
Affiliation(s)
- Masaru Koido
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan.
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan.
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan.
| |
Collapse
|
2
|
Popp JM, Rhodes K, Jangi R, Li M, Barr K, Tayeb K, Battle A, Gilad Y. Cell-type and dynamic state govern genetic regulation of gene expression in heterogeneous differentiating cultures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.02.592174. [PMID: 38746382 PMCID: PMC11092595 DOI: 10.1101/2024.05.02.592174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Identifying the molecular effects of human genetic variation across cellular contexts is crucial for understanding the mechanisms underlying disease-associated loci, yet many cell-types and developmental stages remain underexplored. Here we harnessed the potential of heterogeneous differentiating cultures ( HDCs ), an in vitro system in which pluripotent cells asynchronously differentiate into a broad spectrum of cell-types. We generated HDCs for 53 human donors and collected single-cell RNA-sequencing data from over 900,000 cells. We identified expression quantitative trait loci in 29 cell-types and characterized regulatory dynamics across diverse differentiation trajectories. This revealed novel regulatory variants for genes involved in key developmental and disease-related processes while replicating known effects from primary tissues, and dynamic regulatory effects associated with a range of complex traits.
Collapse
|
3
|
Randolph HE, Aracena KA, Lin YL, Mu Z, Barreiro LB. Shaping immunity: The influence of natural selection on population immune diversity. Immunol Rev 2024; 323:227-240. [PMID: 38577999 DOI: 10.1111/imr.13329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2024]
Abstract
Humans exhibit considerable variability in their immune responses to the same immune challenges. Such variation is widespread and affects individual and population-level susceptibility to infectious diseases and immune disorders. Although the factors influencing immune response diversity are partially understood, what mechanisms lead to the wide range of immune traits in healthy individuals remain largely unexplained. Here, we discuss the role that natural selection has played in driving phenotypic differences in immune responses across populations and present-day susceptibility to immune-related disorders. Further, we touch on future directions in the field of immunogenomics, highlighting the value of expanding this work to human populations globally, the utility of modeling the immune response as a dynamic process, and the importance of considering the potential polygenic nature of natural selection. Identifying loci acted upon by evolution may further pinpoint variants critically involved in disease etiology, and designing studies to capture these effects will enrich our understanding of the genetic contributions to immunity and immune dysregulation.
Collapse
Affiliation(s)
- Haley E Randolph
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, USA
- Department of Pediatrics, Columbia University Irving Medical Center, New York, New York, USA
| | | | - Yen-Lung Lin
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois, USA
| | - Zepeng Mu
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, USA
| | - Luis B Barreiro
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, USA
- Department of Human Genetics, University of Chicago, Chicago, Illinois, USA
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois, USA
- Committee on Immunology, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
4
|
Farbehi N, Neavin DR, Cuomo ASE, Studer L, MacArthur DG, Powell JE. Integrating population genetics, stem cell biology and cellular genomics to study complex human diseases. Nat Genet 2024; 56:758-766. [PMID: 38741017 DOI: 10.1038/s41588-024-01731-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 03/20/2024] [Indexed: 05/16/2024]
Abstract
Human pluripotent stem (hPS) cells can, in theory, be differentiated into any cell type, making them a powerful in vitro model for human biology. Recent technological advances have facilitated large-scale hPS cell studies that allow investigation of the genetic regulation of molecular phenotypes and their contribution to high-order phenotypes such as human disease. Integrating hPS cells with single-cell sequencing makes identifying context-dependent genetic effects during cell development or upon experimental manipulation possible. Here we discuss how the intersection of stem cell biology, population genetics and cellular genomics can help resolve the functional consequences of human genetic variation. We examine the critical challenges of integrating these fields and approaches to scaling them cost-effectively and practically. We highlight two areas of human biology that can particularly benefit from population-scale hPS cell studies, elucidating mechanisms underlying complex disease risk loci and evaluating relationships between common genetic variation and pharmacotherapeutic phenotypes.
Collapse
Affiliation(s)
- Nona Farbehi
- Garvan Weizmann Center for Cellular Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Graduate School of Biomedical Engineering, University of New South Wales, Sydney, New South Wales, Australia
- Aligning Science Across Parkinson's Collaborative Research Network, Chevy Chase, MD, USA
| | - Drew R Neavin
- Garvan Weizmann Center for Cellular Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Anna S E Cuomo
- Garvan Weizmann Center for Cellular Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research, University of New South Wales, Sydney, New South Wales, Australia
| | - Lorenz Studer
- Aligning Science Across Parkinson's Collaborative Research Network, Chevy Chase, MD, USA
- The Center for Stem Cell Biology and Developmental Biology Program, Sloan-Kettering Institute for Cancer Research, New York, NY, USA
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, University of New South Wales, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Joseph E Powell
- Garvan Weizmann Center for Cellular Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Aligning Science Across Parkinson's Collaborative Research Network, Chevy Chase, MD, USA.
- UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
5
|
Jeong R, Bulyk ML. Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.12.589213. [PMID: 38659802 PMCID: PMC11042205 DOI: 10.1101/2024.04.12.589213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Most genetic loci associated with complex traits and diseases through genome-wide association studies (GWAS) are noncoding, suggesting that the causal variants likely have gene regulatory effects. However, only a small number of loci have been linked to expression quantitative trait loci (eQTLs) detected currently. To better understand the potential reasons for many trait-associated loci lacking eQTL colocalization, we investigated whether chromatin accessibility QTLs (caQTLs) in lymphoblastoid cell lines (LCLs) explain immune-mediated disease associations that eQTLs in LCLs did not. The power to detect caQTLs was greater than that of eQTLs and was less affected by the distance from the transcription start site of the associated gene. Meta-analyzing LCL eQTL data to increase the sample size to over a thousand led to additional loci with eQTL colocalization, demonstrating that insufficient statistical power is still likely to be a factor. Moreover, further eQTL colocalization loci were uncovered by surveying eQTLs of other immune cell types. Altogether, insufficient power and context-specificity of eQTLs both contribute to the 'missing regulation.'
Collapse
Affiliation(s)
- Raehoon Jeong
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
- Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
6
|
Natri HM, Del Azodi CB, Peter L, Taylor CJ, Chugh S, Kendle R, Chung MI, Flaherty DK, Matlock BK, Calvi CL, Blackwell TS, Ware LB, Bacchetta M, Walia R, Shaver CM, Kropski JA, McCarthy DJ, Banovich NE. Cell-type-specific and disease-associated expression quantitative trait loci in the human lung. Nat Genet 2024; 56:595-604. [PMID: 38548990 PMCID: PMC11018522 DOI: 10.1038/s41588-024-01702-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 02/28/2024] [Indexed: 04/04/2024]
Abstract
Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis. Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA sequencing of lung tissue from 66 individuals with pulmonary fibrosis and 48 unaffected donors. Using a pseudobulk approach, we mapped expression quantitative trait loci (eQTLs) across 38 cell types, observing both shared and cell-type-specific regulatory effects. Furthermore, we identified disease interaction eQTLs and demonstrated that this class of associations is more likely to be cell-type-specific and linked to cellular dysregulation in pulmonary fibrosis. Finally, we connected lung disease risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression and implicates context-specific eQTLs as key regulators of lung homeostasis and disease.
Collapse
Affiliation(s)
- Heini M Natri
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Christina B Del Azodi
- St. Vincent's Institute of Medical Research, Melbourne, Victoria, Australia
- Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
| | - Lance Peter
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Chase J Taylor
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sagrika Chugh
- St. Vincent's Institute of Medical Research, Melbourne, Victoria, Australia
- Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, Victoria, Australia
| | - Robert Kendle
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Mei-I Chung
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - David K Flaherty
- Flow Cytometry Shared Resource, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Brittany K Matlock
- Flow Cytometry Shared Resource, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Carla L Calvi
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Timothy S Blackwell
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
- Department of Veterans Affairs Medical Center, Nashville, TN, USA
| | - Lorraine B Ware
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Matthew Bacchetta
- Department of Cardiac Surgery, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Rajat Walia
- Department of Thoracic Disease and Transplantation, Norton Thoracic Institute, Phoenix, AZ, USA
| | - Ciara M Shaver
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jonathan A Kropski
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
- Department of Veterans Affairs Medical Center, Nashville, TN, USA
| | - Davis J McCarthy
- St. Vincent's Institute of Medical Research, Melbourne, Victoria, Australia
- Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, Victoria, Australia
| | | |
Collapse
|
7
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
8
|
McIntire E, Barr KA, Gonzales NM, Gilad Y. Guided Differentiation of Pluripotent Stem Cells for Cardiac Cell Diversity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.21.550072. [PMID: 37502898 PMCID: PMC10370173 DOI: 10.1101/2023.07.21.550072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
We have developed a guided differentiation protocol for induced pluripotent stem cells (iPSCs) that rapidly generates a temporally and functionally diverse set of cardiac-relevant cell types. By leveraging techniques used in embryoid body and cardiac organoid generation, we produce both progenitor and terminal cardiac cell types concomitantly in just 10 days. Our results show that guided differentiation generates functionally relevant cardiac cell types that closely align with the transcriptional profiles of cells from differentiation time-course collections, mature cardiac organoids, and in vivo heart tissue. Guided differentiation prioritizes simplicity by minimizing the number of reagents and steps required, thereby enabling rapid and cost-effective experimental throughput. We expect this approach will provide a scalable cardiac model for population-level studies of gene regulatory variation and gene-by-environment interactions.
Collapse
Affiliation(s)
- Erik McIntire
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Kenneth A Barr
- Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Natalia M Gonzales
- Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
9
|
Matthews ER, Johnson OD, Horn KJ, Gutiérrez JA, Powell SR, Ward MC. Anthracyclines induce cardiotoxicity through a shared gene expression response signature. PLoS Genet 2024; 20:e1011164. [PMID: 38416769 PMCID: PMC10927150 DOI: 10.1371/journal.pgen.1011164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 03/11/2024] [Accepted: 01/31/2024] [Indexed: 03/01/2024] Open
Abstract
TOP2 inhibitors (TOP2i) are effective drugs for breast cancer treatment. However, they can cause cardiotoxicity in some women. The most widely used TOP2i include anthracyclines (AC) Doxorubicin (DOX), Daunorubicin (DNR), Epirubicin (EPI), and the anthraquinone Mitoxantrone (MTX). It is unclear whether women would experience the same adverse effects from all drugs in this class, or if specific drugs would be preferable for certain individuals based on their cardiotoxicity risk profile. To investigate this, we studied the effects of treatment of DOX, DNR, EPI, MTX, and an unrelated monoclonal antibody Trastuzumab (TRZ) on iPSC-derived cardiomyocytes (iPSC-CMs) from six healthy females. All TOP2i induce cell death at concentrations observed in cancer patient serum, while TRZ does not. A sub-lethal dose of all TOP2i induces limited cellular stress but affects calcium handling, a function critical for cardiomyocyte contraction. TOP2i induce thousands of gene expression changes over time, giving rise to four distinct gene expression response signatures, denoted as TOP2i early-acute, early-sustained, and late response genes, and non-response genes. There is no drug- or AC-specific signature. TOP2i early response genes are enriched in chromatin regulators, which mediate AC sensitivity across breast cancer patients. However, there is increased transcriptional variability between individuals following AC treatments. To investigate potential genetic effects on response variability, we first identified a reported set of expression quantitative trait loci (eQTLs) uncovered following DOX treatment in iPSC-CMs. Indeed, DOX response eQTLs are enriched in genes that respond to all TOP2i. Next, we identified 38 genes in loci associated with AC toxicity by GWAS or TWAS. Two thirds of the genes that respond to at least one TOP2i, respond to all ACs with the same direction of effect. Our data demonstrate that TOP2i induce thousands of shared gene expression changes in cardiomyocytes, including genes near SNPs associated with inter-individual variation in response to DOX treatment and AC-induced cardiotoxicity.
Collapse
Affiliation(s)
- E. Renee Matthews
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - Omar D. Johnson
- Biochemistry, Cellular and Molecular Biology Graduate Program, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - Kandace J. Horn
- John Sealy School of Medicine, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - José A. Gutiérrez
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - Simon R. Powell
- Neuroscience Graduate Program, University of Texas Medical Branch, Galveston, Texas, United States of America
| | - Michelle C. Ward
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, United States of America
| |
Collapse
|
10
|
Yang Y, Yang R, Kang B, Qian S, He X, Zhang X. Single-cell long-read sequencing in human cerebral organoids uncovers cell-type-specific and autism-associated exons. Cell Rep 2023; 42:113335. [PMID: 37889749 PMCID: PMC10842930 DOI: 10.1016/j.celrep.2023.113335] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/12/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023] Open
Abstract
Dysregulation of alternative splicing has been repeatedly associated with neurodevelopmental disorders, but the extent of cell-type-specific splicing in human neural development remains largely uncharted. Here, single-cell long-read sequencing in induced pluripotent stem cell (iPSC)-derived cerebral organoids identifies over 31,000 uncatalogued isoforms and 4,531 cell-type-specific splicing events. Long reads uncover coordinated splicing and cell-type-specific intron retention events, which are challenging to study with short reads. Retained neuronal introns are enriched in RNA splicing regulators, showing shorter lengths, higher GC contents, and weaker 5' splice sites. We use this dataset to explore the biological processes underlying neurological disorders, focusing on autism. In comparison with prior transcriptomic data, we find that the splicing program in autistic brains is closer to the progenitor state than differentiated neurons. Furthermore, cell-type-specific exons harbor significantly more de novo mutations in autism probands than in siblings. Overall, these results highlight the importance of cell-type-specific splicing in autism and neuronal gene regulation.
Collapse
Affiliation(s)
- Yalan Yang
- Department of Human Genetics, Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Runwei Yang
- Department of Human Genetics, Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Bowei Kang
- Department of Human Genetics, Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Sheng Qian
- Department of Human Genetics, Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA
| | - Xin He
- Department of Human Genetics, Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA.
| | - Xiaochang Zhang
- Department of Human Genetics, Neuroscience Institute, The University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
11
|
Gilmore RB, Liu Y, Stoddard CE, Chung MS, Carmichael GG, Cotney J. Identifying key underlying regulatory networks and predicting targets of orphan C/D box SNORD116 snoRNAs in Prader-Willi syndrome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.03.560773. [PMID: 37873184 PMCID: PMC10592975 DOI: 10.1101/2023.10.03.560773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Prader-Willi syndrome (PWS) is a rare neurodevelopmental disorder characterized principally by initial symptoms of neonatal hypotonia and failure-to-thrive in infancy, followed by hyperphagia and obesity. It is well established that PWS is caused by loss of paternal expression of the imprinted region on chromosome 15q11-q13. While most PWS cases exhibit megabase-scale deletions of the paternal chromosome 15q11-q13 allele, several PWS patients have been identified harboring a much smaller deletion encompassing primarily SNORD116. This finding suggests SNORD116 is a direct driver of PWS phenotypes. The SNORD116 gene cluster is composed of 30 copies of individual SNORD116 C/D box small nucleolar RNAs (snoRNAs). Many C/D box snoRNAs have been shown to guide chemical modifications of other RNA molecules, often ribosomal RNA (rRNA). However, SNORD116 snoRNAs are termed 'orphans' because no verified targets have been identified and their sequences show no significant complementarity to rRNA. It is crucial to identify the targets and functions of SNORD116 snoRNAs because all reported PWS cases lack their expression. To address this, we engineered two different deletions modelling PWS in two distinct human embryonic stem cell (hESC) lines to control for effects of genetic background. Utilizing an inducible expression system enabled quick, reproducible differentiation of these lines into neurons. Systematic comparisons of neuronal gene expression across deletion types and genetic backgrounds revealed a novel list of 42 consistently dysregulated genes. Employing the recently described computational tool snoGloBe, we discovered these dysregulated genes are significantly enriched for predicted SNORD116 targeting versus multiple control analyses. Importantly, our results showed it is critical to use multiple isogenic cell line pairs, as this eliminated many spuriously differentially expressed genes. Our results indicate a novel gene regulatory network controlled by SNORD116 is likely perturbed in PWS patients.
Collapse
Affiliation(s)
- Rachel B. Gilmore
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Yaling Liu
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Christopher E. Stoddard
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Michael S. Chung
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Gordon G. Carmichael
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Justin Cotney
- Department of Genetics and Genome Sciences, University of Connecticut School of Medicine, Farmington, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
12
|
Pollen AA, Kilik U, Lowe CB, Camp JG. Human-specific genetics: new tools to explore the molecular and cellular basis of human evolution. Nat Rev Genet 2023; 24:687-711. [PMID: 36737647 PMCID: PMC9897628 DOI: 10.1038/s41576-022-00568-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/08/2022] [Indexed: 02/05/2023]
Abstract
Our ancestors acquired morphological, cognitive and metabolic modifications that enabled humans to colonize diverse habitats, develop extraordinary technologies and reshape the biosphere. Understanding the genetic, developmental and molecular bases for these changes will provide insights into how we became human. Connecting human-specific genetic changes to species differences has been challenging owing to an abundance of low-effect size genetic changes, limited descriptions of phenotypic differences across development at the level of cell types and lack of experimental models. Emerging approaches for single-cell sequencing, genetic manipulation and stem cell culture now support descriptive and functional studies in defined cell types with a human or ape genetic background. In this Review, we describe how the sequencing of genomes from modern and archaic hominins, great apes and other primates is revealing human-specific genetic changes and how new molecular and cellular approaches - including cell atlases and organoids - are enabling exploration of the candidate causal factors that underlie human-specific traits.
Collapse
Affiliation(s)
- Alex A Pollen
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA.
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA.
| | - Umut Kilik
- Institute of Human Biology (IHB), Roche Pharma Research and Early Development, Roche Innovation Center Basel, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Craig B Lowe
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA.
| | - J Gray Camp
- Institute of Human Biology (IHB), Roche Pharma Research and Early Development, Roche Innovation Center Basel, Basel, Switzerland.
- University of Basel, Basel, Switzerland.
| |
Collapse
|
13
|
Smullen M, Olson MN, Reichert JM, Dawes P, Murray LF, Baer CE, Wang Q, Readhead B, Church GM, Lim ET, Chan Y. Reliable multiplex generation of pooled induced pluripotent stem cells. CELL REPORTS METHODS 2023; 3:100570. [PMID: 37751688 PMCID: PMC10545906 DOI: 10.1016/j.crmeth.2023.100570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 06/23/2023] [Accepted: 08/04/2023] [Indexed: 09/28/2023]
Abstract
Reprogramming somatic cells into pluripotent stem cells (iPSCs) enables the study of systems in vitro. To increase the throughput of reprogramming, we present induction of pluripotency from pooled cells (iPPC)-an efficient, scalable, and reliable reprogramming procedure. Using our deconvolution algorithm that employs pooled sequencing of single-nucleotide polymorphisms (SNPs), we accurately estimated individual donor proportions of the pooled iPSCs. With iPPC, we concurrently reprogrammed over one hundred donor lymphoblastoid cell lines (LCLs) into iPSCs and found strong correlations of individual donors' reprogramming ability across multiple experiments. Individual donors' reprogramming ability remains consistent across both same-day replicates and multiple experimental runs, and the expression of certain immunoglobulin precursor genes may impact reprogramming ability. The pooled iPSCs were also able to differentiate into cerebral organoids. Our procedure enables a multiplex framework of using pooled libraries of donor iPSCs for downstream research and investigation of in vitro phenotypes.
Collapse
Affiliation(s)
- Molly Smullen
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Meagan N Olson
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Julia M Reichert
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Pepper Dawes
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Liam F Murray
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Christina E Baer
- Department of Microbiology and Physiological Systems, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Qi Wang
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ 85281, USA
| | - Benjamin Readhead
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ 85281, USA
| | - George M Church
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| | - Elaine T Lim
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Yingleong Chan
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; NeuroNexus Institute, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
14
|
Barr KA, Rhodes KL, Gilad Y. The relationship between regulatory changes in cis and trans and the evolution of gene expression in humans and chimpanzees. Genome Biol 2023; 24:207. [PMID: 37697401 PMCID: PMC10496171 DOI: 10.1186/s13059-023-03019-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 07/21/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND Comparative gene expression studies in apes are fundamentally limited by the challenges associated with sampling across different tissues. Here, we used single-cell RNA sequencing of embryoid bodies to collect transcriptomic data from over 70 cell types in three humans and three chimpanzees. RESULTS We find hundreds of genes whose regulation is conserved across cell types, as well as genes whose regulation likely evolves under directional selection in one or a handful of cell types. Using embryoid bodies from a human-chimpanzee fused cell line, we also infer the proportion of inter-species regulatory differences due to changes in cis and trans elements between the species. Using the cis/trans inference and an analysis of transcription factor binding sites, we identify dozens of transcription factors whose inter-species differences in expression are affecting expression differences between humans and chimpanzees in hundreds of target genes. CONCLUSIONS Here, we present the most comprehensive dataset of comparative gene expression from humans and chimpanzees to date, including a catalog of regulatory mechanisms associated with inter-species differences.
Collapse
Affiliation(s)
- Kenneth A Barr
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
| | | | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
15
|
Park S, Gwon Y, Khan SA, Jang KJ, Kim J. Engineering considerations of iPSC-based personalized medicine. Biomater Res 2023; 27:67. [PMID: 37420273 DOI: 10.1186/s40824-023-00382-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 04/19/2023] [Indexed: 07/09/2023] Open
Abstract
Personalized medicine aims to provide tailored medical treatment that considers the clinical, genetic, and environmental characteristics of patients. iPSCs have attracted considerable attention in the field of personalized medicine; however, the inherent limitations of iPSCs prevent their widespread use in clinical applications. That is, it would be important to develop notable engineering strategies to overcome the current limitations of iPSCs. Such engineering approaches could lead to significant advances in iPSC-based personalized therapy by offering innovative solutions to existing challenges, from iPSC preparation to clinical applications. In this review, we summarize how engineering strategies have been used to advance iPSC-based personalized medicine by categorizing the development process into three distinctive steps: 1) the production of therapeutic iPSCs; 2) engineering of therapeutic iPSCs; and 3) clinical applications of engineered iPSCs. Specifically, we focus on engineering strategies and their implications for each step in the development of iPSC-based personalized medicine.
Collapse
Affiliation(s)
- Sangbae Park
- Department of Convergence Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Department of Rural and Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Interdisciplinary Program in IT-Bio Convergence System, Chonnam National University, Gwangju, 61186, Republic of Korea
- Institute of Nano-Stem Cells Therapeutics, NANOBIOSYSTEM Co, Ltd, Gwangju, 61011, Republic of Korea
| | - Yonghyun Gwon
- Department of Convergence Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Department of Rural and Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Interdisciplinary Program in IT-Bio Convergence System, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Shahidul Ahmed Khan
- Department of Convergence Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Department of Rural and Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea
- Interdisciplinary Program in IT-Bio Convergence System, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Kyoung-Je Jang
- Department of Bio-Systems Engineering, Institute of Smart Farm, Gyeongsang National University, Jinju, 52828, Republic of Korea.
- Institute of Agriculture & Life Science, Gyeongsang National University, Jinju, 52828, Republic of Korea.
| | - Jangho Kim
- Department of Convergence Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea.
- Department of Rural and Biosystems Engineering, Chonnam National University, Gwangju, 61186, Republic of Korea.
- Interdisciplinary Program in IT-Bio Convergence System, Chonnam National University, Gwangju, 61186, Republic of Korea.
- Institute of Nano-Stem Cells Therapeutics, NANOBIOSYSTEM Co, Ltd, Gwangju, 61011, Republic of Korea.
| |
Collapse
|
16
|
Balcı AT, Ebeid MM, Benos PV, Kostka D, Chikina M. An intrinsically interpretable neural network architecture for sequence-to-function learning. Bioinformatics 2023; 39:i413-i422. [PMID: 37387140 PMCID: PMC10311317 DOI: 10.1093/bioinformatics/btad271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Sequence-based deep learning approaches have been shown to predict a multitude of functional genomic readouts, including regions of open chromatin and RNA expression of genes. However, a major limitation of current methods is that model interpretation relies on computationally demanding post hoc analyses, and even then, one can often not explain the internal mechanics of highly parameterized models. Here, we introduce a deep learning architecture called totally interpretable sequence-to-function model (tiSFM). tiSFM improves upon the performance of standard multilayer convolutional models while using fewer parameters. Additionally, while tiSFM is itself technically a multilayer neural network, internal model parameters are intrinsically interpretable in terms of relevant sequence motifs. RESULTS We analyze published open chromatin measurements across hematopoietic lineage cell-types and demonstrate that tiSFM outperforms a state-of-the-art convolutional neural network model custom-tailored to this dataset. We also show that it correctly identifies context-specific activities of transcription factors with known roles in hematopoietic differentiation, including Pax5 and Ebf1 for B-cells, and Rorc for innate lymphoid cells. tiSFM's model parameters have biologically meaningful interpretations, and we show the utility of our approach on a complex task of predicting the change in epigenetic state as a function of developmental transition. AVAILABILITY AND IMPLEMENTATION The source code, including scripts for the analysis of key findings, can be found at https://github.com/boooooogey/ATAConv, implemented in Python.
Collapse
Affiliation(s)
- Ali Tuğrul Balcı
- Joint Carnegie Mellon University-University of Pittsburgh Program in Computational Biology, Pittsburgh, PA 15213, United States
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Mark Maher Ebeid
- Joint Carnegie Mellon University-University of Pittsburgh Program in Computational Biology, Pittsburgh, PA 15213, United States
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Panayiotis V Benos
- Department of Epidemiology, University of Florida, Gainesville, FL 32610, United States
| | - Dennis Kostka
- Joint Carnegie Mellon University-University of Pittsburgh Program in Computational Biology, Pittsburgh, PA 15213, United States
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, United States
- Department of Developmental Biology, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Maria Chikina
- Joint Carnegie Mellon University-University of Pittsburgh Program in Computational Biology, Pittsburgh, PA 15213, United States
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15213, United States
| |
Collapse
|
17
|
Natri HM, Del Azodi CB, Peter L, Taylor CJ, Chugh S, Kendle R, Chung MI, Flaherty DK, Matlock BK, Calvi CL, Blackwell TS, Ware LB, Bacchetta M, Walia R, Shaver CM, Kropski JA, McCarthy DJ, Banovich NE. Cell type-specific and disease-associated eQTL in the human lung. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.17.533161. [PMID: 36993211 PMCID: PMC10055257 DOI: 10.1101/2023.03.17.533161] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis (PF). Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA-sequencing of lung tissue from 67 PF and 49 unaffected donors. Employing a pseudo-bulk approach, we mapped expression quantitative trait loci (eQTL) across 38 cell types, observing both shared and cell type-specific regulatory effects. Further, we identified disease-interaction eQTL and demonstrated that this class of associations is more likely to be cell-type specific and linked to cellular dysregulation in PF. Finally, we connected PF risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression, and implicates context-specific eQTL as key regulators of lung homeostasis and disease.
Collapse
|
18
|
Ullah F, Jabeen S, Salton M, Reddy ASN, Ben-Hur A. Evidence for the role of transcription factors in the co-transcriptional regulation of intron retention. Genome Biol 2023; 24:53. [PMID: 36949544 PMCID: PMC10031921 DOI: 10.1186/s13059-023-02885-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 02/16/2023] [Indexed: 03/24/2023] Open
Abstract
BACKGROUND Alternative splicing is a widespread regulatory phenomenon that enables a single gene to produce multiple transcripts. Among the different types of alternative splicing, intron retention is one of the least explored despite its high prevalence in both plants and animals. The recent discovery that the majority of splicing is co-transcriptional has led to the finding that chromatin state affects alternative splicing. Therefore, it is plausible that transcription factors can regulate splicing outcomes. RESULTS We provide evidence for the hypothesis that transcription factors are involved in the regulation of intron retention by studying regions of open chromatin in retained and excised introns. Using deep learning models designed to distinguish between regions of open chromatin in retained introns and non-retained introns, we identified motifs enriched in IR events with significant hits to known human transcription factors. Our model predicts that the majority of transcription factors that affect intron retention come from the zinc finger family. We demonstrate the validity of these predictions using ChIP-seq data for multiple zinc finger transcription factors and find strong over-representation for their peaks in intron retention events. CONCLUSIONS This work opens up opportunities for further studies that elucidate the mechanisms by which transcription factors affect intron retention and other forms of splicing. AVAILABILITY Source code available at https://github.com/fahadahaf/chromir.
Collapse
Affiliation(s)
- Fahad Ullah
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Saira Jabeen
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Maayan Salton
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Anireddy S N Reddy
- Biochemistry and Molecular Biology Department, The Hebrew University Faculty of Medicine, Jerusalem, Israel
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA.
| |
Collapse
|
19
|
D'Antonio M, Nguyen JP, Arthur TD, Matsui H, D'Antonio-Chronowska A, Frazer KA. Fine mapping spatiotemporal mechanisms of genetic variants underlying cardiac traits and disease. Nat Commun 2023; 14:1132. [PMID: 36854752 PMCID: PMC9975214 DOI: 10.1038/s41467-023-36638-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 02/10/2023] [Indexed: 03/02/2023] Open
Abstract
The causal variants and genes underlying thousands of cardiac GWAS signals have yet to be identified. Here, we leverage spatiotemporal information on 966 RNA-seq cardiac samples and perform an expression quantitative trait locus (eQTL) analysis detecting eQTLs considering both eGenes and eIsoforms. We identify 2,578 eQTLs associated with a specific developmental stage-, tissue- and/or cell type. Colocalization between eQTL and GWAS signals of five cardiac traits identified variants with high posterior probabilities for being causal in 210 GWAS loci. Pulse pressure GWAS loci are enriched for colocalization with fetal- and smooth muscle- eQTLs; pulse rate with adult- and cardiac muscle- eQTLs; and atrial fibrillation with cardiac muscle- eQTLs. Fine mapping identifies 79 credible sets with five or fewer SNPs, of which 15 were associated with spatiotemporal eQTLs. Our study shows that many cardiac GWAS variants impact traits and disease in a developmental stage-, tissue- and/or cell type-specific fashion.
Collapse
Affiliation(s)
- Matteo D'Antonio
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA.
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| | - Jennifer P Nguyen
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Timothy D Arthur
- Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | | | - Kelly A Frazer
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| |
Collapse
|
20
|
Tsujimoto H, Katagiri N, Ijiri Y, Sasaki B, Kobayashi Y, Mima A, Ryosaka M, Furuyama K, Kawaguchi Y, Osafune K. In vitro methods to ensure absence of residual undifferentiated human induced pluripotent stem cells intermingled in induced nephron progenitor cells. PLoS One 2022; 17:e0275600. [PMID: 36378656 PMCID: PMC9665373 DOI: 10.1371/journal.pone.0275600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 09/20/2022] [Indexed: 11/16/2022] Open
Abstract
Cell therapies using human induced pluripotent stem cell (hiPSC)-derived nephron progenitor cells (NPCs) are expected to ameliorate acute kidney injury (AKI). However, using hiPSC-derived NPCs clinically is a challenge because hiPSCs themselves are tumorigenic. LIN28A, ESRG, CNMD and SFRP2 transcripts have been used as a marker of residual hiPSCs for a variety of cell types undergoing clinical trials. In this study, by reanalyzing public databases, we found a baseline expression of LIN28A, ESRG, CNMD and SFRP2 in hiPSC-derived NPCs and several other cell types, suggesting LIN28A, ESRG, CNMD and SFRP2 are not always reliable markers for iPSC detection. As an alternative, we discovered a lncRNA marker gene, MIR302CHG, among many known and unknown iPSC markers, as highly differentially expressed between hiPSCs and NPCs, by RNA sequencing and quantitative RT-PCR (qRT-PCR) analyses. Using MIR302CHG as an hiPSC marker, we constructed two assay methods, a combination of magnetic bead-based enrichment and qRT-PCR and digital droplet PCR alone, to detect a small number of residual hiPSCs in NPC populations. The use of these in vitro assays could contribute to patient safety in treatments using hiPSC-derived cells.
Collapse
Affiliation(s)
- Hiraku Tsujimoto
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
- Rege Nephro Co., Ltd., Med-Pharm Collaboration Building, Kyoto University, Kyoto, Japan
- * E-mail: (KO); (HT)
| | - Naoko Katagiri
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
- Rege Nephro Co., Ltd., Med-Pharm Collaboration Building, Kyoto University, Kyoto, Japan
| | - Yoshihiro Ijiri
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
- Rege Nephro Co., Ltd., Med-Pharm Collaboration Building, Kyoto University, Kyoto, Japan
| | - Ben Sasaki
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
| | - Yoshifumi Kobayashi
- Rege Nephro Co., Ltd., Med-Pharm Collaboration Building, Kyoto University, Kyoto, Japan
| | - Akira Mima
- Rege Nephro Co., Ltd., Med-Pharm Collaboration Building, Kyoto University, Kyoto, Japan
| | - Makoto Ryosaka
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
- Rege Nephro Co., Ltd., Med-Pharm Collaboration Building, Kyoto University, Kyoto, Japan
| | - Kenichiro Furuyama
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
| | - Yoshiya Kawaguchi
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
| | - Kenji Osafune
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
- * E-mail: (KO); (HT)
| |
Collapse
|
21
|
Lamin A/C-dependent chromatin architecture safeguards naïve pluripotency to prevent aberrant cardiovascular cell fate and function. Nat Commun 2022; 13:6663. [PMID: 36333314 PMCID: PMC9636150 DOI: 10.1038/s41467-022-34366-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 10/24/2022] [Indexed: 11/06/2022] Open
Abstract
Tight control of cell fate choices is crucial for normal development. Here we show that lamin A/C plays a key role in chromatin organization in embryonic stem cells (ESCs), which safeguards naïve pluripotency and ensures proper cell fate choices during cardiogenesis. We report changes in chromatin compaction and localization of cardiac genes in Lmna-/- ESCs resulting in precocious activation of a transcriptional program promoting cardiomyocyte versus endothelial cell fate. This is accompanied by premature cardiomyocyte differentiation, cell cycle withdrawal and abnormal contractility. Gata4 is activated by lamin A/C loss and Gata4 silencing or haploinsufficiency rescues the aberrant cardiovascular cell fate choices induced by lamin A/C deficiency. We uncover divergent functions of lamin A/C in naïve pluripotent stem cells and cardiomyocytes, which have distinct contributions to the transcriptional alterations of patients with LMNA-associated cardiomyopathy. We conclude that disruption of lamin A/C-dependent chromatin architecture in ESCs is a primary event in LMNA loss-of-function cardiomyopathy.
Collapse
|
22
|
Monosomy X in isogenic human iPSC-derived trophoblast model impacts expression modules preserved in human placenta. Proc Natl Acad Sci U S A 2022; 119:e2211073119. [PMID: 36161909 DOI: 10.1073/pnas.2211073119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Mammalian sex chromosomes encode homologous X/Y gene pairs that were retained on the Y chromosome in males and escape X chromosome inactivation (XCI) in females. Inferred to reflect X/Y pair dosage sensitivity, monosomy X is a leading cause of miscarriage in humans with near full penetrance. This phenotype is shared with many other mammals but not the mouse, which offers sophisticated genetic tools to generate sex chromosomal aneuploidy but also tolerates its developmental impact. To address this critical gap, we generated X-monosomic human induced pluripotent stem cells (hiPSCs) alongside otherwise isogenic euploid controls from male and female mosaic samples. Phased genomic variants in these hiPSC panels enable systematic investigation of X/Y dosage-sensitive features using in vitro models of human development. Here, we demonstrate the utility of these validated hiPSC lines to test how X/Y-linked gene dosage impacts a widely used model for human syncytiotrophoblast development. While these isogenic panels trigger a GATA2/3- and TFAP2A/C-driven trophoblast gene circuit irrespective of karyotype, differential expression implicates monosomy X in altered levels of placental genes and in secretion of placental growth factor (PlGF) and human chorionic gonadotropin (hCG). Remarkably, weighted gene coexpression network modules that significantly reflect these changes are also preserved in first-trimester chorionic villi and term placenta. Our results suggest monosomy X may skew trophoblast cell type composition and function, and that the combined haploinsufficiency of the pseudoautosomal region likely plays a key role in these changes.
Collapse
|
23
|
Lea AJ, Peng J, Ayroles JF. Diverse environmental perturbations reveal the evolution and context-dependency of genetic effects on gene expression levels. Genome Res 2022; 32:1826-1839. [PMID: 36229124 PMCID: PMC9712631 DOI: 10.1101/gr.276430.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 09/07/2022] [Indexed: 01/18/2023]
Abstract
There is increasing appreciation that, in addition to being shaped by an individual's genotype and environment, most complex traits are also determined by poorly understood interactions between these two factors. So-called "genotype × environment" (G×E) interactions remain difficult to map at the organismal level but can be uncovered using molecular phenotypes. To do so at large scale, we used TM3'seq to profile transcriptomes across 12 cellular environments in 544 immortalized B cell lines from the 1000 Genomes Project. We mapped the genetic basis of gene expression levels across environments and revealed a context-dependent genetic architecture: The average heritability of gene expression levels increased in treatment relative to control conditions, and on average, each treatment revealed new expression quantitative trait loci (eQTLs) at 11% of genes. Across our experiments, 22% of all identified eQTLs were context-dependent, and this group was enriched for trait- and disease-associated loci. Further, evolutionary analyses suggested that positive selection has shaped G×E loci involved in responding to immune challenges and hormones but not to man-made chemicals. We hypothesize that this reflects a reduced opportunity for selection to act on responses to molecules recently introduced into human environments. Together, our work highlights the importance of considering an exposure's evolutionary history when studying and interpreting G×E interactions, and provides new insight into the evolutionary mechanisms that maintain G×E loci in human populations.
Collapse
Affiliation(s)
- Amanda J. Lea
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey 08544, USA;,Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Julie Peng
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey 08544, USA;,Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Julien F. Ayroles
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey 08544, USA;,Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| |
Collapse
|
24
|
Brooks IR, Garrone CM, Kerins C, Kiar CS, Syntaka S, Xu JZ, Spagnoli FM, Watt FM. Functional genomics and the future of iPSCs in disease modeling. Stem Cell Reports 2022; 17:1033-1047. [PMID: 35487213 PMCID: PMC9133703 DOI: 10.1016/j.stemcr.2022.03.019] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 03/30/2022] [Accepted: 03/31/2022] [Indexed: 10/28/2022] Open
Abstract
Induced pluripotent stem cells (iPSCs) are valuable in disease modeling because of their potential to expand and differentiate into virtually any cell type and recapitulate key aspects of human biology. Functional genomics are genome-wide studies that aim to discover genotype-phenotype relationships, thereby revealing the impact of human genetic diversity on normal and pathophysiology. In this review, we make the case that human iPSCs (hiPSCs) are a powerful tool for functional genomics, since they provide an in vitro platform for the study of population genetics. We describe cutting-edge tools and strategies now available to researchers, including multi-omics technologies, advances in hiPSC culture techniques, and innovations in drug development. Functional genomics approaches based on hiPSCs hold great promise for advancing drug discovery, disease etiology, and the impact of genetic variation on human biology.
Collapse
Affiliation(s)
- Imogen R Brooks
- St John's Institute of Dermatology, King's College London, London, SE1 9RT, UK
| | - Cristina M Garrone
- Centre for Gene Therapy and Regenerative Medicine, King's College London, London, SE1 9RT, UK
| | - Caoimhe Kerins
- Centre for Craniofacial and Regenerative Biology, King's College London, London, SE1 9RT, UK
| | - Cher Shen Kiar
- Peter Gorer Department of Immunobiology, King's College London, London, SE1 9RT, UK
| | - Sofia Syntaka
- Centre for Gene Therapy and Regenerative Medicine, King's College London, London, SE1 9RT, UK
| | - Jessie Z Xu
- Centre for Gene Therapy and Regenerative Medicine, King's College London, London, SE1 9RT, UK
| | - Francesca M Spagnoli
- Centre for Gene Therapy and Regenerative Medicine, King's College London, London, SE1 9RT, UK.
| | - Fiona M Watt
- Centre for Gene Therapy and Regenerative Medicine, King's College London, London, SE1 9RT, UK; Directors' Research Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| |
Collapse
|
25
|
Grandi FC, Modi H, Kampman L, Corces MR. Chromatin accessibility profiling by ATAC-seq. Nat Protoc 2022; 17:1518-1552. [PMID: 35478247 DOI: 10.1038/s41596-022-00692-9] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 02/22/2022] [Indexed: 12/13/2022]
Abstract
The assay for transposase-accessible chromatin using sequencing (ATAC-seq) provides a simple and scalable way to detect the unique chromatin landscape associated with a cell type and how it may be altered by perturbation or disease. ATAC-seq requires a relatively small number of input cells and does not require a priori knowledge of the epigenetic marks or transcription factors governing the dynamics of the system. Here we describe an updated and optimized protocol for ATAC-seq, called Omni-ATAC, that is applicable across a broad range of cell and tissue types. The ATAC-seq workflow has five main steps: sample preparation, transposition, library preparation, sequencing and data analysis. This protocol details the steps to generate and sequence ATAC-seq libraries, with recommendations for sample preparation and downstream bioinformatic analysis. ATAC-seq libraries for roughly 12 samples can be generated in 10 h by someone familiar with basic molecular biology, and downstream sequencing analysis can be implemented using benchmarked pipelines by someone with basic bioinformatics skills and with access to a high-performance computing environment.
Collapse
Affiliation(s)
- Fiorella C Grandi
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA.,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Hailey Modi
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA.,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Lucas Kampman
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA.,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA. .,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA. .,Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
26
|
Kaplow IM, Schäffer DE, Wirthlin ME, Lawler AJ, Brown AR, Kleyman M, Pfenning AR. Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin. BMC Genomics 2022; 23:291. [PMID: 35410163 PMCID: PMC8996547 DOI: 10.1186/s12864-022-08450-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 03/07/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Evolutionary conservation is an invaluable tool for inferring functional significance in the genome, including regions that are crucial across many species and those that have undergone convergent evolution. Computational methods to test for sequence conservation are dominated by algorithms that examine the ability of one or more nucleotides to align across large evolutionary distances. While these nucleotide alignment-based approaches have proven powerful for protein-coding genes and some non-coding elements, they fail to capture conservation of many enhancers, distal regulatory elements that control spatial and temporal patterns of gene expression. The function of enhancers is governed by a complex, often tissue- and cell type-specific code that links combinations of transcription factor binding sites and other regulation-related sequence patterns to regulatory activity. Thus, function of orthologous enhancer regions can be conserved across large evolutionary distances, even when nucleotide turnover is high. RESULTS We present a new machine learning-based approach for evaluating enhancer conservation that leverages the combinatorial sequence code of enhancer activity rather than relying on the alignment of individual nucleotides. We first train a convolutional neural network model that can predict tissue-specific open chromatin, a proxy for enhancer activity, across mammals. Next, we apply that model to distinguish instances where the genome sequence would predict conserved function versus a loss of regulatory activity in that tissue. We present criteria for systematically evaluating model performance for this task and use them to demonstrate that our models accurately predict tissue-specific conservation and divergence in open chromatin between primate and rodent species, vastly out-performing leading nucleotide alignment-based approaches. We then apply our models to predict open chromatin at orthologs of brain and liver open chromatin regions across hundreds of mammals and find that brain enhancers associated with neuron activity have a stronger tendency than the general population to have predicted lineage-specific open chromatin. CONCLUSION The framework presented here provides a mechanism to annotate tissue-specific regulatory function across hundreds of genomes and to study enhancer evolution using predicted regulatory differences rather than nucleotide-level conservation measurements.
Collapse
Affiliation(s)
- Irene M Kaplow
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA. .,Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Daniel E Schäffer
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Morgan E Wirthlin
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.,Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.,Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ashley R Brown
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.,Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Michael Kleyman
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.,Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Andreas R Pfenning
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA. .,Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA. .,Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
27
|
Gill D, Parry A, Santos F, Okkenhaug H, Todd CD, Hernando-Herraez I, Stubbs TM, Milagre I, Reik W. Multi-omic rejuvenation of human cells by maturation phase transient reprogramming. eLife 2022; 11:e71624. [PMID: 35390271 PMCID: PMC9023058 DOI: 10.7554/elife.71624] [Citation(s) in RCA: 60] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 04/06/2022] [Indexed: 11/13/2022] Open
Abstract
Ageing is the gradual decline in organismal fitness that occurs over time leading to tissue dysfunction and disease. At the cellular level, ageing is associated with reduced function, altered gene expression and a perturbed epigenome. Recent work has demonstrated that the epigenome is already rejuvenated by the maturation phase of somatic cell reprogramming, which suggests full reprogramming is not required to reverse ageing of somatic cells. Here we have developed the first "maturation phase transient reprogramming" (MPTR) method, where reprogramming factors are selectively expressed until this rejuvenation point then withdrawn. Applying MPTR to dermal fibroblasts from middle-aged donors, we found that cells temporarily lose and then reacquire their fibroblast identity, possibly as a result of epigenetic memory at enhancers and/or persistent expression of some fibroblast genes. Excitingly, our method substantially rejuvenated multiple cellular attributes including the transcriptome, which was rejuvenated by around 30 years as measured by a novel transcriptome clock. The epigenome was rejuvenated to a similar extent, including H3K9me3 levels and the DNA methylation ageing clock. The magnitude of rejuvenation instigated by MPTR appears substantially greater than that achieved in previous transient reprogramming protocols. In addition, MPTR fibroblasts produced youthful levels of collagen proteins, and showed partial functional rejuvenation of their migration speed. Finally, our work suggests that optimal time windows exist for rejuvenating the transcriptome and the epigenome. Overall, we demonstrate that it is possible to separate rejuvenation from complete pluripotency reprogramming, which should facilitate the discovery of novel anti-ageing genes and therapies.
Collapse
Affiliation(s)
- Diljeet Gill
- Epigenetics Programme, Babraham InstituteCambridgeUnited Kingdom
| | - Aled Parry
- Epigenetics Programme, Babraham InstituteCambridgeUnited Kingdom
| | - Fátima Santos
- Epigenetics Programme, Babraham InstituteCambridgeUnited Kingdom
| | | | | | | | | | - Inês Milagre
- Laboratory for Epigenetic Mechanisms/Chromosome Dynamics Lab, Instituto Gulbenkian de CiênciaOeirasPortugal
| | - Wolf Reik
- Epigenetics Programme, Babraham InstituteCambridgeUnited Kingdom
- Wellcome Trust Sanger Institute, HinxtonCambridgeUnited Kingdom
- Centre for Trophoblast Research, University of CambridgeCambridgeUnited Kingdom
| |
Collapse
|
28
|
Marwaha S, Knowles JW, Ashley EA. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med 2022; 14:23. [PMID: 35220969 PMCID: PMC8883622 DOI: 10.1186/s13073-022-01026-w] [Citation(s) in RCA: 83] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 02/10/2022] [Indexed: 02/07/2023] Open
Abstract
AbstractRare diseases affect 30 million people in the USA and more than 300–400 million worldwide, often causing chronic illness, disability, and premature death. Traditional diagnostic techniques rely heavily on heuristic approaches, coupling clinical experience from prior rare disease presentations with the medical literature. A large number of rare disease patients remain undiagnosed for years and many even die without an accurate diagnosis. In recent years, gene panels, microarrays, and exome sequencing have helped to identify the molecular cause of such rare and undiagnosed diseases. These technologies have allowed diagnoses for a sizable proportion (25–35%) of undiagnosed patients, often with actionable findings. However, a large proportion of these patients remain undiagnosed. In this review, we focus on technologies that can be adopted if exome sequencing is unrevealing. We discuss the benefits of sequencing the whole genome and the additional benefit that may be offered by long-read technology, pan-genome reference, transcriptomics, metabolomics, proteomics, and methyl profiling. We highlight computational methods to help identify regionally distant patients with similar phenotypes or similar genetic mutations. Finally, we describe approaches to automate and accelerate genomic analysis. The strategies discussed here are intended to serve as a guide for clinicians and researchers in the next steps when encountering patients with non-diagnostic exomes.
Collapse
|
29
|
Weighill D, Ben Guebila M, Glass K, Quackenbush J, Platig J. Predicting genotype-specific gene regulatory networks. Genome Res 2022; 32:524-533. [PMID: 35193937 PMCID: PMC8896459 DOI: 10.1101/gr.275107.120] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 01/11/2022] [Indexed: 11/25/2022]
Abstract
Understanding how each person's unique genotype influences their individual patterns of gene regulation has the potential to improve our understanding of human health and development, and to refine genotype-specific disease risk assessments and treatments. However, the effects of genetic variants are not typically considered when constructing gene regulatory networks, despite the fact that many disease-associated genetic variants are thought to have regulatory effects, including the disruption of transcription factor (TF) binding. We developed EGRET (Estimating the Genetic Regulatory Effect on TFs), which infers a genotype-specific gene regulatory network for each individual in a study population. EGRET begins by constructing a genotype-informed TF-gene prior network derived using TF motif predictions, expression quantitative trait locus (eQTL) data, individual genotypes, and the predicted effects of genetic variants on TF binding. It then uses a technique known as message passing to integrate this prior network with gene expression and TF protein–protein interaction data to produce a refined, genotype-specific regulatory network. We used EGRET to infer gene regulatory networks for two blood-derived cell lines and identified genotype-associated, cell line–specific regulatory differences that we subsequently validated using allele-specific expression, chromatin accessibility QTLs, and differential ChIP-seq TF binding. We also inferred EGRET networks for three cell types from each of 119 individuals and identified cell type–specific regulatory differences associated with diseases related to those cell types. EGRET is, to our knowledge, the first method that infers networks reflective of individual genetic variation in a way that provides insight into the genetic regulatory associations driving complex phenotypes.
Collapse
Affiliation(s)
- Deborah Weighill
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
| | | | - Kimberly Glass
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA.,Harvard Medical School, Boston, Massachusetts 02115, USA
| | - John Quackenbush
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA.,Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA
| | - John Platig
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA.,Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
30
|
Rhodes K, Barr KA, Popp JM, Strober BJ, Battle A, Gilad Y. Human embryoid bodies as a novel system for genomic studies of functionally diverse cell types. eLife 2022; 11:71361. [PMID: 35142607 PMCID: PMC8830892 DOI: 10.7554/elife.71361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 12/18/2021] [Indexed: 12/23/2022] Open
Abstract
Practically all studies of gene expression in humans to date have been performed in a relatively small number of adult tissues. Gene regulation is highly dynamic and context-dependent. In order to better understand the connection between gene regulation and complex phenotypes, including disease, we need to be able to study gene expression in more cell types, tissues, and states that are relevant to human phenotypes. In particular, we need to characterize gene expression in early development cell types, as mutations that affect developmental processes may be of particular relevance to complex traits. To address this challenge, we propose to use embryoid bodies (EBs), which are organoids that contain a multitude of cell types in dynamic states. EBs provide a system in which one can study dynamic regulatory processes at an unprecedentedly high resolution. To explore the utility of EBs, we systematically explored cellular and gene expression heterogeneity in EBs from multiple individuals. We characterized the various cell types that arise from EBs, the extent to which they recapitulate gene expression in vivo, and the relative contribution of technical and biological factors to variability in gene expression, cell composition, and differentiation efficiency. Our results highlight the utility of EBs as a new model system for mapping dynamic inter-individual regulatory differences in a large variety of cell types. One major goal of human genetics is to understand how changes in the way genes are regulated affect human traits, including disease susceptibility. To date, most studies of gene regulation have been performed in adult tissues, such as liver or kidney tissue, that were collected at a single time point. Yet, gene regulation is highly dynamic and context-dependent, meaning that it is important to gather data from a greater variety of cell types at different stages of their development. Additionally, observing which genes switch on and off in response to external treatments can shed light on how genetic variation can drive errors in gene regulation and cause diseases. Stem cells can produce more cells like themselves or differentiate – acquire the characteristics – of many cell types. These cells have been used in the laboratory to research gene regulation. Unfortunately, these studies often fail to capture the complex spatial and temporal dynamics of stem cell differentiation; in particular, these studies are unable to observe gene regulation in the transient cell types that appear early in embryonic development. To overcome these limitations, scientists developed systems such as embryoid bodies: three-dimensional aggregates of stem cells that, when grown under certain conditions, spontaneously develop into a variety of cell types. Rhodes, Barr et al. wanted to assess the utility of embryoid bodies as a model to study how genes are dynamically regulated in different cell types, by different individuals who have distinct genetic makeups. To do this, they grew embryoid bodies made from human stem cells from different individuals to examine which genes switched on and off as the stem cells that formed the embryoid bodies differentiated into different types of cells. The results showed that it was possible to grow embryoid bodies derived from genetically distinct individuals that consistently produce diverse cell types, similar to those found during human fetal development. Rhodes, Barr et al.’s findings suggest that embryoid bodies are a useful model to study gene regulation across individuals with different genetic backgrounds. This could accelerate research into how genetics are associated with disease by capturing gene regulatory dynamics at an unprecedentedly high spatial and temporal resolution. Additionally, embryoid bodies could be used to explore how exposure to different environmental factors during early development affect disease-related outcomes in adulthood in different individuals.
Collapse
Affiliation(s)
- Katherine Rhodes
- Department of Medicine, University of Chicago, Chicago, United States
| | - Kenneth A Barr
- Department of Medicine, University of Chicago, Chicago, United States
| | - Joshua M Popp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, United States
| | - Benjamin J Strober
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, United States
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, United States.,Department of Computer Science, Johns Hopkins University, Baltimore, United States.,Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, United States
| | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, United States
| |
Collapse
|
31
|
Elorbany R, Popp JM, Rhodes K, Strober BJ, Barr K, Qi G, Gilad Y, Battle A. Single-cell sequencing reveals lineage-specific dynamic genetic regulation of gene expression during human cardiomyocyte differentiation. PLoS Genet 2022; 18:e1009666. [PMID: 35061661 PMCID: PMC8809621 DOI: 10.1371/journal.pgen.1009666] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 02/02/2022] [Accepted: 12/21/2021] [Indexed: 12/13/2022] Open
Abstract
Dynamic and temporally specific gene regulatory changes may underlie unexplained genetic associations with complex disease. During a dynamic process such as cellular differentiation, the overall cell type composition of a tissue (or an in vitro culture) and the gene regulatory profile of each cell can both experience significant changes over time. To identify these dynamic effects in high resolution, we collected single-cell RNA-sequencing data over a differentiation time course from induced pluripotent stem cells to cardiomyocytes, sampled at 7 unique time points in 19 human cell lines. We employed a flexible approach to map dynamic eQTLs whose effects vary significantly over the course of bifurcating differentiation trajectories, including many whose effects are specific to one of these two lineages. Our study design allowed us to distinguish true dynamic eQTLs affecting a specific cell lineage from expression changes driven by potentially non-genetic differences between cell lines such as cell composition. Additionally, we used the cell type profiles learned from single-cell data to deconvolve and re-analyze data from matched bulk RNA-seq samples. Using this approach, we were able to identify a large number of novel dynamic eQTLs in single cell data while also attributing dynamic effects in bulk to a particular lineage. Overall, we found that using single cell data to uncover dynamic eQTLs can provide new insight into the gene regulatory changes that occur among heterogeneous cell types during cardiomyocyte differentiation.
Collapse
Affiliation(s)
- Reem Elorbany
- Interdisciplinary Scientist Training Program, University of Chicago, Chicago, Illinois, United States of America
| | - Joshua M. Popp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Katherine Rhodes
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Benjamin J. Strober
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Kenneth Barr
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Guanghao Qi
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
32
|
Perrin HJ, Currin KW, Vadlamudi S, Pandey GK, Ng KK, Wabitsch M, Laakso M, Love MI, Mohlke KL. Chromatin accessibility and gene expression during adipocyte differentiation identify context-dependent effects at cardiometabolic GWAS loci. PLoS Genet 2021; 17:e1009865. [PMID: 34699533 PMCID: PMC8570510 DOI: 10.1371/journal.pgen.1009865] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 11/05/2021] [Accepted: 10/07/2021] [Indexed: 12/15/2022] Open
Abstract
Chromatin accessibility and gene expression in relevant cell contexts can guide identification of regulatory elements and mechanisms at genome-wide association study (GWAS) loci. To identify regulatory elements that display differential activity across adipocyte differentiation, we performed ATAC-seq and RNA-seq in a human cell model of preadipocytes and adipocytes at days 4 and 14 of differentiation. For comparison, we created a consensus map of ATAC-seq peaks in 11 human subcutaneous adipose tissue samples. We identified 58,387 context-dependent chromatin accessibility peaks and 3,090 context-dependent genes between all timepoint comparisons (log2 fold change>1, FDR<5%) with 15,919 adipocyte- and 18,244 preadipocyte-dependent peaks. Adipocyte-dependent peaks showed increased overlap (60.1%) with Roadmap Epigenomics adipocyte nuclei enhancers compared to preadipocyte-dependent peaks (11.5%). We linked context-dependent peaks to genes based on adipocyte promoter capture Hi-C data, overlap with adipose eQTL variants, and context-dependent gene expression. Of 16,167 context-dependent peaks linked to a gene, 5,145 were linked by two or more strategies to 1,670 genes. Among GWAS loci for cardiometabolic traits, adipocyte-dependent peaks, but not preadipocyte-dependent peaks, showed significant enrichment (LD score regression P<0.005) for waist-to-hip ratio and modest enrichment (P < 0.05) for HDL-cholesterol. We identified 659 peaks linked to 503 genes by two or more approaches and overlapping a GWAS signal, suggesting a regulatory mechanism at these loci. To identify variants that may alter chromatin accessibility between timepoints, we identified 582 variants in 454 context-dependent peaks that demonstrated allelic imbalance in accessibility (FDR<5%), of which 55 peaks also overlapped GWAS variants. At one GWAS locus for palmitoleic acid, rs603424 was located in an adipocyte-dependent peak linked to SCD and exhibited allelic differences in transcriptional activity in adipocytes (P = 0.003) but not preadipocytes (P = 0.09). These results demonstrate that context-dependent peaks and genes can guide discovery of regulatory variants at GWAS loci and aid identification of regulatory mechanisms. Cardiovascular and metabolic diseases are widespread, and an increased understanding of genetic mechanisms behind these diseases could improve treatment. Chromatin accessibility and gene expression in relevant cell contexts can guide identification of regulatory elements and genetic mechanisms for disease traits. A relevant context for cardiovascular and metabolic disease traits is adipocyte differentiation. To identify regulatory elements and genes that display differences in activity during adipocyte differentiation, we profiled chromatin accessibility and gene expression in a human cell model of preadipocytes and adipocytes. We identified chromatin regions that change accessibility during differentiation and predicted genes they may affect. We also linked these chromatin regions to genetic variants associated with risk of disease. At one genomic region linked to fatty acids, a chromatin region more accessible in adipocytes linked to a fatty acid synthesis gene and exhibited allelic differences in transcriptional activity in adipocytes but not preadipocytes. These results demonstrate that chromatin regions and genes that change during cell context can guide discovery of regulatory variants and aid identification of disease mechanisms.
Collapse
Affiliation(s)
- Hannah J. Perrin
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Kevin W. Currin
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Swarooparani Vadlamudi
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Gautam K. Pandey
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Kenneth K. Ng
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Martin Wabitsch
- Department of Pediatrics and Adolescent Medicine, Ulm University Hospital, Ulm, Germany
| | - Markku Laakso
- Department of Medicine, University of Eastern Finland and Kuopio University Hospital, Kuopio, Finland
| | - Michael I. Love
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Karen L. Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
33
|
Vinel C, Rosser G, Guglielmi L, Constantinou M, Pomella N, Zhang X, Boot JR, Jones TA, Millner TO, Dumas AA, Rakyan V, Rees J, Thompson JL, Vuononvirta J, Nadkarni S, El Assan T, Aley N, Lin YY, Liu P, Nelander S, Sheer D, Merry CLR, Marelli-Berg F, Brandner S, Marino S. Comparative epigenetic analysis of tumour initiating cells and syngeneic EPSC-derived neural stem cells in glioblastoma. Nat Commun 2021; 12:6130. [PMID: 34675201 PMCID: PMC8531305 DOI: 10.1038/s41467-021-26297-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 09/23/2021] [Indexed: 12/13/2022] Open
Abstract
Epigenetic mechanisms which play an essential role in normal developmental processes, such as self-renewal and fate specification of neural stem cells (NSC) are also responsible for some of the changes in the glioblastoma (GBM) genome. Here we develop a strategy to compare the epigenetic and transcriptional make-up of primary GBM cells (GIC) with patient-matched expanded potential stem cell (EPSC)-derived NSC (iNSC). Using a comparative analysis of the transcriptome of syngeneic GIC/iNSC pairs, we identify a glycosaminoglycan (GAG)-mediated mechanism of recruitment of regulatory T cells (Tregs) in GBM. Integrated analysis of the transcriptome and DNA methylome of GBM cells identifies druggable target genes and patient-specific prediction of drug response in primary GIC cultures, which is validated in 3D and in vivo models. Taken together, we provide a proof of principle that this experimental pipeline has the potential to identify patient-specific disease mechanisms and druggable targets in GBM.
Collapse
Affiliation(s)
- Claire Vinel
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Gabriel Rosser
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Loredana Guglielmi
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Myrianni Constantinou
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Nicola Pomella
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Xinyu Zhang
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - James R Boot
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Tania A Jones
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Thomas O Millner
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Anaelle A Dumas
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Vardhman Rakyan
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Jeremy Rees
- Division of Neuropathology, The National Hospital for Neurology and Neurosurgery, University College London Hospitals NHS Foundation Trust, Queen Square, London, UK
| | - Jamie L Thompson
- Stem Cell Glycobiology Group, Biodiscovery Institute, University of Nottingham, Nottingham, UK
| | - Juho Vuononvirta
- The William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Suchita Nadkarni
- The William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Tedani El Assan
- Division of Neuropathology, The National Hospital for Neurology and Neurosurgery, University College London Hospitals NHS Foundation Trust, Queen Square, London, UK
| | - Natasha Aley
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London, UK
| | - Yung-Yao Lin
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
- Stem Cell Laboratory, National Bowel Research Centre, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, 2 Newark Street, London, UK
| | - Pentao Liu
- Faculty of Medicine, School of Biomedical Sciences, The University of Hong Kong, Hong Kong, Hong Kong
| | - Sven Nelander
- Department of Immunology Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Denise Sheer
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Catherine L R Merry
- Stem Cell Glycobiology Group, Biodiscovery Institute, University of Nottingham, Nottingham, UK
| | - Federica Marelli-Berg
- The William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK
| | - Sebastian Brandner
- Division of Neuropathology, The National Hospital for Neurology and Neurosurgery, University College London Hospitals NHS Foundation Trust, Queen Square, London, UK
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, Queen Square, London, UK
| | - Silvia Marino
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University London, London, UK.
| |
Collapse
|
34
|
A functional genomics pipeline identifies pleiotropy and cross-tissue effects within obesity-associated GWAS loci. Nat Commun 2021; 12:5253. [PMID: 34489471 PMCID: PMC8421397 DOI: 10.1038/s41467-021-25614-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 08/20/2021] [Indexed: 02/07/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified many disease-associated variants, yet mechanisms underlying these associations remain unclear. To understand obesity-associated variants, we generate gene regulatory annotations in adipocytes and hypothalamic neurons across cellular differentiation stages. We then test variants in 97 obesity-associated loci using a massively parallel reporter assay and identify putatively causal variants that display cell type specific or cross-tissue enhancer-modulating properties. Integrating these variants with gene regulatory information suggests genes that underlie obesity GWAS associations. We also investigate a complex genomic interval on 16p11.2 where two independent loci exhibit megabase-range, cross-locus chromatin interactions. We demonstrate that variants within these two loci regulate a shared gene set. Together, our data support a model where GWAS loci contain variants that alter enhancer activity across tissues, potentially with temporally restricted effects, to impact the expression of multiple genes. This complex model has broad implications for ongoing efforts to understand GWAS.
Collapse
|
35
|
Ullah F, Ben-Hur A. A self-attention model for inferring cooperativity between regulatory features. Nucleic Acids Res 2021; 49:e77. [PMID: 33950192 PMCID: PMC8287919 DOI: 10.1093/nar/gkab349] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 04/15/2021] [Accepted: 04/20/2021] [Indexed: 11/14/2022] Open
Abstract
Deep learning has demonstrated its predictive power in modeling complex biological phenomena such as gene expression. The value of these models hinges not only on their accuracy, but also on the ability to extract biologically relevant information from the trained models. While there has been much recent work on developing feature attribution methods that discover the most important features for a given sequence, inferring cooperativity between regulatory elements, which is the hallmark of phenomena such as gene expression, remains an open problem. We present SATORI, a Self-ATtentiOn based model to detect Regulatory element Interactions. Our approach combines convolutional layers with a self-attention mechanism that helps us capture a global view of the landscape of interactions between regulatory elements in a sequence. A comprehensive evaluation demonstrates the ability of SATORI to identify numerous statistically significant TF-TF interactions, many of which have been previously reported. Our method is able to detect higher numbers of experimentally verified TF-TF interactions than existing methods, and has the advantage of not requiring a computationally expensive post-processing step. Finally, SATORI can be used for detection of any type of feature interaction in models that use a similar attention mechanism, and is not limited to the detection of TF-TF interactions.
Collapse
Affiliation(s)
- Fahad Ullah
- Department of Computer Science, Colorado State University, Fort Collins, CO 80523, USA
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|
36
|
Bansal P, Ahern DT, Kondaveeti Y, Qiu CW, Pinter SF. Contiguous erosion of the inactive X in human pluripotency concludes with global DNA hypomethylation. Cell Rep 2021; 35:109215. [PMID: 34107261 PMCID: PMC8267460 DOI: 10.1016/j.celrep.2021.109215] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 08/18/2020] [Accepted: 05/13/2021] [Indexed: 01/21/2023] Open
Abstract
Female human pluripotent stem cells (hPSCs) routinely undergo inactive X (Xi) erosion. This progressive loss of key repressive features follows the loss of XIST expression, the long non-coding RNA driving X inactivation, and causes reactivation of silenced genes across the eroding X (Xe). To date, the sporadic and progressive nature of erosion has obscured its scale, dynamics, and key transition events. To address this problem, we perform an integrated analysis of DNA methylation (DNAme), chromatin accessibility, and gene expression across hundreds of hPSC samples. Differential DNAme orders female hPSCs across a trajectory from initiation to terminal Xi erosion. Our results identify a cis-regulatory element crucial for XIST expression, trace contiguously growing reactivated domains to a few euchromatic origins, and indicate that the late-stage Xe impairs DNAme genome-wide. Surprisingly, from this altered regulatory landscape emerge select features of naive pluripotency, suggesting that its link to X dosage may be partially conserved in human embryonic development. Reactivation of the silenced X in human female iPSC/ESCs compromises their utility. Bansal et al. perform an integrated genomics analysis to reveal a prevalent X erosion trajectory that they validate in long-term culture. Starting with XIST loss, this trajectory indicates that reactivation may spread contiguously from escapees to silenced genes.
Collapse
Affiliation(s)
- Prakhar Bansal
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, USA; Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, USA
| | - Darcy T Ahern
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, USA; Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, USA
| | - Yuvabharath Kondaveeti
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, USA
| | - Catherine W Qiu
- Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, USA
| | - Stefan F Pinter
- Graduate Program in Genetics and Developmental Biology, UCONN Health, University of Connecticut, Farmington, CT, USA; Department of Genetics and Genome Sciences, UCONN Health, University of Connecticut, Farmington, CT, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA.
| |
Collapse
|
37
|
Findley AS, Monziani A, Richards AL, Rhodes K, Ward MC, Kalita CA, Alazizi A, Pazokitoroudi A, Sankararaman S, Wen X, Lanfear DE, Pique-Regi R, Gilad Y, Luca F. Functional dynamic genetic effects on gene regulation are specific to particular cell types and environmental conditions. eLife 2021; 10:e67077. [PMID: 33988505 PMCID: PMC8248987 DOI: 10.7554/elife.67077] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 05/13/2021] [Indexed: 12/14/2022] Open
Abstract
Genetic effects on gene expression and splicing can be modulated by cellular and environmental factors; yet interactions between genotypes, cell type, and treatment have not been comprehensively studied together. We used an induced pluripotent stem cell system to study multiple cell types derived from the same individuals and exposed them to a large panel of treatments. Cellular responses involved different genes and pathways for gene expression and splicing and were highly variable across contexts. For thousands of genes, we identified variable allelic expression across contexts and characterized different types of gene-environment interactions, many of which are associated with complex traits. Promoter functional and evolutionary features distinguished genes with elevated allelic imbalance mean and variance. On average, half of the genes with dynamic regulatory interactions were missed by large eQTL mapping studies, indicating the importance of exploring multiple treatments to reveal previously unrecognized regulatory loci that may be important for disease.
Collapse
Affiliation(s)
- Anthony S Findley
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Alan Monziani
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Allison L Richards
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Katherine Rhodes
- Department of Human Genetics, University of ChicagoChicagoUnited States
| | - Michelle C Ward
- Department of Medicine, University of ChicagoChicagoUnited States
| | - Cynthia A Kalita
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | - Adnan Alazizi
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
| | | | - Sriram Sankararaman
- Department of Computer Science, UCLALos AngelesUnited States
- Department of Human Genetics, UCLALos AngelesUnited States
- Department of Computational Medicine, UCLALos AngelesUnited States
| | - Xiaoquan Wen
- Department of Biostatistics, University of MichiganAnn ArborUnited States
| | - David E Lanfear
- Center for Individualized and Genomic Medicine Research, Henry Ford HospitalDetroitUnited States
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
- Department of Obstetrics and Gynecology, Wayne State UniversityDetroitUnited States
| | - Yoav Gilad
- Department of Human Genetics, University of ChicagoChicagoUnited States
- Department of Medicine, University of ChicagoChicagoUnited States
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State UniversityDetroitUnited States
- Department of Obstetrics and Gynecology, Wayne State UniversityDetroitUnited States
| |
Collapse
|
38
|
Dannemann M, Gallego Romero I. Harnessing pluripotent stem cells as models to decipher human evolution. FEBS J 2021; 289:2992-3010. [PMID: 33876573 DOI: 10.1111/febs.15885] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/18/2021] [Accepted: 04/16/2021] [Indexed: 12/13/2022]
Abstract
The study of human evolution, long constrained by a lack of experimental model systems, has been transformed by the emergence of the induced pluripotent stem cell (iPSC) field. iPSCs can be readily established from noninvasive tissue sources, from both humans and other primates; they can be maintained in the laboratory indefinitely, and they can be differentiated into other tissue types. These qualities mean that iPSCs are rapidly becoming established as viable and powerful model systems with which it is possible to address questions in human evolution that were until now logistically and ethically intractable, especially in the quest to understand humans' place among the great apes, and the genetic basis of human uniqueness. In this review, we discuss the key lessons and takeaways of this nascent field; from the types of research, iPSCs make possible to lingering challenges and likely future directions. We provide a comprehensive overview of how the seemingly unlikely combination of iPSCs and explicit evolutionary frameworks is transforming what is possible in our understanding of humanity's past and present.
Collapse
Affiliation(s)
| | - Irene Gallego Romero
- Institute of Genomics, University of Tartu, Estonia.,Melbourne Integrative Genomics, The University of Melbourne, Parkville, Australia.,School of BioSciences, The University of Melbourne, Parkville, Australia.,The Centre for Stem Cell Systems, The University of Melbourne, Parkville, Australia
| |
Collapse
|
39
|
Atak ZK, Taskiran II, Demeulemeester J, Flerin C, Mauduit D, Minnoye L, Hulselmans G, Christiaens V, Ghanem GE, Wouters J, Aerts S. Interpretation of allele-specific chromatin accessibility using cell state-aware deep learning. Genome Res 2021; 31:1082-1096. [PMID: 33832990 PMCID: PMC8168584 DOI: 10.1101/gr.260851.120] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 04/05/2021] [Indexed: 12/26/2022]
Abstract
Genomic sequence variation within enhancers and promoters can have a significant impact on the cellular state and phenotype. However, sifting through the millions of candidate variants in a personal genome or a cancer genome, to identify those that impact cis-regulatory function, remains a major challenge. Interpretation of noncoding genome variation benefits from explainable artificial intelligence to predict and interpret the impact of a mutation on gene regulation. Here we generate phased whole genomes with matched chromatin accessibility, histone modifications, and gene expression for 10 melanoma cell lines. We find that training a specialized deep learning model, called DeepMEL2, on melanoma chromatin accessibility data can capture the various regulatory programs of the melanocytic and mesenchymal-like melanoma cell states. This model outperforms motif-based variant scoring, as well as more generic deep learning models. We detect hundreds to thousands of allele-specific chromatin accessibility variants (ASCAVs) in each melanoma genome, of which 15%-20% can be explained by gains or losses of transcription factor binding sites. A considerable fraction of ASCAVs are caused by changes in AP-1 binding, as confirmed by matched ChIP-seq data to identify allele-specific binding of JUN and FOSL1. Finally, by augmenting the DeepMEL2 model with ChIP-seq data for GABPA, the TERT promoter mutation, as well as additional ETS motif gains, can be identified with high confidence. In conclusion, we present a new integrative genomics approach and a deep learning model to identify and interpret functional enhancer mutations with allelic imbalance of chromatin accessibility and gene expression.
Collapse
Affiliation(s)
- Zeynep Kalender Atak
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Ibrahim Ihsan Taskiran
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Jonas Demeulemeester
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium.,Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
| | - Christopher Flerin
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - David Mauduit
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Liesbeth Minnoye
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Gert Hulselmans
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Valerie Christiaens
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Ghanem-Elias Ghanem
- Institut Jules Bordet, Université Libre de Bruxelles, 1000 Brussels, Belgium
| | - Jasper Wouters
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| | - Stein Aerts
- VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.,KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium
| |
Collapse
|
40
|
Bonder MJ, Smail C, Gloudemans MJ, Frésard L, Jakubosky D, D'Antonio M, Li X, Ferraro NM, Carcamo-Orive I, Mirauta B, Seaton DD, Cai N, Vakili D, Horta D, Zhao C, Zastrow DB, Bonner DE, Wheeler MT, Kilpinen H, Knowles JW, Smith EN, Frazer KA, Montgomery SB, Stegle O. Identification of rare and common regulatory variants in pluripotent cells using population-scale transcriptomics. Nat Genet 2021; 53:313-321. [PMID: 33664507 PMCID: PMC7944648 DOI: 10.1038/s41588-021-00800-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 01/25/2021] [Indexed: 12/18/2022]
Abstract
Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of novel colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.
Collapse
Affiliation(s)
- Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | - Craig Smail
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA. .,Genomic Medicine Center, Children's Mercy Research Institute and Children's Mercy Kansas City, Kansas City, MO, USA.
| | - Michael J Gloudemans
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Laure Frésard
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - David Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA.,Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Matteo D'Antonio
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, USA
| | - Xin Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Nicole M Ferraro
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Ivan Carcamo-Orive
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Bogdan Mirauta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Daniel D Seaton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Na Cai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK.,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.,Helmholtz Pioneer Campus, Helmholtz Zentrum München, Neuherberg, Germany
| | - Dara Vakili
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,Faculty of Medicine, Imperial College London, London, UK
| | - Danilo Horta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Chunli Zhao
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Diane B Zastrow
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Devon E Bonner
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | | | | | | | | | - Matthew T Wheeler
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA.,Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Helena Kilpinen
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.,UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland.,Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Joshua W Knowles
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Erin N Smith
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Kelly A Frazer
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, USA.,Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA. .,Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany. .,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.
| |
Collapse
|
41
|
Jerber J, Seaton DD, Cuomo ASE, Kumasaka N, Haldane J, Steer J, Patel M, Pearce D, Andersson M, Bonder MJ, Mountjoy E, Ghoussaini M, Lancaster MA, Marioni JC, Merkle FT, Gaffney DJ, Stegle O. Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat Genet 2021; 53:304-312. [PMID: 33664506 PMCID: PMC7610897 DOI: 10.1038/s41588-021-00801-6] [Citation(s) in RCA: 100] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 01/25/2021] [Indexed: 02/06/2023]
Abstract
Studying the function of common genetic variants in primary human tissues and during development is challenging. To address this, we use an efficient multiplexing strategy to differentiate 215 human induced pluripotent stem cell (iPSC) lines toward a midbrain neural fate, including dopaminergic neurons, and use single-cell RNA sequencing (scRNA-seq) to profile over 1 million cells across three differentiation time points. The proportion of neurons produced by each cell line is highly reproducible and is predictable by robust molecular markers expressed in pluripotent cells. Expression quantitative trait loci (eQTL) were characterized at different stages of neuronal development and in response to rotenone-induced oxidative stress. Of these, 1,284 eQTL colocalize with known neurological trait risk loci, and 46% are not found in the Genotype-Tissue Expression (GTEx) catalog. Our study illustrates how coupling scRNA-seq with long-term iPSC differentiation enables mechanistic studies of human trait-associated genetic variants in otherwise inaccessible cell states.
Collapse
Affiliation(s)
- Julie Jerber
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Daniel D Seaton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Anna S E Cuomo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Natsuhiko Kumasaka
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - James Haldane
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Juliette Steer
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Minal Patel
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Daniel Pearce
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Malin Andersson
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ed Mountjoy
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maya Ghoussaini
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - John C Marioni
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
| | - Florian T Merkle
- Metabolic Research Laboratories and Medical Research Council Metabolic Diseases Unit, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge, UK.
- Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK.
| | - Daniel J Gaffney
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
| | - Oliver Stegle
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany.
| |
Collapse
|
42
|
Ward MC, Banovich NE, Sarkar A, Stephens M, Gilad Y. Dynamic effects of genetic variation on gene expression revealed following hypoxic stress in cardiomyocytes. eLife 2021; 10:57345. [PMID: 33554857 PMCID: PMC7906610 DOI: 10.7554/elife.57345] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 02/06/2021] [Indexed: 12/13/2022] Open
Abstract
One life-threatening outcome of cardiovascular disease is myocardial infarction, where cardiomyocytes are deprived of oxygen. To study inter-individual differences in response to hypoxia, we established an in vitro model of induced pluripotent stem cell-derived cardiomyocytes from 15 individuals. We measured gene expression levels, chromatin accessibility, and methylation levels in four culturing conditions that correspond to normoxia, hypoxia, and short- or long-term re-oxygenation. We characterized thousands of gene regulatory changes as the cells transition between conditions. Using available genotypes, we identified 1,573 genes with a cis expression quantitative locus (eQTL) in at least one condition, as well as 367 dynamic eQTLs, which are classified as eQTLs in at least one, but not in all conditions. A subset of genes with dynamic eQTLs is associated with complex traits and disease. Our data demonstrate how dynamic genetic effects on gene expression, which are likely relevant for disease, can be uncovered under stress.
Collapse
Affiliation(s)
- Michelle C Ward
- Department of Medicine, University of Chicago, Chicago, United States.,Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, United States
| | - Nicholas E Banovich
- Department of Human Genetics, University of Chicago, Chicago, United States.,Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, United States
| | - Abhishek Sarkar
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, United States.,Department of Statistics, University of Chicago, Chicago, United States
| | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, United States.,Department of Human Genetics, University of Chicago, Chicago, United States
| |
Collapse
|
43
|
Bhatia S. Genetics of Anthracycline Cardiomyopathy in Cancer Survivors: JACC: CardioOncology State-of-the-Art Review. JACC: CARDIOONCOLOGY 2020; 2:539-552. [PMID: 33364618 PMCID: PMC7757557 DOI: 10.1016/j.jaccao.2020.09.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Anthracyclines are an integral part of chemotherapy regimens used to treat a variety of childhood-onset and adult-onset cancers. However, the development of cardiac dysfunction and heart failure often compromises the clinical utility of anthracyclines. The risk of cardiac dysfunction increases with anthracycline dose. This anthracycline-cardiac dysfunction association is modified by several demographic and clinical factors, such as age at anthracycline exposure (<4 years and ≥65 years); female sex; chest radiation; presence of cardiovascular risk factors (diabetes, hypertension); and concurrent use of cyclophosphamide, paclitaxel, and trastuzumab. However, the clinical variables alone yield modest predictive power in detecting cardiac dysfunction. Recently, attention has focused on the molecular basis of anthracycline-related cardiac dysfunction, providing an initial understanding of the mechanism of anthracycline-related cardiomyopathy. This review describes the current state of knowledge with respect to the pathogenesis of anthracycline-related cardiomyopathy and identifies the critical next steps to mitigate this problem. Anthracycline chemotherapy results in an increased risk of cardiac dysfunction. Most recent studies have suggested that there is a genetic basis for anthracycline-related cardiac dysfunction. Integration of genetics with the clinical characteristics may be used to enhance the ability to predict the risk for anthracycline-related cardiomyopathy.
Collapse
Affiliation(s)
- Smita Bhatia
- Institute for Cancer Outcomes and Survivorship, University of Alabama at Birmingham, Birmingham, Alabama, USA
| |
Collapse
|
44
|
Wood KA, Rowlands CF, Thomas HB, Woods S, O’Flaherty J, Douzgou S, Kimber SJ, Newman WG, O’Keefe RT. Modelling the developmental spliceosomal craniofacial disorder Burn-McKeown syndrome using induced pluripotent stem cells. PLoS One 2020; 15:e0233582. [PMID: 32735620 PMCID: PMC7394406 DOI: 10.1371/journal.pone.0233582] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 07/06/2020] [Indexed: 12/15/2022] Open
Abstract
The craniofacial developmental disorder Burn-McKeown Syndrome (BMKS) is caused by biallelic variants in the pre-messenger RNA splicing factor gene TXNL4A/DIB1. The majority of affected individuals with BMKS have a 34 base pair deletion in the promoter region of one allele of TXNL4A combined with a loss-of-function variant on the other allele, resulting in reduced TXNL4A expression. However, it is unclear how reduced expression of this ubiquitously expressed spliceosome protein results in craniofacial defects during development. Here we reprogrammed peripheral mononuclear blood cells from a BMKS patient and her unaffected mother into induced pluripotent stem cells (iPSCs) and differentiated the iPSCs into induced neural crest cells (iNCCs), the key cell type required for correct craniofacial development. BMKS patient-derived iPSCs proliferated more slowly than both mother- and unrelated control-derived iPSCs, and RNA-Seq analysis revealed significant differences in gene expression and alternative splicing. Patient iPSCs displayed defective differentiation into iNCCs compared to maternal and unrelated control iPSCs, in particular a delay in undergoing an epithelial-to-mesenchymal transition (EMT). RNA-Seq analysis of differentiated iNCCs revealed widespread gene expression changes and mis-splicing in genes relevant to craniofacial and embryonic development that highlight a dampened response to WNT signalling, the key pathway activated during iNCC differentiation. Furthermore, we identified the mis-splicing of TCF7L2 exon 4, a key gene in the WNT pathway, as a potential cause of the downregulated WNT response in patient cells. Additionally, mis-spliced genes shared common sequence properties such as length, branch point to 3’ splice site (BPS-3’SS) distance and splice site strengths, suggesting that splicing of particular subsets of genes is particularly sensitive to changes in TXNL4A expression. Together, these data provide the first insight into how reduced TXNL4A expression in BMKS patients might compromise splicing and NCC function, resulting in defective craniofacial development in the embryo.
Collapse
Affiliation(s)
- Katherine A. Wood
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Charlie F. Rowlands
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Huw B. Thomas
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Steven Woods
- Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Julieta O’Flaherty
- Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Sofia Douzgou
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Susan J. Kimber
- Division of Cell Matrix Biology and Regenerative Medicine, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - William G. Newman
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Raymond T. O’Keefe
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- * E-mail:
| |
Collapse
|
45
|
Dannemann M, He Z, Heide C, Vernot B, Sidow L, Kanton S, Weigert A, Treutlein B, Pääbo S, Kelso J, Camp JG. Human Stem Cell Resources Are an Inroad to Neandertal DNA Functions. Stem Cell Reports 2020; 15:214-225. [PMID: 32559457 PMCID: PMC7363959 DOI: 10.1016/j.stemcr.2020.05.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 05/21/2020] [Accepted: 05/22/2020] [Indexed: 02/07/2023] Open
Abstract
Induced pluripotent stem cells (iPSCs) from diverse humans offer the potential to study human functional variation in controlled culture environments. A portion of this variation originates from an ancient admixture between modern humans and Neandertals, which introduced alleles that left a phenotypic legacy on individual humans today. Here, we show that a large iPSC repository harbors extensive Neandertal DNA, including alleles that contribute to human phenotypes and diseases, encode hundreds of amino acid changes, and alter gene expression in specific tissues. We provide a database of the inferred introgressed Neandertal alleles for each individual iPSC line, together with the annotation of the predicted functional variants. We also show that transcriptomic data from organoids generated from iPSCs can be used to track Neandertal-derived RNA over developmental processes. Human iPSC resources provide an opportunity to experimentally explore Neandertal DNA function and its contribution to present-day phenotypes, and potentially study Neandertal traits.
Collapse
Affiliation(s)
- Michael Dannemann
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany; Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Zhisong He
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Christian Heide
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Benjamin Vernot
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Leila Sidow
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Sabina Kanton
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Anne Weigert
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Barbara Treutlein
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany; Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Svante Pääbo
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Janet Kelso
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - J Gray Camp
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany; Institute of Molecular and Clinical Ophthalmology Basel, Basel, Switzerland; Department of Ophthalmology, University of Basel, Basel, Switzerland.
| |
Collapse
|
46
|
Jakubosky D, D'Antonio M, Bonder MJ, Smail C, Donovan MKR, Young Greenwald WW, Matsui H, D'Antonio-Chronowska A, Stegle O, Smith EN, Montgomery SB, DeBoever C, Frazer KA. Properties of structural variants and short tandem repeats associated with gene expression and complex traits. Nat Commun 2020; 11:2927. [PMID: 32522982 PMCID: PMC7286898 DOI: 10.1038/s41467-020-16482-4] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 05/05/2020] [Indexed: 12/14/2022] Open
Abstract
Structural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we identify genomic features of SV classes and STRs that are associated with gene expression and complex traits, including their locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We identify a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and show that they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that are associated with gene expression and human traits.
Collapse
Affiliation(s)
- David Jakubosky
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
| | - Matteo D'Antonio
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Craig Smail
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Department of Pathology, Stanford University, Stanford, California, 94305, USA
| | - Margaret K R Donovan
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, 92093-0419, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - William W Young Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Hiroko Matsui
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | | | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany
| | - Erin N Smith
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University, Stanford, California, 94305, USA
- Department of Genetics, Stanford University, Stanford, California, 94305, USA
| | - Christopher DeBoever
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Kelly A Frazer
- Institute of Genomic Medicine, University of California San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
- Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
47
|
Bar S, Seaton LR, Weissbein U, Eldar-Geva T, Benvenisty N. Global Characterization of X Chromosome Inactivation in Human Pluripotent Stem Cells. Cell Rep 2020; 27:20-29.e3. [PMID: 30943402 DOI: 10.1016/j.celrep.2019.03.019] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 01/15/2019] [Accepted: 03/05/2019] [Indexed: 02/06/2023] Open
Abstract
Dosage compensation of sex-chromosome gene expression between male and female mammals is achieved via X chromosome inactivation (XCI) by employing epigenetic modifications to randomly silence one X chromosome during early embryogenesis. Human pluripotent stem cells (hPSCs) were reported to present various states of XCI that differ according to the expression of the long non-coding RNA XIST and the degree of X chromosome silencing. To obtain a comprehensive perspective on XCI in female hPSCs, we performed a large-scale analysis characterizing different XCI parameters in more than 700 RNA high-throughput sequencing samples. Our findings suggest differences in XCI status between most published samples of embryonic stem cells (ESCs) and induced PSCs (iPSCs). While the majority of iPSC lines maintain an inactive X chromosome, ESC lines tend to silence the expression of XIST and upregulate distal chromosomal regions. Our study highlights significant epigenetic heterogeneity within hPSCs, which may bear implications for their use in research and regenerative therapy.
Collapse
Affiliation(s)
- Shiran Bar
- The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
| | - Lev Roz Seaton
- The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
| | - Uri Weissbein
- The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
| | - Talia Eldar-Geva
- IVF Unit, Division of Obstetrics and Gynecology, Shaare Zedek Medical Center, Jerusalem, Israel; The Hebrew University School of Medicine, Jerusalem, Israel
| | - Nissim Benvenisty
- The Azrieli Center for Stem Cells and Genetic Research, Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel.
| |
Collapse
|
48
|
Characterizing and inferring quantitative cell cycle phase in single-cell RNA-seq data analysis. Genome Res 2020; 30:611-621. [PMID: 32312741 PMCID: PMC7197478 DOI: 10.1101/gr.247759.118] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Accepted: 04/02/2020] [Indexed: 11/25/2022]
Abstract
Cellular heterogeneity in gene expression is driven by cellular processes, such as cell cycle and cell-type identity, and cellular environment such as spatial location. The cell cycle, in particular, is thought to be a key driver of cell-to-cell heterogeneity in gene expression, even in otherwise homogeneous cell populations. Recent advances in single-cell RNA-sequencing (scRNA-seq) facilitate detailed characterization of gene expression heterogeneity and can thus shed new light on the processes driving heterogeneity. Here, we combined fluorescence imaging with scRNA-seq to measure cell cycle phase and gene expression levels in human induced pluripotent stem cells (iPSCs). By using these data, we developed a novel approach to characterize cell cycle progression. Although standard methods assign cells to discrete cell cycle stages, our method goes beyond this and quantifies cell cycle progression on a continuum. We found that, on average, scRNA-seq data from only five genes predicted a cell's position on the cell cycle continuum to within 14% of the entire cycle and that using more genes did not improve this accuracy. Our data and predictor of cell cycle phase can directly help future studies to account for cell cycle-related heterogeneity in iPSCs. Our results and methods also provide a foundation for future work to characterize the effects of the cell cycle on expression heterogeneity in other cell types.
Collapse
|
49
|
Li HT, Liu Y, Liu H, Sun X. Effect for Human Genomic Variation During the BMP4-Induced Conversion From Pluripotent Stem Cells to Trophoblast. Front Genet 2020; 11:230. [PMID: 32318089 PMCID: PMC7154154 DOI: 10.3389/fgene.2020.00230] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Accepted: 02/26/2020] [Indexed: 12/19/2022] Open
Abstract
The role of genomic variation in differentiation is currently not well understood. Here, the genomic variations were determined with the whole-genome sequencing for three pairs of pluripotent stem cell lines and their corresponding BMP4-induced trophoblast cell lines. We identified ∼3,500 single nucleotide variations and ∼4,500 indels by comparing the genome sequenced data between the stem cell lines and the matched BMP4-induced trophoblast cell lines and annotated them by integrating the epigenomic and transcriptomic datasets. Relatively, introns enrich more variations. We found ∼45% (42 genes) of the differentially expressed genes in trophoblasts that associate genomic variations. Six variations, located at transcription factor binding sites where H3K4me3 and H3K27ac are enriched in both H1 and H1_BMP4, were identified. The epigenetic status around the genomic variations in H1 was similar to that in H1_BMP4. This means that the variation-associated gene’s expression change can not be attributed to epigenetic alteration. The genes associated with the six variations were upregulated in differentiation. We inferred that during the differentiation, an increased in the expression level of the MEF2C gene is due to a genomic variation in chromosomes 5: 88179358 A > G, which is at a binding site of TFs KLF16, NR2C2, and ZNF740 to MEF2C. Allele G shows a higher affinity to the TFs in the induced cells. The increased expression of MEF2C leads to an increased expression of TF MEF2C’s target genes, subsequently affecting the differentiation. Although genomic variation should not be a dominant factor in differentiation, we believe that genomic variation could indeed play a role in the differentiation from stem cells into trophoblast.
Collapse
Affiliation(s)
- Hai-Tao Li
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Yajun Liu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China.,The Second Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Academy of Medical Sciences of Zhengzhou University Translational Medicine Platform, Zhengzhou University, Zhengzhou, China
| | - Hongde Liu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Xiao Sun
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
50
|
Goubert C, Zevallos NA, Feschotte C. Contribution of unfixed transposable element insertions to human regulatory variation. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190331. [PMID: 32075552 PMCID: PMC7061991 DOI: 10.1098/rstb.2019.0331] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/09/2019] [Indexed: 12/11/2022] Open
Abstract
Thousands of unfixed transposable element (TE) insertions segregate in the human population, but little is known about their impact on genome function. Recently, a few studies associated unfixed TE insertions to mRNA levels of adjacent genes, but the biological significance of these associations, their replicability across cell types and the mechanisms by which they may regulate genes remain largely unknown. Here, we performed a TE-expression QTL analysis of 444 lymphoblastoid cell lines (LCL) and 289 induced pluripotent stem cells using a newly developed set of genotypes for 2743 polymorphic TE insertions. We identified 211 and 176 TE-eQTL acting in cis in each respective cell type. Approximately 18% were shared across cell types with strongly correlated effects. Furthermore, analysis of chromatin accessibility QTL in a subset of the LCL suggests that unfixed TEs often modulate the activity of enhancers and other distal regulatory DNA elements, which tend to lose accessibility when a TE inserts within them. We also document a case of an unfixed TE likely influencing gene expression at the post-transcriptional level. Our study points to broad and diverse cis-regulatory effects of unfixed TEs in the human population and underscores their plausible contribution to phenotypic variation. This article is part of a discussion meeting issue 'Crossroads between transposons and gene regulation'.
Collapse
Affiliation(s)
| | | | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Road, Ithaca, NY 14853, USA
| |
Collapse
|