1
|
Frenkel M, Raman S. Discovering mechanisms of human genetic variation and controlling cell states at scale. Trends Genet 2024:S0168-9525(24)00074-X. [PMID: 38658256 DOI: 10.1016/j.tig.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 03/29/2024] [Accepted: 03/29/2024] [Indexed: 04/26/2024]
Abstract
Population-scale sequencing efforts have catalogued substantial genetic variation in humans such that variant discovery dramatically outpaces interpretation. We discuss how single-cell sequencing is poised to reveal genetic mechanisms at a rate that may soon approach that of variant discovery. The functional genomics toolkit is sufficiently modular to systematically profile almost any type of variation within increasingly diverse contexts and with molecularly comprehensive and unbiased readouts. As a result, we can construct deep phenotypic atlases of variant effects that span the entire regulatory cascade. The same conceptual approach to interpreting genetic variation should be applied to engineering therapeutic cell states. In this way, variant mechanism discovery and cell state engineering will become reciprocating and iterative processes towards genomic medicine.
Collapse
Affiliation(s)
- Max Frenkel
- Cellular and Molecular Biology Graduate Program, University of Wisconsin, Madison, WI, USA; Medical Scientist Training Program, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA; Department of Biochemistry, University of Wisconsin, Madison, WI, USA.
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin, Madison, WI, USA; Department of Bacteriology, University of Wisconsin, Madison, WI, USA; Department of Chemical and Biological Engineering, University of Wisconsin, Madison, WI, USA.
| |
Collapse
|
2
|
Rogers BB, Anderson AG, Lauzon SN, Davis MN, Hauser RM, Roberts SC, Rodriguez-Nunez I, Trausch-Lowther K, Barinaga EA, Hall PI, Knuesel MT, Taylor JW, Mackiewicz M, Roberts BS, Cooper SJ, Rizzardi LF, Myers RM, Cochran JN. Neuronal MAPT expression is mediated by long-range interactions with cis-regulatory elements. Am J Hum Genet 2024; 111:259-279. [PMID: 38232730 PMCID: PMC10870142 DOI: 10.1016/j.ajhg.2023.12.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 12/12/2023] [Accepted: 12/12/2023] [Indexed: 01/19/2024] Open
Abstract
Tauopathies are a group of neurodegenerative diseases defined by abnormal aggregates of tau, a microtubule-associated protein encoded by MAPT. MAPT expression is near absent in neural progenitor cells (NPCs) and increases during differentiation. This temporally dynamic expression pattern suggests that MAPT expression could be controlled by transcription factors and cis-regulatory elements specific to differentiated cell types. Given the relevance of MAPT expression to neurodegeneration pathogenesis, identification of such elements is relevant to understanding disease risk and pathogenesis. Here, we performed chromatin conformation assays (HiC & Capture-C), single-nucleus multiomics (RNA-seq+ATAC-seq), bulk ATAC-seq, and ChIP-seq for H3K27ac and CTCF in NPCs and differentiated neurons to nominate candidate cis-regulatory elements (cCREs). We assayed these cCREs using luciferase assays and CRISPR interference (CRISPRi) experiments to measure their effects on MAPT expression. Finally, we integrated cCRE annotations into an analysis of genetic variation in neurodegeneration-affected individuals and control subjects. We identified both proximal and distal regulatory elements for MAPT and confirmed the regulatory function for several regions, including three regions centromeric to MAPT beyond the H1/H2 haplotype inversion breakpoint. We also found that rare and predicted damaging genetic variation in nominated CREs was nominally depleted in dementia-affected individuals relative to control subjects, consistent with the hypothesis that variants that disrupt MAPT enhancer activity, and thereby reduced MAPT expression, may be protective against neurodegenerative disease. Overall, this study provides compelling evidence for pursuing detailed knowledge of CREs for genes of interest to permit better understanding of disease risk.
Collapse
Affiliation(s)
- Brianne B Rogers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA; University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | | | - Shelby N Lauzon
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - M Natalie Davis
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Rebecca M Hauser
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Sydney C Roberts
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | | | | | - Erin A Barinaga
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Paige I Hall
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | | | - Jared W Taylor
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Brian S Roberts
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Sara J Cooper
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | | | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA.
| | | |
Collapse
|
3
|
Sun J, Noss S, Banerjee D, Das M, Girirajan S. Strategies for dissecting the complexity of neurodevelopmental disorders. Trends Genet 2024; 40:187-202. [PMID: 37949722 PMCID: PMC10872993 DOI: 10.1016/j.tig.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/20/2023] [Accepted: 10/16/2023] [Indexed: 11/12/2023]
Abstract
Neurodevelopmental disorders (NDDs) are associated with a wide range of clinical features, affecting multiple pathways involved in brain development and function. Recent advances in high-throughput sequencing have unveiled numerous genetic variants associated with NDDs, which further contribute to disease complexity and make it challenging to infer disease causation and underlying mechanisms. Herein, we review current strategies for dissecting the complexity of NDDs using model organisms, induced pluripotent stem cells, single-cell sequencing technologies, and massively parallel reporter assays. We further highlight single-cell CRISPR-based screening techniques that allow genomic investigation of cellular transcriptomes with high efficiency, accuracy, and throughput. Overall, we provide an integrated review of experimental approaches that can be applicable for investigating a broad range of complex disorders.
Collapse
Affiliation(s)
- Jiawan Sun
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Serena Noss
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Deepro Banerjee
- Bioinformatics and Genomics Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Maitreya Das
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Santhosh Girirajan
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA; Bioinformatics and Genomics Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; Department of Anthropology, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
4
|
Abstract
Enhancers and promoters are classically considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with seemingly no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1,003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected 5 distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.
Collapse
Affiliation(s)
- Sanjarbek Hudaiberdiev
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of Health. Bethesda, MD
| | - Ivan Ovcharenko
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of Health. Bethesda, MD
| |
Collapse
|
5
|
Wang QS, Edahiro R, Namkoong H, Hasegawa T, Shirai Y, Sonehara K, Kumanogoh A, Ishii M, Koike R, Kimura A, Imoto S, Miyano S, Ogawa S, Kanai T, Fukunaga K, Okada Y. Estimating gene-level false discovery probability improves eQTL statistical fine-mapping precision. NAR Genom Bioinform 2023; 5:lqad090. [PMID: 37915762 PMCID: PMC10616627 DOI: 10.1093/nargab/lqad090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/10/2023] [Accepted: 09/25/2023] [Indexed: 11/03/2023] Open
Abstract
Statistical fine-mapping prioritizes putative causal variants from a large number of candidate variants, and is widely used in expression quantitative loci (eQTLs) studies. In eQTL fine-mapping, the existence of causal variants for gene expression is not guaranteed, since the genetic heritability of gene expression explained by nearby (cis-) variants is limited. Here we introduce a refined fine-mapping algorithm, named Knockoff-Finemap combination (KFc). KFc estimates the probability that the causal variant(s) exist in the cis-window of a gene through construction of knockoff genotypes (i.e. a set of synthetic genotypes that resembles the original genotypes), and uses it to adjust the posterior inclusion probabilities (PIPs). Utilizing simulated gene expression data, we show that KFc results in calibrated PIP distribution with improved precision. When applied to gene expression data of 465 genotyped samples from the Japan COVID-19 Task Force (JCTF), KFc resulted in significant enrichment of a functional score as well as reporter assay hits in the top PIP bins. When combined with functional priors derived from an external fine-mapping study (GTEx), KFc resulted in a significantly higher proportion of hematopoietic trait putative causal variants in the top PIP bins. Our work presents improvements in the precision of a major fine-mapping algorithm.
Collapse
Affiliation(s)
- Qingbo S Wang
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, 113-0033, Japan
| | - Ryuya Edahiro
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
- Department of Respiratory Medicine and Clinical Immunology, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
| | - Ho Namkoong
- Department of Infectious Diseases, Keio University School of Medicine, Tokyo, 160-8582, Japan
| | - Takanori Hasegawa
- M&D Data Science Center, Tokyo Medical and Dental University, Tokyo, 113-8510, Japan
| | - Yuya Shirai
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
- Department of Respiratory Medicine and Clinical Immunology, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
| | - Kyuto Sonehara
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, 113-0033, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan
| | | | - Atsushi Kumanogoh
- Department of Respiratory Medicine and Clinical Immunology, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan
- Department of Immunopathology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan
- Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita, 565-0871, Japan
| | - Makoto Ishii
- Department of Respiratory Medicine, Nagoya University Graduate School of Medicine, 65 tsurumai, Showa-ku, Nagoya, 466-8550, Japan
| | - Ryuji Koike
- Health Science Research and Development Center (HeRD), Tokyo Medical and Dental University, Tokyo, 113-8510, Japan
| | - Akinori Kimura
- Institute of Research, Tokyo Medical and Dental University, Tokyo, 113-8510, Japan
| | - Seiya Imoto
- Division of Health Medical Intelligence, Human Genome Center, the Institute of Medical Science, the University of Tokyo, Tokyo, 108-8639, Japan
| | - Satoru Miyano
- M&D Data Science Center, Tokyo Medical and Dental University, Tokyo, 113-8510, Japan
| | - Seishi Ogawa
- Department of Pathology and Tumor Biology, Kyoto University, Kyoto, 606-8315, Japan
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, 606-8303, Japan
- Department of Medicine, Center for Hematology and Regenerative Medicine, Karolinska Institute, Stockholm, 171 77, Sweden
| | - Takanori Kanai
- Division of Gastroenterology and Hepatology, Department of Medicine, Keio University School of Medicine, Tokyo, 160-8582, Japan
| | - Koichi Fukunaga
- Division of Pulmonary Medicine, Department of Medicine, Keio University School of Medicine, Tokyo, 160-8582, Japan
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, 113-0033, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan
- Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita, 565-0871, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan
| |
Collapse
|
6
|
Wang J, Cheng X, Liang Q, Owen LA, Lu J, Zheng Y, Wang M, Chen S, DeAngelis MM, Li Y, Chen R. Single-cell multiomics of the human retina reveals hierarchical transcription factor collaboration in mediating cell type-specific effects of genetic variants on gene regulation. Genome Biol 2023; 24:269. [PMID: 38012720 PMCID: PMC10680294 DOI: 10.1186/s13059-023-03111-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 11/15/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Systematic characterization of how genetic variation modulates gene regulation in a cell type-specific context is essential for understanding complex traits. To address this question, we profile gene expression and chromatin accessibility in cells from healthy retinae of 20 human donors through single-cell multiomics and genomic sequencing. RESULTS We map eQTL, caQTL, allelic-specific expression, and allelic-specific chromatin accessibility in major retinal cell types. By integrating these results, we identify and characterize regulatory elements and genetic variants effective on gene regulation in individual cell types. The majority of identified sc-eQTLs and sc-caQTLs display cell type-specific effects, while the cis-elements containing genetic variants with cell type-specific effects are often accessible in multiple cell types. Furthermore, the transcription factors whose binding sites are perturbed by genetic variants tend to have higher expression levels in the cell types where the variants exert their effects, compared to the cell types where the variants have no impact. We further validate our findings with high-throughput reporter assays. Lastly, we identify the enriched cell types, candidate causal variants and genes, and cell type-specific regulatory mechanism underlying GWAS loci. CONCLUSIONS Overall, genetic effects on gene regulation are highly context dependent. Our results suggest that cell type-dependent genetic effect is driven by precise modulation of both trans-factor expression and chromatin accessibility of cis-elements. Our findings indicate hierarchical collaboration among transcription factors plays a crucial role in mediating cell type-specific effects of genetic variants on gene regulation.
Collapse
Affiliation(s)
- Jun Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Xuesen Cheng
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Qingnan Liang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Leah A Owen
- Department of Ophthalmology and Visual Sciences, John A. Moran Eye Center, University of Utah, Salt Lake City, UT, USA
| | - Jiaxiong Lu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Yiqiao Zheng
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
| | - Meng Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, MO, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, MO, USA
| | - Margaret M DeAngelis
- Department of Ophthalmology, University at Buffalo the State University of New York, Buffalo, NY, USA
| | - Yumei Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Rui Chen
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
7
|
Zhao J, Baltoumas FA, Konnaris MA, Mouratidis I, Liu Z, Sims J, Agarwal V, Pavlopoulos GA, Georgakopoulos-Soares I, Ahituv N. MPRAbase: A Massively Parallel Reporter Assay Database. bioRxiv 2023:2023.11.19.567742. [PMID: 38045264 PMCID: PMC10690217 DOI: 10.1101/2023.11.19.567742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Massively parallel reporter assays (MPRAs) represent a set of high-throughput technologies that measure the functional effects of thousands of sequences/variants on gene regulatory activity. There are several different variations of MPRA technology and they are used for numerous applications, including regulatory element discovery, variant effect measurement, saturation mutagenesis, synthetic regulatory element generation or characterization of evolutionary gene regulatory differences. Despite their many designs and uses, there is no comprehensive database that incorporates the results of these experiments. To address this, we developed MPRAbase, a manually curated database that currently harbors 129 experiments, encompassing 17,718,677 elements tested across 35 cell types and 4 organisms. The MPRAbase web interface ( http://www.mprabase.com ) serves as a centralized user-friendly repository to download existing MPRA data for independent analysis and is designed with the ability to allow researchers to share their published data for rapid dissemination to the community.
Collapse
|
8
|
Mollazadeh S, Abdolahzadeh N, Moghbeli M, Arab F, Saburi E. The crosstalk between non-coding RNA polymorphisms and resistance to lung cancer therapies. Heliyon 2023; 9:e20652. [PMID: 37829813 PMCID: PMC10565774 DOI: 10.1016/j.heliyon.2023.e20652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 09/23/2023] [Accepted: 10/03/2023] [Indexed: 10/14/2023] Open
Abstract
Lung cancer (LC) is one of the most common cancer-related mortality in the world. Even with intensive multimodality therapies, lung cancer has a poor prognosis and a high morbidity rate. This review focused on the role of non-coding RNA polymorphisms such as lncRNAs and miRNAs in the resistance to LC therapies, which could open promising avenue for better therapeutic response. Of note, there is currently no valid biomarker to predict lung cancer sensitivity in patients during treatment. Since genetic variations cause many challenges in treating patients, genotyping of known polymorphisms must be thoroughly explored to find desirable treatment platforms. With this knowledge, individualized treatments could become more possible in management of LC.
Collapse
Affiliation(s)
- Samaneh Mollazadeh
- Natural Products and Medicinal Plants Research Center, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Negar Abdolahzadeh
- Department of Advanced Sciences and Technologies, School of Medicine, North Khorasan University of Medical Sciences, Bojnurd, Iran
| | - Meysam Moghbeli
- Medical Genetics and Molecular Medicine Department, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Arab
- Medical Genetics and Molecular Medicine Department, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Ehsan Saburi
- Medical Genetics and Molecular Medicine Department, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
9
|
Antontseva EV, Degtyareva AO, Korbolina EE, Damarov IS, Merkulova TI. Human-genome single nucleotide polymorphisms affecting transcription factor binding and their role in pathogenesis. Vavilovskii Zhurnal Genet Selektsii 2023; 27:662-675. [PMID: 37965371 PMCID: PMC10641029 DOI: 10.18699/vjgb-23-77] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 11/16/2023] Open
Abstract
Single nucleotide polymorphisms (SNPs) are the most common type of variation in the human genome. The vast majority of SNPs identified in the human genome do not have any effect on the phenotype; however, some can lead to changes in the function of a gene or the level of its expression. Most SNPs associated with certain traits or pathologies are mapped to regulatory regions of the genome and affect gene expression by changing transcription factor binding sites. In recent decades, substantial effort has been invested in searching for such regulatory SNPs (rSNPs) and understanding the mechanisms by which they lead to phenotypic differences, primarily to individual differences in susceptibility to diseases and in sensitivity to drugs. The development of the NGS (next-generation sequencing) technology has contributed not only to the identification of a huge number of SNPs and to the search for their association (genome-wide association studies, GWASs) with certain diseases or phenotypic manifestations, but also to the development of more productive approaches to their functional annotation. It should be noted that the presence of an association does not allow one to identify a functional, truly disease-associated DNA sequence variant among multiple marker SNPs that are detected due to linkage disequilibrium. Moreover, determination of associations of genetic variants with a disease does not provide information about the functionality of these variants, which is necessary to elucidate the molecular mechanisms of the development of pathology and to design effective methods for its treatment and prevention. In this regard, the functional analysis of SNPs annotated in the GWAS catalog, both at the genome-wide level and at the level of individual SNPs, became especially relevant in recent years. A genome-wide search for potential rSNPs is possible without any prior knowledge of their association with a trait. Thus, mapping expression quantitative trait loci (eQTLs) makes it possible to identify an SNP for which - among transcriptomes of homozygotes and heterozygotes for its various alleles - there are differences in the expression level of certain genes, which can be located at various distances from the SNP. To predict rSNPs, approaches based on searches for allele-specific events in RNA-seq, ChIP-seq, DNase-seq, ATAC-seq, MPRA, and other data are also used. Nonetheless, for a more complete functional annotation of such rSNPs, it is necessary to establish their association with a trait, in particular, with a predisposition to a certain pathology or sensitivity to drugs. Thus, approaches to finding SNPs important for the development of a trait can be categorized into two groups: (1) starting from data on an association of SNPs with a certain trait, (2) starting from the determination of allele-specific changes at the molecular level (in a transcriptome or regulome). Only comprehensive use of strategically different approaches can considerably enrich our knowledge about the role of genetic determinants in the molecular mechanisms of trait formation, including predisposition to multifactorial diseases.
Collapse
Affiliation(s)
- E V Antontseva
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - A O Degtyareva
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - E E Korbolina
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - I S Damarov
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - T I Merkulova
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
10
|
Frenkel M, Hujoel ML, Morris Z, Raman S. Discovering chromatin dysregulation induced by protein-coding perturbations at scale. bioRxiv 2023:2023.09.20.555752. [PMID: 37781603 PMCID: PMC10541138 DOI: 10.1101/2023.09.20.555752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/03/2023]
Abstract
Although population-scale databases have expanded to millions of protein-coding variants, insight into variant mechanisms has not kept pace. We present PROD-ATAC, a high-throughput method for discovering the effects of protein-coding variants on chromatin. A pooled library of variants is expressed in a disease-agnostic cell line, and single-cell ATAC resolves each variant's effect on chromatin. Using PROD-ATAC, we characterized the effects of >100 oncofusions (a class of cancer-causing chimeric proteins) and controls and revealed that pioneer activity is a common feature of fusions spanning an enormous range of fusion frequencies. Further, fusion-induced dysregulation can be context-agnostic as observed mechanisms often overlapped with cancer and cell-type specific prior knowledge. We also showed that gain-of-function pioneering is common among oncofusions. This work provides a global view of fusion-induced chromatin. We uncovered convergent mechanisms among disparate oncofusions and shared modes of dysregulation across different cancers. PROD-ATAC is generalizable to any set of protein-coding variants.
Collapse
Affiliation(s)
- Max Frenkel
- Cellular and Molecular Biology Graduate Program, University of Wisconsin, Madison, Wisconsin, USA
- Medical Scientist Training Program, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
- Department of Biochemistry, University of Wisconsin, Madison, Wisconsin, USA
| | - Margaux L.A. Hujoel
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Center for Data Sciences, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Zachary Morris
- Department of Human Oncology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin, Madison, Wisconsin, USA
- Department of Bacteriology, University of Wisconsin, Madison, Wisconsin, USA
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin, USA
| |
Collapse
|
11
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
12
|
Hudaiberdiev S, Taylor DL, Song W, Narisu N, Bhuiyan RM, Taylor HJ, Tang X, Yan T, Swift AJ, Bonnycastle LL, Consortium DIAMANTE, Chen S, Stitzel ML, Erdos MR, Ovcharenko I, Collins FS. Modeling islet enhancers using deep learning identifies candidate causal variants at loci associated with T2D and glycemic traits. Proc Natl Acad Sci U S A 2023; 120:e2206612120. [PMID: 37603758 PMCID: PMC10469333 DOI: 10.1073/pnas.2206612120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 07/19/2023] [Indexed: 08/23/2023] Open
Abstract
Genetic association studies have identified hundreds of independent signals associated with type 2 diabetes (T2D) and related traits. Despite these successes, the identification of specific causal variants underlying a genetic association signal remains challenging. In this study, we describe a deep learning (DL) method to analyze the impact of sequence variants on enhancers. Focusing on pancreatic islets, a T2D relevant tissue, we show that our model learns islet-specific transcription factor (TF) regulatory patterns and can be used to prioritize candidate causal variants. At 101 genetic signals associated with T2D and related glycemic traits where multiple variants occur in linkage disequilibrium, our method nominates a single causal variant for each association signal, including three variants previously shown to alter reporter activity in islet-relevant cell types. For another signal associated with blood glucose levels, we biochemically test all candidate causal variants from statistical fine-mapping using a pancreatic islet beta cell line and show biochemical evidence of allelic effects on TF binding for the model-prioritized variant. To aid in future research, we publicly distribute our model and islet enhancer perturbation scores across ~67 million genetic variants. We anticipate that DL methods like the one presented in this study will enhance the prioritization of candidate causal variants for functional studies.
Collapse
Affiliation(s)
- Sanjarbek Hudaiberdiev
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20892
| | - D. Leland Taylor
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| | - Wei Song
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20892
| | - Narisu Narisu
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| | - Redwan M. Bhuiyan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT06032
- Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT06032
| | - Henry J. Taylor
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, CambridgeCB1 8RN, UK
| | - Xuming Tang
- Department of Surgery, Weill Cornell Medicine, New York, NY10065
- Center for Genomic Health, Weill Cornell Medicine, New York, NY10065
| | - Tingfen Yan
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| | - Amy J. Swift
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| | - Lori L. Bonnycastle
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| | - DIAMANTE Consortium
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20892
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
- The Jackson Laboratory for Genomic Medicine, Farmington, CT06032
- Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT06032
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, CambridgeCB1 8RN, UK
- Department of Surgery, Weill Cornell Medicine, New York, NY10065
- Center for Genomic Health, Weill Cornell Medicine, New York, NY10065
- Institute of Systems Genomics, University of Connecticut, Farmington, CT06032
| | - Shuibing Chen
- Department of Surgery, Weill Cornell Medicine, New York, NY10065
- Center for Genomic Health, Weill Cornell Medicine, New York, NY10065
| | - Michael L. Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, CT06032
- Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT06032
- Institute of Systems Genomics, University of Connecticut, Farmington, CT06032
| | - Michael R. Erdos
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20892
| | - Francis S. Collins
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD20892
| |
Collapse
|
13
|
Yang M, Ali O, Bjørås M, Wang J. Identifying functional regulatory mutation blocks by integrating genome sequencing and transcriptome data. iScience 2023; 26:107266. [PMID: 37520692 PMCID: PMC10371843 DOI: 10.1016/j.isci.2023.107266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 04/05/2023] [Accepted: 06/28/2023] [Indexed: 08/01/2023] Open
Abstract
Millions of single nucleotide variants (SNVs) exist in the human genome; however, it remains challenging to identify functional SNVs associated with diseases. We propose a non-encoding SNVs analysis tool bpb3, BayesPI-BAR version 3, aiming to identify the functional mutation blocks (FMBs) by integrating genome sequencing and transcriptome data. The identified FMBs display high frequency SNVs, significant changes in transcription factors (TFs) binding affinity and are nearby the regulatory regions of differentially expressed genes. A two-level Bayesian approach with a biophysical model for protein-DNA interactions is implemented, to compute TF-DNA binding affinity changes based on clustered position weight matrices (PWMs) from over 1700 TF-motifs. The epigenetic data, such as the DNA methylome can also be integrated to scan FMBs. By testing the datasets from follicular lymphoma and melanoma, bpb3 automatically and robustly identifies FMBs, demonstrating that bpb3 can provide insight into patho-mechanisms, and therapeutic targets from transcriptomic and genomic data.
Collapse
Affiliation(s)
- Mingyi Yang
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
- Department of Medical Biochemistry, Oslo University Hospital and University of Oslo, Oslo, Norway
| | - Omer Ali
- Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway
- Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Magnar Bjørås
- Department of Microbiology, Oslo University Hospital and University of Oslo, Oslo, Norway
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Junbai Wang
- Department of Clinical Molecular Biology (EpiGen), Akershus University Hospital and University of Oslo, Lørenskog, Norway
| |
Collapse
|
14
|
Oliveros W, Delfosse K, Lato DF, Kiriakopulos K, Mokhtaridoost M, Said A, McMurray BJ, Browning JW, Mattioli K, Meng G, Ellis J, Mital S, Melé M, Maass PG. Systematic characterization of regulatory variants of blood pressure genes. Cell Genom 2023; 3:100330. [PMID: 37492106 PMCID: PMC10363820 DOI: 10.1016/j.xgen.2023.100330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 03/29/2023] [Accepted: 04/28/2023] [Indexed: 07/27/2023]
Abstract
High blood pressure (BP) is the major risk factor for cardiovascular disease. Genome-wide association studies have identified genetic variants for BP, but functional insights into causality and related molecular mechanisms lag behind. We functionally characterize 4,608 genetic variants in linkage with 135 BP loci in vascular smooth muscle cells and cardiomyocytes by massively parallel reporter assays. High densities of regulatory variants at BP loci (i.e., ULK4, MAP4, CFDP1, PDE5A) indicate that multiple variants drive genetic association. Regulatory variants are enriched in repeats, alter cardiovascular-related transcription factor motifs, and spatially converge with genes controlling specific cardiovascular pathways. Using heuristic scoring, we define likely causal variants, and CRISPR prime editing finally determines causal variants for KCNK9, SFXN2, and PCGF6, which are candidates for developing high BP. Our systems-level approach provides a catalog of functionally relevant variants and their genomic architecture in two trait-relevant cell lines for a better understanding of BP gene regulation.
Collapse
Affiliation(s)
- Winona Oliveros
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Catalonia, Spain
| | - Kate Delfosse
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Daniella F. Lato
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Katerina Kiriakopulos
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Milad Mokhtaridoost
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Abdelrahman Said
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Brandon J. McMurray
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Jared W.L. Browning
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham & Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Guoliang Meng
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - James Ellis
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Seema Mital
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Ted Rogers Centre for Heart Research, Toronto, ON M5G 1X8, Canada
- Department of Pediatrics, The Hospital for Sick Children, University of Toronto, Toronto, ON M5G 0A4, Canada
| | - Marta Melé
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Catalonia, Spain
| | - Philipp G. Maass
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
15
|
Hill C, Hudaiberdiev S, Ovcharenko I. ChromDL: a next-generation regulatory DNA classifier. Bioinformatics 2023; 39:i377-i385. [PMID: 37387183 DOI: 10.1093/bioinformatics/btad217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Predicting the regulatory function of non-coding DNA using only the DNA sequence continues to be a major challenge in genomics. With the advent of improved optimization algorithms, faster GPU speeds, and more intricate machine-learning libraries, hybrid convolutional and recurrent neural network architectures can be constructed and applied to extract crucial information from non-coding DNA. RESULTS Using a comparative analysis of the performance of thousands of Deep Learning architectures, we developed ChromDL, a neural network architecture combining bidirectional gated recurrent units, convolutional neural networks, and bidirectional long short-term memory units, which significantly improves upon a range of prediction metrics compared to its predecessors in transcription factor binding site, histone modification, and DNase-I hyper-sensitive site detection. Combined with a secondary model, it can be utilized for accurate classification of gene regulatory elements. The model can also detect weak transcription factor binding as compared to previously developed methods and has the potential to help delineate transcription factor binding motif specificities. AVAILABILITY AND IMPLEMENTATION The ChromDL source code can be found at https://github.com/chrishil1/ChromDL.
Collapse
Affiliation(s)
- Christopher Hill
- Computational Biology Branch, Intramural Research Program, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, United States
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Sanjarbek Hudaiberdiev
- Computational Biology Branch, Intramural Research Program, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, United States
| | - Ivan Ovcharenko
- Computational Biology Branch, Intramural Research Program, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, United States
| |
Collapse
|
16
|
FitzPatrick VD, Leemans C, van Arensbergen J, van Steensel B, Bussemaker H. Defining the fine structure of promoter activity on a genome-wide scale with CISSECTOR. Nucleic Acids Res 2023; 51:5499-5511. [PMID: 37013986 PMCID: PMC10287907 DOI: 10.1093/nar/gkad232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 03/08/2023] [Accepted: 03/22/2023] [Indexed: 04/05/2023] Open
Abstract
Classic promoter mutagenesis strategies can be used to study how proximal promoter regions regulate the expression of particular genes of interest. This is a laborious process, in which the smallest sub-region of the promoter still capable of recapitulating expression in an ectopic setting is first identified, followed by targeted mutation of putative transcription factor binding sites. Massively parallel reporter assays such as survey of regulatory elements (SuRE) provide an alternative way to study millions of promoter fragments in parallel. Here we show how a generalized linear model (GLM) can be used to transform genome-scale SuRE data into a high-resolution genomic track that quantifies the contribution of local sequence to promoter activity. This coefficient track helps identify regulatory elements and can be used to predict promoter activity of any sub-region in the genome. It thus allows in silico dissection of any promoter in the human genome to be performed. We developed a web application, available at cissector.nki.nl, that lets researchers easily perform this analysis as a starting point for their research into any promoter of interest.
Collapse
Affiliation(s)
- Vincent D FitzPatrick
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| | - Christ Leemans
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Joris van Arensbergen
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Bas van Steensel
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Department of Cell Biology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| |
Collapse
|
17
|
Fabo T, Khavari P. Functional characterization of human genomic variation linked to polygenic diseases. Trends Genet 2023; 39:462-490. [PMID: 36997428 PMCID: PMC11025698 DOI: 10.1016/j.tig.2023.02.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 02/22/2023] [Accepted: 02/23/2023] [Indexed: 03/30/2023]
Abstract
The burden of human disease lies predominantly in polygenic diseases. Since the early 2000s, genome-wide association studies (GWAS) have identified genetic variants and loci associated with complex traits. These have ranged from variants in coding sequences to mutations in regulatory regions, such as promoters and enhancers, as well as mutations affecting mediators of mRNA stability and other downstream regulators, such as 5' and 3'-untranslated regions (UTRs), long noncoding RNA (lncRNA), and miRNA. Recent research advances in genetics have utilized a combination of computational techniques, high-throughput in vitro and in vivo screening modalities, and precise genome editing to impute the function of diverse classes of genetic variants identified through GWAS. In this review, we highlight the vastness of genomic variants associated with polygenic disease risk and address recent advances in how genetic tools can be used to functionally characterize them.
Collapse
Affiliation(s)
- Tania Fabo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Graduate Program in Genetics, Stanford University, Stanford, CA, USA; Stanford University School of Medicine, Stanford University, Stanford, CA, USA
| | - Paul Khavari
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Graduate Program in Genetics, Stanford University, Stanford, CA, USA; Stanford University School of Medicine, Stanford University, Stanford, CA, USA; Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA.
| |
Collapse
|
18
|
Shi FY, Wang Y, Huang D, Liang Y, Liang N, Chen XW, Gao G. Computational Assessment of the Expression-modulating Potential for Non-coding Variants. Genomics Proteomics Bioinformatics 2023; 21:662-673. [PMID: 34890839 PMCID: PMC10787178 DOI: 10.1016/j.gpb.2021.10.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 10/13/2021] [Accepted: 11/01/2021] [Indexed: 06/13/2023]
Abstract
Large-scale genome-wide association studies (GWAS) and expression quantitative trait locus (eQTL) studies have identified multiple non-coding variants associated with genetic diseases by affecting gene expression. However, pinpointing causal variants effectively and efficiently remains a serious challenge. Here, we developed CARMEN, a novel algorithm to identify functional non-coding expression-modulating variants. Multiple evaluations demonstrated CARMEN's superior performance over state-of-the-art tools. Applying CARMEN to GWAS and eQTL datasets further pinpointed several causal variants other than the reported lead single-nucleotide polymorphisms (SNPs). CARMEN scales well with the massive datasets, and is available online as a web server at http://carmen.gao-lab.org.
Collapse
Affiliation(s)
- Fang-Yuan Shi
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, Beijing 100871, China
| | - Yu Wang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, Beijing 100871, China
| | - Dong Huang
- State Key Laboratory of Membrane Biology, Institute of Molecular Medicine, Peking University, Beijing 100871, China
| | - Yu Liang
- Human Aging Research Institute, School of Life Science, Nanchang University, Nanchang 330031, China
| | - Nan Liang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, Beijing 100871, China
| | - Xiao-Wei Chen
- State Key Laboratory of Membrane Biology, Institute of Molecular Medicine, Peking University, Beijing 100871, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) & Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, Beijing 100871, China.
| |
Collapse
|
19
|
Rogers BB, Anderson AG, Lauzon SN, Davis MN, Hauser RM, Roberts SC, Rodriguez-Nunez I, Trausch-Lowther K, Barinaga EA, Taylor JW, Mackiewicz M, Roberts BS, Cooper SJ, Rizzardi LF, Myers RM, Cochran JN. MAPT expression is mediated by long-range interactions with cis-regulatory elements. bioRxiv 2023:2023.03.07.531520. [PMID: 37090552 PMCID: PMC10120716 DOI: 10.1101/2023.03.07.531520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Background Tauopathies are a group of neurodegenerative diseases driven by abnormal aggregates of tau, a microtubule associated protein encoded by the MAPT gene. MAPT expression is absent in neural progenitor cells (NPCs) and increases during differentiation. This temporally dynamic expression pattern suggests that MAPT expression is controlled by transcription factors and cis-regulatory elements specific to differentiated cell types. Given the relevance of MAPT expression to neurodegeneration pathogenesis, identification of such elements is relevant to understanding genetic risk factors. Methods We performed HiC, chromatin conformation capture (Capture-C), single-nucleus multiomics (RNA-seq+ATAC-seq), bulk ATAC-seq, and ChIP-seq for H3K27Ac and CTCF in NPCs and neurons differentiated from human iPSC cultures. We nominated candidate cis-regulatory elements (cCREs) for MAPT in human NPCs, differentiated neurons, and pure cultures of inhibitory and excitatory neurons. We then assayed these cCREs using luciferase assays and CRISPR interference (CRISPRi) experiments to measure their effects on MAPT expression. Finally, we integrated cCRE annotations into an analysis of genetic variation in AD cases and controls. Results Using orthogonal genomics approaches, we nominated 94 cCREs for MAPT, including the identification of cCREs specifically active in differentiated neurons. Eleven regions enhanced reporter gene transcription in luciferase assays. Using CRISPRi, 5 of the 94 regions tested were identified as necessary for MAPT expression as measured by RT-qPCR and RNA-seq. Rare and predicted damaging genetic variation in both nominated and confirmed CREs was depleted in AD cases relative to controls (OR = 0.40, p = 0.004), consistent with the hypothesis that variants that disrupt MAPT enhancer activity, and thereby reduce MAPT expression, may be protective against neurodegenerative disease. Conclusions We identified both proximal and distal regulatory elements for MAPT and confirmed the regulatory function for several regions, including three regions centromeric to MAPT beyond the well-described H1/H2 haplotype inversion breakpoint. This study provides compelling evidence for pursuing detailed knowledge of CREs for genes of interest to permit better understanding of disease risk.
Collapse
Affiliation(s)
- Brianne B. Rogers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- University of Alabama at Birmingham, Birmingham, AL, USA
| | | | | | | | | | | | | | | | | | - Jared W. Taylor
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Sara J. Cooper
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | | | | |
Collapse
|
20
|
Ren N, Dai S, Ma S, Yang F. Strategies for activity analysis of single nucleotide polymorphisms associated with human diseases. Clin Genet 2023; 103:392-400. [PMID: 36527336 DOI: 10.1111/cge.14282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 12/10/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022]
Abstract
Genome-wide association studies (GWAS) have identified a large number of single nucleotide polymorphism (SNP) sites associated with human diseases. In the annotation of human diseases, especially cancers, SNPs, as an important component of genetic factors, have gained increasing attention. Given that most of the SNPs are located in non-coding regions, the functional verification of these SNPs is a great challenge. The key to functional annotation for risk SNPs is to screen SNPs with regulatory activity from thousands of disease associated-SNPs. In this review, we systematically recapitulate the characteristics and functional roles of SNP sites, discuss three parallel reporter screening strategies in detail based on barcode tag classification, and recommend the common in silico strategies to help supplement the annotation of SNP sites with epigenetic activity analysis, prediction of target genes and trans-acting factors. We hope that this review will contribute to this exuberant research field by providing robust activity analysis strategies that can facilitate the translation of GWAS results into personalized diagnosis and prevention measures for human diseases.
Collapse
Affiliation(s)
- Naixia Ren
- School of Life Sciences and Medicine, Shandong University of Technology, Zibo, China
| | - Shangkun Dai
- School of Life Sciences and Medicine, Shandong University of Technology, Zibo, China
| | - Shumin Ma
- School of Medicine and Pharmacy, Ocean University of China, Qingdao, China
| | - Fengtang Yang
- School of Life Sciences and Medicine, Shandong University of Technology, Zibo, China
| |
Collapse
|
21
|
Anderson AG, Rogers BB, Loupe JM, Rodriguez-Nunez I, Roberts SC, White LM, Brazell JN, Bunney WE, Bunney BG, Watson SJ, Cochran JN, Myers RM, Rizzardi LF. Single nucleus multiomics identifies ZEB1 and MAFB as candidate regulators of Alzheimer's disease-specific cis-regulatory elements. Cell Genom 2023; 3:100263. [PMID: 36950385 PMCID: PMC10025452 DOI: 10.1016/j.xgen.2023.100263] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 12/06/2022] [Accepted: 01/12/2023] [Indexed: 02/05/2023]
Abstract
Cell type-specific transcriptional differences between brain tissues from donors with Alzheimer's disease (AD) and unaffected controls have been well documented, but few studies have rigorously interrogated the regulatory mechanisms responsible for these alterations. We performed single nucleus multiomics (snRNA-seq plus snATAC-seq) on 105,332 nuclei isolated from cortical tissues from 7 AD and 8 unaffected donors to identify candidate cis-regulatory elements (CREs) involved in AD-associated transcriptional changes. We detected 319,861 significant correlations, or links, between gene expression and cell type-specific transposase accessible regions enriched for active CREs. Among these, 40,831 were unique to AD tissues. Validation experiments confirmed the activity of many regions, including several candidate regulators of APP expression. We identified ZEB1 and MAFB as candidate transcription factors playing important roles in AD-specific gene regulation in neurons and microglia, respectively. Microglia links were globally enriched for heritability of AD risk and previously identified active regulatory regions.
Collapse
Affiliation(s)
| | - Brianne B. Rogers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jacob M. Loupe
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | | | - Lauren M. White
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - William E. Bunney
- Department of Psychiatry and Human Behavior, College of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Blynn G. Bunney
- Department of Psychiatry and Human Behavior, College of Medicine, University of California, Irvine, Irvine, CA, USA
| | - Stanley J. Watson
- Mental Health Research Institute, University of Michigan, Ann Arbor, MI, USA
| | | | | | | |
Collapse
|
22
|
Reddy AJ, Herschl MH, Kolli S, Lu AX, Geng X, Kumar A, Hsu PD, Levine S, Ioannidis NM. Pretraining strategies for effective promoter-driven gene expression prediction. bioRxiv 2023:2023.02.24.529941. [PMID: 36909524 PMCID: PMC10002662 DOI: 10.1101/2023.02.24.529941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
Abstract
Advances in gene delivery technologies are enabling rapid progress in molecular medicine, but require precise expression of genetic cargo in desired cell types, which is predominantly achieved via a regulatory DNA sequence called a promoter; however, only a handful of cell type-specific promoters are known. Efficiently designing compact promoter sequences with a high density of regulatory information by leveraging machine learning models would therefore be broadly impactful for fundamental research and direct therapeutic applications. However, models of expression from such compact promoter sequences are lacking, despite the recent success of deep learning in modelling expression from endogenous regulatory sequences. Despite the lack of large datasets measuring promoter-driven expression in many cell types, data from a few well-studied cell types or from endogenous gene expression may provide relevant information for transfer learning, which has not yet been explored in this setting. Here, we evaluate a variety of pretraining tasks and transfer strategies for modelling cell type-specific expression from compact promoters and demonstrate the effectiveness of pretraining on existing promoter-driven expression datasets from other cell types. Our approach is broadly applicable for modelling promoter-driven expression in any data-limited cell type of interest, and will enable the use of model-based optimization techniques for promoter design for gene delivery applications. Our code and data are available at https://github.com/anikethjr/promoter_models.
Collapse
|
23
|
Li S, Hannenhalli S, Ovcharenko I. De novo human brain enhancers created by single-nucleotide mutations. Sci Adv 2023; 9:eadd2911. [PMID: 36791193 PMCID: PMC9931207 DOI: 10.1126/sciadv.add2911] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 01/12/2023] [Indexed: 05/30/2023]
Abstract
Advanced human cognition is attributed to increased neocortex size and complexity, but the underlying evolutionary and regulatory mechanisms are largely unknown. Using human and macaque embryonic neocortical H3K27ac data coupled with a deep learning model of enhancers, we identified ~4000 enhancer gains in humans, which, per our model, can often be attributed to single-nucleotide essential mutations. Our analyses suggest that functional gains in embryonic brain development are associated with de novo enhancers whose putative target genes exhibit increased expression in progenitor cells and interneurons and partake in critical neural developmental processes. Essential mutations alter enhancer activity through altered binding of key transcription factors (TFs) of embryonic neocortex, including ISL1, POU3F2, PITX1/2, and several SOX TFs, and are associated with central nervous system disorders. Overall, our results suggest that essential mutations lead to gain of embryonic neocortex enhancers, which orchestrate expression of genes involved in critical developmental processes associated with human cognition.
Collapse
Affiliation(s)
- Shan Li
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Sridhar Hannenhalli
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
24
|
Gallego Romero I, Lea AJ. Leveraging massively parallel reporter assays for evolutionary questions. Genome Biol 2023; 24:26. [PMID: 36788564 PMCID: PMC9926830 DOI: 10.1186/s13059-023-02856-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 01/17/2023] [Indexed: 02/16/2023] Open
Abstract
A long-standing goal of evolutionary biology is to decode how gene regulation contributes to organismal diversity. Doing so is challenging because it is hard to predict function from non-coding sequence and to perform molecular research with non-model taxa. Massively parallel reporter assays (MPRAs) enable the testing of thousands to millions of sequences for regulatory activity simultaneously. Here, we discuss the execution, advantages, and limitations of MPRAs, with a focus on evolutionary questions. We propose solutions for extending MPRAs to rare taxa and those with limited genomic resources, and we underscore MPRA's broad potential for driving genome-scale, functional studies across organisms.
Collapse
Affiliation(s)
- Irene Gallego Romero
- Melbourne Integrative Genomics, University of Melbourne, Royal Parade, Parkville, Victoria, 3010, Australia. .,School of BioSciences, The University of Melbourne, Royal Parade, Parkville, 3010, Australia. .,The Centre for Stem Cell Systems, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, 30 Royal Parade, Parkville, Victoria, 3010, Australia. .,Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Riia 23b, 51010, Tartu, Estonia.
| | - Amanda J. Lea
- grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37240 USA ,Child and Brain Development Program, Canadian Institute for Advanced Study, Toronto, Canada
| |
Collapse
|
25
|
Stikker BS, Hendriks RW, Stadhouders R. Decoding the genetic and epigenetic basis of asthma. Allergy 2023; 78:940-956. [PMID: 36727912 DOI: 10.1111/all.15666] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 01/17/2023] [Accepted: 01/30/2023] [Indexed: 02/03/2023]
Abstract
Asthma is a complex and heterogeneous chronic inflammatory disease of the airways. Alongside environmental factors, asthma susceptibility is strongly influenced by genetics. Given its high prevalence and our incomplete understanding of the mechanisms underlying disease susceptibility, asthma is frequently studied in genome-wide association studies (GWAS), which have identified thousands of genetic variants associated with asthma development. Virtually all these genetic variants reside in non-coding genomic regions, which has obscured the functional impact of asthma-associated variants and their translation into disease-relevant mechanisms. Recent advances in genomics technology and epigenetics now offer methods to link genetic variants to gene regulatory elements embedded within non-coding regions, which have started to unravel the molecular mechanisms underlying the complex (epi)genetics of asthma. Here, we provide an integrated overview of (epi)genetic variants associated with asthma, focusing on efforts to link these disease associations to biological insight into asthma pathophysiology using state-of-the-art genomics methodology. Finally, we provide a perspective as to how decoding the genetic and epigenetic basis of asthma has the potential to transform clinical management of asthma and to predict the risk of asthma development.
Collapse
Affiliation(s)
- Bernard S Stikker
- Department of Pulmonary Medicine, Erasmus MC, University Medical Center, Rotterdam, The Netherlands
| | - Rudi W Hendriks
- Department of Pulmonary Medicine, Erasmus MC, University Medical Center, Rotterdam, The Netherlands
| | - Ralph Stadhouders
- Department of Pulmonary Medicine, Erasmus MC, University Medical Center, Rotterdam, The Netherlands.,Department of Cell Biology, Erasmus MC, University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
26
|
Nguyen TV, Vander Jagt CJ, Wang J, Daetwyler HD, Xiang R, Goddard ME, Nguyen LT, Ross EM, Hayes BJ, Chamberlain AJ, MacLeod IM. In it for the long run: perspectives on exploiting long-read sequencing in livestock for population scale studies of structural variants. Genet Sel Evol 2023; 55:9. [PMID: 36721111 PMCID: PMC9887926 DOI: 10.1186/s12711-023-00783-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 01/23/2023] [Indexed: 02/02/2023] Open
Abstract
Studies have demonstrated that structural variants (SV) play a substantial role in the evolution of species and have an impact on Mendelian traits in the genome. However, unlike small variants (< 50 bp), it has been challenging to accurately identify and genotype SV at the population scale using short-read sequencing. Long-read sequencing technologies are becoming competitively priced and can address several of the disadvantages of short-read sequencing for the discovery and genotyping of SV. In livestock species, analysis of SV at the population scale still faces challenges due to the lack of resources, high costs, technological barriers, and computational limitations. In this review, we summarize recent progress in the characterization of SV in the major livestock species, the obstacles that still need to be overcome, as well as the future directions in this growing field. It seems timely that research communities pool resources to build global population-scale long-read sequencing consortiums for the major livestock species for which the application of genomic tools has become cost-effective.
Collapse
Affiliation(s)
- Tuan V. Nguyen
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| | - Christy J. Vander Jagt
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| | - Jianghui Wang
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| | - Hans D. Daetwyler
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1018.80000 0001 2342 0938School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083 Australia
| | - Ruidong Xiang
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1008.90000 0001 2179 088XFaculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville, VIC 3052 Australia
| | - Michael E. Goddard
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1008.90000 0001 2179 088XFaculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville, VIC 3052 Australia
| | - Loan T. Nguyen
- grid.1003.20000 0000 9320 7537Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072 Australia
| | - Elizabeth M. Ross
- grid.1003.20000 0000 9320 7537Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072 Australia
| | - Ben J. Hayes
- grid.1003.20000 0000 9320 7537Queensland Alliance for Agriculture and Food Innovation, University of Queensland, St Lucia, QLD 4072 Australia
| | - Amanda J. Chamberlain
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia ,grid.1018.80000 0001 2342 0938School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083 Australia
| | - Iona M. MacLeod
- grid.452283.a0000 0004 0407 2669Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083 Australia
| |
Collapse
|
27
|
Hill C, Hudaiberdiev S, Ovcharenko I. ChromDL: A Next-Generation Regulatory DNA Classifier. bioRxiv 2023:2023.01.27.525971. [PMID: 36789431 PMCID: PMC9928050 DOI: 10.1101/2023.01.27.525971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
MOTIVATION Predicting the regulatory function of non-coding DNA using only the DNA sequence continues to be a major challenge in genomics. With the advent of improved optimization algorithms, faster GPU speeds, and more intricate machine learning libraries, hybrid convolutional and recurrent neural network architectures can be constructed and applied to extract crucial information from non-coding DNA. RESULTS Using a comparative analysis of the performance of thousands of Deep Learning (DL) architectures, we developed ChromDL, a neural network architecture combining bidirectional gated recurrent units (BiGRU), convolutional neural networks (CNNs), and bidirectional long short-term memory units (BiLSTM), which significantly improves upon a range of prediction metrics compared to its predecessors in transcription factor binding site (TFBS), histone modification (HM), and DNase-I hypersensitive site (DHS) detection. Combined with a secondary model, it can be utilized for accurate classification of gene regulatory elements. The model can also detect weak transcription factor (TF) binding with higher accuracy as compared to previously developed methods and has the potential to accurately delineate TF binding motif specificities. AVAILABILITY The ChromDL source code can be found at https://github.com/chrishil1/ChromDL .
Collapse
Affiliation(s)
- Christopher Hill
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sanjarbek Hudaiberdiev
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
28
|
Kircher M, Ludwig KU. Systematic assays and resources for the functional annotation of non-coding variants. MED GENET-BERLIN 2022; 34:275-286. [PMID: 37034418 PMCID: PMC10081529 DOI: 10.1515/medgen-2022-2161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Abstract
Identification of genetic variation in individual genomes is now a routine procedure in human genetic research and diagnostics. For many variants, however, insufficient evidence is available to establish a pathogenic effect, particularly for variants in non-coding regions. Furthermore, the sheer number of candidate variants renders testing in individual assays virtually impossible. While scalable approaches are being developed, the selection of methods and resources and the application of a given framework to a particular disease or trait remain major challenges. This limits the translation of results from both genome-wide association studies and genome sequencing. Here, we discuss computational and experimental approaches available for functional annotation of non-coding variation.
Collapse
Affiliation(s)
- Martin Kircher
- Institute of Human Genetics , University of Lübeck , Lübeck , Germany
- Berlin Institute of Health at Charité – Universitätsmedizin Berlin , Berlin , Germany
| | - Kerstin U. Ludwig
- Institute of Human Genetics, University Hospital Bonn , University of Bonn , Venusberg-Campus 1, Building 76 , Bonn , Germany
| |
Collapse
|
29
|
Zaugg JB, Sahlén P, Andersson R, Alberich-Jorda M, de Laat W, Deplancke B, Ferrer J, Mandrup S, Natoli G, Plewczynski D, Rada-Iglesias A, Spicuglia S. Current challenges in understanding the role of enhancers in disease. Nat Struct Mol Biol 2022; 29:1148-58. [PMID: 36482255 DOI: 10.1038/s41594-022-00896-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/04/2022] [Indexed: 12/13/2022]
Abstract
Enhancers play a central role in the spatiotemporal control of gene expression and tend to work in a cell-type-specific manner. In addition, they are suggested to be major contributors to phenotypic variation, evolution and disease. There is growing evidence that enhancer dysfunction due to genetic, structural or epigenetic mechanisms contributes to a broad range of human diseases referred to as enhanceropathies. Such mechanisms often underlie the susceptibility to common diseases, but can also play a direct causal role in cancer or Mendelian diseases. Despite the recent gain of insights into enhancer biology and function, we still have a limited ability to predict how enhancer dysfunction impacts gene expression. Here we discuss the major challenges that need to be overcome when studying the role of enhancers in disease etiology and highlight opportunities and directions for future studies, aiming to disentangle the molecular basis of enhanceropathies.
Collapse
|
30
|
Long E, Yin J, Funderburk KM, Xu M, Feng J, Kane A, Zhang T, Myers T, Golden A, Thakur R, Kong H, Jessop L, Kim EY, Jones K, Chari R, Machiela MJ, Yu K, Iles MM, Landi MT, Law MH, Chanock SJ, Brown KM, Choi J. Massively parallel reporter assays and variant scoring identified functional variants and target genes for melanoma loci and highlighted cell-type specificity. Am J Hum Genet 2022; 109:2210-2229. [PMID: 36423637 PMCID: PMC9748337 DOI: 10.1016/j.ajhg.2022.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 11/02/2022] [Indexed: 11/24/2022] Open
Abstract
The most recent genome-wide association study (GWAS) of cutaneous melanoma identified 54 risk-associated loci, but functional variants and their target genes for most have not been established. Here, we performed massively parallel reporter assays (MPRAs) by using malignant melanoma and normal melanocyte cells and further integrated multi-layer annotation to systematically prioritize functional variants and susceptibility genes from these GWAS loci. Of 1,992 risk-associated variants tested in MPRAs, we identified 285 from 42 loci (78% of the known loci) displaying significant allelic transcriptional activities in either cell type (FDR < 1%). We further characterized MPRA-significant variants by motif prediction, epigenomic annotation, and statistical/functional fine-mapping to create integrative variant scores, which prioritized one to six plausible candidate variants per locus for the 42 loci and nominated a single variant for 43% of these loci. Overlaying the MPRA-significant variants with genome-wide significant expression or methylation quantitative trait loci (eQTLs or meQTLs, respectively) from melanocytes or melanomas identified candidate susceptibility genes for 60% of variants (172 of 285 variants). CRISPRi of top-scoring variants validated their cis-regulatory effect on the eQTL target genes, MAFF (22q13.1) and GPRC5A (12p13.1). Finally, we identified 36 melanoma-specific and 45 melanocyte-specific MPRA-significant variants, a subset of which are linked to cell-type-specific target genes. Analyses of transcription factor availability in MPRA datasets and variant-transcription-factor interaction in eQTL datasets highlighted the roles of transcription factors in cell-type-specific variant functionality. In conclusion, MPRAs along with variant scoring effectively prioritized plausible candidates for most melanoma GWAS loci and highlighted cellular contexts where the susceptibility variants are functional.
Collapse
Affiliation(s)
- Erping Long
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Jinhu Yin
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Karen M. Funderburk
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Mai Xu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - James Feng
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Alexander Kane
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Tongwu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Timothy Myers
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Alyxandra Golden
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Rohit Thakur
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Hyunkyung Kong
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Lea Jessop
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Eun Young Kim
- Department of Internal Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Kristine Jones
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Raj Chari
- Genome Modification Core, Frederick National Lab for Cancer Research, National Cancer Institute, Frederick, MD, USA
| | - Mitchell J. Machiela
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Kai Yu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | | | - Mark M. Iles
- Leeds Institute for Data Analytics, School of Medicine, University of Leeds, Leeds LS2 9NL, UK
| | - Maria Teresa Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Matthew H. Law
- Statistical Genetics, QIMR Berghofer Medical Research Institute, Brisbane, QLD 4006, Australia,Faculty of Health, Queensland University of Technology, Brisbane, QLD, Australia,School of Biomedical Sciences, University of Queensland, Brisbane, QLD, Australia
| | - Stephen J. Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Kevin M. Brown
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Jiyeon Choi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA,Corresponding author
| |
Collapse
|
31
|
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022; 56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Scalable sequence-function studies have enabled the systematic analysis and cataloging of hundreds of thousands of coding and noncoding genetic variants in the human genome. This has improved clinical variant interpretation and provided insights into the molecular, biophysical, and cellular effects of genetic variants at an astonishing scale and resolution across the spectrum of allele frequencies. In this review, we explore current applications and prospects for the field and outline the principles underlying scalable functional assay design, with a focus on the study of single-nucleotide coding and noncoding variants.
Collapse
Affiliation(s)
- Daniel Tabet
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Victoria Parikh
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Frederick P Roth
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Center for Genomic Medicine and Endocrine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Harvard University, Boston, Massachusetts, USA;
| |
Collapse
|
32
|
Koido M, Hon CC, Koyama S, Kawaji H, Murakawa Y, Ishigaki K, Ito K, Sese J, Parrish NF, Kamatani Y, Carninci P, Terao C. Prediction of the cell-type-specific transcription of non-coding RNAs from genome sequences via machine learning. Nat Biomed Eng 2022:10.1038/s41551-022-00961-8. [PMID: 36411359 DOI: 10.1038/s41551-022-00961-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 10/12/2022] [Indexed: 11/22/2022]
Abstract
Gene transcription is regulated through complex mechanisms involving non-coding RNAs (ncRNAs). As the transcription of ncRNAs, especially of enhancer RNAs, is often low and cell type specific, how the levels of RNA transcription depend on genotype remains largely unexplored. Here we report the development and utility of a machine-learning model (MENTR) that reliably links genome sequence and ncRNA expression at the cell type level. Effects on ncRNA transcription predicted by the model were concordant with estimates from published studies in a cell-type-dependent manner, regardless of allele frequency and genetic linkage. Among 41,223 variants from genome-wide association studies, the model identified 7,775 enhancer RNAs and 3,548 long ncRNAs causally associated with complex traits across 348 major human primary cells and tissues, such as rare variants plausibly altering the transcription of enhancer RNAs to influence the risks of Crohn's disease and asthma. The model may aid the discovery of causal variants and the generation of testable hypotheses for biological mechanisms driving complex traits.
Collapse
Affiliation(s)
- Masaru Koido
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Division of Molecular Pathology, Department of Cancer Biology, Institute of Medical Science, The University of Tokyo, Tokyo, Japan.,Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Chung-Chau Hon
- Laboratory for Genome Information Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Satoshi Koyama
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Hideya Kawaji
- Preventive Medicine and Applied Genomics Unit, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Research Center for Genome & Medical Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Yasuhiro Murakawa
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,IFOM ETS - The AIRC Institute of Molecular Oncology, Milan, Italy.,Institute for the Advanced Study of Human Biology, Kyoto University, Kyoto, Japan
| | - Kazuyoshi Ishigaki
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.,Center for Data Sciences, Harvard Medical School, Boston, MA, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kaoru Ito
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jun Sese
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Aomi, Koto-ku, Tokyo, Japan.,Humanome Lab Inc., Tokyo, Japan
| | - Nicholas F Parrish
- Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Cluster for Pioneering Research and RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Laboratory for Single Cell Technologies, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.,Human Technopole, Milan, Italy
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan. .,Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan. .,The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan.
| |
Collapse
|
33
|
Pang B, van Weerd JH, Hamoen FL, Snyder MP. Identification of non-coding silencer elements and their regulation of gene expression. Nat Rev Mol Cell Biol 2022; 24:383-395. [DOI: 10.1038/s41580-022-00549-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2022] [Indexed: 11/09/2022]
|
34
|
Cooper YA, Guo Q, Geschwind DH. Multiplexed functional genomic assays to decipher the noncoding genome. Hum Mol Genet 2022; 31:R84-R96. [PMID: 36057282 PMCID: PMC9585676 DOI: 10.1093/hmg/ddac194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 11/14/2022] Open
Abstract
Linkage disequilibrium and the incomplete regulatory annotation of the noncoding genome complicates the identification of functional noncoding genetic variants and their causal association with disease. Current computational methods for variant prioritization have limited predictive value, necessitating the application of highly parallelized experimental assays to efficiently identify functional noncoding variation. Here, we summarize two distinct approaches, massively parallel reporter assays and CRISPR-based pooled screens and describe their flexible implementation to characterize human noncoding genetic variation at unprecedented scale. Each approach provides unique advantages and limitations, highlighting the importance of multimodal methodological integration. These multiplexed assays of variant effects are undoubtedly poised to play a key role in the experimental characterization of noncoding genetic risk, informing our understanding of the underlying mechanisms of disease-associated loci and the development of more robust predictive classification algorithms.
Collapse
Affiliation(s)
- Yonatan A Cooper
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Medical Scientist Training Program, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA
| | - Qiuyu Guo
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA
| | - Daniel H Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Program in Neurogenetics, Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
- Center for Autism Research and Treatment, Semel Institute, University of California Los Angeles, Los Angeles, CA, USA
- Institute of Precision Health, University of California Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
35
|
Wang C, Dai J, Qin N, Fan J, Ma H, Chen C, An M, Zhang J, Yan C, Gu Y, Xie Y, He Y, Jiang Y, Zhu M, Song C, Jiang T, Liu J, Zhou J, Wang N, Hua T, Liang S, Wang L, Xu J, Yin R, Chen L, Xu L, Jin G, Lin D, Hu Z, Shen H. Analyses of rare predisposing variants of lung cancer in 6,004 whole genomes in Chinese. Cancer Cell 2022; 40:1223-1239.e6. [PMID: 36113475 DOI: 10.1016/j.ccell.2022.08.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 07/08/2022] [Accepted: 08/15/2022] [Indexed: 12/24/2022]
Abstract
We present the largest whole-genome sequencing (WGS) study of non-small cell lung cancer (NSCLC) to date among 6,004 individuals of Chinese ancestry, coupled with 23,049 individuals genotyped by SNP array. We construct a high-quality haplotype reference panel for imputation and identify 20 common and low-frequency loci (minor allele frequency [MAF] ≥ 0.5%), including five loci that have never been reported before. For rare loss-of-function (LoF) variants (MAF < 0.5%), we identify BRCA2 and 18 other cancer predisposition genes that affect 5.29% of individuals with NSCLC, and 98.91% (181 of 183) of LoF variants have not been linked previously to NSCLC risk. Promoter variants of BRCA2 also have a substantial effect on NSCLC risk, and their prevalence is comparable with BRCA2 LoF variants. The associations are validated in an independent case-control study including 4,410 individuals and a prospective cohort study including 23,826 individuals. Our findings not only provide a high-quality reference panel for future array-based association studies but depict the whole picture of rare pathogenic variants for NSCLC.
Collapse
Affiliation(s)
- Cheng Wang
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Juncheng Dai
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Na Qin
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Jingyi Fan
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Hongxia Ma
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine (Suzhou Centre), Gusu School, Nanjing Medical University, Suzhou 215002, Jiangsu, China; Research Units of Cohort Study on Cardiovascular Diseases and Cancers, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Congcong Chen
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Mingxing An
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Jing Zhang
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Caiwang Yan
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Yayun Gu
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Yuan Xie
- Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Yuanlin He
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Yue Jiang
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Meng Zhu
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Ci Song
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Tao Jiang
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Jia Liu
- Department of Health Promotion & Chronic Non-Communicable Disease Control, Wuxi Center for Disease Control and Prevention, Affiliated Wuxi Center for Disease Control and Prevention of Nanjing Medical University, Wuxi 214145, Jiangsu, China
| | - Jun Zhou
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Nanxi Wang
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Tingting Hua
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Shuang Liang
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Lu Wang
- Department of Health Promotion & Chronic Non-Communicable Disease Control, Wuxi Center for Disease Control and Prevention, Affiliated Wuxi Center for Disease Control and Prevention of Nanjing Medical University, Wuxi 214145, Jiangsu, China
| | - Jing Xu
- Department of Thoracic Surgery, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, Jiangsu, China
| | - Rong Yin
- Jiangsu Key Laboratory of Molecular and Translational Cancer Research, Department of Thoracic Surgery Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, Nanjing Medical University Affiliated Cancer Hospital, Nanjing 210029, Jiangsu, China
| | - Liang Chen
- Department of Thoracic Surgery, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, Jiangsu, China
| | - Lin Xu
- Jiangsu Key Laboratory of Molecular and Translational Cancer Research, Department of Thoracic Surgery Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, Nanjing Medical University Affiliated Cancer Hospital, Nanjing 210029, Jiangsu, China
| | - Guangfu Jin
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China
| | - Dongxin Lin
- Department of Etiology and Carcinogenesis, National Cancer Center and Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
| | - Zhibin Hu
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine (Suzhou Centre), Gusu School, Nanjing Medical University, Suzhou 215002, Jiangsu, China.
| | - Hongbing Shen
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, Jiangsu, China; Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, Jiangsu, China; State Key Laboratory of Reproductive Medicine (Suzhou Centre), Gusu School, Nanjing Medical University, Suzhou 215002, Jiangsu, China; Research Units of Cohort Study on Cardiovascular Diseases and Cancers, Chinese Academy of Medical Sciences, Beijing 100730, China.
| |
Collapse
|
36
|
Raskó T, Pande A, Radscheit K, Zink A, Singh M, Sommer C, Wachtl G, Kolacsek O, Inak G, Szvetnik A, Petrakis S, Bunse M, Bansal V, Selbach M, Orbán TI, Prigione A, Hurst LD, Izsvák Z. A Novel Gene Controls a New Structure: PiggyBac Transposable Element-Derived 1, Unique to Mammals, Controls Mammal-Specific Neuronal Paraspeckles. Mol Biol Evol 2022; 39:6661922. [PMID: 36205081 PMCID: PMC9538788 DOI: 10.1093/molbev/msac175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Although new genes can arrive from modes other than duplication, few examples are well characterized. Given high expression in some human brain subregions and a putative link to psychological disorders [e.g., schizophrenia (SCZ)], suggestive of brain functionality, here we characterize piggyBac transposable element-derived 1 (PGBD1). PGBD1 is nonmonotreme mammal-specific and under purifying selection, consistent with functionality. The gene body of human PGBD1 retains much of the original DNA transposon but has additionally captured SCAN and KRAB domains. Despite gene body retention, PGBD1 has lost transposition abilities, thus transposase functionality is absent. PGBD1 no longer recognizes piggyBac transposon-like inverted repeats, nonetheless PGBD1 has DNA binding activity. Genome scale analysis identifies enrichment of binding sites in and around genes involved in neuronal development, with association with both histone activating and repressing marks. We focus on one of the repressed genes, the long noncoding RNA NEAT1, also dysregulated in SCZ, the core structural RNA of paraspeckles. DNA binding assays confirm specific binding of PGBD1 both in the NEAT1 promoter and in the gene body. Depletion of PGBD1 in neuronal progenitor cells (NPCs) results in increased NEAT1/paraspeckles and differentiation. We conclude that PGBD1 has evolved core regulatory functionality for the maintenance of NPCs. As paraspeckles are a mammal-specific structure, the results presented here show a rare example of the evolution of a novel gene coupled to the evolution of a contemporaneous new structure.
Collapse
Affiliation(s)
- Tamás Raskó
- Max Delbrück Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | | | | | - Annika Zink
- Department of General Pediatrics, Neonatology and Pediatric Cardiology, Medical Faculty, Heinrich Heine University, Duesseldorf, Germany
| | - Manvendra Singh
- Max Delbrück Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Christian Sommer
- Max Delbrück Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Gerda Wachtl
- Institute of Enzymology, Research Centre for Natural Sciences, ELKH, Budapest, Hungary,Doctoral School of Biology, Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Orsolya Kolacsek
- Institute of Enzymology, Research Centre for Natural Sciences, ELKH, Budapest, Hungary
| | - Gizem Inak
- Department of General Pediatrics, Neonatology and Pediatric Cardiology, Medical Faculty, Heinrich Heine University, Duesseldorf, Germany
| | - Attila Szvetnik
- Max Delbrück Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Spyros Petrakis
- Institute of Applied Biosciences/Centre for Research and Technology Hellas, 57001 Thessaloniki, Greece
| | - Mario Bunse
- Max Delbrück Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Vikas Bansal
- Biomedical Data Science and Machine Learning Group, German Center for Neurodegenerative Diseases, Tübingen 72076, Germany
| | - Matthias Selbach
- Max Delbrück Center for Molecular Medicine in the Helmholtz Society, Berlin, Germany
| | - Tamás I Orbán
- Institute of Enzymology, Research Centre for Natural Sciences, ELKH, Budapest, Hungary
| | - Alessandro Prigione
- Department of General Pediatrics, Neonatology and Pediatric Cardiology, Medical Faculty, Heinrich Heine University, Duesseldorf, Germany
| | | | | |
Collapse
|
37
|
Wang QS, Edahiro R, Namkoong H, Hasegawa T, Shirai Y, Sonehara K, Tanaka H, Lee H, Saiki R, Hyugaji T, Shimizu E, Katayama K, Kanai M, Naito T, Sasa N, Yamamoto K, Kato Y, Morita T, Takahashi K, Harada N, Naito T, Hiki M, Matsushita Y, Takagi H, Ichikawa M, Nakamura A, Harada S, Sandhu Y, Kabata H, Masaki K, Kamata H, Ikemura S, Chubachi S, Okamori S, Terai H, Morita A, Asakura T, Sasaki J, Morisaki H, Uwamino Y, Nanki K, Uchida S, Uno S, Nishimura T, Ishiguro T, Isono T, Shibata S, Matsui Y, Hosoda C, Takano K, Nishida T, Kobayashi Y, Takaku Y, Takayanagi N, Ueda S, Tada A, Miyawaki M, Yamamoto M, Yoshida E, Hayashi R, Nagasaka T, Arai S, Kaneko Y, Sasaki K, Tagaya E, Kawana M, Arimura K, Takahashi K, Anzai T, Ito S, Endo A, Uchimura Y, Miyazaki Y, Honda T, Tateishi T, Tohda S, Ichimura N, Sonobe K, Sassa CT, Nakajima J, Nakano Y, Nakajima Y, Anan R, Arai R, Kurihara Y, Harada Y, Nishio K, Ueda T, Azuma M, Saito R, Sado T, Miyazaki Y, Sato R, Haruta Y, Nagasaki T, Yasui Y, Hasegawa Y, Mutoh Y, Kimura T, Sato T, Takei R, Hagimoto S, Noguchi Y, Yamano Y, Sasano H, Ota S, Nakamori Y, Yoshiya K, Saito F, Yoshihara T, Wada D, Iwamura H, Kanayama S, Maruyama S, Yoshiyama T, Ohta K, Kokuto H, Ogata H, Tanaka Y, Arakawa K, Shimoda M, Osawa T, Tateno H, Hase I, Yoshida S, Suzuki S, Kawada M, Horinouchi H, Saito F, Mitamura K, Hagihara M, Ochi J, Uchida T, Baba R, Arai D, Ogura T, Takahashi H, Hagiwara S, Nagao G, Konishi S, Nakachi I, Murakami K, Yamada M, Sugiura H, Sano H, Matsumoto S, Kimura N, Ono Y, Baba H, Suzuki Y, Nakayama S, Masuzawa K, Namba S, Shiroyama T, Noda Y, Niitsu T, Adachi Y, Enomoto T, Amiya S, Hara R, Yamaguchi Y, Murakami T, Kuge T, Matsumoto K, Yamamoto Y, Yamamoto M, Yoneda M, Tomono K, Kato K, Hirata H, Takeda Y, Koh H, Manabe T, Funatsu Y, Ito F, Fukui T, Shinozuka K, Kohashi S, Miyazaki M, Shoko T, Kojima M, Adachi T, Ishikawa M, Takahashi K, Inoue T, Hirano T, Kobayashi K, Takaoka H, Watanabe K, Miyazawa N, Kimura Y, Sado R, Sugimoto H, Kamiya A, Kuwahara N, Fujiwara A, Matsunaga T, Sato Y, Okada T, Hirai Y, Kawashima H, Narita A, Niwa K, Sekikawa Y, Nishi K, Nishitsuji M, Tani M, Suzuki J, Nakatsumi H, Ogura T, Kitamura H, Hagiwara E, Murohashi K, Okabayashi H, Mochimaru T, Nukaga S, Satomi R, Oyamada Y, Mori N, Baba T, Fukui Y, Odate M, Mashimo S, Makino Y, Yagi K, Hashiguchi M, Kagyo J, Shiomi T, Fuke S, Saito H, Tsuchida T, Fujitani S, Takita M, Morikawa D, Yoshida T, Izumo T, Inomata M, Kuse N, Awano N, Tone M, Ito A, Nakamura Y, Hoshino K, Maruyama J, Ishikura H, Takata T, Odani T, Amishima M, Hattori T, Shichinohe Y, Kagaya T, Kita T, Ohta K, Sakagami S, Koshida K, Hayashi K, Shimizu T, Kozu Y, Hiranuma H, Gon Y, Izumi N, Nagata K, Ueda K, Taki R, Hanada S, Kawamura K, Ichikado K, Nishiyama K, Muranaka H, Nakamura K, Hashimoto N, Wakahara K, Koji S, Omote N, Ando A, Kodama N, Kaneyama Y, Maeda S, Kuraki T, Matsumoto T, Yokote K, Nakada TA, Abe R, Oshima T, Shimada T, Harada M, Takahashi T, Ono H, Sakurai T, Shibusawa T, Kimizuka Y, Kawana A, Sano T, Watanabe C, Suematsu R, Sageshima H, Yoshifuji A, Ito K, Takahashi S, Ishioka K, Nakamura M, Masuda M, Wakabayashi A, Watanabe H, Ueda S, Nishikawa M, Chihara Y, Takeuchi M, Onoi K, Shinozuka J, Sueyoshi A, Nagasaki Y, Okamoto M, Ishihara S, Shimo M, Tokunaga Y, Kusaka Y, Ohba T, Isogai S, Ogawa A, Inoue T, Fukuyama S, Eriguchi Y, Yonekawa A, Kan-o K, Matsumoto K, Kanaoka K, Ihara S, Komuta K, Inoue Y, Chiba S, Yamagata K, Hiramatsu Y, Kai H, Asano K, Oguma T, Ito Y, Hashimoto S, Yamasaki M, Kasamatsu Y, Komase Y, Hida N, Tsuburai T, Oyama B, Takada M, Kanda H, Kitagawa Y, Fukuta T, Miyake T, Yoshida S, Ogura S, Abe S, Kono Y, Togashi Y, Takoi H, Kikuchi R, Ogawa S, Ogata T, Ishihara S, Kanehiro A, Ozaki S, Fuchimoto Y, Wada S, Fujimoto N, Nishiyama K, Terashima M, Beppu S, Yoshida K, Narumoto O, Nagai H, Ooshima N, Motegi M, Umeda A, Miyagawa K, Shimada H, Endo M, Ohira Y, Watanabe M, Inoue S, Igarashi A, Sato M, Sagara H, Tanaka A, Ohta S, Kimura T, Shibata Y, Tanino Y, Nikaido T, Minemura H, Sato Y, Yamada Y, Hashino T, Shinoki M, Iwagoe H, Takahashi H, Fujii K, Kishi H, Kanai M, Imamura T, Yamashita T, Yatomi M, Maeno T, Hayashi S, Takahashi M, Kuramochi M, Kamimaki I, Tominaga Y, Ishii T, Utsugi M, Ono A, Tanaka T, Kashiwada T, Fujita K, Saito Y, Seike M, Watanabe H, Matsuse H, Kodaka N, Nakano C, Oshio T, Hirouchi T, Makino S, Egi M, Omae Y, Nannya Y, Ueno T, Takano T, Katayama K, Ai M, Kumanogoh A, Sato T, Hasegawa N, Tokunaga K, Ishii M, Koike R, Kitagawa Y, Kimura A, Imoto S, Miyano S, Ogawa S, Kanai T, Fukunaga K, Okada Y. The whole blood transcriptional regulation landscape in 465 COVID-19 infected samples from Japan COVID-19 Task Force. Nat Commun 2022; 13:4830. [PMID: 35995775 PMCID: PMC9395416 DOI: 10.1038/s41467-022-32276-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 07/25/2022] [Indexed: 11/12/2022] Open
Abstract
Coronavirus disease 2019 (COVID-19) is a recently-emerged infectious disease that has caused millions of deaths, where comprehensive understanding of disease mechanisms is still unestablished. In particular, studies of gene expression dynamics and regulation landscape in COVID-19 infected individuals are limited. Here, we report on a thorough analysis of whole blood RNA-seq data from 465 genotyped samples from the Japan COVID-19 Task Force, including 359 severe and 106 non-severe COVID-19 cases. We discover 1169 putative causal expression quantitative trait loci (eQTLs) including 34 possible colocalizations with biobank fine-mapping results of hematopoietic traits in a Japanese population, 1549 putative causal splice QTLs (sQTLs; e.g. two independent sQTLs at TOR1AIP1), as well as biologically interpretable trans-eQTL examples (e.g., REST and STING1), all fine-mapped at single variant resolution. We perform differential gene expression analysis to elucidate 198 genes with increased expression in severe COVID-19 cases and enriched for innate immune-related functions. Finally, we evaluate the limited but non-zero effect of COVID-19 phenotype on eQTL discovery, and highlight the presence of COVID-19 severity-interaction eQTLs (ieQTLs; e.g., CLEC4C and MYBL2). Our study provides a comprehensive catalog of whole blood regulatory variants in Japanese, as well as a reference for transcriptional landscapes in response to COVID-19 infection. Genetic mechanisms influencing COVID-19 susceptibility are not well understood. Here, the authors analyzed whole blood RNA-seq data of 465 Japanese individuals with COVID-19, highlighting thousands of fine-mapped variants affecting expression and splicing of genes, as well as the presence of COVID-19 severity-interaction eQTLs.
Collapse
|
38
|
Zhou Y, Tremmel R, Schaeffeler E, Schwab M, Lauschke VM. Challenges and opportunities associated with rare-variant pharmacogenomics. Trends Pharmacol Sci 2022; 43:852-865. [PMID: 36008164 DOI: 10.1016/j.tips.2022.07.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 06/15/2022] [Accepted: 07/29/2022] [Indexed: 12/26/2022]
Abstract
Recent advances in next-generation sequencing (NGS) have resulted in the identification of tens of thousands of rare pharmacogenetic variations with unknown functional effects. However, although such pharmacogenetic variations have been estimated to account for a considerable amount of the heritable variability in drug response and toxicity, accurate interpretation at the level of the individual patient remains challenging. We discuss emerging strategies and concepts to close this translational gap. We illustrate how massively parallel experimental assays, artificial intelligence (AI), and machine learning can synergize with population-scale biobank projects to facilitate the interpretation of NGS data to individualize clinical decision-making and personalized medicine.
Collapse
Affiliation(s)
- Yitian Zhou
- Department of Physiology and Pharmacology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Roman Tremmel
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; University of Tübingen, Tübingen, Germany
| | - Elke Schaeffeler
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; University of Tübingen, Tübingen, Germany; Cluster of Excellence iFIT (EXC2180) Image-Guided and Functionally Instructed Tumor Therapies, University of Tübingen, Tübingen, Germany
| | - Matthias Schwab
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; Cluster of Excellence iFIT (EXC2180) Image-Guided and Functionally Instructed Tumor Therapies, University of Tübingen, Tübingen, Germany; Department of Clinical Pharmacology, and Department of Biochemistry and Pharmacy, University of Tübingen, Tübingen, Germany
| | - Volker M Lauschke
- Department of Physiology and Pharmacology, Karolinska Institutet, 171 77 Stockholm, Sweden; Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; University of Tübingen, Tübingen, Germany.
| |
Collapse
|
39
|
Dixon PH, Levine AP, Cebola I, Chan MMY, Amin AS, Aich A, Mozere M, Maude H, Mitchell AL, Zhang J, Chambers J, Syngelaki A, Donnelly J, Cooley S, Geary M, Nicolaides K, Thorsell M, Hague WM, Estiu MC, Marschall HU, Gale DP, Williamson C. GWAS meta-analysis of intrahepatic cholestasis of pregnancy implicates multiple hepatic genes and regulatory elements. Nat Commun 2022; 13:4840. [PMID: 35977952 PMCID: PMC9385867 DOI: 10.1038/s41467-022-29931-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 04/08/2022] [Indexed: 12/15/2022] Open
Abstract
Intrahepatic cholestasis of pregnancy (ICP) is a pregnancy-specific liver disorder affecting 0.5-2% of pregnancies. The majority of cases present in the third trimester with pruritus, elevated serum bile acids and abnormal serum liver tests. ICP is associated with an increased risk of adverse outcomes, including spontaneous preterm birth and stillbirth. Whilst rare mutations affecting hepatobiliary transporters contribute to the aetiology of ICP, the role of common genetic variation in ICP has not been systematically characterised to date. Here, we perform genome-wide association studies (GWAS) and meta-analyses for ICP across three studies including 1138 cases and 153,642 controls. Eleven loci achieve genome-wide significance and have been further investigated and fine-mapped using functional genomics approaches. Our results pinpoint common sequence variation in liver-enriched genes and liver-specific cis-regulatory elements as contributing mechanisms to ICP susceptibility.
Collapse
Affiliation(s)
- Peter H Dixon
- Department of Women and Children's Health, School of Life Course Sciences, King's College London, London, UK
| | - Adam P Levine
- Department of Renal Medicine, University College London, London, UK
- Research Department of Pathology, University College London, London, UK
| | - Inês Cebola
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Melanie M Y Chan
- Department of Renal Medicine, University College London, London, UK
| | - Aliya S Amin
- Department of Women and Children's Health, School of Life Course Sciences, King's College London, London, UK
| | - Anshul Aich
- Department of Renal Medicine, University College London, London, UK
| | - Monika Mozere
- Department of Renal Medicine, University College London, London, UK
| | - Hannah Maude
- Section of Genetics and Genomics, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Alice L Mitchell
- Department of Women and Children's Health, School of Life Course Sciences, King's College London, London, UK
| | - Jun Zhang
- Department of Renal Medicine, University College London, London, UK
- Division of Nephrology, Department of Medicine, Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Jenny Chambers
- ICP Support, 69 Mere Green Road, Sutton Coldfield, UK
- Women's Health Research Centre, Imperial College London, London, UK
| | - Argyro Syngelaki
- Harris Birthright Research Centre for Fetal Medicine, King's College Hospital, London, UK
| | | | | | | | - Kypros Nicolaides
- Harris Birthright Research Centre for Fetal Medicine, King's College Hospital, London, UK
| | | | - William M Hague
- Robinson Research Institute, The University of Adelaide, Adelaide, SA, Australia
| | | | - Hanns-Ulrich Marschall
- Department of Molecular and Clinical Medicine/Wallenberg Laboratory, University of Gothenburg, Gothenburg, Sweden
| | - Daniel P Gale
- Department of Renal Medicine, University College London, London, UK
| | - Catherine Williamson
- Department of Women and Children's Health, School of Life Course Sciences, King's College London, London, UK.
| |
Collapse
|
40
|
Abstract
Enhancers confer precise spatiotemporal patterns of gene expression in response to developmental and environmental stimuli. Over the last decade, the transcription of enhancer RNAs (eRNAs) – nascent RNAs transcribed from active enhancers – has emerged as a key factor regulating enhancer activity. eRNAs are relatively short-lived RNA species that are transcribed at very high rates but also quickly degraded. Nevertheless, eRNAs are deeply intertwined within enhancer regulatory networks and are implicated in a number of transcriptional control mechanisms. Enhancers show changes in function and sequence over evolutionary time, raising questions about the relationship between enhancer sequences and eRNA function. Moreover, the vast majority of single nucleotide polymorphisms associated with human complex diseases map to the non-coding genome, with causal disease variants enriched within enhancers. In this Primer, we survey the diverse roles played by eRNAs in enhancer-dependent gene expression, evaluating different models for eRNA function. We also explore questions surrounding the genetic conservation of enhancers and how this relates to eRNA function and dysfunction. Summary: This Primer evaluates the ideas that underpin developing models for eRNA function, exploring cases in which perturbed eRNA function contributes to disease.
Collapse
Affiliation(s)
- Laura J. Harrison
- Molecular and Cellular Biology, School of Biosciences, Sheffield Institute For Nucleic Acids, The University of Sheffield, Firth Court, Western Bank , Sheffield S10 2TN , UK
| | - Daniel Bose
- Molecular and Cellular Biology, School of Biosciences, Sheffield Institute For Nucleic Acids, The University of Sheffield, Firth Court, Western Bank , Sheffield S10 2TN , UK
| |
Collapse
|
41
|
Wang L, Ding X, Huang Q, Hu B, Liang L, Wang Q. Gllac7 Is Induced by Agricultural and Forestry Residues and Exhibits Allelic Expression Bias in Ganoderma lucidum. Front Microbiol 2022; 13:890686. [PMID: 35847055 PMCID: PMC9279560 DOI: 10.3389/fmicb.2022.890686] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 06/06/2022] [Indexed: 11/13/2022] Open
Abstract
Ganoderma lucidum has a wide carbon spectrum, while the expression profile of key genes relevant to carbon metabolism on different carbon sources has been seldom studied. Here, the transcriptomes of G. lucidum mycelia cultured on each of 19 carbon sources were conducted. In comparison with glucose, 16 to 1,006 genes were upregulated and 7 to 1,865 genes were downregulated. Significant gene expression dynamics and induced activity were observed in laccase genes when using agricultural and forestry residues (AFRs) as solo carbon sources. Furthermore, study of laccase gene family in two haploids of G. lucidum GL0102 was conducted. Totally, 15 and 16 laccase genes were identified in GL0102_53 and GL0102_8, respectively, among which 15 pairs were allelic genes. Gene structures were conserved between allelic laccase genes, while sequence variations (most were SNPs) existed. Nine laccase genes rarely expressed on all the tested carbon sources, while the other seven genes showed high expression level on AFRs, especially Gllac2 and Gllac7, which showed 5- to 1,149-fold and 4- to 94-fold upregulation in mycelia cultured for 5 days, respectively. The expression of H53lac7 was consistently higher than that of H8lac7_1 on all the carbon sources except XM, exhibiting a case of allelic expression bias. A total of 47 SNPs and 3 insertions/deletions were observed between promoters of H53lac7 and H8lac7_1, which lead to differences in predicted binding sites of zinc fingers. These results provide scientific data for understanding the gene expression profile and regulatory role on different carbon sources and may support further functional research of laccase.
Collapse
Affiliation(s)
- Lining Wang
- Guangdong Engineering Laboratory of Biomass High-Value Utilization, Guangdong Plant Fiber Comprehensive Utilization Engineering Technology Research and Development Center, Guangzhou Key Laboratory of Biomass Comprehensive Utilization, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, China
| | - Xiaoxia Ding
- Key Laboratory of Quality Evaluation of Chinese Medicine of the Guangdong Provincial Medical Products Administration, the Second Clinical College, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Qinghua Huang
- Guangdong Engineering Laboratory of Biomass High-Value Utilization, Guangdong Plant Fiber Comprehensive Utilization Engineering Technology Research and Development Center, Guangzhou Key Laboratory of Biomass Comprehensive Utilization, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, China
| | - Biao Hu
- Guangdong Engineering Laboratory of Biomass High-Value Utilization, Guangdong Plant Fiber Comprehensive Utilization Engineering Technology Research and Development Center, Guangzhou Key Laboratory of Biomass Comprehensive Utilization, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, China
| | - Lei Liang
- Guangdong Engineering Laboratory of Biomass High-Value Utilization, Guangdong Plant Fiber Comprehensive Utilization Engineering Technology Research and Development Center, Guangzhou Key Laboratory of Biomass Comprehensive Utilization, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, China
| | - Qingfu Wang
- Guangdong Engineering Laboratory of Biomass High-Value Utilization, Guangdong Plant Fiber Comprehensive Utilization Engineering Technology Research and Development Center, Guangzhou Key Laboratory of Biomass Comprehensive Utilization, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, China
| |
Collapse
|
42
|
Umeton R, Bellucci G, Bigi R, Romano S, Buscarinu MC, Reniè R, Rinaldi V, Pizzolato Umeton R, Morena E, Romano C, Mechelli R, Salvetti M, Ristori G. Multiple sclerosis genetic and non-genetic factors interact through the transient transcriptome. Sci Rep 2022; 12:7536. [PMID: 35534508 PMCID: PMC9085834 DOI: 10.1038/s41598-022-11444-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 04/22/2022] [Indexed: 11/20/2022] Open
Abstract
A clinically actionable understanding of multiple sclerosis (MS) etiology goes through GWAS interpretation, prompting research on new gene regulatory models. Our previous investigations suggested heterogeneity in etiology components and stochasticity in the interaction between genetic and non-genetic factors. To find a unifying model for this evidence, we focused on the recently mapped transient transcriptome (TT), that is mostly coded by intergenic and intronic regions, with half-life of minutes. Through a colocalization analysis, here we demonstrate that genomic regions coding for the TT are significantly enriched for MS-associated GWAS variants and DNA binding sites for molecular transducers mediating putative, non-genetic, determinants of MS (vitamin D deficiency, Epstein Barr virus latent infection, B cell dysfunction), indicating TT-coding regions as MS etiopathogenetic hotspots. Future research comparing cell-specific transient and stable transcriptomes may clarify the interplay between genetic variability and non-genetic factors causing MS. To this purpose, our colocalization analysis provides a freely available data resource at www.mscoloc.com.
Collapse
Affiliation(s)
- Renato Umeton
- Department of Informatics and Analytics, Dana-Farber Cancer Institute, Boston, MA, USA. .,Department of Biological Engineering, Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA. .,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA. .,Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA.
| | - Gianmarco Bellucci
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy
| | - Rachele Bigi
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy.,Neuroimmunology Unit, IRCCS Fondazione Santa Lucia, Rome, Italy
| | - Silvia Romano
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy
| | - Maria Chiara Buscarinu
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy.,Neuroimmunology Unit, IRCCS Fondazione Santa Lucia, Rome, Italy
| | - Roberta Reniè
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy
| | - Virginia Rinaldi
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy
| | - Raffaella Pizzolato Umeton
- Department of Neurology, UMass Memorial Health Care, Worcester, MA, USA.,University of Massachusetts Medical School, Worcester, MA, USA.,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Emanuele Morena
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy
| | - Carmela Romano
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy
| | - Rosella Mechelli
- IRCCS San Raffaele Pisana, Rome, Italy.,San Raffaele Roma Open University, Rome, Italy
| | - Marco Salvetti
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy. .,IRCCS Istituto Neurologico Mediterraneo Neuromed, Pozzilli, Italy.
| | - Giovanni Ristori
- Department of Neurosciences, Mental Health and Sensory Organs, Centre for Experimental Neurological Therapies (CENTERS), Sapienza University of Rome, Rome, Italy. .,Neuroimmunology Unit, IRCCS Fondazione Santa Lucia, Rome, Italy.
| |
Collapse
|
43
|
Abstract
Thousands of common genetic variants in the human population have been associated with disease risk and phenotypic variation by genome-wide association studies (GWAS). However, the majority of GWAS variants fall into noncoding regions of the genome, complicating our understanding of their regulatory functions, and few molecular mechanisms of GWAS variant effects have been clearly elucidated. Here, we set out to review genetic variant effects, focusing on expression quantitative trait loci (eQTLs), including their utility in interpreting GWAS variant mechanisms. We discuss the interrelated challenges and opportunities for eQTL analysis, covering determining causal variants, elucidating molecular mechanisms of action, and understanding context variability. Addressing these questions can enable better functional characterization of disease-associated loci and provide insights into fundamental biological questions of the noncoding genetic regulatory code and its control of gene expression. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Elise Flynn
- New York Genome Center, New York, NY, USA; , .,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; , .,Department of Systems Biology, Columbia University, New York, NY, USA.,Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
44
|
Chai T, Tian M, Yang X, Qiu Z, Lin X, Chen L. Genome-Wide Identification of Associations of Circulating Molecules With Spontaneous Coronary Artery Dissection and Aortic Aneurysm and Dissection. Front Cardiovasc Med 2022; 9:874912. [PMID: 35571188 PMCID: PMC9091499 DOI: 10.3389/fcvm.2022.874912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/07/2022] [Indexed: 11/16/2022] Open
Abstract
Circulating proteins play functional roles in various biological processes and disease pathogenesis. The aim of this study was to highlight circulating proteins associated with aortic aneurysm and dissection (AAD) and spontaneous coronary artery dissection (SCAD). We examined the associations of circulating molecule levels with SCAD by integrating data from a genome-wide association study (GWAS) of CanSCAD and 7 pQTL studies. Mendelian randomization (MR) analysis was applied to examine the associations between circulating molecule levels and AAD by using data from UK Biobank GWAS and pQTL studies. The SCAD-associated SNPs in 1q21.2 were strongly associated with circulating levels of extracellular matrix protein 1 (ECM1) and 25 other proteins (encoded by CTSS, CAT, CNDP1, KNG1, SLAMF7, TIE1, CXCL1, MBL2, ESD, CXCL16, CCL14, KCNE5, CST7, PSME1, GPC3, MAP2K4, SPOCK3, LRPPRC, CLEC4M, NOG, C1QTNF9, CX3CL1, SCP2D1, SERPINF2, and FN1). These proteins were enriched in biological processes such as regulation of peptidase activity and regulation of cellular protein metabolic processes. Proteins (FGF6, FGF9, HGF, BCL2L1, and VEGFA) involved in the Ras signaling pathway were identified to be related to AAD. In addition, SCAD- and AAD-associated SNPs were associated with cytokine and lipid levels. MR analysis showed that circulating ECM1, SPOCK3 and IL1b levels were associated with AAD. Circulating levels of low-density lipoprotein cholesterol and small very-low-density lipoprotein particles were strongly associated with AAD. The present study found associations between circulating proteins and lipids and SCAD and AAD. Circulating ECM1 and low-density lipoprotein cholesterol may play a role in the pathology of SCAD and AAD.
Collapse
Affiliation(s)
- Tianci Chai
- Department of Cardiac Surgery, Fujian Medical University Union Hospital, Fuzhou, China
- Fujian Key Laboratory of Cardio-Thoracic Surgery (Fujian Medical University), Fuzhou, China
- Department of Anesthesiology, Xinyi People’s Hospital, Xuzhou, China
| | - Mengyue Tian
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
| | - Xiaojie Yang
- Fujian Key Laboratory of Cardio-Thoracic Surgery (Fujian Medical University), Fuzhou, China
- Department of Thoracic Surgery, Fujian Medical University Union Hospital, Fuzhou, China
| | - Zhihuang Qiu
- Department of Cardiac Surgery, Fujian Medical University Union Hospital, Fuzhou, China
- Fujian Key Laboratory of Cardio-Thoracic Surgery (Fujian Medical University), Fuzhou, China
| | - Xinjian Lin
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
| | - Liangwan Chen
- Department of Cardiac Surgery, Fujian Medical University Union Hospital, Fuzhou, China
- Fujian Key Laboratory of Cardio-Thoracic Surgery (Fujian Medical University), Fuzhou, China
- *Correspondence: Liangwan Chen,
| |
Collapse
|
45
|
Nisar S, Torres M, Thiam A, Pouvelle B, Rosier F, Gallardo F, Ka O, Mbengue B, Diallo RN, Brosseau L, Spicuglia S, Dieye A, Marquet S, Rihet P. Identification of ATP2B4 Regulatory Element Containing Functional Genetic Variants Associated with Severe Malaria. Int J Mol Sci 2022; 23:4849. [PMID: 35563239 DOI: 10.3390/ijms23094849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/15/2022] [Accepted: 04/22/2022] [Indexed: 12/04/2022] Open
Abstract
Genome-wide association studies for severe malaria (SM) have identified 30 genetic variants mostly located in non-coding regions. Here, we aimed to identify potential causal genetic variants located in these loci and demonstrate their functional activity. We systematically investigated the regulatory effect of the SNPs in linkage disequilibrium (LD) with the malaria-associated genetic variants. Annotating and prioritizing genetic variants led to the identification of a regulatory region containing five ATP2B4 SNPs in LD with rs10900585. We found significant associations between SM and rs10900585 and our candidate SNPs (rs11240734, rs1541252, rs1541253, rs1541254, and rs1541255) in a Senegalese population. Then, we demonstrated that both individual SNPs and the combination of SNPs had regulatory effects. Moreover, CRISPR/Cas9-mediated deletion of this region decreased ATP2B4 transcript and protein levels and increased Ca2+ intracellular concentration in the K562 cell line. Our data demonstrate that severe malaria-associated genetic variants alter the expression of ATP2B4 encoding a plasma membrane calcium-transporting ATPase 4 (PMCA4) expressed on red blood cells. Altering the activity of this regulatory element affects the risk of SM, likely through calcium concentration effect on parasitaemia.
Collapse
|
46
|
Spielmann M, Kircher M. Computational and experimental methods for classifying variants of unknown clinical significance. Cold Spring Harb Mol Case Stud 2022; 8:mcs.a006196. [PMID: 35483875 PMCID: PMC9059783 DOI: 10.1101/mcs.a006196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
The increase in sequencing capacity, reduction in costs, and national and international coordinated efforts have led to the widespread introduction of next-generation sequencing (NGS) technologies in patient care. More generally, human genetics and genomic medicine are gaining importance for more and more patients. Some communities are already discussing the prospect of sequencing each individual's genome at time of birth. Together with digital health records, this shall enable individualized treatments and preventive measures, so-called precision medicine. A central step in this process is the identification of disease causal mutations or variant combinations that make us more susceptible for diseases. Although various technological advances have improved the identification of genetic alterations, the interpretation and ranking of the identified variants remains a major challenge. Based on our knowledge of molecular processes or previously identified disease variants, we can identify potentially functional genetic variants and, using different lines of evidence, we are sometimes able to demonstrate their pathogenicity directly. However, the vast majority of variants are classified as variants of uncertain clinical significance (VUSs) with not enough experimental evidence to determine their pathogenicity. In these cases, computational methods may be used to improve the prioritization and an increasing toolbox of experimental methods is emerging that can be used to assay the molecular effects of VUSs. Here, we discuss how computational and experimental methods can be used to create catalogs of variant effects for a variety of molecular and cellular phenotypes. We discuss the prospects of integrating large-scale functional data with machine learning and clinical knowledge for the development of accurate pathogenicity predictions for clinical applications.
Collapse
Affiliation(s)
- Malte Spielmann
- Institute of Human Genetics, University of Lübeck, 23562 Lübeck, Germany;,Institute of Human Genetics, Christian-Albrechts-Universität, 24105 Kiel, Germany;,Human Molecular Genomics Group, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany;,DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
| | - Martin Kircher
- Institute of Human Genetics, University of Lübeck, 23562 Lübeck, Germany;,Berlin Institute of Health at Charité—Universitätsmedizin Berlin, 10117 Berlin, Germany;,DZHK (German Centre for Cardiovascular Research), partner site Berlin, 10115 Berlin, Germany
| |
Collapse
|
47
|
Martinez-Ara M, Comoglio F, van Arensbergen J, van Steensel B. Systematic analysis of intrinsic enhancer-promoter compatibility in the mouse genome. Mol Cell 2022; 82:2519-2531.e6. [PMID: 35594855 PMCID: PMC9278412 DOI: 10.1016/j.molcel.2022.04.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 02/17/2022] [Accepted: 04/05/2022] [Indexed: 12/12/2022]
Affiliation(s)
- Miguel Martinez-Ara
- Division of Gene Regulation and Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, the Netherlands
| | - Federico Comoglio
- Division of Gene Regulation and Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, the Netherlands
| | - Joris van Arensbergen
- Division of Gene Regulation and Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, the Netherlands
| | - Bas van Steensel
- Division of Gene Regulation and Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, the Netherlands.
| |
Collapse
|
48
|
Galouzis CC, Furlong EEM. Regulating specificity in enhancer-promoter communication. Curr Opin Cell Biol 2022; 75:102065. [PMID: 35240372 DOI: 10.1016/j.ceb.2022.01.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 01/23/2022] [Accepted: 01/25/2022] [Indexed: 12/14/2022]
Abstract
Enhancers are cis-regulatory elements that can activate transcription remotely to regulate a specific pattern of a gene's expression. Genes typically have many enhancers that are often intermingled in the loci of other genes. To regulate expression, enhancers must therefore activate their correct promoter while ignoring others that may be in closer linear proximity. In this review, we discuss mechanisms by which enhancers engage with promoters, including recent findings on the role of cohesin and the Mediator complex, and how this specificity in enhancer-promoter communication is encoded. Genetic dissection of model loci, in addition to more recent findings using genome-wide approaches, highlight the core promoter sequence, its accessibility, cofactor-promoter preference, in addition to the surrounding genomic context, as key components.
Collapse
Affiliation(s)
| | - Eileen E M Furlong
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, D-69117, Heidelberg, Germany.
| |
Collapse
|
49
|
Toropainen A, Stolze LK, Örd T, Whalen MB, Torrell PM, Link VM, Kaikkonen MU, Romanoski CE. Functional noncoding SNPs in human endothelial cells fine-map vascular trait associations. Genome Res 2022; 32:409-424. [PMID: 35193936 PMCID: PMC8896458 DOI: 10.1101/gr.276064.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/06/2022] [Indexed: 11/25/2022]
Abstract
Functional consequences of genetic variation in the noncoding human genome are difficult to ascertain despite demonstrated associations to common, complex disease traits. To elucidate properties of functional noncoding SNPs with effects in human endothelial cells (ECs), we utilized our previous molecular quantitative trait locus (molQTL) analysis for transcription factor binding, chromatin accessibility, and H3K27 acetylation to nominate a set of likely functional noncoding SNPs. Together with information from genome-wide association studies (GWASs) for vascular disease traits, we tested the ability of 34,344 variants to perturb enhancer function in ECs using the highly multiplexed STARR-seq assay. Of these, 5711 variants validated, whose enriched attributes included: (1) mutations to TF binding motifs for ETS or AP-1 that are regulators of the EC state; (2) location in accessible and H3K27ac-marked EC chromatin; and (3) molQTL associations whereby alleles associate with differences in chromatin accessibility and TF binding across genetically diverse ECs. Next, using pro-inflammatory IL1B as an activator of cell state, we observed robust evidence (>50%) of context-specific SNP effects, underscoring the prevalence of noncoding gene-by-environment (GxE) effects. Lastly, using these cumulative data, we fine-mapped vascular disease loci and highlighted evidence suggesting mechanisms by which noncoding SNPs at two loci affect risk for pulse pressure/large artery stroke and abdominal aortic aneurysm through respective effects on transcriptional regulation of POU4F1 and LDAH. Together, we highlight the attributes and context dependence of functional noncoding SNPs and provide new mechanisms underlying vascular disease risk.
Collapse
Affiliation(s)
- Anu Toropainen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Lindsey K Stolze
- The Department of Cellular and Molecular Medicine, The University of Arizona, Tucson, Arizona 85721, USA.,The Genetics Interdisciplinary Graduate Program, The University of Arizona, Tucson, Arizona 85721, USA
| | - Tiit Örd
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Michael B Whalen
- The Department of Cellular and Molecular Medicine, The University of Arizona, Tucson, Arizona 85721, USA
| | - Paula Martí Torrell
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Verena M Link
- Metaorganism Immunity Section, Laboratory of Host Immunity and Microbiome, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Minna U Kaikkonen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Casey E Romanoski
- The Department of Cellular and Molecular Medicine, The University of Arizona, Tucson, Arizona 85721, USA.,The Genetics Interdisciplinary Graduate Program, The University of Arizona, Tucson, Arizona 85721, USA
| |
Collapse
|
50
|
Osman N, Shawky AEM, Brylinski M. Exploring the effects of genetic variation on gene regulation in cancer in the context of 3D genome structure. BMC Genom Data 2022; 23:13. [PMID: 35176995 PMCID: PMC8851830 DOI: 10.1186/s12863-021-01021-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 12/23/2021] [Indexed: 12/31/2022] Open
Abstract
Background Numerous genome-wide association studies (GWAS) conducted to date revealed genetic variants associated with various diseases, including breast and prostate cancers. Despite the availability of these large-scale data, relatively few variants have been functionally characterized, mainly because the majority of single-nucleotide polymorphisms (SNPs) map to the non-coding regions of the human genome. The functional characterization of these non-coding variants and the identification of their target genes remain challenging. Results In this communication, we explore the potential functional mechanisms of non-coding SNPs by integrating GWAS with the high-resolution chromosome conformation capture (Hi-C) data for breast and prostate cancers. We show that more genetic variants map to regulatory elements through the 3D genome structure than the 1D linear genome lacking physical chromatin interactions. Importantly, the association of enhancers, transcription factors, and their target genes with breast and prostate cancers tends to be higher when these regulatory elements are mapped to high-risk SNPs through spatial interactions compared to simply using a linear proximity. Finally, we demonstrate that topologically associating domains (TADs) carrying high-risk SNPs also contain gene regulatory elements whose association with cancer is generally higher than those belonging to control TADs containing no high-risk variants. Conclusions Our results suggest that many SNPs may contribute to the cancer development by affecting the expression of certain tumor-related genes through long-range chromatin interactions with gene regulatory elements. Integrating large-scale genetic datasets with the 3D genome structure offers an attractive and unique approach to systematically investigate the functional mechanisms of genetic variants in disease risk and progression. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-021-01021-x.
Collapse
Affiliation(s)
- Noha Osman
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.,Department of Cell Biology, National Research Centre, Giza, 12622, Egypt.,Department of Medicine, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Abd-El-Monsif Shawky
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|