1
|
Dudnyk K, Cai D, Shi C, Xu J, Zhou J. Sequence basis of transcription initiation in the human genome. Science 2024; 384:eadj0116. [PMID: 38662817 DOI: 10.1126/science.adj0116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 02/28/2024] [Indexed: 05/03/2024]
Abstract
Transcription initiation is a process that is essential to ensuring the proper function of any gene, yet we still lack a unified understanding of sequence patterns and rules that explain most transcription start sites in the human genome. By predicting transcription initiation at base-pair resolution from sequences with a deep learning-inspired explainable model called Puffin, we show that a small set of simple rules can explain transcription initiation at most human promoters. We identify key sequence patterns that contribute to human promoter activity, each activating transcription with distinct position-specific effects. Furthermore, we explain the sequence basis of bidirectional transcription at promoters, identify the links between promoter sequence and gene expression variation across cell types, and explore the conservation of sequence determinants of transcription initiation across mammalian species.
Collapse
Affiliation(s)
- Kseniia Dudnyk
- Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Donghong Cai
- Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Center of Excellence for Leukemia Studies (CELS), Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Chenlai Shi
- Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jian Xu
- Center of Excellence for Leukemia Studies (CELS), Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jian Zhou
- Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
2
|
Vaknin I, Willinger O, Mandl J, Heuberger H, Ben-Ami D, Zeng Y, Goldberg S, Orenstein Y, Amit R. A universal system for boosting gene expression in eukaryotic cell-lines. Nat Commun 2024; 15:2394. [PMID: 38493141 PMCID: PMC10944472 DOI: 10.1038/s41467-024-46573-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 03/04/2024] [Indexed: 03/18/2024] Open
Abstract
We demonstrate a transcriptional regulatory design algorithm that can boost expression in yeast and mammalian cell lines. The system consists of a simplified transcriptional architecture composed of a minimal core promoter and a synthetic upstream regulatory region (sURS) composed of up to three motifs selected from a list of 41 motifs conserved in the eukaryotic lineage. The sURS system was first characterized using an oligo-library containing 189,990 variants. We validate the resultant expression model using a set of 43 unseen sURS designs. The validation sURS experiments indicate that a generic set of grammar rules for boosting and attenuation may exist in yeast cells. Finally, we demonstrate that this generic set of grammar rules functions similarly in mammalian CHO-K1 and HeLa cells. Consequently, our work provides a design algorithm for boosting the expression of promoters used for expressing industrially relevant proteins in yeast and mammalian cell lines.
Collapse
Affiliation(s)
- Inbal Vaknin
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Or Willinger
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Jonathan Mandl
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
| | - Hadar Heuberger
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Dan Ben-Ami
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Yi Zeng
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Sarah Goldberg
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Yaron Orenstein
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel.
- The Russell Berrie Nanotechnology Institute, Technion, Haifa, Israel.
| |
Collapse
|
3
|
Kwak IY, Kim BC, Lee J, Kang T, Garry DJ, Zhang J, Gong W. Proformer: a hybrid macaron transformer model predicts expression values from promoter sequences. BMC Bioinformatics 2024; 25:81. [PMID: 38378442 PMCID: PMC10877777 DOI: 10.1186/s12859-024-05645-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 01/08/2024] [Indexed: 02/22/2024] Open
Abstract
The breakthrough high-throughput measurement of the cis-regulatory activity of millions of randomly generated promoters provides an unprecedented opportunity to systematically decode the cis-regulatory logic that determines the expression values. We developed an end-to-end transformer encoder architecture named Proformer to predict the expression values from DNA sequences. Proformer used a Macaron-like Transformer encoder architecture, where two half-step feed forward (FFN) layers were placed at the beginning and the end of each encoder block, and a separable 1D convolution layer was inserted after the first FFN layer and in front of the multi-head attention layer. The sliding k-mers from one-hot encoded sequences were mapped onto a continuous embedding, combined with the learned positional embedding and strand embedding (forward strand vs. reverse complemented strand) as the sequence input. Moreover, Proformer introduced multiple expression heads with mask filling to prevent the transformer models from collapsing when training on relatively small amount of data. We empirically determined that this design had significantly better performance than the conventional design such as using the global pooling layer as the output layer for the regression task. These analyses support the notion that Proformer provides a novel method of learning and enhances our understanding of how cis-regulatory sequences determine the expression values.
Collapse
Affiliation(s)
- Il-Youp Kwak
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Byeong-Chan Kim
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Juhyun Lee
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Taein Kang
- Department of Applied Statistics, Chung‑Ang University, Seoul, Republic of Korea
| | - Daniel J Garry
- Cardiovascular Division, Department of Medicine, Lillehei Heart Institute, University of Minnesota, 2231 6th St SE, Minneapolis, MN, 55455, USA.
- Stem Cell Institute, University of Minnesota, Minneapolis, MN, 55455, USA.
- Paul and Sheila Wellstone Muscular Dystrophy Center, University of Minnesota, Minneapolis, MN, 55455, USA.
| | - Jianyi Zhang
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Wuming Gong
- Cardiovascular Division, Department of Medicine, Lillehei Heart Institute, University of Minnesota, 2231 6th St SE, Minneapolis, MN, 55455, USA.
| |
Collapse
|
4
|
Martyn GE, Montgomery MT, Jones H, Guo K, Doughty BR, Linder J, Chen Z, Cochran K, Lawrence KA, Munson G, Pampari A, Fulco CP, Kelley DR, Lander ES, Kundaje A, Engreitz JM. Rewriting regulatory DNA to dissect and reprogram gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.20.572268. [PMID: 38187584 PMCID: PMC10769263 DOI: 10.1101/2023.12.20.572268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico. 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.
Collapse
Affiliation(s)
- Gabriella E Martyn
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Michael T Montgomery
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Katherine Guo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Benjamin R Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Ziwei Chen
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Kelly Cochran
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Kathryn A Lawrence
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Glen Munson
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Anusri Pampari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Sanofi, Cambridge, MA, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
5
|
Fu ZH, He SZ, Wu Y, Zhao GR. Design and deep learning of synthetic B-cell-specific promoters. Nucleic Acids Res 2023; 51:11967-11979. [PMID: 37889080 PMCID: PMC10681721 DOI: 10.1093/nar/gkad930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/20/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
Synthetic biology and deep learning synergistically revolutionize our ability for decoding and recoding DNA regulatory grammar. The B-cell-specific transcriptional regulation is intricate, and unlock the potential of B-cell-specific promoters as synthetic elements is important for B-cell engineering. Here, we designed and pooled synthesized 23 640 B-cell-specific promoters that exhibit larger sequence space, B-cell-specific expression, and enable diverse transcriptional patterns in B-cells. By MPRA (Massively parallel reporter assays), we deciphered the sequence features that regulate promoter transcriptional, including motifs and motif syntax (their combination and distance). Finally, we built and trained a deep learning model capable of predicting the transcriptional strength of the immunoglobulin V gene promoter directly from sequence. Prediction of thousands of promoter variants identified in the global human population shows that polymorphisms in promoters influence the transcription of immunoglobulin V genes, which may contribute to individual differences in adaptive humoral immune responses. Our work helps to decipher the transcription mechanism in immunoglobulin genes and offers thousands of non-similar promoters for B-cell engineering.
Collapse
Affiliation(s)
- Zong-Heng Fu
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Si-Zhe He
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Yi Wu
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| | - Guang-Rong Zhao
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Frontiers Research Institute for Synthetic Biology, Tianjin University, Tianjin 300072, China
| |
Collapse
|
6
|
Zhang P, Wang H, Xu H, Wei L, Liu L, Hu Z, Wang X. Deep flanking sequence engineering for efficient promoter design using DeepSEED. Nat Commun 2023; 14:6309. [PMID: 37813854 PMCID: PMC10562447 DOI: 10.1038/s41467-023-41899-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 09/20/2023] [Indexed: 10/11/2023] Open
Abstract
Designing promoters with desirable properties is essential in synthetic biology. Human experts are skilled at identifying strong explicit patterns in small samples, while deep learning models excel at detecting implicit weak patterns in large datasets. Biologists have described the sequence patterns of promoters via transcription factor binding sites (TFBSs). However, the flanking sequences of cis-regulatory elements, have long been overlooked and often arbitrarily decided in promoter design. To address this limitation, we introduce DeepSEED, an AI-aided framework that efficiently designs synthetic promoters by combining expert knowledge with deep learning techniques. DeepSEED has demonstrated success in improving the properties of Escherichia coli constitutive, IPTG-inducible, and mammalian cell doxycycline (Dox)-inducible promoters. Furthermore, our results show that DeepSEED captures the implicit features in flanking sequences, such as k-mer frequencies and DNA shape features, which are crucial for determining promoter properties.
Collapse
Affiliation(s)
- Pengcheng Zhang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Haochen Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Hanwen Xu
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Lei Wei
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Liyang Liu
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China
| | - Zhirui Hu
- Center for Statistical Science, Tsinghua University, Beijing, China
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics; Center for Synthetic and Systems Biology; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
7
|
Weingarten-Gabbay S, Bauer MR, Stanton AC, Klaeger S, Verzani EK, López D, Clauser KR, Carr SA, Abelin JG, Rice CM, Sabeti PC. Pan-viral ORFs discovery using Massively Parallel Ribosome Profiling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559641. [PMID: 37808651 PMCID: PMC10557741 DOI: 10.1101/2023.09.26.559641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Unveiling the complete proteome of viruses is crucial to our understanding of the viral life cycle and interaction with the host. We developed Massively Parallel Ribosome Profiling (MPRP) to experimentally determine open reading frames (ORFs) in 20,170 designed oligonucleotides across 679 human-associated viral genomes. We identified 5,381 ORFs, including 4,208 non-canonical ORFs, and show successful detection of both annotated coding sequences (CDSs) and reported non-canonical ORFs. By examining immunopeptidome datasets of infected cells, we found class I human leukocyte antigen (HLA-I) peptides originating from non-canonical ORFs identified through MPRP. By inspecting ribosome occupancies on the 5'UTR and CDS regions of annotated viral genes, we identified hundreds of upstream ORFs (uORFs) that negatively regulate the synthesis of canonical viral proteins. The unprecedented source of viral ORFs across a wide range of viral families, including highly pathogenic viruses, expands the repertoire of vaccine targets and exposes new cis-regulatory sequences in viral genomes.
Collapse
|
8
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
9
|
Guzman C, Duttke S, Zhu Y, De Arruda Saldanha C, Downes N, Benner C, Heinz S. Combining TSS-MPRA and sensitive TSS profile dissimilarity scoring to study the sequence determinants of transcription initiation. Nucleic Acids Res 2023; 51:e80. [PMID: 37403796 PMCID: PMC10450201 DOI: 10.1093/nar/gkad562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 06/13/2023] [Accepted: 06/20/2023] [Indexed: 07/06/2023] Open
Abstract
Cis-regulatory elements (CREs) can be classified by the shapes of their transcription start site (TSS) profiles, which are indicative of distinct regulatory mechanisms. Massively parallel reporter assays (MPRAs) are increasingly being used to study CRE regulatory mechanisms, yet the degree to which MPRAs replicate individual endogenous TSS profiles has not been determined. Here, we present a new low-input MPRA protocol (TSS-MPRA) that enables measuring TSS profiles of episomal reporters as well as after lentiviral reporter chromatinization. To sensitively compare MPRA and endogenous TSS profiles, we developed a novel dissimilarity scoring algorithm (WIP score) that outperforms the frequently used earth mover's distance on experimental data. Using TSS-MPRA and WIP scoring on 500 unique reporter inserts, we found that short (153 bp) MPRA promoter inserts replicate the endogenous TSS patterns of ∼60% of promoters. Lentiviral reporter chromatinization did not improve fidelity of TSS-MPRA initiation patterns, and increasing insert size frequently led to activation of extraneous TSS in the MPRA that are not active in vivo. We discuss the implications of our findings, which highlight important caveats when using MPRAs to study transcription mechanisms. Finally, we illustrate how TSS-MPRA and WIP scoring can provide novel insights into the impact of transcription factor motif mutations and genetic variants on TSS patterns and transcription levels.
Collapse
Affiliation(s)
- Carlos Guzman
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
- Department of Bioengineering, Graduate Program in Bioinformatics & Systems Biology, U.C. San Diego, La Jolla, CA 92093, USA
| | - Sascha Duttke
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Yixin Zhu
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Camila De Arruda Saldanha
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Nicholas L Downes
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Christopher Benner
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Sven Heinz
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| |
Collapse
|
10
|
Das S, Singh A, Shah P. Evaluating single-cell variability in proteasomal decay. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.22.554358. [PMID: 37662347 PMCID: PMC10473619 DOI: 10.1101/2023.08.22.554358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Gene expression is a stochastic process that leads to variability in mRNA and protein abundances even within an isogenic population of cells grown in the same environment. This variation, often called gene-expression noise, has typically been attributed to transcriptional and translational processes while ignoring the contributions of protein decay variability across cells. Here we estimate the single-cell protein decay rates of two degron GFPs in Saccharomyces cerevisiae using time-lapse microscopy. We find substantial cell-to-cell variability in the decay rates of the degron GFPs. We evaluate cellular features that explain the variability in the proteasomal decay and find that the amount of 20s catalytic beta subunit of the proteasome marginally explains the observed variability in the degron GFP half-lives. We propose alternate hypotheses that might explain the observed variability in the decay of the two degron GFPs. Overall, our study highlights the importance of studying the kinetics of the decay process at single-cell resolution and that decay rates vary at the single-cell level, and that the decay process is stochastic. A complex model of decay dynamics must be included when modeling stochastic gene expression to estimate gene expression noise.
Collapse
Affiliation(s)
| | - Abhyudai Singh
- Department of Electrical and Computer Engineering, Biomedical Engineering, University of Delaware
| | | |
Collapse
|
11
|
Hussain S, Sadouni N, van Essen D, Dao LTM, Ferré Q, Charbonnier G, Torres M, Gallardo F, Lecellier CH, Sexton T, Saccani S, Spicuglia S. Short tandem repeats are important contributors to silencer elements in T cells. Nucleic Acids Res 2023; 51:4845-4866. [PMID: 36929452 PMCID: PMC10250210 DOI: 10.1093/nar/gkad187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 02/26/2023] [Accepted: 03/15/2023] [Indexed: 03/18/2023] Open
Abstract
The action of cis-regulatory elements with either activation or repression functions underpins the precise regulation of gene expression during normal development and cell differentiation. Gene activation by the combined activities of promoters and distal enhancers has been extensively studied in normal and pathological contexts. In sharp contrast, gene repression by cis-acting silencers, defined as genetic elements that negatively regulate gene transcription in a position-independent fashion, is less well understood. Here, we repurpose the STARR-seq approach as a novel high-throughput reporter strategy to quantitatively assess silencer activity in mammals. We assessed silencer activity from DNase hypersensitive I sites in a mouse T cell line. Identified silencers were associated with either repressive or active chromatin marks and enriched for binding motifs of known transcriptional repressors. CRISPR-mediated genomic deletions validated the repressive function of distinct silencers involved in the repression of non-T cell genes and genes regulated during T cell differentiation. Finally, we unravel an association of silencer activity with short tandem repeats, highlighting the role of repetitive elements in silencer activity. Our results provide a general strategy for genome-wide identification and characterization of silencer elements.
Collapse
Affiliation(s)
- Saadat Hussain
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Nori Sadouni
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Dominic van Essen
- Institute for Research on Cancer and Ageing, IRCAN, 06107 Nice, France
| | - Lan T M Dao
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Quentin Ferré
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Guillaume Charbonnier
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Magali Torres
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Frederic Gallardo
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Charles-Henri Lecellier
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
- LIRMM, University of Montpellier, CNRS, Montpellier, France
| | - Tom Sexton
- Institut de Génétique et de Biologie Moléculaire et Cellulaire – IGBMC (CNRS UMR 7104, INSERM U1258, Université de Strasbourg), 67404 Illkirch, France
| | - Simona Saccani
- Institute for Research on Cancer and Ageing, IRCAN, 06107 Nice, France
| | - Salvatore Spicuglia
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| |
Collapse
|
12
|
Georgakopoulos-Soares I, Deng C, Agarwal V, Chan CSY, Zhao J, Inoue F, Ahituv N. Transcription factor binding site orientation and order are major drivers of gene regulatory activity. Nat Commun 2023; 14:2333. [PMID: 37087538 PMCID: PMC10122648 DOI: 10.1038/s41467-023-37960-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 04/06/2023] [Indexed: 04/24/2023] Open
Abstract
The gene regulatory code and grammar remain largely unknown, precluding our ability to link phenotype to genotype in regulatory sequences. Here, using a massively parallel reporter assay (MPRA) of 209,440 sequences, we examine all possible pair and triplet combinations, permutations and orientations of eighteen liver-associated transcription factor binding sites (TFBS). We find that TFBS orientation and order have a major effect on gene regulatory activity. Corroborating these results with genomic analyses, we find clear human promoter TFBS orientation biases and similar TFBS orientation and order transcriptional effects in an MPRA that tested 164,307 liver candidate regulatory elements. Additionally, by adding TFBS orientation to a model that predicts expression from sequence we improve performance by 7.7%. Collectively, our results show that TFBS orientation and order have a significant effect on gene regulatory activity and need to be considered when analyzing the functional effect of variants on the activity of these sequences.
Collapse
Affiliation(s)
- Ilias Georgakopoulos-Soares
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA.
| | - Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Vikram Agarwal
- mRNA Center of Excellence, Sanofi Pasteur Inc., Waltham, MA, USA
| | - Candace S Y Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Jingjing Zhao
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
13
|
Durrant MG, Fanton A, Tycko J, Hinks M, Chandrasekaran SS, Perry NT, Schaepe J, Du PP, Lotfy P, Bassik MC, Bintu L, Bhatt AS, Hsu PD. Systematic discovery of recombinases for efficient integration of large DNA sequences into the human genome. Nat Biotechnol 2023; 41:488-499. [PMID: 36217031 PMCID: PMC10083194 DOI: 10.1038/s41587-022-01494-w] [Citation(s) in RCA: 39] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 09/01/2022] [Indexed: 11/09/2022]
Abstract
Large serine recombinases (LSRs) are DNA integrases that facilitate the site-specific integration of mobile genetic elements into bacterial genomes. Only a few LSRs, such as Bxb1 and PhiC31, have been characterized to date, with limited efficiency as tools for DNA integration in human cells. In this study, we developed a computational approach to identify thousands of LSRs and their DNA attachment sites, expanding known LSR diversity by >100-fold and enabling the prediction of their insertion site specificities. We tested their recombination activity in human cells, classifying them as landing pad, genome-targeting or multi-targeting LSRs. Overall, we achieved up to seven-fold higher recombination than Bxb1 and genome integration efficiencies of 40-75% with cargo sizes over 7 kb. We also demonstrate virus-free, direct integration of plasmid or amplicon libraries for improved functional genomics applications. This systematic discovery of recombinases directly from microbial sequencing data provides a resource of over 60 LSRs experimentally characterized in human cells for large-payload genome insertion without exposed DNA double-stranded breaks.
Collapse
Affiliation(s)
- Matthew G Durrant
- Arc Institute, Palo Alto, CA, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Alison Fanton
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Michaela Hinks
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Sita S Chandrasekaran
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Nicholas T Perry
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Julia Schaepe
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Peter P Du
- Department of Genetics, Stanford University, Stanford, CA, USA
- Cancer Biology Program, Stanford University, Stanford, CA, USA
| | - Peter Lotfy
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Medicine (Hematology), Stanford University, Stanford, CA, USA.
| | - Patrick D Hsu
- Arc Institute, Palo Alto, CA, USA.
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA.
- Laboratory of Molecular and Cell Biology, Salk Institute for Biological Studies, La Jolla, CA, USA.
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA.
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
14
|
Karollus A, Mauermeier T, Gagneur J. Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers. Genome Biol 2023; 24:56. [PMID: 36973806 PMCID: PMC10045630 DOI: 10.1186/s13059-023-02899-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 03/16/2023] [Indexed: 03/29/2023] Open
Abstract
BACKGROUND The largest sequence-based models of transcription control to date are obtained by predicting genome-wide gene regulatory assays across the human genome. This setting is fundamentally correlative, as those models are exposed during training solely to the sequence variation between human genes that arose through evolution, questioning the extent to which those models capture genuine causal signals. RESULTS Here we confront predictions of state-of-the-art models of transcription regulation against data from two large-scale observational studies and five deep perturbation assays. The most advanced of these sequence-based models, Enformer, by and large, captures causal determinants of human promoters. However, models fail to capture the causal effects of enhancers on expression, notably in medium to long distances and particularly for highly expressed promoters. More generally, the predicted impact of distal elements on gene expression predictions is small and the ability to correctly integrate long-range information is significantly more limited than the receptive fields of the models suggest. This is likely caused by the escalating class imbalance between actual and candidate regulatory elements as distance increases. CONCLUSIONS Our results suggest that sequence-based models have advanced to the point that in silico study of promoter regions and promoter variants can provide meaningful insights and we provide practical guidance on how to use them. Moreover, we foresee that it will require significantly more and particularly new kinds of data to train models accurately accounting for distal elements.
Collapse
Affiliation(s)
- Alexander Karollus
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
| | - Thomas Mauermeier
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Julien Gagneur
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany.
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.
- Munich Data Science Institute, Technical University of Munich, Garching, Germany.
| |
Collapse
|
15
|
Agarwal V, Inoue F, Schubach M, Martin BK, Dash PM, Zhang Z, Sohota A, Noble WS, Yardimci GG, Kircher M, Shendure J, Ahituv N. Massively parallel characterization of transcriptional regulatory elements in three diverse human cell types. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.05.531189. [PMID: 36945371 PMCID: PMC10028905 DOI: 10.1101/2023.03.05.531189] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
Abstract
The human genome contains millions of candidate cis-regulatory elements (CREs) with cell-type-specific activities that shape both health and myriad disease states. However, we lack a functional understanding of the sequence features that control the activity and cell-type-specific features of these CREs. Here, we used lentivirus-based massively parallel reporter assays (lentiMPRAs) to test the regulatory activity of over 680,000 sequences, representing a nearly comprehensive set of all annotated CREs among three cell types (HepG2, K562, and WTC11), finding 41.7% to be functional. By testing sequences in both orientations, we find promoters to have significant strand orientation effects. We also observe that their 200 nucleotide cores function as non-cell-type-specific 'on switches' providing similar expression levels to their associated gene. In contrast, enhancers have weaker orientation effects, but increased tissue-specific characteristics. Utilizing our lentiMPRA data, we develop sequence-based models to predict CRE function with high accuracy and delineate regulatory motifs. Testing an additional lentiMPRA library encompassing 60,000 CREs in all three cell types, we further identified factors that determine cell-type specificity. Collectively, our work provides an exhaustive catalog of functional CREs in three widely used cell lines, and showcases how large-scale functional measurements can be used to dissect regulatory grammar.
Collapse
Affiliation(s)
- Vikram Agarwal
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- mRNA Center of Excellence, Sanofi Pasteur Inc., Waltham, MA 02451, USA
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Max Schubach
- Berlin Institute of Health of Health at Charité - Universitätsmedizin Berlin, 10178, Berlin, Germany
| | - Beth K. Martin
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Pyaree Mohan Dash
- Berlin Institute of Health of Health at Charité - Universitätsmedizin Berlin, 10178, Berlin, Germany
| | - Zicong Zhang
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Ajuni Sohota
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| | - William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - Galip Gürkan Yardimci
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Knight Cancer Institute, Oregon Health and Science University, Portland, OR, USA
- Cancer Early Detection Advanced Research Center, Oregon Health and Science University, Portland, OR, USA
| | - Martin Kircher
- Berlin Institute of Health of Health at Charité - Universitätsmedizin Berlin, 10178, Berlin, Germany
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 98195, USA
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, USA
- Allen Center for Cell Lineage Tracing, University of Washington, Seattle, WA 98195, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
16
|
Deng S. The origin of genetic and metabolic systems: Evolutionary structuralinsights. Heliyon 2023; 9:e14466. [PMID: 36967965 PMCID: PMC10036676 DOI: 10.1016/j.heliyon.2023.e14466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 02/27/2023] [Accepted: 03/06/2023] [Indexed: 03/16/2023] Open
Abstract
DNA is derived from reverse transcription and its origin is related to reverse transcriptase, DNA polymerase and integrase. The gene structure originated from the evolution of the first RNA polymerase. Thus, an explanation of the origin of the genetic system must also explain the evolution of these enzymes. This paper proposes a polymer structure model, termed the stable complex evolution model, which explains the evolution of enzymes and functional molecules. Enzymes evolved their functions by forming locally tightly packed complexes with specific substrates. A metabolic reaction can therefore be considered to be the result of adaptive evolution in this way when a certain essential molecule is lacking in a cell. The evolution of the primitive genetic and metabolic systems was thus coordinated and synchronized. According to the stable complex model, almost all functional molecules establish binding affinity and specific recognition through complementary interactions, and functional molecules therefore have the nature of being auto-reactive. This is thermodynamically favorable and leads to functional duplication and self-organization. Therefore, it can be speculated that biological systems have a certain tendency to maintain functional stability or are influenced by an inherent selective power. The evolution of dormant bacteria may support this hypothesis, and inherent selectivity can be unified with natural selection at the molecular level.
Collapse
|
17
|
Moeckel C, Zaravinos A, Georgakopoulos-Soares I. Strand Asymmetries Across Genomic Processes. Comput Struct Biotechnol J 2023; 21:2036-2047. [PMID: 36968020 PMCID: PMC10030826 DOI: 10.1016/j.csbj.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/08/2023] [Accepted: 03/08/2023] [Indexed: 03/12/2023] Open
Abstract
Across biological systems, a number of genomic processes, including transcription, replication, DNA repair, and transcription factor binding, display intrinsic directionalities. These directionalities are reflected in the asymmetric distribution of nucleotides, motifs, genes, transposon integration sites, and other functional elements across the two complementary strands. Strand asymmetries, including GC skews and mutational biases, have shaped the nucleotide composition of diverse organisms. The investigation of strand asymmetries often serves as a method to understand underlying biological mechanisms, including protein binding preferences, transcription factor interactions, retrotransposition, DNA damage and repair preferences, transcription-replication collisions, and mutagenesis mechanisms. Research into this subject also enables the identification of functional genomic sites, such as replication origins and transcription start sites. Improvements in our ability to detect and quantify DNA strand asymmetries will provide insights into diverse functionalities of the genome, the contribution of different mutational mechanisms in germline and somatic mutagenesis, and our knowledge of genome instability and evolution, which all have significant clinical implications in human disease, including cancer. In this review, we describe key developments that have been made across the field of genomic strand asymmetries, as well as the discovery of associated mechanisms.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Apostolos Zaravinos
- Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus
- Cancer Genetics, Genomics and Systems Biology laboratory, Basic and Translational Cancer Research Center (BTCRC), Nicosia 1516, Cyprus
- Corresponding author at: Department of Life Sciences, European University Cyprus, Diogenis Str., 6, Nicosia 2404, Cyprus.
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Corresponding author.
| |
Collapse
|
18
|
Carrasco Pro S, Hook H, Bray D, Berenzy D, Moyer D, Yin M, Labadorf AT, Tewhey R, Siggers T, Fuxman Bass JI. Widespread perturbation of ETS factor binding sites in cancer. Nat Commun 2023; 14:913. [PMID: 36808133 PMCID: PMC9938127 DOI: 10.1038/s41467-023-36535-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 02/03/2023] [Indexed: 02/19/2023] Open
Abstract
Although >90% of somatic mutations reside in non-coding regions, few have been reported as cancer drivers. To predict driver non-coding variants (NCVs), we present a transcription factor (TF)-aware burden test based on a model of coherent TF function in promoters. We apply this test to NCVs from the Pan-Cancer Analysis of Whole Genomes cohort and predict 2555 driver NCVs in the promoters of 813 genes across 20 cancer types. These genes are enriched in cancer-related gene ontologies, essential genes, and genes associated with cancer prognosis. We find that 765 candidate driver NCVs alter transcriptional activity, 510 lead to differential binding of TF-cofactor regulatory complexes, and that they primarily impact the binding of ETS factors. Finally, we show that different NCVs within a promoter often affect transcriptional activity through shared mechanisms. Our integrated computational and experimental approach shows that cancer NCVs are widespread and that ETS factors are commonly disrupted.
Collapse
Affiliation(s)
| | - Heather Hook
- Department of Biology, Boston University, Boston, MA, USA
| | - David Bray
- Bioinformatics Program, Boston University, Boston, MA, USA
| | | | - Devlin Moyer
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Meimei Yin
- Department of Biology, Boston University, Boston, MA, USA
| | - Adam Thomas Labadorf
- Bioinformatics Hub, Boston University, Boston, MA, USA
- Boston University School of Medicine, Department of Neurology, Boston, MA, USA
| | | | - Trevor Siggers
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
- Biological Design Center, Boston University, Boston, MA, USA.
| | - Juan Ignacio Fuxman Bass
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
| |
Collapse
|
19
|
Kari H, Bandi SMS, Kumar A, Yella VR. DeePromClass: Delineator for Eukaryotic Core Promoters Employing Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:802-807. [PMID: 35353704 DOI: 10.1109/tcbb.2022.3163418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Computational promoter identification in eukaryotes is a classical biological problem that should be refurbished with the availability of an avalanche of experimental data and emerging deep learning technologies. The current knowledge indicates that eukaryotic core promoters display multifarious signals such as TATA-Box, Inr element, TCT, and Pause-button, etc., and structural motifs such as G-quadruplexes. In the present study, we combined the power of deep learning with a plethora of promoter motifs to delineate promoter and non-promoters gleaned from the statistical properties of DNA sequence arrangement. To this end, we implemented convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for five model systems with [-100 to +50] segments relative to the transcription start site being the core promoter. Unlike previous state-of-the-art tools, which furnish a binary decision of promoter or non-promoter, we classify a chunk of 151mer sequence into a promoter along with the consensus signal type or a non-promoter. The combined CNN-LSTM model; we call "DeePromClass", achieved testing accuracy of 90.6%, 93.6%, 91.8%, 86.5%, and 84.0% for S. cerevisiae, C. elegans, D. melanogaster, Mus musculus, and Homo sapiens respectively. In total, our tool provides an insightful update on next-generation promoter prediction tools for promoter biologists.
Collapse
|
20
|
Cabrera A, Edelstein HI, Glykofrydis F, Love KS, Palacios S, Tycko J, Zhang M, Lensch S, Shields CE, Livingston M, Weiss R, Zhao H, Haynes KA, Morsut L, Chen YY, Khalil AS, Wong WW, Collins JJ, Rosser SJ, Polizzi K, Elowitz MB, Fussenegger M, Hilton IB, Leonard JN, Bintu L, Galloway KE, Deans TL. The sound of silence: Transgene silencing in mammalian cell engineering. Cell Syst 2022; 13:950-973. [PMID: 36549273 PMCID: PMC9880859 DOI: 10.1016/j.cels.2022.11.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 09/22/2022] [Accepted: 11/22/2022] [Indexed: 12/24/2022]
Abstract
To elucidate principles operating in native biological systems and to develop novel biotechnologies, synthetic biology aims to build and integrate synthetic gene circuits within native transcriptional networks. The utility of synthetic gene circuits for cell engineering relies on the ability to control the expression of all constituent transgene components. Transgene silencing, defined as the loss of expression over time, persists as an obstacle for engineering primary cells and stem cells with transgenic cargos. In this review, we highlight the challenge that transgene silencing poses to the robust engineering of mammalian cells, outline potential molecular mechanisms of silencing, and present approaches for preventing transgene silencing. We conclude with a perspective identifying future research directions for improving the performance of synthetic gene circuits.
Collapse
Affiliation(s)
- Alan Cabrera
- Department of Bioengineering, Rice University, Houston, TX 77005, USA; Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Hailey I Edelstein
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA; The Eli and Edythe Broad CIRM Center, Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Fokion Glykofrydis
- Department of Stem Cell Biology and Regenerative Medicine, University of Southern California, Los Angeles, CA 90033-9080, USA
| | - Kasey S Love
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Sebastian Palacios
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Meng Zhang
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Champaign, Urbana, IL 61801, USA
| | - Sarah Lensch
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Cara E Shields
- Wallace H. Coulter Department of Biomedical Engineering, Emory University, Atlanta, GA 30322, USA
| | - Mark Livingston
- Department of Biomedical Engineering, University of Utah, Salt Lake City, UT 84112, USA
| | - Ron Weiss
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Champaign, Urbana, IL 61801, USA
| | - Karmella A Haynes
- Wallace H. Coulter Department of Biomedical Engineering, Emory University, Atlanta, GA 30322, USA
| | - Leonardo Morsut
- Department of Stem Cell Biology and Regenerative Medicine, University of Southern California, Los Angeles, CA 90033-9080, USA
| | - Yvonne Y Chen
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, CA 90095, USA; Parker Institute for Cancer Immunotherapy Center at UCLA, Los Angeles, CA 90095, USA
| | - Ahmad S Khalil
- Biological Design Center and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Wilson W Wong
- Biological Design Center and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - James J Collins
- Department of Stem Cell Biology and Regenerative Medicine, University of Southern California, Los Angeles, CA 90033-9080, USA; Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Susan J Rosser
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Karen Polizzi
- Department of Chemical Engineering, Imperial College London, South Kensington Campus, London, UK; Imperial College Centre for Synthetic Biology, South Kensington Campus, London, UK
| | - Michael B Elowitz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | - Martin Fussenegger
- Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, Basel 4058, Switzerland; Faculty of Science, University of Basel, Mattenstrasse 26, Basel 4058, Switzerland
| | - Isaac B Hilton
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | - Joshua N Leonard
- Center for Synthetic Biology, Northwestern University, Evanston, IL 60208, USA; The Eli and Edythe Broad CIRM Center, Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Kate E Galloway
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tara L Deans
- Department of Biomedical Engineering, University of Utah, Salt Lake City, UT 84112, USA.
| |
Collapse
|
21
|
Yin XY, Chen HX, Chen Z, Yang Q, Han J, He GW. Identification and functional analysis of genetic variants of ISL1 gene promoter in human atrial septal defects. J Gene Med 2022; 24:e3450. [PMID: 36170181 DOI: 10.1002/jgm.3450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/16/2022] [Accepted: 09/25/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Atrial septal defect (ASD) is a common type of congenital heart disease. A gene promoter plays pivotal role in the disease development. This study was designed to investigate the pathological role of variants of the ISL1 gene promoter region in ASD patients. METHODS Total DNA extracted from 625 subjects, including 332 ASD patients and 293 healthy controls, was sequenced to identify variants in the promoter region of ISL1 gene. Further functional analyses of the variants were performed with dual luciferase reporter assay and electrophoretic mobility shift assay (EMSA). All possible binding sites of transcription factor affected by the identified variants were predicted using the JASPAR database. RESULTS Four variants in the ISL1 gene promoter were found only in patients with ASD by sequencing. Three of the four variants [g.4923 G > C (rs541081886), g.5079 A > G (rs1371835943) and g.5309 G > A (rs116222082)] significantly decreased the transcriptional activities compared with the wild-type ISL1 gene promoter (p < 0.05). The EMSA revealed that these variants [g.4923 G > C (rs541081886), g.5079 A > G (rs1371835943) and g.5309 G > A (rs116222082)] in the ISL1 gene promoter affected the number and affinity of binding sites of transcription factors. Further analysis with the online JASPAR database demonstrated that a cluster of putative binding sites for transcription factors may be altered by these variants. CONCLUSIONS These sequence variants identified from the promoter region of ISL1 gene in ASD patients are probably involved in the development of ASD by affecting the transcriptional activity and altering ISL1 levels. Therefore, these findings may provide new insights into the molecular etiology and potential therapeutic strategy of ASD.
Collapse
Affiliation(s)
- Xiu-Yun Yin
- School of Pharmacy, Drug Research & Development Center, Wannan Medical College, Wuhu, Anhui, China & The Institute of Cardiovascular Diseases, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, China
| | - Huan-Xin Chen
- The Institute of Cardiovascular Diseases & Department Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, China
| | - Zhuo Chen
- School of Pharmacy, Drug Research & Development Center, Wannan Medical College, Wuhu, Anhui, China & The Institute of Cardiovascular Diseases, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, China
| | - Qin Yang
- The Institute of Cardiovascular Diseases & Department Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, China
| | - Jun Han
- School of Pharmacy, Drug Research & Development Center, Wannan Medical College, Wuhu, Anhui, China
| | - Guo-Wei He
- The Institute of Cardiovascular Diseases & Department Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, China
| |
Collapse
|
22
|
Yang Y, Shao Y, Chaffin TA, Lee JH, Poindexter MR, Ahkami AH, Blumwald E, Stewart CN. Performance of abiotic stress-inducible synthetic promoters in genetically engineered hybrid poplar ( Populus tremula × Populus alba). FRONTIERS IN PLANT SCIENCE 2022; 13:1011939. [PMID: 36330242 PMCID: PMC9623294 DOI: 10.3389/fpls.2022.1011939] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 09/28/2022] [Indexed: 05/27/2023]
Abstract
Abiotic stresses can cause significant damage to plants. For sustainable bioenergy crop production, it is critical to generate resistant crops to such stress. Engineering promoters to control the precise expression of stress resistance genes is a very effective way to address the problem. Here we developed stably transformed Populus tremula × Populus alba hybrid poplar (INRA 717-1B4) containing one-of-six synthetic drought stress-inducible promoters (SDs; SD9-1, SD9-2, SD9-3, SD13-1, SD18-1, and SD18-3) identified previously by transient transformation assays. We screened green fluorescent protein (GFP) induction in poplar under osmotic stress conditions. Of six transgenic lines containing synthetic promoter, three lines (SD18-1, 9-2, and 9-3) had significant GFP expression in both salt and osmotic stress treatments. Each synthetic promoter employed heptamerized repeats of specific and short cis-regulatory elements (7 repeats of 7-8 bases). To verify whether the repeats of longer sequences can improve osmotic stress responsiveness, a transgenic poplar containing the synthetic promoter of the heptamerized entire SD9 motif (20 bases, containing all partial SD9 motifs) was generated and measured for GFP induction under osmotic stress. The heptamerized entire SD9 motif did not result in higher GFP expression than the shorter promoters consisting of heptamerized SD9-1, 9-2, and 9-3 (partial SD9) motifs. This result indicates that shorter synthetic promoters (~50 bp) can be used for versatile control of gene expression in transgenic poplar. These synthetic promoters will be useful tools to engineer stress-resilient bioenergy tree crops in the future.
Collapse
Affiliation(s)
- Yongil Yang
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Department of Plant Sciences, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
| | - Yuanhua Shao
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Department of Plant Sciences, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
| | - Timothy A. Chaffin
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Department of Plant Sciences, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
| | - Jun Hyung Lee
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Department of Plant Sciences, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| | - Magen R. Poindexter
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Department of Plant Sciences, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
| | - Amir H. Ahkami
- Environmental Molecular Sciences Laboratory (EMSL), Pacific Northwest National Laboratory (PNNL), Richland, WA, United States
| | - Eduardo Blumwald
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - C. Neal Stewart
- Center for Agricultural Synthetic Biology, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
- Department of Plant Sciences, University of Tennessee Institute of Agriculture, Knoxville, TN, United States
| |
Collapse
|
23
|
Mellul M, Lahav S, Imashimizu M, Tokunaga Y, Lukatsky DB, Ram O. Repetitive DNA symmetry elements negatively regulate gene expression in embryonic stem cells. Biophys J 2022; 121:3126-3135. [PMID: 35810331 PMCID: PMC9463640 DOI: 10.1016/j.bpj.2022.07.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 06/13/2022] [Accepted: 07/07/2022] [Indexed: 11/30/2022] Open
Abstract
Transcription factor (TF) binding to genomic DNA elements constitutes one of the key mechanisms that regulates gene expression program in cells. Both consensus and nonconsensus DNA sequence elements influence the recognition specificity of TFs. Based on the analysis of experimentally determined c-Myc binding preferences to genomic DNA, here we statistically predict that certain repetitive, nonconsensus DNA symmetry elements can relatively reduce TF-DNA binding preferences. This is in contrast to a different set of repetitive, nonconsensus symmetry elements that can increase the strength of TF-DNA binding. Using c-Myc enhancer reporter system containing consensus motif flanked by nonconsensus sequences in embryonic stem cells, we directly demonstrate that the enrichment in such negatively regulating repetitive symmetry elements is sufficient to reduce the gene expression level compared with native genomic sequences. Negatively regulating repetitive symmetry elements around consensus c-Myc motif and DNA sequences containing consensus c-Myc motif flanked by entirely randomized sequences show similar expression baseline. A possible explanation for this observation is that rather than complete repression, negatively regulating repetitive symmetry elements play a regulatory role in fine-tuning the reduction of gene expression, most probably by binding TFs other than c-Myc.
Collapse
Affiliation(s)
- Meir Mellul
- Department of Biological Chemistry, The Institute of Life Sciences, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Jerusalem, Israel
| | - Shlomtzion Lahav
- Department of Biological Chemistry, The Institute of Life Sciences, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Jerusalem, Israel
| | - Masahiko Imashimizu
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan
| | - Yuji Tokunaga
- Graduate School of Pharmaceutical Sciences, the University of Tokyo, Tokyo, Japan
| | - David B Lukatsky
- Department of Chemistry, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
| | - Oren Ram
- Department of Biological Chemistry, The Institute of Life Sciences, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Jerusalem, Israel.
| |
Collapse
|
24
|
Isbel L, Grand RS, Schübeler D. Generating specificity in genome regulation through transcription factor sensitivity to chromatin. Nat Rev Genet 2022; 23:728-740. [PMID: 35831531 DOI: 10.1038/s41576-022-00512-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/30/2022] [Indexed: 12/11/2022]
Abstract
Cell type-specific gene expression relies on transcription factors (TFs) binding DNA sequence motifs embedded in chromatin. Understanding how motifs are accessed in chromatin is crucial to comprehend differential transcriptional responses and the phenotypic impact of sequence variation. Chromatin obstacles to TF binding range from DNA methylation to restriction of DNA access by nucleosomes depending on their position, composition and modification. In vivo and in vitro approaches now enable the study of TF binding in chromatin at unprecedented resolution. Emerging insights suggest that TFs vary in their ability to navigate chromatin states. However, it remains challenging to link binding and transcriptional outcomes to molecular characteristics of TFs or the local chromatin substrate. Here, we discuss our current understanding of how TFs access DNA in chromatin and novel techniques and directions towards a better understanding of this critical step in genome regulation.
Collapse
Affiliation(s)
- Luke Isbel
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.,School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Ralph S Grand
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.,Zentrum für Molekulare Biologie der Universität Heidelberg, Heidelberg, Germany
| | - Dirk Schübeler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland. .,Faculty of Sciences, University of Basel, Basel, Switzerland.
| |
Collapse
|
25
|
Bergman DT, Jones TR, Liu V, Ray J, Jagoda E, Siraj L, Kang HY, Nasser J, Kane M, Rios A, Nguyen TH, Grossman SR, Fulco CP, Lander ES, Engreitz JM. Compatibility rules of human enhancer and promoter sequences. Nature 2022; 607:176-184. [PMID: 35594906 PMCID: PMC9262863 DOI: 10.1038/s41586-022-04877-w] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 05/17/2022] [Indexed: 01/03/2023]
Abstract
Gene regulation in the human genome is controlled by distal enhancers that activate specific nearby promoters1. A proposed model for this specificity is that promoters have sequence-encoded preferences for certain enhancers, for example, mediated by interacting sets of transcription factors or cofactors2. This 'biochemical compatibility' model has been supported by observations at individual human promoters and by genome-wide measurements in Drosophila3-9. However, the degree to which human enhancers and promoters are intrinsically compatible has not yet been systematically measured, and how their activities combine to control RNA expression remains unclear. Here we design a high-throughput reporter assay called enhancer × promoter self-transcribing active regulatory region sequencing (ExP STARR-seq) and applied it to examine the combinatorial compatibilities of 1,000 enhancer and 1,000 promoter sequences in human K562 cells. We identify simple rules for enhancer-promoter compatibility, whereby most enhancers activate all promoters by similar amounts, and intrinsic enhancer and promoter activities multiplicatively combine to determine RNA output (R2 = 0.82). In addition, two classes of enhancers and promoters show subtle preferential effects. Promoters of housekeeping genes contain built-in activating motifs for factors such as GABPA and YY1, which decrease the responsiveness of promoters to distal enhancers. Promoters of variably expressed genes lack these motifs and show stronger responsiveness to enhancers. Together, this systematic assessment of enhancer-promoter compatibility suggests a multiplicative model tuned by enhancer and promoter class to control gene transcription in the human genome.
Collapse
Affiliation(s)
- Drew T Bergman
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | | | - Vincent Liu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Judhajeet Ray
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Evelyn Jagoda
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Layla Siraj
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biophysics Graduate Program, Harvard University, Cambridge, MA, USA
| | - Helen Y Kang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michael Kane
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Antonio Rios
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Tung H Nguyen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Bristol Myers Squibb, Cambridge, MA, USA
| | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Jesse M Engreitz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
26
|
Al‑Obaide M, Al‑Obaidi I, Vasylyeva T. The potential consequences of bidirectional promoter methylation on GLA and HNRNPH2 expression in Fabry disease phenotypes in a family of patients carrying a GLA deletion variant. Biomed Rep 2022; 17:71. [PMID: 35910704 PMCID: PMC9326966 DOI: 10.3892/br.2022.1554] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 06/10/2022] [Indexed: 11/11/2022] Open
Abstract
Fabry disease (FD) is a rare inherited disease characterized by a wide range of symptoms attributed to GLA mutations resulting in defective α-galactosidase A (α-Gal A) and accumulation of glycosphingolipids. The GLA locus is paired in a divergent manner with the heterogeneous nuclear ribonucleoprotein HNRNPH2 locus mapped in the RPL36A-HNRNPH2 readthrough locus. As a follow-up to our recent finding of the co-regulation of GLA and HNRNPH2 via a bidirectional promoter (BDP) in normal kidney and skin cells, the potential accumulative influence of BDP methylation and GLA mutation on the severity of FD in patients from the same family, two males and two females carrying a GLA deletion mutation, c.1033_1034delTC (p.Ser345Argfs) was addressed in the present study. The molecular analyses of the FD patients compared with the control revealed that the expression of GLA was significantly low (P<0.05), and HNRNPH2 showed a tendency of low expression (P=0.1) when BDP methylation was elevated in FD patients, compared with low BDP methylation and high GLA expression (P<0.05), and a high trend of HNRNPH2 expression in normal individuals. The accumulative effects of the mutation and BDP methylation with the severity of the disease were observed in three patients. One male FD patient, a member of the FD family diagnosed with progressive loss of kidney function, hypertension, and eventually a stroke, and the lowest level of α-Gal A enzyme activity showed the highest BDP DNA methylation level. It is concluded that the DNA methylation of GLA-HNRNPH2 BDP may serve a role in diagnosing and treating FD.
Collapse
Affiliation(s)
- Mohammed Al‑Obaide
- Department of Pediatrics, School of Medicine, Texas Tech University Health Sciences Center, Amarillo, TX 79106, USA
| | - Ibtisam Al‑Obaidi
- Department of Pediatrics, School of Medicine, Texas Tech University Health Sciences Center, Amarillo, TX 79106, USA
| | - Tetyana Vasylyeva
- Department of Pediatrics, School of Medicine, Texas Tech University Health Sciences Center, Amarillo, TX 79106, USA
| |
Collapse
|
27
|
Perkins ML, Gandara L, Crocker J. A synthetic synthesis to explore animal evolution and development. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200517. [PMID: 35634925 PMCID: PMC9149795 DOI: 10.1098/rstb.2020.0517] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Identifying the general principles by which genotypes are converted into phenotypes remains a challenge in the post-genomic era. We still lack a predictive understanding of how genes shape interactions among cells and tissues in response to signalling and environmental cues, and hence how regulatory networks generate the phenotypic variation required for adaptive evolution. Here, we discuss how techniques borrowed from synthetic biology may facilitate a systematic exploration of evolvability across biological scales. Synthetic approaches permit controlled manipulation of both endogenous and fully engineered systems, providing a flexible platform for investigating causal mechanisms in vivo. Combining synthetic approaches with multi-level phenotyping (phenomics) will supply a detailed, quantitative characterization of how internal and external stimuli shape the morphology and behaviour of living organisms. We advocate integrating high-throughput experimental data with mathematical and computational techniques from a variety of disciplines in order to pursue a comprehensive theory of evolution. This article is part of the theme issue ‘Genetic basis of adaptation and speciation: from loci to causative mutations’.
Collapse
Affiliation(s)
- Mindy Liu Perkins
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lautaro Gandara
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Justin Crocker
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| |
Collapse
|
28
|
Nir R, Hoernes TP, Muramatsu H, Faserl K, Karikó K, Erlacher MD, Sas-Chen A, Schwartz S. A systematic dissection of determinants and consequences of snoRNA-guided pseudouridylation of human mRNA. Nucleic Acids Res 2022; 50:4900-4916. [PMID: 35536311 PMCID: PMC9122591 DOI: 10.1093/nar/gkac347] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 04/18/2022] [Accepted: 04/24/2022] [Indexed: 12/25/2022] Open
Abstract
RNA can be extensively modified post-transcriptionally with >170 covalent modifications, expanding its functional and structural repertoire. Pseudouridine (Ψ), the most abundant modified nucleoside in rRNA and tRNA, has recently been found within mRNA molecules. It remains unclear whether pseudouridylation of mRNA can be snoRNA-guided, bearing important implications for understanding the physiological target spectrum of snoRNAs and for their potential therapeutic exploitation in genetic diseases. Here, using a massively parallel reporter based strategy we simultaneously interrogate Ψ levels across hundreds of synthetic constructs with predesigned complementarity against endogenous snoRNAs. Our results demonstrate that snoRNA-mediated pseudouridylation can occur on mRNA targets. However, this is typically achieved at relatively low efficiencies, and is constrained by mRNA localization, snoRNA expression levels and the length of the snoRNA:mRNA complementarity stretches. We exploited these insights for the design of snoRNAs targeting pseudouridylation at premature termination codons, which was previously shown to suppress translational termination. However, in this and follow-up experiments in human cells we observe no evidence for significant levels of readthrough of pseudouridylated stop codons. Our study enhances our understanding of the scope, 'design rules', constraints and consequences of snoRNA-mediated pseudouridylation.
Collapse
Affiliation(s)
- Ronit Nir
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Thomas Philipp Hoernes
- Institute of Genomics and RNomics, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | - Hiromi Muramatsu
- Department of Neurosurgery, University of Pennsylvania, Philadelphia, PA, USA.,Department of Microbiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Klaus Faserl
- Institute of Clinical Biochemistry, Biocenter, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | - Katalin Karikó
- Department of Neurosurgery, University of Pennsylvania, Philadelphia, PA, USA.,BioNTech RNA Pharmaceuticals, Mainz, Germany
| | | | - Aldema Sas-Chen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel.,The Shmunis School of Biomedicine and Cancer Research, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Schraga Schwartz
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
29
|
Vaknin I, Amit R. Molecular and experimental tools to design synthetic enhancers. Curr Opin Biotechnol 2022; 76:102728. [PMID: 35525178 DOI: 10.1016/j.copbio.2022.102728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Revised: 03/16/2022] [Accepted: 04/03/2022] [Indexed: 11/03/2022]
Abstract
Understanding the grammar of enhancers and how they regulate gene expression is key for both basic research and for the pharma and biotech industries. The design and characterization of synthetic enhancers can expand the known regulatory space. This is achieved by the utilization of DNA Oligo Libraries (OLs), which facilitates screening of as many as millions of synthetic enhancer variants simultaneously. This review includes the latest commercial DNA OL synthesis technology and its capabilities, and a general 'know-how' guide for the design, construction, and analysis of OL-based synthetic enhancer characterization experiments. Specifically, we focus on synthetic-enhancer-based massively parallel reporter assay, Sort-seq methodologies (e.g. flow cytometry, deep sequencing), and a brief description of machine learning-based attempts for OL-analysis and follow-up validation experiments.
Collapse
Affiliation(s)
- Inbal Vaknin
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, Haifa 3200000, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, Haifa 3200000, Israel; The Russell Berrie Nanotechnology Institute, Technion - Israel Institute of Technology, Haifa 3200000, Israel.
| |
Collapse
|
30
|
Johnson AO, Fowler SB, Webster CI, Brown AJ, James DC. Bioinformatic Design of Dendritic Cell-Specific Synthetic Promoters. ACS Synth Biol 2022; 11:1613-1626. [PMID: 35389220 PMCID: PMC9016764 DOI: 10.1021/acssynbio.2c00027] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
![]()
Next-generation DNA vectors for cancer
immunotherapies and vaccine
development require promoters eliciting predefined transcriptional
activities specific to target cell types, such as dendritic cells
(DCs), which underpin immune response. In this study, we describe
the de novo design of DC-specific synthetic promoters via in silico assembly of cis-transcription
factor response elements (TFREs) that harness the DC transcriptional
landscape. Using computational genome mining approaches, candidate
TFREs were identified within promoter sequences of highly expressed
DC-specific genes or those exhibiting an upregulated expression during
DC maturation. Individual TFREs were then screened in vitro in a target DC line and off-target cell lines derived from skeletal
muscle, fibroblast, epithelial, and endothelial cells using homotypic
(TFRE repeats in series) reporter constructs. Based on these data,
a library of heterotypic promoter assemblies varying in the TFRE composition,
copy number, and sequential arrangement was constructed and tested in vitro to identify DC-specific promoters. Analysis of
the transcriptional activity and specificity of these promoters unraveled
underlying design rules, primarily TFRE composition, which govern
the DC-specific synthetic promoter activity. Using these design rules,
a second library of exclusively DC-specific promoters exhibiting varied
transcriptional activities was generated. All DC-specific synthetic
promoter assemblies exhibited >5-fold activity in the target DC
line
relative to off-target cell lines, with transcriptional activities
ranging from 8 to 67% of the nonspecific human cytomegalovirus (hCMV-IE1)
promoter. We show that bioinformatic analysis of a mammalian cell
transcriptional landscape is an effective strategy for de
novo design of cell-type-specific synthetic promoters with
precisely controllable transcriptional activities.
Collapse
Affiliation(s)
- Abayomi O. Johnson
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K
- SynGenSys Limited, Freeths LLP, Norfolk Street, Sheffield S1 2JE, U.K
| | - Susan B. Fowler
- Antibody Discovery and Protein Engineering, R&D, AstraZeneca, Cambridge CB21 6GH, U.K
| | - Carl I. Webster
- Discovery Sciences, R&D, AstraZeneca, Cambridge CB21 6GH, U.K
| | - Adam J. Brown
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K
- SynGenSys Limited, Freeths LLP, Norfolk Street, Sheffield S1 2JE, U.K
| | - David C. James
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K
- SynGenSys Limited, Freeths LLP, Norfolk Street, Sheffield S1 2JE, U.K
| |
Collapse
|
31
|
Kreimer A, Ashuach T, Inoue F, Khodaverdian A, Deng C, Yosef N, Ahituv N. Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation. Nat Commun 2022; 13:1504. [PMID: 35315433 PMCID: PMC8938438 DOI: 10.1038/s41467-022-28659-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 02/04/2022] [Indexed: 02/08/2023] Open
Abstract
Gene regulatory elements play a key role in orchestrating gene expression during cellular differentiation, but what determines their function over time remains largely unknown. Here, we perform perturbation-based massively parallel reporter assays at seven early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide cellular differentiation. By perturbing over 2,000 putative DNA binding motifs in active regulatory regions, we delineate four categories of functional elements, and observe that activity direction is mostly determined by the sequence itself, while the magnitude of effect depends on the cellular environment. We also find that fine-tuning transcription rates is often achieved by a combined activity of adjacent activating and repressing elements. Our work provides a blueprint for the sequence components needed to induce different transcriptional patterns in general and specifically during neural differentiation. How gene regulatory elements regulate gene expression during cellular differentiation remains largely unknown. Here the authors use perturbation-based massively parallel reporter assays at early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide different transcriptional patterns.
Collapse
|
32
|
Uzonyi A, Nir R, Schwartz S. Cloning of DNA oligo pools for in vitro expression. STAR Protoc 2022; 3:101103. [PMID: 35462793 PMCID: PMC9019715 DOI: 10.1016/j.xpro.2021.101103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Oligo library pools are powerful tools for systematic investigation of genetic and transcriptomic machinery such as promoter function and gene regulation, non-coding RNAs, or RNA modifications. Here, we provide a detailed protocol for cloning DNA oligo pools made up of tens of thousands of different constructs, aiming to preserve the complexity of the pools. This system would be suitable for expression in cell lines and can be followed up by next-generation sequencing analysis. For complete details on the use and execution of this profile, please refer to Uzonyi et al. (2021). Restriction-based cloning of DNA pools Preservation of complexity of thousands of constructs Used to investigate genetic and transcriptomic machineries To be expressed in cell lines and follow up by NGS analysis
Collapse
Affiliation(s)
- Anna Uzonyi
- Department of Molecular Genetics, Weizmann Institute of Science, 7610001 Rehovot, Israel
| | - Ronit Nir
- Department of Molecular Genetics, Weizmann Institute of Science, 7610001 Rehovot, Israel
| | - Schraga Schwartz
- Department of Molecular Genetics, Weizmann Institute of Science, 7610001 Rehovot, Israel
| |
Collapse
|
33
|
Abstract
DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs). Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF–TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent. We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression. Analysis of massively parallel reporter assays measuring the transcriptional activity of DNA sequences indicates that most transcription factor (TF) activity is additive and does not rely on specific TF–TF interactions. Individual TFs can have different gene regulatory activities.
Collapse
|
34
|
Bakoulis S, Krautz R, Alcaraz N, Salvatore M, Andersson R. OUP accepted manuscript. Nucleic Acids Res 2022; 50:2111-2127. [PMID: 35166831 PMCID: PMC8887488 DOI: 10.1093/nar/gkac088] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 01/22/2022] [Accepted: 01/27/2022] [Indexed: 11/12/2022] Open
Affiliation(s)
| | | | - Nicolas Alcaraz
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
- Novo Nordisk Foundation Center for Protein Research (CPR), University of Copenhagen, 2200 Copenhagen, Denmark
| | - Marco Salvatore
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Robin Andersson
- To whom correspondence should be addressed. Tel: +45 35330245;
| |
Collapse
|
35
|
Cabaj A, Moszyńska A, Charzyńska A, Bartoszewski R, Dąbrowski M. Functional and HRE motifs count analysis of induction of selected hypoxia-responsive genes by HIF-1 and HIF-2 in human umbilical endothelial cells. Cell Signal 2021; 90:110209. [PMID: 34890779 DOI: 10.1016/j.cellsig.2021.110209] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 11/12/2021] [Accepted: 11/27/2021] [Indexed: 12/19/2022]
Abstract
We analyzed the effects of selective knockdown of either HIF-1α or HIF-2α on the transcriptional response to hypoxia of human umbilical endothelial cells at two time-points (2 h and 8 h) of hypoxia. We focused on 13 previously identified hypoxia-responsive genes, pre-selected to have different activation kinetics and different proportions of HRE motifs annotated to either HIF-1 or HIF-2 in open promoters - open chromatin DNase-hypersensitive sites (DHS) regions within ±1 kb of the gene start. We report that genes activated by both HIF-1 and 2 tend to be activated earlier than genes activated by HIF-1 only, which, in turn, tend to be activated earlier than genes activated by HIF-2 only. Moreover, for the 13 analyzed genes, we found that the effect of silencing HIF1A on the gene induction by hypoxia is greater for the genes with more HRE motifs annotated to HIF-1 in their promoter open chromatin DHS regions within ±1 kb and also within ±10 kb of the gene start. We corroborated and extended this finding by showing that among 232 genes previously identified as activated by hypoxia, the genes with ChIP-seq peak(s) for HIF-1α within a ±10 kb flank of the gene start contain more HRE motifs annotated to HIF-1 in the DHS regions within this flank than the genes with no ChIP-seq peaks. Also in the whole genome, the DHS regions intersecting ChIP-seq peaks for HIF-1α contain more HRE motifs annotated to HIF-1 than the DHS regions not intersecting the ChIP-seq peaks. This suggests a mechanism, by which higher promoter content of HRE motifs in DHS regions increases HIF-1 binding, which in turn increases gene induction by hypoxia.
Collapse
Affiliation(s)
- Aleksandra Cabaj
- Laboratory of Bioinformatics, Nencki Institute of Experimental Biology, Polish Academy of Sciences, ul. Pasteura 3, 02-093 Warsaw, Poland
| | - Adrianna Moszyńska
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Al. Gen. J. Hallera 107, 80-416 Gdansk, Poland
| | - Agata Charzyńska
- Laboratory of Bioinformatics, Nencki Institute of Experimental Biology, Polish Academy of Sciences, ul. Pasteura 3, 02-093 Warsaw, Poland
| | - Rafał Bartoszewski
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Al. Gen. J. Hallera 107, 80-416 Gdansk, Poland
| | - Michał Dąbrowski
- Laboratory of Bioinformatics, Nencki Institute of Experimental Biology, Polish Academy of Sciences, ul. Pasteura 3, 02-093 Warsaw, Poland.
| |
Collapse
|
36
|
Gebert M, Sobolewska A, Bartoszewska S, Cabaj A, Crossman DK, Króliczewski J, Madanecki P, Dąbrowski M, Collawn JF, Bartoszewski R. Genome-wide mRNA profiling identifies X-box-binding protein 1 (XBP1) as an IRE1 and PUMA repressor. Cell Mol Life Sci 2021; 78:7061-7080. [PMID: 34636989 PMCID: PMC8558229 DOI: 10.1007/s00018-021-03952-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 09/17/2021] [Accepted: 09/28/2021] [Indexed: 02/06/2023]
Abstract
Accumulation of misfolded proteins in ER activates the unfolded protein response (UPR), a multifunctional signaling pathway that is important for cell survival. The UPR is regulated by three ER transmembrane sensors, one of which is inositol-requiring protein 1 (IRE1). IRE1 activates a transcription factor, X-box-binding protein 1 (XBP1), by removing a 26-base intron from XBP1 mRNA that generates spliced XBP1 mRNA (XBP1s). To search for XBP1 transcriptional targets, we utilized an XBP1s-inducible human cell line to limit XBP1 expression in a controlled manner. We also verified the identified XBP1-dependent genes with specific silencing of this transcription factor during pharmacological ER stress induction with both an N-linked glycosylation inhibitor (tunicamycin) and a non-competitive inhibitor of the sarco/endoplasmic reticulum Ca2+ ATPase (SERCA) (thapsigargin). We then compared those results to the XBP1s-induced cell line without pharmacological ER stress induction. Using next‐generation sequencing followed by bioinformatic analysis of XBP1-binding motifs, we defined an XBP1 regulatory network and identified XBP1 as a repressor of PUMA (a proapoptotic gene) and IRE1 mRNA expression during the UPR. Our results indicate impairing IRE1 activity during ER stress conditions accelerates cell death in ER-stressed cells, whereas elevating XBP1 expression during ER stress using an inducible cell line correlated with a clear prosurvival effect and reduced PUMA protein expression. Although further studies will be required to test the underlying molecular mechanisms involved in the relationship between these genes with XBP1, these studies identify a novel repressive role of XBP1 during the UPR.
Collapse
Affiliation(s)
- Magdalena Gebert
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Hallera 107, 80-416, Gdansk, Poland
| | - Aleksandra Sobolewska
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Hallera 107, 80-416, Gdansk, Poland
| | - Sylwia Bartoszewska
- Department of Inorganic Chemistry, Medical University of Gdansk, Gdansk, Poland
| | - Aleksandra Cabaj
- Laboratory of Bioinformatics, Nencki Institute of Experimental Biology of the Polish Academy of Sciences, Warsaw, Poland
| | - David K Crossman
- Department of Genetics, Heflin Center for Genomic Science, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Jarosław Króliczewski
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Hallera 107, 80-416, Gdansk, Poland
| | - Piotr Madanecki
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Hallera 107, 80-416, Gdansk, Poland
| | - Michał Dąbrowski
- Laboratory of Bioinformatics, Nencki Institute of Experimental Biology of the Polish Academy of Sciences, Warsaw, Poland
| | - James F Collawn
- Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Rafal Bartoszewski
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Hallera 107, 80-416, Gdansk, Poland.
| |
Collapse
|
37
|
Umarov R, Li Y, Arakawa T, Takizawa S, Gao X, Arner E. ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation. PLoS Comput Biol 2021; 17:e1009376. [PMID: 34491989 PMCID: PMC8448322 DOI: 10.1371/journal.pcbi.1009376] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 09/17/2021] [Accepted: 08/23/2021] [Indexed: 11/19/2022] Open
Abstract
Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing transcription at distant regions (enhancers). Accurate identification of regulatory elements is fundamental for annotating genomes and understanding gene expression patterns. While there are many attempts to develop computational promoter and enhancer identification methods, reliable tools to analyze long genomic sequences are still lacking. Prediction methods often perform poorly on the genome-wide scale because the number of negatives is much higher than that in the training sets. To address this issue, we propose a dynamic negative set updating scheme with a two-model approach, using one model for scanning the genome and the other one for testing candidate positions. The developed method achieves good genome-level performance and maintains robust performance when applied to other vertebrate species, without re-training. Moreover, the unannotated predicted regulatory regions made on the human genome are enriched for disease-associated variants, suggesting them to be potentially true regulatory elements rather than false positives. We validated high scoring "false positive" predictions using reporter assay and all tested candidates were successfully validated, demonstrating the ability of our method to discover novel human regulatory regions.
Collapse
Affiliation(s)
- Ramzan Umarov
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan
- * E-mail: (RU); (XG); (EA)
| | - Yu Li
- Department of Computer Science and Engineering (CSE), The Chinese University of Hong Kong (CUHK), Hong Kong, People’s Republic of China
| | - Takahiro Arakawa
- Laboratory for Applied Regulatory Genomics Network Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Satoshi Takizawa
- Laboratory for Applied Regulatory Genomics Network Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Xin Gao
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, Thuwal, Saudi Arabia
- * E-mail: (RU); (XG); (EA)
| | - Erik Arner
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan
- Laboratory for Applied Regulatory Genomics Network Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
- * E-mail: (RU); (XG); (EA)
| |
Collapse
|
38
|
Pimmett VL, Dejean M, Fernandez C, Trullo A, Bertrand E, Radulescu O, Lagha M. Quantitative imaging of transcription in living Drosophila embryos reveals the impact of core promoter motifs on promoter state dynamics. Nat Commun 2021; 12:4504. [PMID: 34301936 PMCID: PMC8302612 DOI: 10.1038/s41467-021-24461-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 03/31/2021] [Indexed: 11/09/2022] Open
Abstract
Genes are expressed in stochastic transcriptional bursts linked to alternating active and inactive promoter states. A major challenge in transcription is understanding how promoter composition dictates bursting, particularly in multicellular organisms. We investigate two key Drosophila developmental promoter motifs, the TATA box (TATA) and the Initiator (INR). Using live imaging in Drosophila embryos and new computational methods, we demonstrate that bursting occurs on multiple timescales ranging from seconds to minutes. TATA-containing promoters and INR-containing promoters exhibit distinct dynamics, with one or two separate rate-limiting steps respectively. A TATA box is associated with long active states, high rates of polymerase initiation, and short-lived, infrequent inactive states. In contrast, the INR motif leads to two inactive states, one of which relates to promoter-proximal polymerase pausing. Surprisingly, the model suggests pausing is not obligatory, but occurs stochastically for a subset of polymerases. Overall, our results provide a rationale for promoter switching during zygotic genome activation.
Collapse
Affiliation(s)
- Virginia L Pimmett
- Institut de Génétique Moléculaire de Montpellier, Univ Montpellier, CNRS, Montpellier, France
| | - Matthieu Dejean
- Institut de Génétique Moléculaire de Montpellier, Univ Montpellier, CNRS, Montpellier, France
| | - Carola Fernandez
- Institut de Génétique Moléculaire de Montpellier, Univ Montpellier, CNRS, Montpellier, France
| | - Antonio Trullo
- Institut de Génétique Moléculaire de Montpellier, Univ Montpellier, CNRS, Montpellier, France
| | - Edouard Bertrand
- Institut de Génétique Moléculaire de Montpellier, Univ Montpellier, CNRS, Montpellier, France
- Institut de Génétique Humaine, Univ Montpellier, CNRS, Montpellier, France
| | - Ovidiu Radulescu
- Laboratory of Pathogen Host Interactions, Univ Montpellier, CNRS, Montpellier, France
| | - Mounia Lagha
- Institut de Génétique Moléculaire de Montpellier, Univ Montpellier, CNRS, Montpellier, France.
| |
Collapse
|
39
|
Zheng SQ, Chen HX, Liu XC, Yang Q, He GW. Identification of variants of ISL1 gene promoter and cellular functions in isolated ventricular septal defects. Am J Physiol Cell Physiol 2021; 321:C443-C452. [PMID: 34260301 DOI: 10.1152/ajpcell.00167.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Ventricular septal defects (VSDs) are the most common congenital heart defects (CHDs). Studies have documented that ISL1 has a crucial impact on cardiac growth, but the role of variants in the ISL1 gene promoter in patients with VSD has not been explored. In 400 subjects (200 patients with isolated and sporadic VSDs: 200 healthy controls), we investigated the ISL1 gene promoter variant and performed cellular functional experiments by using the dual-luciferase reporter assay to verify the impact on gene expression. In the ISL1 promoter, five variants were found only in patients with VSD by sequencing. Cellular functional experiments demonstrated that three variants decreased the transcriptional activity of the ISL1 promoter (P < 0.05). Further analysis with the online JASPAR database demonstrated that a cluster of putative binding sites for transcription factors may be altered by these variants, possibly resulting in change of ISL1 protein expression and VSD formation. Our study has, for the first time, identified novel variants in the ISL1 gene promoter region in the Han Chinese patients with isolated and sporadic VSD. In addition, the cellular functional experiments, electrophoretic mobility shift assay, and bioinformatic analysis have demonstrated that these variants significantly alter the expression of the ISL1 gene and affect the binding of transcription factors, likely resulting in VSD. Therefore, this study may provide new insights into the role of the gene promoter region for a better understanding of genetic basis of the formation of CHDs and may promote further investigations on mechanism of the formation of CHDs.
Collapse
Affiliation(s)
- Si-Qiang Zheng
- The Institute of Cardiovascular Diseases & Department of Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, People's Republic of China
| | - Huan-Xin Chen
- The Institute of Cardiovascular Diseases & Department of Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, People's Republic of China
| | - Xiao-Cheng Liu
- The Institute of Cardiovascular Diseases & Department of Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, People's Republic of China
| | - Qin Yang
- The Institute of Cardiovascular Diseases & Department of Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, People's Republic of China
| | - Guo-Wei He
- The Institute of Cardiovascular Diseases & Department of Cardiovascular Surgery, TEDA International Cardiovascular Hospital, Tianjin University & Chinese Academy of Medical Sciences, Tianjin, People's Republic of China.,Drug Research and Development Center, Wannan Medical College, Wuhu, People's Republic of China.,Department of Surgery, Oregon Health and Science University, Portland, Oregon
| |
Collapse
|
40
|
Fan K, Moore JE, Zhang XO, Weng Z. Genetic and epigenetic features of promoters with ubiquitous chromatin accessibility support ubiquitous transcription of cell-essential genes. Nucleic Acids Res 2021; 49:5705-5725. [PMID: 33978759 PMCID: PMC8191798 DOI: 10.1093/nar/gkab345] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 03/19/2021] [Accepted: 05/01/2021] [Indexed: 12/04/2022] Open
Abstract
Gene expression is controlled by regulatory elements within accessible chromatin. Although most regulatory elements are cell type-specific, a subset is accessible in nearly all the 517 human and 94 mouse cell and tissue types assayed by the ENCODE consortium. We systematically analyzed 9000 human and 8000 mouse ubiquitously-accessible candidate cis-regulatory elements (cCREs) with promoter-like signatures (PLSs) from ENCODE, which we denote ubi-PLSs. These are more CpG-rich than non-ubi-PLSs and correspond to genes with ubiquitously high transcription, including a majority of cell-essential genes. ubi-PLSs are enriched with motifs of ubiquitously-expressed transcription factors and preferentially bound by transcriptional cofactors regulating ubiquitously-expressed genes. They are highly conserved between human and mouse at the synteny level but exhibit frequent turnover of motif sites; accordingly, ubi-PLSs show increased variation at their centers compared with flanking regions among the ∼186 thousand human genomes sequenced by the TOPMed project. Finally, ubi-PLSs are enriched in genes implicated in Mendelian diseases, especially diseases broadly impacting most cell types, such as deficiencies in mitochondrial functions. Thus, a set of roughly 9000 mammalian promoters are actively maintained in an accessible state across cell types by a distinct set of transcription factors and cofactors to ensure the transcriptional programs of cell-essential genes.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Xiao-ou Zhang
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| |
Collapse
|
41
|
Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network. Nat Commun 2021; 12:3297. [PMID: 34078885 PMCID: PMC8172540 DOI: 10.1038/s41467-021-23143-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 04/13/2021] [Indexed: 02/04/2023] Open
Abstract
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.
Collapse
|
42
|
Jores T, Tonnies J, Wrightsman T, Buckler ES, Cuperus JT, Fields S, Queitsch C. Synthetic promoter designs enabled by a comprehensive analysis of plant core promoters. NATURE PLANTS 2021; 7:842-855. [PMID: 34083762 PMCID: PMC10246763 DOI: 10.1038/s41477-021-00932-y] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/27/2021] [Indexed: 05/24/2023]
Abstract
Targeted engineering of plant gene expression holds great promise for ensuring food security and for producing biopharmaceuticals in plants. However, this engineering requires thorough knowledge of cis-regulatory elements to precisely control either endogenous or introduced genes. To generate this knowledge, we used a massively parallel reporter assay to measure the activity of nearly complete sets of promoters from Arabidopsis, maize and sorghum. We demonstrate that core promoter elements-notably the TATA box-as well as promoter GC content and promoter-proximal transcription factor binding sites influence promoter strength. By performing the experiments in two assay systems, leaves of the dicot tobacco and protoplasts of the monocot maize, we detect species-specific differences in the contributions of GC content and transcription factors to promoter strength. Using these observations, we built computational models to predict promoter strength in both assay systems, allowing us to design highly active promoters comparable in activity to the viral 35S minimal promoter. Our results establish a promising experimental approach to optimize native promoter elements and generate synthetic ones with desirable features.
Collapse
Affiliation(s)
- Tobias Jores
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jackson Tonnies
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Graduate Program in Biology, University of Washington, Seattle, WA, USA
| | - Travis Wrightsman
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA
| | - Edward S Buckler
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA
- Agricultural Research Service, United States Department of Agriculture, Ithaca, NY, USA
- Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA
| | - Josh T Cuperus
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| | - Stanley Fields
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Department of Medicine, University of Washington, Seattle, WA, USA.
| | - Christine Queitsch
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| |
Collapse
|
43
|
van den Akker GGH, Zacchini F, Housmans BAC, van der Vloet L, Caron MMJ, Montanaro L, Welting TJM. Current Practice in Bicistronic IRES Reporter Use: A Systematic Review. Int J Mol Sci 2021; 22:5193. [PMID: 34068921 PMCID: PMC8156625 DOI: 10.3390/ijms22105193] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 05/05/2021] [Accepted: 05/12/2021] [Indexed: 12/26/2022] Open
Abstract
Bicistronic reporter assays have been instrumental for transgene expression, understanding of internal ribosomal entry site (IRES) translation, and identification of novel cap-independent translational elements (CITE). We observed a large methodological variability in the use of bicistronic reporter assays and data presentation or normalization procedures. Therefore, we systematically searched the literature for bicistronic IRES reporter studies and analyzed methodological details, data visualization, and normalization procedures. Two hundred fifty-seven publications were identified using our search strategy (published 1994-2020). Experimental studies on eukaryotic adherent cell systems and the cell-free translation assay were included for further analysis. We evaluated the following methodological details for 176 full text articles: the bicistronic reporter design, the cell line or type, transfection methods, and time point of analyses post-transfection. For the cell-free translation assay, we focused on methods of in vitro transcription, type of translation lysate, and incubation times and assay temperature. Data can be presented in multiple ways: raw data from individual cistrons, a ratio of the two, or fold changes thereof. In addition, many different control experiments have been suggested when studying IRES-mediated translation. In addition, many different normalization and control experiments have been suggested when studying IRES-mediated translation. Therefore, we also categorized and summarized their use. Our unbiased analyses provide a representative overview of bicistronic IRES reporter use. We identified parameters that were reported inconsistently or incompletely, which could hamper data reproduction and interpretation. On the basis of our analyses, we encourage adhering to a number of practices that should improve transparency of bicistronic reporter data presentation and improve methodological descriptions to facilitate data replication.
Collapse
Affiliation(s)
- Guus Gijsbertus Hubert van den Akker
- Department of Orthopedic Surgery, Maastricht University, Medical Center+, 6229 ER Maastricht, The Netherlands; (G.G.H.v.d.A.); (B.A.C.H.); (L.v.d.V.); (M.M.J.C.)
| | - Federico Zacchini
- Department of Experimental, Diagnostic and Specialty Medicine, Bologna University, I-40138 Bologna, Italy; (F.Z.); (L.M.)
- Centro di Ricerca Biomedica Applicata—CRBA, Bologna University, Policlinico di Sant’Orsola, I-40138 Bologna, Italy
| | - Bas Adrianus Catharina Housmans
- Department of Orthopedic Surgery, Maastricht University, Medical Center+, 6229 ER Maastricht, The Netherlands; (G.G.H.v.d.A.); (B.A.C.H.); (L.v.d.V.); (M.M.J.C.)
| | - Laura van der Vloet
- Department of Orthopedic Surgery, Maastricht University, Medical Center+, 6229 ER Maastricht, The Netherlands; (G.G.H.v.d.A.); (B.A.C.H.); (L.v.d.V.); (M.M.J.C.)
| | - Marjolein Maria Johanna Caron
- Department of Orthopedic Surgery, Maastricht University, Medical Center+, 6229 ER Maastricht, The Netherlands; (G.G.H.v.d.A.); (B.A.C.H.); (L.v.d.V.); (M.M.J.C.)
| | - Lorenzo Montanaro
- Department of Experimental, Diagnostic and Specialty Medicine, Bologna University, I-40138 Bologna, Italy; (F.Z.); (L.M.)
- Centro di Ricerca Biomedica Applicata—CRBA, Bologna University, Policlinico di Sant’Orsola, I-40138 Bologna, Italy
- Programma Dipartimentale in Medicina di Laboratorio, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Via Albertoni 15, I-40138 Bologna, Italy
| | - Tim Johannes Maria Welting
- Department of Orthopedic Surgery, Maastricht University, Medical Center+, 6229 ER Maastricht, The Netherlands; (G.G.H.v.d.A.); (B.A.C.H.); (L.v.d.V.); (M.M.J.C.)
| |
Collapse
|
44
|
Parisi C, Vashisht S, Winata CL. Fish-Ing for Enhancers in the Heart. Int J Mol Sci 2021; 22:3914. [PMID: 33920121 PMCID: PMC8069060 DOI: 10.3390/ijms22083914] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/07/2021] [Accepted: 04/08/2021] [Indexed: 12/19/2022] Open
Abstract
Precise control of gene expression is crucial to ensure proper development and biological functioning of an organism. Enhancers are non-coding DNA elements which play an essential role in regulating gene expression. They contain specific sequence motifs serving as binding sites for transcription factors which interact with the basal transcription machinery at their target genes. Heart development is regulated by intricate gene regulatory network ensuring precise spatiotemporal gene expression program. Mutations affecting enhancers have been shown to result in devastating forms of congenital heart defect. Therefore, identifying enhancers implicated in heart biology and understanding their mechanism is key to improve diagnosis and therapeutic options. Despite their crucial role, enhancers are poorly studied, mainly due to a lack of reliable way to identify them and determine their function. Nevertheless, recent technological advances have allowed rapid progress in enhancer discovery. Model organisms such as the zebrafish have contributed significant insights into the genetics of heart development through enabling functional analyses of genes and their regulatory elements in vivo. Here, we summarize the current state of knowledge on heart enhancers gained through studies in model organisms, discuss various approaches to discover and study their function, and finally suggest methods that could further advance research in this field.
Collapse
Affiliation(s)
- Costantino Parisi
- International Institute of Molecular and Cell Biology in Warsaw, 02-109 Warsaw, Poland; (C.P.); (S.V.)
| | - Shikha Vashisht
- International Institute of Molecular and Cell Biology in Warsaw, 02-109 Warsaw, Poland; (C.P.); (S.V.)
| | - Cecilia Lanny Winata
- International Institute of Molecular and Cell Biology in Warsaw, 02-109 Warsaw, Poland; (C.P.); (S.V.)
- Max Planck Institute for Heart and Lung Research, 61231 Bad Nauheim, Germany
| |
Collapse
|
45
|
What do Transcription Factors Interact With? J Mol Biol 2021; 433:166883. [PMID: 33621520 DOI: 10.1016/j.jmb.2021.166883] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 02/09/2021] [Accepted: 02/13/2021] [Indexed: 12/11/2022]
Abstract
Although we have made significant progress, we still possess a limited understanding of how genomic and epigenomic information directs gene expression programs through sequence-specific transcription factors (TFs). Extensive research has settled on three general classes of TF targets in metazoans: promoter accessibility via chromatin regulation (e.g., SAGA), assembly of the general transcription factors on promoter DNA (e.g., TFIID), and recruitment of RNA polymerase (Pol) II (e.g., Mediator) to establish a transcription pre-initiation complex (PIC). Here we discuss TFs and their targets. We also place this in the context of our current work with Saccharomyces (yeast), where we find that promoters typically lack an architecture that supports TF function. Moreover, yeast promoters that support TF binding also display interactions with cofactors like SAGA and Mediator, but not TFIID. It is unknown to what extent all genes in metazoans require TFs and their cofactors.
Collapse
|
46
|
Bonny AR, Fonseca JP, Park JE, El-Samad H. Orthogonal control of mean and variability of endogenous genes in a human cell line. Nat Commun 2021; 12:292. [PMID: 33436569 PMCID: PMC7804932 DOI: 10.1038/s41467-020-20467-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Accepted: 11/25/2020] [Indexed: 12/11/2022] Open
Abstract
Stochastic fluctuations at the transcriptional level contribute to isogenic cell-to-cell heterogeneity in mammalian cell populations. However, we still have no clear understanding of the repercussions of this heterogeneity, given the lack of tools to independently control mean expression and variability of a gene. Here, we engineer a synthetic circuit to modulate mean expression and heterogeneity of transgenes and endogenous human genes. The circuit, a Tunable Noise Rheostat (TuNR), consists of a transcriptional cascade of two inducible transcriptional activators, where the output mean and variance can be modulated by two orthogonal small molecule inputs. In this fashion, different combinations of the inputs can achieve the same mean but with different population variability. With TuNR, we achieve low basal expression, over 1000-fold expression of a transgene product, and up to 7-fold induction of the endogenous gene NGFR. Importantly, for the same mean expression level, we are able to establish varying degrees of heterogeneity in expression within an isogenic population, thereby decoupling gene expression noise from its mean. TuNR is therefore a modular tool that can be used in mammalian cells to enable direct interrogation of the implications of cell-to-cell variability.
Collapse
Affiliation(s)
- Alain R Bonny
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - João Pedro Fonseca
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, 94158, USA
- Amyris Bio Products Portugal, Porto, Portugal
| | - Jesslyn E Park
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - Hana El-Samad
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA, 94158, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, 94158, USA.
| |
Collapse
|
47
|
Yu TC, Liu WL, Brinck MS, Davis JE, Shek J, Bower G, Einav T, Insigne KD, Phillips R, Kosuri S, Urtecho G. Multiplexed characterization of rationally designed promoter architectures deconstructs combinatorial logic for IPTG-inducible systems. Nat Commun 2021; 12:325. [PMID: 33436562 PMCID: PMC7804116 DOI: 10.1038/s41467-020-20094-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 11/04/2020] [Indexed: 12/21/2022] Open
Abstract
A crucial step towards engineering biological systems is the ability to precisely tune the genetic response to environmental stimuli. In the case of Escherichia coli inducible promoters, our incomplete understanding of the relationship between sequence composition and gene expression hinders our ability to predictably control transcriptional responses. Here, we profile the expression dynamics of 8269 rationally designed, IPTG-inducible promoters that collectively explore the individual and combinatorial effects of RNA polymerase and LacI repressor binding site strengths. We then fit a statistical mechanics model to measured expression that accurately models gene expression and reveals properties of theoretically optimal inducible promoters. Furthermore, we characterize three alternative promoter architectures and show that repositioning binding sites within promoters influences the types of combinatorial effects observed between promoter elements. In total, this approach enables us to deconstruct relationships between inducible promoter elements and discover practical insights for engineering inducible promoters with desirable characteristics.
Collapse
Affiliation(s)
- Timothy C Yu
- Department of Bioengineering, University of California, Los Angeles, CA, 90095, USA
| | - Winnie L Liu
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, CA, 90095, USA
| | - Marcia S Brinck
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA, 90095, USA
| | - Jessica E Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
| | - Jeremy Shek
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
| | - Grace Bower
- Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, CA, 90095, USA
| | - Tal Einav
- Department of Physics, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Kimberly D Insigne
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, CA, 90095, USA
| | - Rob Phillips
- Department of Physics, California Institute of Technology, Pasadena, CA, 91125, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
- Department of Applied Physics, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA.
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, 90095, USA.
- Institute for Quantitative and Computational Biosciences (QCB), University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, 90095, USA.
- Molecular Biology Interdepartmental Doctoral Program, University of California, Los Angeles, CA, 90095, USA.
| | - Guillaume Urtecho
- Molecular Biology Interdepartmental Doctoral Program, University of California, Los Angeles, CA, 90095, USA.
| |
Collapse
|
48
|
Al-Obaide MAI, Al-Obaidi II, Vasylyeva TL. Unexplored regulatory sequences of divergently paired GLA and HNRNPH2 loci pertinent to Fabry disease in human kidney and skin cells: Presence of an active bidirectional promoter. Exp Ther Med 2020; 21:154. [PMID: 33456521 PMCID: PMC7792484 DOI: 10.3892/etm.2020.9586] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 12/01/2020] [Indexed: 12/21/2022] Open
Abstract
Fabry disease (FD) is a rare hereditary disorder characterized by a wide range of symptoms caused by a variety of mutations in the galactosidase α (GLA) gene. The heterogeneous nuclear ribonucleoprotein (HNRNPH2) gene is divergently paired with GLA on chromosome X and is thought to be implicated in FD. However, insufficient information is available on the regulatory mechanisms associated with the expression of HNRNPH2 and the GLA loci. Therefore, the current study performed bioinformatics analyses to assess the GLA and HNRNPH2 loci and investigate the regulatory mechanisms involved in the expression of each gene. The regulatory mechanisms underlying GLA and HNRNPH2 were revealed. The expression of each gene was associated with a bidirectional promoter (BDP) characterized by the absence of TATA box motifs and the presence of specific transcription factor binding sites (TFBSs) and a CpG Island (CGI). The nuclear run-on transcription assay confirmed the activity of BDP GLA and HNRNPH2 transcription in 293T. Methylation-specific PCR analysis demonstrated a statistically significant variation in the DNA methylation pattern of BDP in several cell lines, including human adult epidermal keratinocytes (AEKs), human renal glomerular endothelial cells, human renal epithelial cells and 293T cells. The highest observed significance was demonstrated in AEKs (P<0.05). The results of the chromatin-immunoprecipitation assay using 293T cells identified specific TFBS motifs for Yin Yang 1 and nuclear respiratory factor 1 transcription factors in BDPs. The National Center for Biotechnology Information-single nucleotide polymorphism database revealed pathogenic variants in the BDP sequence. Additionally, a previously reported variant associated with a severe heterozygous female case of GLA FD was mapped in BDP. The results of the present study suggested that the expression of the divergent paired loci, GLA and HNRNPH2, were controlled by BDP. Mutations in BDP may also serve a role in FD and may explain clinical disease diversity.
Collapse
Affiliation(s)
- Mohammed A Ibrahim Al-Obaide
- Department of Pediatrics, School of Medicine, Texas Tech University Health Sciences Center, Amarillo, TX 79106, USA
| | - Ibtisam I Al-Obaidi
- Department of Pediatrics, School of Medicine, Texas Tech University Health Sciences Center, Amarillo, TX 79106, USA
| | - Tetyana L Vasylyeva
- Department of Pediatrics, School of Medicine, Texas Tech University Health Sciences Center, Amarillo, TX 79106, USA
| |
Collapse
|
49
|
Molecular and evolutionary processes generating variation in gene expression. Nat Rev Genet 2020; 22:203-215. [PMID: 33268840 DOI: 10.1038/s41576-020-00304-w] [Citation(s) in RCA: 103] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2020] [Indexed: 12/18/2022]
Abstract
Heritable variation in gene expression is common within and between species. This variation arises from mutations that alter the form or function of molecular gene regulatory networks that are then filtered by natural selection. High-throughput methods for introducing mutations and characterizing their cis- and trans-regulatory effects on gene expression (particularly, transcription) are revealing how different molecular mechanisms generate regulatory variation, and studies comparing these mutational effects with variation seen in the wild are teasing apart the role of neutral and non-neutral evolutionary processes. This integration of molecular and evolutionary biology allows us to understand how the variation in gene expression we see today came to be and to predict how it is most likely to evolve in the future.
Collapse
|
50
|
Renganaath K, Chong R, Day L, Kosuri S, Kruglyak L, Albert FW. Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross. eLife 2020; 9:e62669. [PMID: 33179598 PMCID: PMC7685706 DOI: 10.7554/elife.62669] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/11/2020] [Indexed: 02/06/2023] Open
Abstract
Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Collapse
Affiliation(s)
- Kaushik Renganaath
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| | - Rockie Chong
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Laura Day
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Sriram Kosuri
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Frank W Albert
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|