1
|
Asma H, Liu L, Halfon MS. SCRMshaw: Supervised cis-regulatory module prediction for insect genomes. PLoS One 2024; 19:e0311752. [PMID: 39637210 PMCID: PMC11620701 DOI: 10.1371/journal.pone.0311752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 09/24/2024] [Indexed: 12/07/2024] Open
Abstract
As the number of sequenced insect genomes continues to grow, there is a pressing need for rapid and accurate annotation of their regulatory component. SCRMshaw is a computational tool designed to predict cis-regulatory modules ("enhancers") in the genomes of various insect species. A key advantage of SCRMshaw is its accessibility. It requires minimal resources-just a genome sequence and training data from known Drosophila regulatory sequences, which are readily available for download. Even users with modest computational skills can run SCRMshaw on a desktop computer for basic applications, although a high-performance computing cluster is recommended for optimal results. SCRMshaw can be tailored to specific needs: users can employ a single set of training data to predict enhancers associated with a particular gene expression pattern, or utilize multiple sets to provide a first-pass regulatory annotation for a newly-sequenced genome. This protocol provides an extensive update to the previously published SCRMshaw protocol and aligns with the methods used in a recent annotation of over 30 insect regulatory genomes. It includes the most recent modifications to the SCRMshaw protocol and details an end-to-end pipeline that begins with a sequenced genome and ends with a fully-annotated regulatory genome. Relevant scripts are available via GitHub, and a living protocol that will be updated as necessary is linked to this article at protocols.io.
Collapse
Affiliation(s)
- Hasiba Asma
- Departments of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY, United States of America
| | - Luna Liu
- Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY, United States of America
| | - Marc S. Halfon
- Departments of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY, United States of America
- Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY, United States of America
- Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY, United States of America
| |
Collapse
|
2
|
Asma H, Tieke E, Deem KD, Rahmat J, Dong T, Huang X, Tomoyasu Y, Halfon MS. Regulatory genome annotation of 33 insect species. eLife 2024; 13:RP96738. [PMID: 39392676 PMCID: PMC11469670 DOI: 10.7554/elife.96738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024] Open
Abstract
Annotation of newly sequenced genomes frequently includes genes, but rarely covers important non-coding genomic features such as the cis-regulatory modules-e.g., enhancers and silencers-that regulate gene expression. Here, we begin to remedy this situation by developing a workflow for rapid initial annotation of insect regulatory sequences, and provide a searchable database resource with enhancer predictions for 33 genomes. Using our previously developed SCRMshaw computational enhancer prediction method, we predict over 2.8 million regulatory sequences along with the tissues where they are expected to be active, in a set of insect species ranging over 360 million years of evolution. Extensive analysis and validation of the data provides several lines of evidence suggesting that we achieve a high true-positive rate for enhancer prediction. One, we show that our predictions target specific loci, rather than random genomic locations. Two, we predict enhancers in orthologous loci across a diverged set of species to a significantly higher degree than random expectation would allow. Three, we demonstrate that our predictions are highly enriched for regions of accessible chromatin. Four, we achieve a validation rate in excess of 70% using in vivo reporter gene assays. As we continue to annotate both new tissues and new species, our regulatory annotation resource will provide a rich source of data for the research community and will have utility for both small-scale (single gene, single species) and large-scale (many genes, many species) studies of gene regulation. In particular, the ability to search for functionally related regulatory elements in orthologous loci should greatly facilitate studies of enhancer evolution even among distantly related species.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New YorkBuffaloUnited States
| | - Ellen Tieke
- Department of Biology, Miami UniversityOxfordUnited States
| | - Kevin D Deem
- Department of Biology, Miami UniversityOxfordUnited States
| | - Jabale Rahmat
- Department of Biology, Miami UniversityOxfordUnited States
| | - Tiffany Dong
- Department of Biochemistry, University at Buffalo-State University of New YorkBuffaloUnited States
| | - Xinbo Huang
- Department of Biochemistry, University at Buffalo-State University of New YorkBuffaloUnited States
| | | | - Marc S Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New YorkBuffaloUnited States
- Department of Biochemistry, University at Buffalo-State University of New YorkBuffaloUnited States
- Department of Biomedical Informatics, University at Buffalo-State University of New YorkBuffaloUnited States
- Department of Biological Sciences, University at Buffalo-State University of New YorkBuffaloUnited States
| |
Collapse
|
3
|
Schember I, Reid W, Sterling-Lentsch G, Halfon MS. Conserved and novel enhancers in the Aedes aegypti single-minded locus recapitulate embryonic ventral midline gene expression. PLoS Genet 2024; 20:e1010891. [PMID: 38683842 PMCID: PMC11081499 DOI: 10.1371/journal.pgen.1010891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 05/09/2024] [Accepted: 04/16/2024] [Indexed: 05/02/2024] Open
Abstract
Transcriptional cis-regulatory modules, e.g., enhancers, control the time and location of metazoan gene expression. While changes in enhancers can provide a powerful force for evolution, there is also significant deep conservation of enhancers for developmentally important genes, with function and sequence characteristics maintained over hundreds of millions of years of divergence. Not well understood, however, is how the overall regulatory composition of a locus evolves, with important outstanding questions such as how many enhancers are conserved vs. novel, and to what extent are the locations of conserved enhancers within a locus maintained? We begin here to address these questions with a comparison of the respective single-minded (sim) loci in the two dipteran species Drosophila melanogaster (fruit fly) and Aedes aegypti (mosquito). sim encodes a highly conserved transcription factor that mediates development of the arthropod embryonic ventral midline. We identify two enhancers in the A. aegypti sim locus and demonstrate that they function equivalently in both transgenic flies and transgenic mosquitoes. One A. aegypti enhancer is highly similar to known Drosophila counterparts in its activity, location, and autoregulatory capability. The other differs from any known Drosophila sim enhancers with a novel location, failure to autoregulate, and regulation of expression in a unique subset of midline cells. Our results suggest that the conserved pattern of sim expression in the two species is the result of both conserved and novel regulatory sequences. Further examination of this locus will help to illuminate how the overall regulatory landscape of a conserved developmental gene evolves.
Collapse
Affiliation(s)
- Isabella Schember
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - William Reid
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - Geyenna Sterling-Lentsch
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- New York State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, New York, United States of America
| |
Collapse
|
4
|
Guffart E, Prinz M. Evolution of Microglia. ADVANCES IN NEUROBIOLOGY 2024; 37:39-51. [PMID: 39207685 DOI: 10.1007/978-3-031-55529-9_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Microglial cells are unique tissue-resident macrophages located in the parenchyma of the central nervous system (CNS). A recent comparative transcriptional study on microglia across more than 20 species from leach across chicken and many more up to humans revealed multiple conserved features. The results indicate the imperative role of microglia over the last 500 million years (Geirsdottir et al. Cell 181:746, 2020). Improved understanding of microglial evolution provides essential insights into conserved and divergent microglial pathways and will have implications for future development of microglia-based therapies to treat CNS disorders. Not only therapeutic approaches may be rethought, but also the understanding of sex specificity of the immune system within the CNS needs to be renewed. Besides revealing the highly detailed characteristics of microglia, the former paradigm of microglia being the only CNS-resident immune cells was outdated by the identification of CNS-associated macrophages (CAMs) as CNS interface residents, who, most likely, accompanied microglia in evolution over the past million years.
Collapse
Affiliation(s)
- Elena Guffart
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Marco Prinz
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany.
- Center for Basics in NeuroModulation (NeuroModulBasics), Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
5
|
Garza AB, Garcia R, Solis LM, Halfon MS, Girgis HZ. EnhancerTracker: Comparing cell-type-specific enhancer activity of DNA sequence triplets via an ensemble of deep convolutional neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.23.573198. [PMID: 38187673 PMCID: PMC10769370 DOI: 10.1101/2023.12.23.573198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Motivation Transcriptional enhancers - unlike promoters - are unrestrained by distance or strand orientation with respect to their target genes, making their computational identification a challenge. Further, there are insufficient numbers of confirmed enhancers for many cell types, preventing robust training of machine-learning-based models for enhancer prediction for such cell types. Results We present EnhancerTracker , a novel tool that leverages an ensemble of deep separable convolutional neural networks to identify cell-type-specific enhancers with the need of only two confirmed enhancers. EnhancerTracker is trained, validated, and tested on 52,789 putative enhancers obtained from the FANTOM5 Project and control sequences derived from the human genome. Unlike available tools, which accept one sequence at a time, the input to our tool is three sequences; the first two are enhancers active in the same cell type. EnhancerTracker outputs 1 if the third sequence is an enhancer active in the same cell type(s) where the first two enhancers are active. It outputs 0 otherwise. On a held-out set (15%), EnhancerTracker achieved an accuracy of 64%, a specificity of 93%, a recall of 35%, a precision of 84%, and an F1 score of 49%. Availability and implementation https://github.com/BioinformaticsToolsmith/EnhancerTracker. Contact hani.girgis@tamuk.edu.
Collapse
|
6
|
Gonçalves TM, Stewart CL, Baxley SD, Xu J, Li D, Gabel HW, Wang T, Avraham O, Zhao G. Towards a comprehensive regulatory map of Mammalian Genomes. RESEARCH SQUARE 2023:rs.3.rs-3294408. [PMID: 37841836 PMCID: PMC10571623 DOI: 10.21203/rs.3.rs-3294408/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We developed a training data-free CRM-prediction algorithm, the Mammalian Regulatory MOdule Detector (MrMOD) for accurate CRM prediction in mammalian genomes. MrMOD provides genome position-fixed CRM models similar to the fixed gene models for the mouse and human genomes using only genomic sequences as the inputs with one adjustable parameter - the significance p-value. Importantly, MrMOD predicts a comprehensive set of high-resolution CRMs in the mouse and human genomes including all types of regulatory modules not limited to any tissue, cell type, developmental stage, or condition. We computationally validated MrMOD predictions used a compendium of 21 orthogonal experimental data sets including thousands of experimentally defined CRMs and millions of putative regulatory elements derived from hundreds of different tissues, cell types, and stimulus conditions obtained from multiple databases. In ovo transgenic reporter assay demonstrates the power of our prediction in guiding experimental design. We analyzed CRMs located in the chromosome 17 using unsupervised machine learning and identified groups of CRMs with multiple lines of evidence supporting their functionality, linking CRMs with upstream binding transcription factors and downstream target genes. Our work provides a comprehensive base pair resolution annotation of the functional regulatory elements and non-functional regions in the mammalian genomes.
Collapse
Affiliation(s)
| | | | | | - Jason Xu
- Missouri University of Science & Technology
| | - Daofeng Li
- Washington University School of Medicine
| | | | - Ting Wang
- Washington University School of Medicine
| | | | | |
Collapse
|
7
|
Weinstein ML, Jaenke CM, Asma H, Spangler M, Kohnen KA, Konys CC, Williams ME, Williams AV, Rebeiz M, Halfon MS, Williams TM. A novel role for trithorax in the gene regulatory network for a rapidly evolving fruit fly pigmentation trait. PLoS Genet 2023; 19:e1010653. [PMID: 36795790 PMCID: PMC9977049 DOI: 10.1371/journal.pgen.1010653] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 03/01/2023] [Accepted: 02/03/2023] [Indexed: 02/17/2023] Open
Abstract
Animal traits develop through the expression and action of numerous regulatory and realizator genes that comprise a gene regulatory network (GRN). For each GRN, its underlying patterns of gene expression are controlled by cis-regulatory elements (CREs) that bind activating and repressing transcription factors. These interactions drive cell-type and developmental stage-specific transcriptional activation or repression. Most GRNs remain incompletely mapped, and a major barrier to this daunting task is CRE identification. Here, we used an in silico method to identify predicted CREs (pCREs) that comprise the GRN which governs sex-specific pigmentation of Drosophila melanogaster. Through in vivo assays, we demonstrate that many pCREs activate expression in the correct cell-type and developmental stage. We employed genome editing to demonstrate that two CREs control the pupal abdomen expression of trithorax, whose function is required for the dimorphic phenotype. Surprisingly, trithorax had no detectable effect on this GRN's key trans-regulators, but shapes the sex-specific expression of two realizator genes. Comparison of sequences orthologous to these CREs supports an evolutionary scenario where these trithorax CREs predated the origin of the dimorphic trait. Collectively, this study demonstrates how in silico approaches can shed novel insights on the GRN basis for a trait's development and evolution.
Collapse
Affiliation(s)
- Michael L. Weinstein
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
| | - Chad M. Jaenke
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
| | - Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - Matthew Spangler
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
| | - Katherine A. Kohnen
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
| | - Claire C. Konys
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
| | - Melissa E. Williams
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
| | - Ashley V. Williams
- West Carrollton High School, 5833 Student St., Dayton, Ohio, United States of America
| | - Mark Rebeiz
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, New York, United States of America
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, New York, United States of America
| | - Thomas M. Williams
- Department of Biology, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
- The Integrative Science and Engineering Center, University of Dayton, 300 College Park, Dayton, Ohio, United States of America
- * E-mail:
| |
Collapse
|
8
|
Alonso-Alvarez C, Andrade P, Cantarero A, Morales J, Carneiro M. Relocation to avoid costs: A hypothesis on red carotenoid-based signals based on recent CYP2J19 gene expression data. Bioessays 2022; 44:e2200037. [PMID: 36209392 DOI: 10.1002/bies.202200037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/25/2022] [Accepted: 09/22/2022] [Indexed: 11/11/2022]
Abstract
In many vertebrates, the enzymatic oxidation of dietary yellow carotenoids generates red keto-carotenoids giving color to ornaments. The oxidase CYP2J19 is here a key effector. Its purported intracellular location suggests a shared biochemical pathway between trait expression and cell functioning. This might guarantee the reliability of red colorations as individual quality signals independent of production costs. We hypothesize that the ornament type (feathers vs. bare parts) and production costs (probably CYP2J19 activity compromising vital functions) could have promoted tissue-specific gene relocation. We review current avian tissue-specific CYP2J19 expression data. Among the ten red-billed species showing CYP2J19 bill expression, only one showed strong hepatic expression. Moreover, a phylogenetically-controlled analysis of 25 red-colored species shows that those producing red bare parts are less likely to have strong hepatic CYP2J19 expression than species with only red plumages. Thus, both production costs and shared pathways might have contributed to the evolution of red signals.
Collapse
Affiliation(s)
- Carlos Alonso-Alvarez
- Department of Evolutionary Ecology, National Museum of Natural Sciences - CSIC. C/ José Gutiérrez Abascal 2, Madrid, Spain
| | - Pedro Andrade
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Universidade do Porto, Vairão, Portugal.,BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| | - Alejandro Cantarero
- Department of Evolutionary Ecology, National Museum of Natural Sciences - CSIC. C/ José Gutiérrez Abascal 2, Madrid, Spain.,Department of Physiology, Veterinary School, Complutense University of Madrid, Madrid, Spain
| | - Judith Morales
- Department of Evolutionary Ecology, National Museum of Natural Sciences - CSIC. C/ José Gutiérrez Abascal 2, Madrid, Spain
| | - Miguel Carneiro
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Universidade do Porto, Vairão, Portugal.,BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| |
Collapse
|
9
|
Luo L, Gribskov M, Wang S. Bibliometric review of ATAC-Seq and its application in gene expression. Brief Bioinform 2022; 23:6543486. [PMID: 35255493 PMCID: PMC9116206 DOI: 10.1093/bib/bbac061] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/06/2022] [Accepted: 02/09/2022] [Indexed: 11/30/2022] Open
Abstract
With recent advances in high-throughput next-generation sequencing, it is possible to describe the regulation and expression of genes at multiple levels. An assay for transposase-accessible chromatin using sequencing (ATAC-seq), which uses Tn5 transposase to sequence protein-free binding regions of the genome, can be combined with chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) and ribonucleic acid sequencing (RNA-seq) to provide a detailed description of gene expression. Here, we reviewed the literature on ATAC-seq and described the characteristics of ATAC-seq publications. We then briefly introduced the principles of RNA-seq, ChIP-seq and ATAC-seq, focusing on the main features of the techniques. We built a phylogenetic tree from species that had been previously studied by using ATAC-seq. Studies of Mus musculus and Homo sapiens account for approximately 90% of the total ATAC-seq data, while other species are still in the process of accumulating data. We summarized the findings from human diseases and other species, illustrating the cutting-edge discoveries and the role of multi-omics data analysis in current research. Moreover, we collected and compared ATAC-seq analysis pipelines, which allowed biological researchers who lack programming skills to better analyze and explore ATAC-seq data. Through this review, it is clear that multi-omics analysis and single-cell sequencing technology will become the mainstream approach in future research.
Collapse
Affiliation(s)
- Liheng Luo
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| | - Michael Gribskov
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Sufang Wang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| |
Collapse
|
10
|
Umarov R, Li Y, Arakawa T, Takizawa S, Gao X, Arner E. ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation. PLoS Comput Biol 2021; 17:e1009376. [PMID: 34491989 PMCID: PMC8448322 DOI: 10.1371/journal.pcbi.1009376] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 09/17/2021] [Accepted: 08/23/2021] [Indexed: 11/19/2022] Open
Abstract
Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing transcription at distant regions (enhancers). Accurate identification of regulatory elements is fundamental for annotating genomes and understanding gene expression patterns. While there are many attempts to develop computational promoter and enhancer identification methods, reliable tools to analyze long genomic sequences are still lacking. Prediction methods often perform poorly on the genome-wide scale because the number of negatives is much higher than that in the training sets. To address this issue, we propose a dynamic negative set updating scheme with a two-model approach, using one model for scanning the genome and the other one for testing candidate positions. The developed method achieves good genome-level performance and maintains robust performance when applied to other vertebrate species, without re-training. Moreover, the unannotated predicted regulatory regions made on the human genome are enriched for disease-associated variants, suggesting them to be potentially true regulatory elements rather than false positives. We validated high scoring "false positive" predictions using reporter assay and all tested candidates were successfully validated, demonstrating the ability of our method to discover novel human regulatory regions.
Collapse
Affiliation(s)
- Ramzan Umarov
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan
- * E-mail: (RU); (XG); (EA)
| | - Yu Li
- Department of Computer Science and Engineering (CSE), The Chinese University of Hong Kong (CUHK), Hong Kong, People’s Republic of China
| | - Takahiro Arakawa
- Laboratory for Applied Regulatory Genomics Network Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Satoshi Takizawa
- Laboratory for Applied Regulatory Genomics Network Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Xin Gao
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, Thuwal, Saudi Arabia
- * E-mail: (RU); (XG); (EA)
| | - Erik Arner
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Japan
- Laboratory for Applied Regulatory Genomics Network Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
- * E-mail: (RU); (XG); (EA)
| |
Collapse
|
11
|
Schember I, Halfon MS. Identification of new Anopheles gambiae transcriptional enhancers using a cross-species prediction approach. INSECT MOLECULAR BIOLOGY 2021; 30:410-419. [PMID: 33866636 PMCID: PMC8266755 DOI: 10.1111/imb.12705] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 02/09/2021] [Accepted: 03/31/2021] [Indexed: 06/12/2023]
Abstract
The success of transgenic mosquito vector control approaches relies on well-targeted gene expression, requiring the identification and characterization of a diverse set of mosquito promoters and transcriptional enhancers. However, few enhancers have been characterized in Anopheles gambiae to date. Here, we employ the SCRMshaw method we previously developed to predict enhancers in the A. gambiae genome, preferentially targeting vector-relevant tissues such as the salivary glands, midgut and nervous system. We demonstrate a high overall success rate, with at least 8 of 11 (73%) tested sequences validating as enhancers in an in vivo xenotransgenic assay. Four tested sequences drive expression in either the salivary gland or the midgut, making them directly useful for probing the biology of these infection-relevant tissues. The success of our study suggests that computational enhancer prediction should serve as an effective means for identifying A. gambiae enhancers with activity in tissues involved in malaria propagation and transmission.
Collapse
Affiliation(s)
- Isabella Schember
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14263
| |
Collapse
|
12
|
Global patterns of enhancer activity during sea urchin embryogenesis assessed by eRNA profiling. Genome Res 2021; 31:1680-1692. [PMID: 34330790 PMCID: PMC8415375 DOI: 10.1101/gr.275684.121] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 07/28/2021] [Indexed: 11/25/2022]
Abstract
We used capped analysis of gene expression with sequencing (CAGE-seq) to profile eRNA expression and enhancer activity during embryogenesis of a model echinoderm: the sea urchin, Strongylocentrotus purpuratus. We identified more than 18,000 enhancers that were active in mature oocytes and developing embryos and documented a burst of enhancer activation during cleavage and early blastula stages. We found that a large fraction (73.8%) of all enhancers active during the first 48 h of embryogenesis were hyperaccessible no later than the 128-cell stage and possibly even earlier. Most enhancers were located near gene bodies, and temporal patterns of eRNA expression tended to parallel those of nearby genes. Furthermore, enhancers near lineage-specific genes contained signatures of inputs from developmental gene regulatory networks deployed in those lineages. A large fraction (60%) of sea urchin enhancers previously shown to be active in transgenic reporter assays was associated with eRNA expression. Moreover, a large fraction (50%) of a representative subset of enhancers identified by eRNA profiling drove tissue-specific gene expression in isolation when tested by reporter assays. Our findings provide an atlas of developmental enhancers in a model sea urchin and support the utility of eRNA profiling as a tool for enhancer discovery and regulatory biology. The data generated in this study are available at Echinobase, the public database of information related to echinoderm genomics.
Collapse
|
13
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
14
|
Hong J, Gao R, Yang Y. CrepHAN: Cross-species prediction of enhancers by using hierarchical attention networks. Bioinformatics 2021; 37:3436-3443. [PMID: 33978703 DOI: 10.1093/bioinformatics/btab349] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 04/21/2021] [Accepted: 05/06/2021] [Indexed: 01/17/2023] Open
Abstract
MOTIVATION Enhancers are important functional elements in genome sequences. The identification of enhancers is a very challenging task due to the great diversity of enhancer sequences and the flexible localization on genomes. Till now, the interactions between enhancers and genes have not been fully understood yet. To speed up the studies of the regulatory roles of enhancers, computational tools for the prediction of enhancers have emerged in recent years. Especially, thanks to the ENCODE project and the advances of high-throughput experimental techniques, a large amount of experimentally verified enhancers have been annotated on the human genome, which allows large-scale predictions of unknown enhancers using data-driven methods. However, except for human and some model organisms, the validated enhancer annotations are scarce for most species, leading to more difficulties in the computational identification of enhancers for their genomes. RESULTS In this study, we propose a deep learning-based predictor for enhancers, named CrepHAN, which is featured by a hierarchical attention neural network and word embedding-based representations for DNA sequences. We use the experimentally-supported data of the human genome to train the model, and perform experiments on human and other mammals, including mouse, cow, and dog. The experimental results show that CrepHAN has more advantages on cross-species predictions, and outperforms the existing models by a large margin. Especially, for human-mouse cross-predictions, the AUC score of ROC curve is increased by 0.033∼0.145 on the combined tissue dataset and 0.032∼0.109 on tissue-specific datasets. AVAILABILITY bcmi.sjtu.edu.cn/~yangyang/CrepHAN.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianwei Hong
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China.,School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ruitian Gao
- Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China.,Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, 200240, China
| |
Collapse
|
15
|
Amezian D, Nauen R, Le Goff G. Transcriptional regulation of xenobiotic detoxification genes in insects - An overview. PESTICIDE BIOCHEMISTRY AND PHYSIOLOGY 2021; 174:104822. [PMID: 33838715 DOI: 10.1016/j.pestbp.2021.104822] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 02/08/2021] [Accepted: 03/02/2021] [Indexed: 05/21/2023]
Abstract
Arthropods have well adapted to the vast array of chemicals they encounter in their environment. Whether these xenobiotics are plant allelochemicals or anthropogenic insecticides one of the strategies they have developed to defend themselves is the induction of detoxification enzymes. Although upregulation of detoxification enzymes and efflux transporters in response to specific inducers has been well described, in insects, yet, little is known on the transcriptional regulation of these genes. Over the past twenty years, an increasing number of studies with insects have used advanced genetic tools such as RNAi, CRISPR/Cas9 and reporter gene assays to dissect the genomic grounds of their xenobiotic response and hence contributed substantially in improving our knowledge on the players involved. Xenobiotics are partly recognized by various "xenobiotic sensors" such as membrane-bound or nuclear receptors. This initiates a molecular reaction cascade ultimately leading to the translocation of a transcription factor to the nucleus that recognizes and binds to short sequences located upstream their target genes to activate transcription. To date, a number of signaling pathways were shown to mediate the upregulation of detoxification enzymes in arthropods and to play a role in either metabolic resistance to insecticides or host-plant adaptation. These include nuclear receptors AhR/ARNT and HR96, GPCRs, CncC and MAPK/CREB. Recent work reveals that upregulation and activation of some components of these pathways as well as polymorphism in the binding motifs of transcription factors are linked to insects' adaptive processes. The aim of this mini-review is to summarize and describe recent work that shed some light on the main regulatory routes of detoxification gene expression in insects.
Collapse
Affiliation(s)
- Dries Amezian
- Université Côte d'Azur, INRAE, CNRS, ISA, F-06903 Sophia Antipolis, France
| | - Ralf Nauen
- Bayer AG, Crop Science Division, R&D, Alfred Nobel-Strasse 50, 40789 Monheim, Germany.
| | - Gaëlle Le Goff
- Université Côte d'Azur, INRAE, CNRS, ISA, F-06903 Sophia Antipolis, France.
| |
Collapse
|
16
|
Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, Fropf R, McAnany C, Gagneur J, Kundaje A, Zeitlinger J. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet 2021; 53:354-366. [PMID: 33603233 PMCID: PMC8812996 DOI: 10.1038/s41588-021-00782-6] [Citation(s) in RCA: 314] [Impact Index Per Article: 78.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 01/07/2021] [Indexed: 01/30/2023]
Abstract
The arrangement (syntax) of transcription factor (TF) binding motifs is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution chromatin immunoprecipitation (ChIP)-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using clustered regularly interspaced short palindromic repeat (CRISPR)-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.
Collapse
Affiliation(s)
- Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany,Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Munich, Germany,Currently at DeepMind, London, UK
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Sabrina Krueger
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Amr Alexandari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Khyati Dalal
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA
| | - Robin Fropf
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Charles McAnany
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA,Department of Genetics, Stanford University, Stanford, CA, USA,correspondence: ,
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA,correspondence: ,
| |
Collapse
|
17
|
M Real F, Haas SA, Franchini P, Xiong P, Simakov O, Kuhl H, Schöpflin R, Heller D, Moeinzadeh MH, Heinrich V, Krannich T, Bressin A, Hartmann MF, Wudy SA, Dechmann DKN, Hurtado A, Barrionuevo FJ, Schindler M, Harabula I, Osterwalder M, Hiller M, Wittler L, Visel A, Timmermann B, Meyer A, Vingron M, Jiménez R, Mundlos S, Lupiáñez DG. The mole genome reveals regulatory rearrangements associated with adaptive intersexuality. Science 2020; 370:208-214. [PMID: 33033216 PMCID: PMC8243244 DOI: 10.1126/science.aaz2582] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 04/19/2020] [Accepted: 08/17/2020] [Indexed: 01/01/2023]
Abstract
Linking genomic variation to phenotypical traits remains a major challenge in evolutionary genetics. In this study, we use phylogenomic strategies to investigate a distinctive trait among mammals: the development of masculinizing ovotestes in female moles. By combining a chromosome-scale genome assembly of the Iberian mole, Talpa occidentalis, with transcriptomic, epigenetic, and chromatin interaction datasets, we identify rearrangements altering the regulatory landscape of genes with distinct gonadal expression patterns. These include a tandem triplication involving CYP17A1, a gene controlling androgen synthesis, and an intrachromosomal inversion involving the pro-testicular growth factor gene FGF9, which is heterochronically expressed in mole ovotestes. Transgenic mice with a knock-in mole CYP17A1 enhancer or overexpressing FGF9 showed phenotypes recapitulating mole sexual features. Our results highlight how integrative genomic approaches can reveal the phenotypic impact of noncoding sequence changes.
Collapse
Affiliation(s)
- Francisca M Real
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Stefan A Haas
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Paolo Franchini
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Peiwen Xiong
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Oleg Simakov
- Department of Molecular Evolution and Development, University of Vienna, 1090 Vienna, Austria
| | - Heiner Kuhl
- Department of Ecophysiology and Aquaculture, Leibniz-Institute of Freshwater Ecology and Inland Fisheries, Berlin, Germany
| | - Robert Schöpflin
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - David Heller
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - M-Hossein Moeinzadeh
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Verena Heinrich
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Thomas Krannich
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Annkatrin Bressin
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Michaela F Hartmann
- Steroid Research & Mass Spectrometry Unit, Laboratory for Translational Hormone Analytics in Paediatric Endocrinology, Division of Paediatric Endocrinology & Diabetology, Center of Child and Adolescent Medicine, Justus Liebig University, Giessen, Germany
| | - Stefan A Wudy
- Steroid Research & Mass Spectrometry Unit, Laboratory for Translational Hormone Analytics in Paediatric Endocrinology, Division of Paediatric Endocrinology & Diabetology, Center of Child and Adolescent Medicine, Justus Liebig University, Giessen, Germany
| | - Dina K N Dechmann
- Department of Migration and Immuno-Ecology, Max Planck Institute for Animal Behavior, Radolfzell, Germany
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Alicia Hurtado
- Departamento de Genética, Universidad de Granada, Granada, Spain
- Instituto de Biotecnología, Centro de Investigación Biomédica, Universidad de Granada, Armilla, Granada, Spain
| | - Francisco J Barrionuevo
- Departamento de Genética, Universidad de Granada, Granada, Spain
- Instituto de Biotecnología, Centro de Investigación Biomédica, Universidad de Granada, Armilla, Granada, Spain
| | - Magdalena Schindler
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Izabela Harabula
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Marco Osterwalder
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Department for BioMedical Research (DBMR), University of Bern, 3008 Bern, Switzerland
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Lars Wittler
- Department of Developmental Genetics, Transgenic Unit, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- U.S. Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
- School of Natural Sciences, University of California, Merced, CA 95343, USA
| | - Bernd Timmermann
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Axel Meyer
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Rafael Jiménez
- Departamento de Genética, Universidad de Granada, Granada, Spain
- Instituto de Biotecnología, Centro de Investigación Biomédica, Universidad de Granada, Armilla, Granada, Spain
| | - Stefan Mundlos
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany.
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Darío G Lupiáñez
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany.
- Institute for Medical and Human Genetics, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité - Universitätsmedizin Berlin, Berlin, Germany
- Epigenetics and Sex Development Group, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| |
Collapse
|
18
|
Cong Q, Zhang J, Shen J, Cao X, Brévignon C, Grishin NV. Speciation in North American Junonia from a genomic perspective. SYSTEMATIC ENTOMOLOGY 2020; 45:803-837. [PMID: 34744257 PMCID: PMC8570557 DOI: 10.1111/syen.12428] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Delineating species boundaries in phylogenetic groups undergoing recent radiation is a daunting challenge akin to discretizing continuity. Here, we propose a general approach exemplified by American butterflies from the genus Junonia Hübner notorious for the variety of similar phenotypes, ease of hybridization, and the lack of consensus about their classification. We obtain whole-genome shotgun sequences of about 200 specimens. We reason that discreteness emerges from continuity by means of a small number of key players, and search for the proteins that diverged markedly between sympatric populations of different species, while keeping low polymorphism within these species. Being 0.25% of the total number, these three dozen 'speciation' proteins indeed partition pairs of Junonia populations into two clusters with a prominent break in between, while all proteins taken together fail to reveal this discontinuity. Populations with larger divergence from each other, comparable to that between two sympatric species, form the first cluster and correspond to different species. The other cluster is characterized by smaller divergence, similar to that between allopatric populations of the same species and comprise conspecific pairs. Using this method, we conclude that J. genoveva (Cramer), J. litoralis Brévignon, J. evarete (Cramer), and J. divaricata C. & R. Felder are restricted to South America. We find that six species of Junonia are present in the United States, one of which is new: Junonia stemosa Grishin, sp.n. (i), found in south Texas and phenotypically closest to J. nigrosuffusa W. Barnes & McDunnough (ii) in its dark appearance. In the pale nudum of the antennal club, these two species resemble J. zonalis C. & R. Felder (iii) from Florida and the Caribbean Islands. The pair of sister species, J. grisea Austin & J. Emmel (iv) and J. coenia Hübner (v), represent the classic west/east U.S.A. split. The mangrove feeder (as caterpillar), dark nudum J. neildi Brévignon (vi) enters south Texas as a new subspecies Junonia neildi varia Grishin ssp.n. characterized by more extensive hybridization with and introgression from J. coenia, and, as a consequence, more variable wing patterns compared with the nominal J. n. neildi in Florida. Furthermore, a new mangrove-feeding species from the Pacific Coast of Mexico is described as Junonia pacoma Grishin sp.n. Finally, genomic analysis suggests that J. nigrosuffusa may be a hybrid species formed by the ancestors of J. grisea and J. stemosa sp.n.
Collapse
Affiliation(s)
- Qian Cong
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
| | - Jing Zhang
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
| | - Jinhui Shen
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
| | - Xiaolong Cao
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
| | - Christian Brévignon
- Villa A7 Rochambeau, Matoury, French Guiana, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
| | - Nick V Grishin
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, U.S.A
| |
Collapse
|
19
|
Tobias IC, Abatti LE, Moorthy SD, Mullany S, Taylor T, Khader N, Filice MA, Mitchell JA. Transcriptional enhancers: from prediction to functional assessment on a genome-wide scale. Genome 2020; 64:426-448. [PMID: 32961076 DOI: 10.1139/gen-2020-0104] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Enhancers are cis-regulatory sequences located distally to target genes. These sequences consolidate developmental and environmental cues to coordinate gene expression in a tissue-specific manner. Enhancer function and tissue specificity depend on the expressed set of transcription factors, which recognize binding sites and recruit cofactors that regulate local chromatin organization and gene transcription. Unlike other genomic elements, enhancers are challenging to identify because they function independently of orientation, are often distant from their promoters, have poorly defined boundaries, and display no reading frame. In addition, there are no defined genetic or epigenetic features that are unambiguously associated with enhancer activity. Over recent years there have been developments in both empirical assays and computational methods for enhancer prediction. We review genome-wide tools, CRISPR advancements, and high-throughput screening approaches that have improved our ability to both observe and manipulate enhancers in vitro at the level of primary genetic sequences, chromatin states, and spatial interactions. We also highlight contemporary animal models and their importance to enhancer validation. Together, these experimental systems and techniques complement one another and broaden our understanding of enhancer function in development, evolution, and disease.
Collapse
Affiliation(s)
- Ian C Tobias
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Luis E Abatti
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Sakthi D Moorthy
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Shanelle Mullany
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Tiegh Taylor
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Nawrah Khader
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Mario A Filice
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Jennifer A Mitchell
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| |
Collapse
|
20
|
Geirsdottir L, David E, Keren-Shaul H, Weiner A, Bohlen SC, Neuber J, Balic A, Giladi A, Sheban F, Dutertre CA, Pfeifle C, Peri F, Raffo-Romero A, Vizioli J, Matiasek K, Scheiwe C, Meckel S, Mätz-Rensing K, van der Meer F, Thormodsson FR, Stadelmann C, Zilkha N, Kimchi T, Ginhoux F, Ulitsky I, Erny D, Amit I, Prinz M. Cross-Species Single-Cell Analysis Reveals Divergence of the Primate Microglia Program. Cell 2020; 179:1609-1622.e16. [PMID: 31835035 DOI: 10.1016/j.cell.2019.11.010] [Citation(s) in RCA: 289] [Impact Index Per Article: 57.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 07/30/2019] [Accepted: 11/06/2019] [Indexed: 02/08/2023]
Abstract
Microglia, the brain-resident immune cells, are critically involved in many physiological and pathological brain processes, including neurodegeneration. Here we characterize microglia morphology and transcriptional programs across ten species spanning more than 450 million years of evolution. We find that microglia express a conserved core gene program of orthologous genes from rodents to humans, including ligands and receptors associated with interactions between glia and neurons. In most species, microglia show a single dominant transcriptional state, whereas human microglia display significant heterogeneity. In addition, we observed notable differences in several gene modules of rodents compared with primate microglia, including complement, phagocytic, and susceptibility genes to neurodegeneration, such as Alzheimer's and Parkinson's disease. Our study provides an essential resource of conserved and divergent microglia pathways across evolution, with important implications for future development of microglia-based therapies in humans.
Collapse
Affiliation(s)
- Laufey Geirsdottir
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Eyal David
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Hadas Keren-Shaul
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel; Life Science Core Facility-Israel National Center for Personalized Medicine (G-INCPM), Weizmann Institute of Science, Rehovot, Israel
| | - Assaf Weiner
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | | | - Jana Neuber
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Adam Balic
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, EH25 9RG, United Kingdom
| | - Amir Giladi
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Fadi Sheban
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Charles-Antoine Dutertre
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore; Program in Emerging Infectious Disease, Duke-NUS Medical School, 8 College Road, Singapore, Singapore
| | - Christine Pfeifle
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Ploen, Germany
| | - Francesca Peri
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Antonella Raffo-Romero
- Universite Lille, Inserm, U-1192-Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse-PRISM, Lille, France
| | - Jacopo Vizioli
- Universite Lille, Inserm, U-1192-Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse-PRISM, Lille, France
| | - Kaspar Matiasek
- Section of Clinical & Comparative Neuropathology, Centre for Clinical Veterinary Medicine, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Christian Scheiwe
- Clinic for Neurosurgery, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Stephan Meckel
- Department of Neuroradiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Kerstin Mätz-Rensing
- German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
| | | | | | - Christine Stadelmann
- Institute of Neuropathology, University Medical Center Göttingen, Göttingen, Germany
| | - Noga Zilkha
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
| | - Tali Kimchi
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
| | - Florent Ginhoux
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore; Shanghai Institute of Immunology, Shanghai JiaoTong University School of Medicine, Shanghai, China; Translational Immunology Institute, Singhealth/Duke-NUS Academic Medical Centre, the Academia, Singapore, Singapore
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Daniel Erny
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Berta-Ottenstein-Programme, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| | - Ido Amit
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel.
| | - Marco Prinz
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Signaling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany; Center for NeuroModulation, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
21
|
Levitsky V, Zemlyanskaya E, Oshchepkov D, Podkolodnaya O, Ignatieva E, Grosse I, Mironova V, Merkulova T. A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package. Nucleic Acids Res 2020; 47:e139. [PMID: 31750523 PMCID: PMC6868382 DOI: 10.1093/nar/gkz800] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 08/12/2019] [Accepted: 09/09/2019] [Indexed: 01/20/2023] Open
Abstract
Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.
Collapse
Affiliation(s)
- Victor Levitsky
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Elena Zemlyanskaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Dmitry Oshchepkov
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Olga Podkolodnaya
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| | - Elena Ignatieva
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Ivo Grosse
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany.,German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany
| | - Victoria Mironova
- Department of Systems Biology, Institute of Cytology and Genetics, Novosibirsk 630090, Russia.,Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Tatyana Merkulova
- Department of Natural Science, Novosibirsk State University, Novosibirsk 630090, Russia.,Department of Molecular Genetics, Institute of Cytology and Genetics, Novosibirsk 630090, Russia
| |
Collapse
|
22
|
Rivera J, Keränen SVE, Gallo SM, Halfon MS. REDfly: the transcriptional regulatory element database for Drosophila. Nucleic Acids Res 2020; 47:D828-D834. [PMID: 30329093 PMCID: PMC6323911 DOI: 10.1093/nar/gky957] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 10/04/2018] [Indexed: 12/21/2022] Open
Abstract
The REDfly database provides a comprehensive curation of experimentally-validated Drosophila transcriptional cis-regulatory elements and includes information on DNA sequence, experimental evidence, patterns of regulated gene expression, and more. Now in its thirteenth year, REDfly has grown to over 23 000 records of tested reporter gene constructs and 2200 tested transcription factor binding sites. Recent developments include the start of curation of predicted cis-regulatory modules in addition to experimentally-verified ones, improved search and filtering, and increased interaction with the authors of curated papers. An expanded data model that will capture information on temporal aspects of gene regulation, regulation in response to environmental and other non-developmental cues, sexually dimorphic gene regulation, and non-endogenous (ectopic) aspects of reporter gene expression is under development and expected to be in place within the coming year. REDfly is freely accessible at http://redfly.ccr.buffalo.edu, and news about database updates and new features can be followed on Twitter at @REDfly_database.
Collapse
Affiliation(s)
- John Rivera
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | | | - Steven M Gallo
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
23
|
Choe S, Huh TL, Rhee M. Trim45 is essential to the development of the diencephalon and eye in zebrafish embryos. Anim Cells Syst (Seoul) 2020; 24:99-106. [PMID: 32489689 PMCID: PMC7241540 DOI: 10.1080/19768354.2020.1751281] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Accepted: 03/27/2020] [Indexed: 01/06/2023] Open
Abstract
Trim45 is one of the RING (really interesting new gene) finger containing E3 ligase, which belongs to TRIM (Tripartite motif) protein family. Its molecular biological functions have been well characterized but not in light of developmental aspects. Here, we are reporting its expression patterns and developmental functions in zebrafish embryos. First, maternal transcripts of trim45 were found at one cell stage while its zygotic messages appeared at 30% epiboly. trim45 transcripts were restricted to the optical tectum, hypothalamus, hindbrain, and pharyngeal endoderm at 24 hpf (hour post-fertilization), and further to the retinal ganglion cell layer and cranial ganglion at 36 hpf. Second, ectopic expression of trim45 by injecting its mRNAs into embryos at one cell stage caused significant expansion of the diencephalon and eye fields at 24 hpf. In contrast, knock-down of trim45 with anti-sense trim45 morpholinos reduced the size of the two tissues at 24 hpf. Finally, the spatial distribution of the transcripts from olig2 and rx1/rx3, markers for the midbrain and eye respectively, were significantly decreased in the thalamus and eye fields respectively at 24 hpf. Based upon these observations, we proposed possible roles of Trim45 in the development of the diencephalon and eye in zebrafish embryos.
Collapse
Affiliation(s)
- Seoyeon Choe
- Department of Biological Sciences, College of Biosciences and Biotechnology, Brain Korea 21 Plus, Chungnam National University, Daejeon, South Korea
| | - Tae-Lin Huh
- School of Life Sciences and Biotechnology, College of Natural Sciences, Kyungpook National University, Daegu, South Korea
| | - Myungchull Rhee
- Department of Biological Sciences, College of Biosciences and Biotechnology, Brain Korea 21 Plus, Chungnam National University, Daejeon, South Korea
| |
Collapse
|
24
|
Mitchelmore J, Grinberg NF, Wallace C, Spivakov M. Functional effects of variation in transcription factor binding highlight long-range gene regulation by epromoters. Nucleic Acids Res 2020; 48:2866-2879. [PMID: 32112106 PMCID: PMC7102942 DOI: 10.1093/nar/gkaa123] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Revised: 02/14/2020] [Accepted: 02/17/2020] [Indexed: 02/06/2023] Open
Abstract
Identifying DNA cis-regulatory modules (CRMs) that control the expression of specific genes is crucial for deciphering the logic of transcriptional control. Natural genetic variation can point to the possible gene regulatory function of specific sequences through their allelic associations with gene expression. However, comprehensive identification of causal regulatory sequences in brute-force association testing without incorporating prior knowledge is challenging due to limited statistical power and effects of linkage disequilibrium. Sequence variants affecting transcription factor (TF) binding at CRMs have a strong potential to influence gene regulatory function, which provides a motivation for prioritizing such variants in association testing. Here, we generate an atlas of CRMs showing predicted allelic variation in TF binding affinity in human lymphoblastoid cell lines and test their association with the expression of their putative target genes inferred from Promoter Capture Hi-C and immediate linear proximity. We reveal >1300 CRM TF-binding variants associated with target gene expression, the majority of them undetected with standard association testing. A large proportion of CRMs showing associations with the expression of genes they contact in 3D localize to the promoter regions of other genes, supporting the notion of 'epromoters': dual-action CRMs with promoter and distal enhancer activity.
Collapse
Affiliation(s)
- Joanna Mitchelmore
- Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Nastasiya F Grinberg
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0SR, UK
| | - Mikhail Spivakov
- Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
- MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, Du Cane Road, London W12 0NN, UK
| |
Collapse
|
25
|
Tomoyasu Y, Halfon MS. How to study enhancers in non-traditional insect models. ACTA ACUST UNITED AC 2020; 223:223/Suppl_1/jeb212241. [PMID: 32034049 DOI: 10.1242/jeb.212241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Transcriptional enhancers are central to the function and evolution of genes and gene regulation. At the organismal level, enhancers play a crucial role in coordinating tissue- and context-dependent gene expression. At the population level, changes in enhancers are thought to be a major driving force that facilitates evolution of diverse traits. An amazing array of diverse traits seen in insect morphology, physiology and behavior has been the subject of research for centuries. Although enhancer studies in insects outside of Drosophila have been limited, recent advances in functional genomic approaches have begun to make such studies possible in an increasing selection of insect species. Here, instead of comprehensively reviewing currently available technologies for enhancer studies in established model organisms such as Drosophila, we focus on a subset of computational and experimental approaches that are likely applicable to non-Drosophila insects, and discuss the pros and cons of each approach. We discuss the importance of validating enhancer function and evaluate several possible validation methods, such as reporter assays and genome editing. Key points and potential pitfalls when establishing a reporter assay system in non-traditional insect models are also discussed. We close with a discussion of how to advance enhancer studies in insects, both by improving computational approaches and by expanding the genetic toolbox in various insects. Through these discussions, this Review provides a conceptual framework for studying the function and evolution of enhancers in non-traditional insect models.
Collapse
Affiliation(s)
| | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
| |
Collapse
|
26
|
Mahmud AKMF, Yang D, Stenberg P, Ioshikhes I, Nandi S. Exploring a Drosophila Transcription Factor Interaction Network to Identify Cis-Regulatory Modules. J Comput Biol 2019; 27:1313-1328. [PMID: 31855461 DOI: 10.1089/cmb.2018.0160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Multiple transcription factors (TFs) bind to specific sites in the genome and interact among themselves to form the cis-regulatory modules (CRMs). They are essential in modulating the expression of genes, and it is important to study this interplay to understand gene regulation. In the present study, we integrated experimentally identified TF binding sites collected from published studies with computationally predicted TF binding sites to identify Drosophila CRMs. Along with the detection of the previously known CRMs, this approach identified novel protein combinations. We determined high-occupancy target sites, where a large number of TFs bind. Investigating these sites revealed that Giant, Dichaete, and Knirp are highly enriched in these locations. A common TAG team motif was observed at these sites, which might play a role in recruiting other TFs. While comparing the binding sites at distal and proximal promoters, we found that certain regulatory TFs, such as Zelda, were highly enriched in enhancers. Our study has shown that, from the information available concerning the TF binding sites, the real CRMs could be predicted accurately and efficiently. Although we only may claim co-occurrence of these proteins in this study, it may actually point to their interaction (as known interaction proteins typically co-occur together). Such an integrative approach can, therefore, help us to provide a better understanding of the interplay among the factors, even though further experimental verification is required.
Collapse
Affiliation(s)
| | - Doo Yang
- Ottawa Institute of Computational Biology and Bioinformatics (OICBB) and Ottawa Institute of Systems Biology (OISB) and Department of Biochemistry, Microbiology and Immunology (BMI), Faculty of Medicine, University of Ottawa, Ottawa, Canada
| | - Per Stenberg
- Department of Molecular Biology, Umeå University, Umeå, Sweden
| | - Ilya Ioshikhes
- Ottawa Institute of Computational Biology and Bioinformatics (OICBB) and Ottawa Institute of Systems Biology (OISB) and Department of Biochemistry, Microbiology and Immunology (BMI), Faculty of Medicine, University of Ottawa, Ottawa, Canada
| | - Soumyadeep Nandi
- Life Sciences Division, Institute of Advanced Study in Science and Technology, Vigyan Path, Paschim Boragaon, Guwahati, India; Amity University Haryana, Gurugram, India
| |
Collapse
|
27
|
Quinn PM, Wijnholds J. Retinogenesis of the Human Fetal Retina: An Apical Polarity Perspective. Genes (Basel) 2019; 10:E987. [PMID: 31795518 PMCID: PMC6947654 DOI: 10.3390/genes10120987] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 11/25/2019] [Accepted: 11/26/2019] [Indexed: 12/20/2022] Open
Abstract
The Crumbs complex has prominent roles in the control of apical cell polarity, in the coupling of cell density sensing to downstream cell signaling pathways, and in regulating junctional structures and cell adhesion. The Crumbs complex acts as a conductor orchestrating multiple downstream signaling pathways in epithelial and neuronal tissue development. These pathways lead to the regulation of cell size, cell fate, cell self-renewal, proliferation, differentiation, migration, mitosis, and apoptosis. In retinogenesis, these are all pivotal processes with important roles for the Crumbs complex to maintain proper spatiotemporal cell processes. Loss of Crumbs function in the retina results in loss of the stratified appearance resulting in retinal degeneration and loss of visual function. In this review, we begin by discussing the physiology of vision. We continue by outlining the processes of retinogenesis and how well this is recapitulated between the human fetal retina and human embryonic stem cell (ESC) or induced pluripotent stem cell (iPSC)-derived retinal organoids. Additionally, we discuss the functionality of in utero and preterm human fetal retina and the current level of functionality as detected in human stem cell-derived organoids. We discuss the roles of apical-basal cell polarity in retinogenesis with a focus on Leber congenital amaurosis which leads to blindness shortly after birth. Finally, we discuss Crumbs homolog (CRB)-based gene augmentation.
Collapse
Affiliation(s)
- Peter M.J. Quinn
- Department of Ophthalmology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands;
| | - Jan Wijnholds
- Department of Ophthalmology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands;
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, 1105 BA Amsterdam, The Netherlands
| |
Collapse
|
28
|
Alanni R, Hou J, Azzawi H, Xiang Y. Deep gene selection method to select genes from microarray datasets for cancer classification. BMC Bioinformatics 2019; 20:608. [PMID: 31775613 PMCID: PMC6880643 DOI: 10.1186/s12859-019-3161-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 10/15/2019] [Indexed: 12/15/2022] Open
Abstract
Background Microarray datasets consist of complex and high-dimensional samples and genes, and generally the number of samples is much smaller than the number of genes. Due to this data imbalance, gene selection is a demanding task for microarray expression data analysis. Results The gene set selected by DGS has shown its superior performances in cancer classification. DGS has a high capability of reducing the number of genes in the original microarray datasets. The experimental comparisons with other representative and state-of-the-art gene selection methods also showed that DGS achieved the best performance in terms of the number of selected genes, classification accuracy, and computational cost. Conclusions We provide an efficient gene selection algorithm can select relevant genes which are significantly sensitive to the samples’ classes. With the few discriminative genes and less cost time by the proposed algorithm achieved much high prediction accuracy on several public microarray data, which in turn verifies the efficiency and effectiveness of the proposed gene selection method.
Collapse
Affiliation(s)
- Russul Alanni
- School of Information Technology, Deakin University, Geelong, Victoria, Australia.
| | - Jingyu Hou
- School of Information Technology, Deakin University, Geelong, Victoria, Australia
| | - Hasseeb Azzawi
- School of Information Technology, Deakin University, Geelong, Victoria, Australia
| | - Yong Xiang
- School of Information Technology, Deakin University, Geelong, Victoria, Australia
| |
Collapse
|
29
|
Perenthaler E, Yousefi S, Niggl E, Barakat TS. Beyond the Exome: The Non-coding Genome and Enhancers in Neurodevelopmental Disorders and Malformations of Cortical Development. Front Cell Neurosci 2019; 13:352. [PMID: 31417368 PMCID: PMC6685065 DOI: 10.3389/fncel.2019.00352] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 07/16/2019] [Indexed: 12/22/2022] Open
Abstract
The development of the human cerebral cortex is a complex and dynamic process, in which neural stem cell proliferation, neuronal migration, and post-migratory neuronal organization need to occur in a well-organized fashion. Alterations at any of these crucial stages can result in malformations of cortical development (MCDs), a group of genetically heterogeneous neurodevelopmental disorders that present with developmental delay, intellectual disability and epilepsy. Recent progress in genetic technologies, such as next generation sequencing, most often focusing on all protein-coding exons (e.g., whole exome sequencing), allowed the discovery of more than a 100 genes associated with various types of MCDs. Although this has considerably increased the diagnostic yield, most MCD cases remain unexplained. As Whole Exome Sequencing investigates only a minor part of the human genome (1-2%), it is likely that patients, in which no disease-causing mutation has been identified, could harbor mutations in genomic regions beyond the exome. Even though functional annotation of non-coding regions is still lagging behind that of protein-coding genes, tremendous progress has been made in the field of gene regulation. One group of non-coding regulatory regions are enhancers, which can be distantly located upstream or downstream of genes and which can mediate temporal and tissue-specific transcriptional control via long-distance interactions with promoter regions. Although some examples exist in literature that link alterations of enhancers to genetic disorders, a widespread appreciation of the putative roles of these sequences in MCDs is still lacking. Here, we summarize the current state of knowledge on cis-regulatory regions and discuss novel technologies such as massively-parallel reporter assay systems, CRISPR-Cas9-based screens and computational approaches that help to further elucidate the emerging role of the non-coding genome in disease. Moreover, we discuss existing literature on mutations or copy number alterations of regulatory regions involved in brain development. We foresee that the future implementation of the knowledge obtained through ongoing gene regulation studies will benefit patients and will provide an explanation to part of the missing heritability of MCDs and other genetic disorders.
Collapse
Affiliation(s)
| | | | | | - Tahsin Stefan Barakat
- Department of Clinical Genetics, Erasmus MC – University Medical Center, Rotterdam, Netherlands
| |
Collapse
|
30
|
Anderson AP, Jones AG. erefinder: Genome-wide detection of oestrogen response elements. Mol Ecol Resour 2019; 19:1366-1373. [PMID: 31177626 DOI: 10.1111/1755-0998.13046] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 05/31/2019] [Accepted: 05/31/2019] [Indexed: 11/28/2022]
Abstract
Oestrogen response elements (EREs) are specific DNA sequences to which ligand-bound oestrogen receptors (ERs) physically bind, allowing them to act as transcription factors for target genes. Locating EREs and ER responsive regions is therefore a potentially important component of the study of oestrogen-regulated pathways. Here, we report the development of a novel software tool, erefinder, which conducts a genome-wide, sliding-window analysis of oestrogen receptor binding affinity. We demonstrate the effects of adjusting window size and highlight the program's general agreement with ChIP studies. We further provide two examples of how erefinder can be used for comparative approaches. erefinder can handle large input files, has settings to allow for broad and narrow searches, and provides the full output to allow for greater data manipulation. These features facilitate a wide range of hypothesis testing for researchers and make erefinder an excellent tool to aid in oestrogen-related research.
Collapse
Affiliation(s)
- Andrew P Anderson
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Adam G Jones
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
| |
Collapse
|
31
|
Asma H, Halfon MS. Computational enhancer prediction: evaluation and improvements. BMC Bioinformatics 2019; 20:174. [PMID: 30953451 PMCID: PMC6451241 DOI: 10.1186/s12859-019-2781-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Accepted: 03/27/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Identifying transcriptional enhancers and other cis-regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to evaluate their performance in terms of estimating their sensitivity and specificity. RESULTS We introduce here pCRMeval, a pipeline for in silico evaluation of any enhancer prediction tools that are flexible enough to be applied to the Drosophila melanogaster genome. pCRMeval compares the result of predictions with the extensive existing knowledge of experimentally-validated Drosophila CRMs in order to estimate the precision and relative sensitivity of the prediction method. In the case of supervised prediction methods-when training data composed of validated CRMs are used-pCRMeval can also assess the sensitivity of specific training sets. We demonstrate the utility of pCRMeval through evaluation of our SCRMshaw CRM prediction method and training data. By measuring the impact of different parameters on SCRMshaw performance, as assessed by pCRMeval, we develop a more robust version of SCRMshaw, SCRMshaw_HD, that improves the number of predictions while maintaining sensitivity and specificity. Our analysis also demonstrates that SCRMshaw_HD, when applied to increasingly less well-assembled genomes, maintains its strong predictive power with only a minor drop-off in performance. CONCLUSION Our pCRMeval pipeline provides a general framework for evaluation that can be applied to any CRM prediction method, particularly a supervised method. While we make use of it here primarily to test and improve a particular method for CRM prediction, SCRMshaw, pCRMeval should provide a valuable platform to the research community not only for evaluating individual methods, but also for comparing between competing methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA
| | - Marc S Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Department of Biochemistry, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Department of Biological Sciences, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Department of Biomedical Informatics, University at Buffalo-State University of New York, 701 Ellicott St, Buffalo, NY, 14203, USA.
- NY State Center of Excellence in Bioinformatics and Life Sciences, 701 Ellicott St, Buffalo, NY, 14203, USA.
- Molecular and Cellular Biology Department and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, 14263, USA.
| |
Collapse
|
32
|
Cremer M, Cremer T. Nuclear compartmentalization, dynamics, and function of regulatory DNA sequences. Genes Chromosomes Cancer 2019; 58:427-436. [DOI: 10.1002/gcc.22714] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 11/23/2018] [Accepted: 11/27/2018] [Indexed: 12/15/2022] Open
Affiliation(s)
- Marion Cremer
- Biocenter, Department Biology II; Ludwig Maximilians-Universität (LMU Munich); Munich Germany
| | - Thomas Cremer
- Biocenter, Department Biology II; Ludwig Maximilians-Universität (LMU Munich); Munich Germany
| |
Collapse
|
33
|
Alanni R, Hou J, Azzawi H, Xiang Y. A novel gene selection algorithm for cancer classification using microarray datasets. BMC Med Genomics 2019; 12:10. [PMID: 30646919 PMCID: PMC6334429 DOI: 10.1186/s12920-018-0447-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 12/07/2018] [Indexed: 12/18/2022] Open
Abstract
Background Microarray datasets are an important medical diagnostic tool as they represent the states of a cell at the molecular level. Available microarray datasets for classifying cancer types generally have a fairly small sample size compared to the large number of genes involved. This fact is known as a curse of dimensionality, which is a challenging problem. Gene selection is a promising approach that addresses this problem and plays an important role in the development of efficient cancer classification due to the fact that only a small number of genes are related to the classification problem. Gene selection addresses many problems in microarray datasets such as reducing the number of irrelevant and noisy genes, and selecting the most related genes to improve the classification results. Methods An innovative Gene Selection Programming (GSP) method is proposed to select relevant genes for effective and efficient cancer classification. GSP is based on Gene Expression Programming (GEP) method with a new defined population initialization algorithm, a new fitness function definition, and improved mutation and recombination operators. . Support Vector Machine (SVM) with a linear kernel serves as a classifier of the GSP. Results Experimental results on ten microarray cancer datasets demonstrate that Gene Selection Programming (GSP) is effective and efficient in eliminating irrelevant and redundant genes/features from microarray datasets. The comprehensive evaluations and comparisons with other methods show that GSP gives a better compromise in terms of all three evaluation criteria, i.e., classification accuracy, number of selected genes, and computational cost. The gene set selected by GSP has shown its superior performances in cancer classification compared to those selected by the up-to-date representative gene selection methods. Conclusion Gene subset selected by GSP can achieve a higher classification accuracy with less processing time.
Collapse
Affiliation(s)
- Russul Alanni
- School of Information Technology, Deakin University, Burwood, 3125, VIC, Australia.
| | - Jingyu Hou
- School of Information Technology, Deakin University, Burwood, 3125, VIC, Australia
| | - Hasseeb Azzawi
- School of Information Technology, Deakin University, Burwood, 3125, VIC, Australia
| | - Yong Xiang
- School of Information Technology, Deakin University, Burwood, 3125, VIC, Australia
| |
Collapse
|
34
|
Abstract
Although the number of sequenced insect genomes numbers in the hundreds, little is known about gene regulatory sequences in any species other than the well-studied Drosophila melanogaster. We provide here a detailed protocol for using SCRMshaw, a computational method for predicting cis-regulatory modules (CRMs, also "enhancers") in sequenced insect genomes. SCRMshaw is effective for CRM discovery throughout the range of holometabolous insects and potentially in even more diverged species, with true-positive prediction rates of 75% or better. Minimal requirements for using SCRMshaw are a genome sequence and training data in the form of known Drosophila CRMs; a comprehensive set of the latter can be obtained from the SCRMshaw download site. For basic applications, a user with only modest computational know-how can run SCRMshaw on a desktop computer. SCRMshaw can be run with a single, narrow set of training data to predict CRMs regulating a specific pattern of gene expression, or with multiple sets of training data covering a broad range of CRM activities to provide an initial rough regulatory annotation of a complete, newly-sequenced genome.
Collapse
Affiliation(s)
- Majid Kazemian
- Departments of Biochemistry and Computer Science, Purdue University, West Lafayette, IN, USA.
| | - Marc S Halfon
- Departments of Biochemistry, Biomedical Informatics, and Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY, USA.
- NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, USA.
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.
| |
Collapse
|
35
|
Halfon MS. Studying Transcriptional Enhancers: The Founder Fallacy, Validation Creep, and Other Biases. Trends Genet 2018; 35:93-103. [PMID: 30553552 DOI: 10.1016/j.tig.2018.11.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 11/15/2018] [Accepted: 11/21/2018] [Indexed: 12/21/2022]
Abstract
Transcriptional enhancers play a major role in regulating metazoan gene expression. Recent developments in genomics and next-generation sequencing have accelerated and revitalized the study of this important class of sequence elements. Increased interest and attention, however, has also led to troubling trends in the enhancer literature. In this Opinion, I describe some of these issues and show how they arise from shifting and nonuniform enhancer definitions, and genome-era biases. I discuss how they can lead to interpretative errors and an unduly narrow focus on certain aspects of enhancer biology to the potential exclusion of others.
Collapse
Affiliation(s)
- Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY, USA; NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, USA; Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY, USA; Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY, USA; Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.
| |
Collapse
|
36
|
Kawakami H, Johnson A, Fujita Y, Swearer A, Wada N, Kawakami Y. Characterization of cis-regulatory elements for Fgf10 expression in the chick embryo. Dev Dyn 2018; 247:1253-1263. [PMID: 30325084 DOI: 10.1002/dvdy.24682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 09/28/2018] [Accepted: 10/11/2018] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Fgf10 is expressed in various tissues and organs, such as the limb bud, heart, inner ear, and head mesenchyme. Previous studies identified Fgf10 enhancers for the inner ear and heart. However, Fgf10 enhancers for other tissues have not been identified. RESULTS By using primary culture chick embryo lateral plate mesoderm cells, we compared activities of deletion constructs of the Fgf10 promoter region, cloned into a promoter-less luciferase reporter vector. We identified a 0.34-kb proximal promoter that can activate luciferase expression. Then, we cloned 11 evolutionarily conserved sequences located within or outside of the Fgf10 gene into the 0.34-kb promoter-luciferase vector, and tested their activities in vitro using primary cultured cells. Two sequences showed the highest activities. By using the Tol2 system and electroporation into chick embryos, activities of the 0.34-kb promoter with and without the two sequences were tested in vivo. No activities were detected in limb buds. However, the 0.34-kb promoter exhibited activities in the dorsal midline of the brain, while Fgf10 is detected in broader region in the brain. The two noncoding sequences negatively acted on the 0.34-kb promoter in the brain. CONCLUSIONS The proximal 0.34-kb promoter has activities to drive expression in restricted areas of the brain. Developmental Dynamics 247:1253-1263, 2018. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Hiroko Kawakami
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota.,Stem Cell Institute, University of Minnesota, Minneapolis, Minnesota.,Developmental Biology Center, University of Minnesota, Minneapolis, Minnesota
| | - Austin Johnson
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota
| | - Yu Fujita
- Department of Applied Biological Science, Tokyo University of Science, Noda, Chiba, Japan
| | - Avery Swearer
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota
| | - Naoyuki Wada
- Department of Applied Biological Science, Tokyo University of Science, Noda, Chiba, Japan
| | - Yasuhiko Kawakami
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota.,Stem Cell Institute, University of Minnesota, Minneapolis, Minnesota.,Developmental Biology Center, University of Minnesota, Minneapolis, Minnesota
| |
Collapse
|
37
|
Wilding CS. Regulating resistance: CncC:Maf, antioxidant response elements and the overexpression of detoxification genes in insecticide resistance. CURRENT OPINION IN INSECT SCIENCE 2018; 27:89-96. [PMID: 30025640 DOI: 10.1016/j.cois.2018.04.006] [Citation(s) in RCA: 93] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 03/12/2018] [Accepted: 04/09/2018] [Indexed: 05/24/2023]
Abstract
Although genetic and genomic tools have greatly furthered our understanding of resistance-associated mutations in molecular target sites of insecticides, the genomic basis of transcriptional regulation of detoxification loci in insect pests and vectors remains relatively unexplored. Recent work using RNAi, reporter assays and comparative genomics are beginning to reveal the molecular architecture of this response, identifying critical transcription factors and their binding sites. Central to this is the insect ortholog of the mammalian transcription factor Nrf2, Cap 'n' Collar isoform-C (CncC) which as a heterodimer with Maf-S regulates the transcription of phase I, II and III detoxification loci in a range of insects, with CncC knockdown or upregulation directly affecting phenotypic resistance. CncC:Maf binds to specific antioxidant response element sequences upstream of detoxification genes to initiate transcription. Recent work is now identifying these binding sites for resistance-associated loci and, coupled with genome sequence data and reporter assays, enabling identification of polymorphisms in the CncC:Maf binding site which regulate the insecticide resistance phenotype.
Collapse
Affiliation(s)
- Craig S Wilding
- School of Natural Sciences and Psychology, Liverpool John Moores University, Liverpool L3 3AF, UK.
| |
Collapse
|
38
|
Dang LT, Tondl M, Chiu MHH, Revote J, Paten B, Tano V, Tokolyi A, Besse F, Quaife-Ryan G, Cumming H, Drvodelic MJ, Eichenlaub MP, Hallab JC, Stolper JS, Rossello FJ, Bogoyevitch MA, Jans DA, Nim HT, Porrello ER, Hudson JE, Ramialison M. TrawlerWeb: an online de novo motif discovery tool for next-generation sequencing datasets. BMC Genomics 2018; 19:238. [PMID: 29621972 PMCID: PMC5887194 DOI: 10.1186/s12864-018-4630-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 03/27/2018] [Indexed: 12/14/2022] Open
Abstract
Background A strong focus of the post-genomic era is mining of the non-coding regulatory genome in order to unravel the function of regulatory elements that coordinate gene expression (Nat 489:57–74, 2012; Nat 507:462–70, 2014; Nat 507:455–61, 2014; Nat 518:317–30, 2015). Whole-genome approaches based on next-generation sequencing (NGS) have provided insight into the genomic location of regulatory elements throughout different cell types, organs and organisms. These technologies are now widespread and commonly used in laboratories from various fields of research. This highlights the need for fast and user-friendly software tools dedicated to extracting cis-regulatory information contained in these regulatory regions; for instance transcription factor binding site (TFBS) composition. Ideally, such tools should not require prior programming knowledge to ensure they are accessible for all users. Results We present TrawlerWeb, a web-based version of the Trawler_standalone tool (Nat Methods 4:563–5, 2007; Nat Protoc 5:323–34, 2010), to allow for the identification of enriched motifs in DNA sequences obtained from next-generation sequencing experiments in order to predict their TFBS composition. TrawlerWeb is designed for online queries with standard options common to web-based motif discovery tools. In addition, TrawlerWeb provides three unique new features: 1) TrawlerWeb allows the input of BED files directly generated from NGS experiments, 2) it automatically generates an input-matched biologically relevant background, and 3) it displays resulting conservation scores for each instance of the motif found in the input sequences, which assists the researcher in prioritising the motifs to validate experimentally. Finally, to date, this web-based version of Trawler_standalone remains the fastest online de novo motif discovery tool compared to other popular web-based software, while generating predictions with high accuracy. Conclusions TrawlerWeb provides users with a fast, simple and easy-to-use web interface for de novo motif discovery. This will assist in rapidly analysing NGS datasets that are now being routinely generated. TrawlerWeb is freely available and accessible at: http://trawler.erc.monash.edu.au. Electronic supplementary material The online version of this article (10.1186/s12864-018-4630-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Louis T Dang
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Markus Tondl
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Man Ho H Chiu
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Jerico Revote
- eResearch, Monash University, Clayton, VIC, Australia
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Vincent Tano
- Department of Biochemistry and Molecular Biology, Bio21 Institute and Cell Signalling Research Laboratories, The University of Melbourne, Melbourne, VIC, Australia
| | - Alex Tokolyi
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Florence Besse
- CNRS, Inserm, Institute of Biology Valrose, Université Côte d'Azur, Parc Valrose, Nice, France
| | - Greg Quaife-Ryan
- School of Biomedical Sciences, The University of Queensland, QLD, Brisbane, Australia
| | - Helen Cumming
- Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research, Monash University, Clayton, VIC, Australia
| | - Mark J Drvodelic
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Michael P Eichenlaub
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Jeannette C Hallab
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Julian S Stolper
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Fernando J Rossello
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia
| | - Marie A Bogoyevitch
- Department of Biochemistry and Molecular Biology, Bio21 Institute and Cell Signalling Research Laboratories, The University of Melbourne, Melbourne, VIC, Australia
| | - David A Jans
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC, Australia
| | - Hieu T Nim
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia.,Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Enzo R Porrello
- Murdoch Children's Research Institute, The Royal Children's Hospital, Parkville, VIC, Australia.,Department of Physiology, School of Biomedical Sciences, The University of Melbourne, Parkville, VIC, Australia
| | - James E Hudson
- School of Biomedical Sciences, The University of Queensland, QLD, Brisbane, Australia
| | - Mirana Ramialison
- Australian Regenerative Medicine Institute, Systems Biology Institute Australia, Monash University, Clayton, VIC, Australia.
| |
Collapse
|
39
|
Lai YT, Deem KD, Borràs-Castells F, Sambrani N, Rudolf H, Suryamohan K, El-Sherif E, Halfon MS, McKay DJ, Tomoyasu Y. Enhancer identification and activity evaluation in the red flour beetle, Tribolium castaneum. Development 2018; 145:dev160663. [PMID: 29540499 PMCID: PMC11736658 DOI: 10.1242/dev.160663] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Accepted: 03/09/2018] [Indexed: 12/13/2022]
Abstract
Evolution of cis-regulatory elements (such as enhancers) plays an important role in the production of diverse morphology. However, a mechanistic understanding is often limited by the absence of methods for studying enhancers in species other than established model systems. Here, we sought to establish methods to identify and test enhancer activity in the red flour beetle, Tribolium castaneum To identify possible enhancer regions, we first obtained genome-wide chromatin profiles from various tissues and stages of Tribolium using FAIRE (formaldehyde-assisted isolation of regulatory elements)-sequencing. Comparison of these profiles revealed a distinct set of open chromatin regions in each tissue and at each stage. In addition, comparison of the FAIRE data with sets of computationally predicted (i.e. supervised cis-regulatory module-predicted) enhancers revealed a very high overlap between the two datasets. Second, using nubbin in the wing and hunchback in the embryo as case studies, we established the first universal reporter assay system that works in various contexts in Tribolium, and in a cross-species context. Together, these advances will facilitate investigation of cis-evolution and morphological diversity in Tribolium and other insects.
Collapse
Affiliation(s)
- Yi-Ting Lai
- Department of Biology, Miami University, Oxford, OH 45056, USA
| | - Kevin D Deem
- Department of Biology, Miami University, Oxford, OH 45056, USA
| | | | - Nagraj Sambrani
- Department of Biology, Miami University, Oxford, OH 45056, USA
| | - Heike Rudolf
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91058, Germany
| | - Kushal Suryamohan
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Ezzat El-Sherif
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91058, Germany
| | - Marc S Halfon
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Daniel J McKay
- Department of Biology, Department of Genetics, Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
40
|
A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model. BIOMED RESEARCH INTERNATIONAL 2018; 2017:6274513. [PMID: 28497059 PMCID: PMC5405574 DOI: 10.1155/2017/6274513] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Revised: 03/06/2017] [Accepted: 03/23/2017] [Indexed: 11/24/2022]
Abstract
The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them.
Collapse
|
41
|
Cumbo F, Vergni D, Santoni D. Investigating transcription factor synergism in humans. DNA Res 2017; 25:103-112. [PMID: 29069301 PMCID: PMC5824945 DOI: 10.1093/dnares/dsx041] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 09/19/2017] [Indexed: 01/13/2023] Open
Abstract
Proteins are the core and the engine of every process in cells thus the study of mechanisms that drive the regulation of protein expression, is essential. Transcription factors play a central role in this extremely complex task and they synergically co-operate in order to provide a fine tuning of protein expressions. In the present study, we designed a mathematically well-founded procedure to investigate the mutual positioning of transcription factors binding sites related to a given couple of transcription factors in order to evaluate the possible association between them. We obtained a list of highly related transcription factors couples, whose binding site occurrences significantly group together for a given set of gene promoters, identifying the biological contexts in which the couples are involved in and the processes they should contribute to regulate.
Collapse
Affiliation(s)
- Fabio Cumbo
- Institute for Systems Analysis and Computer Science 'Antonio Ruberti', National Research Council of Italy, 00185 Rome, Italy.,Department of Engineering, Third University of Rome, 00146 Rome, Italy.,SYSBIO.IT-Centre of Systems Biology, 20126 Milan, Italy
| | - Davide Vergni
- Institute for Applied Mathematics 'Mauro Picone', National Research Council of Italy, 00185 Rome, Italy
| | - Daniele Santoni
- Institute for Systems Analysis and Computer Science 'Antonio Ruberti', National Research Council of Italy, 00185 Rome, Italy
| |
Collapse
|
42
|
Aschenbrenner AC, Bassler K, Brondolin M, Bonaguro L, Carrera P, Klee K, Ulas T, Schultze JL, Hoch M. A cross-species approach to identify transcriptional regulators exemplified for Dnajc22 and Hnf4a. Sci Rep 2017; 7:4056. [PMID: 28642491 PMCID: PMC5481429 DOI: 10.1038/s41598-017-04370-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 05/05/2017] [Indexed: 12/03/2022] Open
Abstract
There is an enormous need to make better use of the ever increasing wealth of publicly available genomic information and to utilize the tremendous progress in computational approaches in the life sciences. Transcriptional regulation of protein-coding genes is a major mechanism of controlling cellular functions. However, the myriad of transcription factors potentially controlling transcription of any given gene makes it often difficult to quickly identify the biological relevant transcription factors. Here, we report on the identification of Hnf4a as a major transcription factor of the so far unstudied DnaJ heat shock protein family (Hsp40) member C22 (Dnajc22). We propose an approach utilizing recent advances in computational biology and the wealth of publicly available genomic information guiding the identification of potential transcription factor candidates together with wet-lab experiments validating computational models. More specifically, the combined use of co-expression analyses based on self-organizing maps with sequence-based transcription factor binding prediction led to the identification of Hnf4a as the potential transcriptional regulator for Dnajc22 which was further corroborated using publicly available datasets on Hnf4a. Following this procedure, we determined its functional binding site in the murine Dnajc22 locus using ChIP-qPCR and luciferase assays and verified this regulatory loop in fruitfly, zebrafish, and humans.
Collapse
Affiliation(s)
- A C Aschenbrenner
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany.
| | - K Bassler
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - M Brondolin
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
- Department of Craniofacial Development and Stem Cell Biology, Dental Institute, King's College London, SE1 9RT, London, United Kingdom
| | - L Bonaguro
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - P Carrera
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - K Klee
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - T Ulas
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - J L Schultze
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
- Single Cell Genomics and Epigenomics Unit at the German Center for Neurodegenerative Diseases and the University of Bonn, 53175, Bonn, Germany
| | - M Hoch
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| |
Collapse
|
43
|
Perspectives on Gene Regulatory Network Evolution. Trends Genet 2017; 33:436-447. [PMID: 28528721 DOI: 10.1016/j.tig.2017.04.005] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Revised: 04/24/2017] [Accepted: 04/25/2017] [Indexed: 11/23/2022]
Abstract
Animal development proceeds through the activity of genes and their cis-regulatory modules (CRMs) working together in sets of gene regulatory networks (GRNs). The emergence of species-specific traits and novel structures results from evolutionary changes in GRNs. Recent work in a wide variety of animal models, and particularly in insects, has started to reveal the modes and mechanisms of GRN evolution. I discuss here various aspects of GRN evolution and argue that developmental system drift (DSD), in which conserved phenotype is nevertheless a result of changed genetic interactions, should regularly be viewed from the perspective of GRN evolution. Advances in methods to discover related CRMs in diverse insect species, a critical requirement for detailed GRN characterization, are also described.
Collapse
|
44
|
Shahmuradov IA, Umarov RK, Solovyev VV. TSSPlant: a new tool for prediction of plant Pol II promoters. Nucleic Acids Res 2017; 45:e65. [PMID: 28082394 PMCID: PMC5416875 DOI: 10.1093/nar/gkw1353] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2016] [Revised: 12/16/2016] [Accepted: 12/27/2016] [Indexed: 11/22/2022] Open
Abstract
Our current knowledge of eukaryotic promoters indicates their complex architecture that is often composed of numerous functional motifs. Most of known promoters include multiple and in some cases mutually exclusive transcription start sites (TSSs). Moreover, TSS selection depends on cell/tissue, development stage and environmental conditions. Such complex promoter structures make their computational identification notoriously difficult. Here, we present TSSPlant, a novel tool that predicts both TATA and TATA-less promoters in sequences of a wide spectrum of plant genomes. The tool was developed by using large promoter collections from ppdb and PlantProm DB. It utilizes eighteen significant compositional and signal features of plant promoter sequences selected in this study, that feed the artificial neural network-based model trained by the backpropagation algorithm. TSSPlant achieves significantly higher accuracy compared to the next best promoter prediction program for both TATA promoters (MCC≃0.84 and F1-score≃0.91 versus MCC≃0.51 and F1-score≃0.71) and TATA-less promoters (MCC≃0.80, F1-score≃0.89 versus MCC≃0.29 and F1-score≃0.50). TSSPlant is available to download as a standalone program at http://www.cbrc.kaust.edu.sa/download/.
Collapse
Affiliation(s)
- Ilham A. Shahmuradov
- King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
- Institue of Molecular Biology and Biotechnologies, ANAS, 2 Matbuat strasse, Baku AZ1073, Azerbaijan
| | - Ramzan Kh. Umarov
- King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | | |
Collapse
|
45
|
Hope CM, Rebay I, Reinitz J. DNA Occupancy of Polymerizing Transcription Factors: A Chemical Model of the ETS Family Factor Yan. Biophys J 2017; 112:180-192. [PMID: 28076810 PMCID: PMC5232354 DOI: 10.1016/j.bpj.2016.11.901] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 10/18/2016] [Accepted: 11/11/2016] [Indexed: 11/28/2022] Open
Abstract
Transcription factors use both protein-DNA and protein-protein interactions to assemble appropriate complexes to regulate gene expression. Although most transcription factors operate as monomers or dimers, a few, including the E26 transformation-specific family repressors Drosophila melanogaster Yan and its human homolog TEL/ETV6, can polymerize. Although polymerization is required for both the normal and oncogenic function of Yan and TEL/ETV6, the mechanisms by which it influences the recruitment, organization, and stability of transcriptional complexes remain poorly understood. Further, a quantitative description of the DNA occupancy of a polymerizing transcription factor is lacking, and such a description would have broader applications to the conceptually related area of polymerizing chromatin regulators. To expand the theoretical basis for understanding how the oligomeric state of a transcriptional regulator influences its chromatin occupancy and function, we leveraged the extensive biochemical characterization of E26 transformation-specific factors to develop a mathematical model of Yan occupancy at chemical equilibrium. We find that spreading condensation from a specific binding site can take place in a path-independent manner given reasonable values of the free energies of specific and non-specific DNA binding and protein-protein cooperativity. Our calculations show that polymerization confers upon a transcription factor the unique ability to extend occupancy across DNA regions far from specific binding sites. In contrast, dimerization promotes recruitment to clustered binding sites and maximizes discrimination between specific and non-specific sites. We speculate that the association with non-specific DNA afforded by polymerization may enable regulatory behaviors that are well-suited to transcriptional repressors but perhaps incompatible with precise activation.
Collapse
Affiliation(s)
- C Matthew Hope
- Department of Biochemistry and Molecular Biophysics, The University of Chicago, Chicago, Illinois
| | - Ilaria Rebay
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois; Ben May Department for Cancer Research, The University of Chicago, Chicago, Illinois.
| | - John Reinitz
- Department of Molecular Genetics and Cell Biology, The University of Chicago, Chicago, Illinois; Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois; Department of Statistics, The University of Chicago, Chicago, Illinois.
| |
Collapse
|
46
|
Buffry AD, Mendes CC, McGregor AP. The Functionality and Evolution of Eukaryotic Transcriptional Enhancers. ADVANCES IN GENETICS 2016; 96:143-206. [PMID: 27968730 DOI: 10.1016/bs.adgen.2016.08.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Enhancers regulate precise spatial and temporal patterns of gene expression in eukaryotes and, moreover, evolutionary changes in these modular cis-regulatory elements may represent the predominant genetic basis for phenotypic evolution. Here, we review approaches to identify and functionally analyze enhancers and their transcription factor binding sites, including assay for transposable-accessible chromatin-sequencing (ATAC-Seq) and clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9, respectively. We also explore enhancer functionality, including how transcription factor binding sites combine to regulate transcription, as well as research on shadow and super enhancers, and how enhancers can act over great distances and even in trans. Finally, we discuss recent theoretical and empirical data on how transcription factor binding sites and enhancers evolve. This includes how the function of enhancers is maintained despite the turnover of transcription factor binding sites as well as reviewing studies where mutations in enhancers have been shown to underlie morphological change.
Collapse
Affiliation(s)
- A D Buffry
- Oxford Brookes University, Oxford, United Kingdom
| | - C C Mendes
- Oxford Brookes University, Oxford, United Kingdom
| | - A P McGregor
- Oxford Brookes University, Oxford, United Kingdom
| |
Collapse
|
47
|
Guo H, Huo H, Yu Q. SMCis: An Effective Algorithm for Discovery of Cis-Regulatory Modules. PLoS One 2016; 11:e0162968. [PMID: 27637070 PMCID: PMC5026350 DOI: 10.1371/journal.pone.0162968] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 08/31/2016] [Indexed: 12/02/2022] Open
Abstract
The discovery of cis-regulatory modules (CRMs) is a challenging problem in computational biology. Limited by the difficulty of using an HMM to model dependent features in transcriptional regulatory sequences (TRSs), the probabilistic modeling methods based on HMMs cannot accurately represent the distance between regulatory elements in TRSs and are cumbersome to model the prevailing dependencies between motifs within CRMs. We propose a probabilistic modeling algorithm called SMCis, which builds a more powerful CRM discovery model based on a hidden semi-Markov model. Our model characterizes the regulatory structure of CRMs and effectively models dependencies between motifs at a higher level of abstraction based on segments rather than nucleotides. Experimental results on three benchmark datasets indicate that our method performs better than the compared algorithms.
Collapse
Affiliation(s)
- Haitao Guo
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi, China
| | - Hongwei Huo
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi, China
- * E-mail:
| | - Qiang Yu
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi, China
| |
Collapse
|
48
|
Streamlined scanning for enhancer elements in Drosophila melanogaster. Biotechniques 2016; 60:141-4. [PMID: 26956092 DOI: 10.2144/000114391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Accepted: 12/09/2015] [Indexed: 11/23/2022] Open
Abstract
Enhancer elements in most eukaryotic organisms are often positioned at a great distance away from the transcription start site of the gene they regulate. Complex three-dimensional chromatin organization and insulators usually guide and limit the range of an enhancer's regulatory activity to a specific genetic locus. Rigorous testing of an entire genomic locus is often required in order to uncover the complete set of cis-regulatory modules (CRMs) regulating a gene, especially those with complex and dynamic expression patterns. Here we report a fast and efficient method for enhancer element identification by scanning large genomic regions using transgenic reporter genes.
Collapse
|
49
|
Salas EN, Shu J, Cserhati MF, Weeks DP, Ladunga I. Pluralistic and stochastic gene regulation: examples, models and consistent theory. Nucleic Acids Res 2016; 44:4595-609. [PMID: 26823500 PMCID: PMC4889914 DOI: 10.1093/nar/gkw042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 01/12/2016] [Indexed: 12/17/2022] Open
Abstract
We present a theory of pluralistic and stochastic gene regulation. To bridge the gap between empirical studies and mathematical models, we integrate pre-existing observations with our meta-analyses of the ENCODE ChIP-Seq experiments. Earlier evidence includes fluctuations in levels, location, activity, and binding of transcription factors, variable DNA motifs, and bursts in gene expression. Stochastic regulation is also indicated by frequently subdued effects of knockout mutants of regulators, their evolutionary losses/gains and massive rewiring of regulatory sites. We report wide-spread pluralistic regulation in ≈800 000 tightly co-expressed pairs of diverse human genes. Typically, half of ≈50 observed regulators bind to both genes reproducibly, twice more than in independently expressed gene pairs. We also examine the largest set of co-expressed genes, which code for cytoplasmic ribosomal proteins. Numerous regulatory complexes are highly significant enriched in ribosomal genes compared to highly expressed non-ribosomal genes. We could not find any DNA-associated, strict sense master regulator. Despite major fluctuations in transcription factor binding, our machine learning model accurately predicted transcript levels using binding sites of 20+ regulators. Our pluralistic and stochastic theory is consistent with partially random binding patterns, redundancy, stochastic regulator binding, burst-like expression, degeneracy of binding motifs and massive regulatory rewiring during evolution.
Collapse
Affiliation(s)
- Elisa N Salas
- Department of Statistics, University of Nebraska, Lincoln, NE 68583-0963, USA Department of Biochemistry, University of Nebraska, Lincoln, NE 68588-0665, USA
| | - Jiang Shu
- Department of Statistics, University of Nebraska, Lincoln, NE 68583-0963, USA
| | - Matyas F Cserhati
- Department of Statistics, University of Nebraska, Lincoln, NE 68583-0963, USA
| | - Donald P Weeks
- Department of Biochemistry, University of Nebraska, Lincoln, NE 68588-0665, USA
| | - Istvan Ladunga
- Department of Statistics, University of Nebraska, Lincoln, NE 68583-0963, USA Department of Biochemistry, University of Nebraska, Lincoln, NE 68588-0665, USA
| |
Collapse
|
50
|
Gurdziel K, Lorberbaum DS, Udager AM, Song JY, Richards N, Parker DS, Johnson LA, Allen BL, Barolo S, Gumucio DL. Identification and Validation of Novel Hedgehog-Responsive Enhancers Predicted by Computational Analysis of Ci/Gli Binding Site Density. PLoS One 2015; 10:e0145225. [PMID: 26710299 PMCID: PMC4692483 DOI: 10.1371/journal.pone.0145225] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 12/01/2015] [Indexed: 01/20/2023] Open
Abstract
The Hedgehog (Hh) signaling pathway directs a multitude of cellular responses during embryogenesis and adult tissue homeostasis. Stimulation of the pathway results in activation of Hh target genes by the transcription factor Ci/Gli, which binds to specific motifs in genomic enhancers. In Drosophila, only a few enhancers (patched, decapentaplegic, wingless, stripe, knot, hairy, orthodenticle) have been shown by in vivo functional assays to depend on direct Ci/Gli regulation. All but one (orthodenticle) contain more than one Ci/Gli site, prompting us to directly test whether homotypic clustering of Ci/Gli binding sites is sufficient to define a Hh-regulated enhancer. We therefore developed a computational algorithm to identify Ci/Gli clusters that are enriched over random expectation, within a given region of the genome. Candidate genomic regions containing Ci/Gli clusters were functionally tested in chicken neural tube electroporation assays and in transgenic flies. Of the 22 Ci/Gli clusters tested, seven novel enhancers (and the previously known patched enhancer) were identified as Hh-responsive and Ci/Gli-dependent in one or both of these assays, including: Cuticular protein 100A (Cpr100A); invected (inv), which encodes an engrailed-related transcription factor expressed at the anterior/posterior wing disc boundary; roadkill (rdx), the fly homolog of vertebrate Spop; the segment polarity gene gooseberry (gsb); and two previously untested regions of the Hh receptor-encoding patched (ptc) gene. We conclude that homotypic Ci/Gli clustering is not sufficient information to ensure Hh-responsiveness; however, it can provide a clue for enhancer recognition within putative Hedgehog target gene loci.
Collapse
Affiliation(s)
- Katherine Gurdziel
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
- Department of Computational Medicine and Bioinformatics, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - David S. Lorberbaum
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
- Cellular and Molecular Biology Program, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - Aaron M. Udager
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - Jane Y. Song
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
- Cellular and Molecular Biology Program, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - Neil Richards
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - David S. Parker
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - Lisa A. Johnson
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
| | - Benjamin L. Allen
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
- * E-mail: (DLG); (SB); (BLA)
| | - Scott Barolo
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
- * E-mail: (DLG); (SB); (BLA)
| | - Deborah L. Gumucio
- Department of Cell and Developmental Biology, The University of Michigan, Ann Arbor, MI 48109, United States of America
- * E-mail: (DLG); (SB); (BLA)
| |
Collapse
|