1
|
Zeitlinger J. Seven myths of how transcription factors read the cis-regulatory code. CURRENT OPINION IN SYSTEMS BIOLOGY 2020; 23:22-31. [PMID: 33134611 PMCID: PMC7592701 DOI: 10.1016/j.coisb.2020.08.002] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Genomics data are now being generated at large quantities, of exquisite high resolution and from single cells. They offer a unique opportunity to develop powerful machine learning algorithms, including neural networks, to uncover the rules of the cis-regulatory code. However, current modeling assumptions are often not based on state-of-the-art knowledge of the cis-regulatory code from transcription, developmental genetics, imaging and structural studies. Here I aim to fill this gap by giving a brief historical overview of the field, describing common misconceptions and providing knowledge that might help to guide computational approaches. I will describe the principles and mechanisms involved in the combinatorial requirement of transcription factor binding motifs for enhancer activity, including the role of chromatin accessibility, repressors and low-affinity motifs in the cis-regulatory code. Deciphering the cis-regulatory code would unlock an enormous amount of regulatory information in the genome and would allow us to locate cis-regulatory genetic variants involved in development and disease.
Collapse
Affiliation(s)
- Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA
- The University of Kansas Medical Center, Kansas City, KS, USA
| |
Collapse
|
2
|
Tomoyasu Y, Halfon MS. How to study enhancers in non-traditional insect models. ACTA ACUST UNITED AC 2020; 223:223/Suppl_1/jeb212241. [PMID: 32034049 DOI: 10.1242/jeb.212241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Transcriptional enhancers are central to the function and evolution of genes and gene regulation. At the organismal level, enhancers play a crucial role in coordinating tissue- and context-dependent gene expression. At the population level, changes in enhancers are thought to be a major driving force that facilitates evolution of diverse traits. An amazing array of diverse traits seen in insect morphology, physiology and behavior has been the subject of research for centuries. Although enhancer studies in insects outside of Drosophila have been limited, recent advances in functional genomic approaches have begun to make such studies possible in an increasing selection of insect species. Here, instead of comprehensively reviewing currently available technologies for enhancer studies in established model organisms such as Drosophila, we focus on a subset of computational and experimental approaches that are likely applicable to non-Drosophila insects, and discuss the pros and cons of each approach. We discuss the importance of validating enhancer function and evaluate several possible validation methods, such as reporter assays and genome editing. Key points and potential pitfalls when establishing a reporter assay system in non-traditional insect models are also discussed. We close with a discussion of how to advance enhancer studies in insects, both by improving computational approaches and by expanding the genetic toolbox in various insects. Through these discussions, this Review provides a conceptual framework for studying the function and evolution of enhancers in non-traditional insect models.
Collapse
Affiliation(s)
| | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
| |
Collapse
|
3
|
Saul MC, Blatti C, Yang W, Bukhari SA, Shpigler HY, Troy JM, Seward CH, Sloofman L, Chandrasekaran S, Bell AM, Stubbs L, Robinson GE, Zhao SD, Sinha S. Cross-species systems analysis of evolutionary toolkits of neurogenomic response to social challenge. GENES BRAIN AND BEHAVIOR 2018; 18:e12502. [PMID: 29968347 DOI: 10.1111/gbb.12502] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 06/18/2018] [Accepted: 06/20/2018] [Indexed: 12/15/2022]
Abstract
Social challenges like territorial intrusions evoke behavioral responses in widely diverging species. Recent work has showed that evolutionary "toolkits"-genes and modules with lineage-specific variations but deep conservation of function-participate in the behavioral response to social challenge. Here, we develop a multispecies computational-experimental approach to characterize such a toolkit at a systems level. Brain transcriptomic responses to social challenge was probed via RNA-seq profiling in three diverged species-honey bees, mice and three-spined stickleback fish-following a common methodology, allowing fair comparisons across species. Data were collected from multiple brain regions and multiple time points after social challenge exposure, achieving anatomical and temporal resolution substantially greater than previous work. We developed statistically rigorous analyses equipped to find homologous functional groups among these species at the levels of individual genes, functional and coexpressed gene modules, and transcription factor subnetworks. We identified six orthogroups involved in response to social challenge, including groups represented by mouse genes Npas4 and Nr4a1, as well as common modulation of systems such as transcriptional regulators, ion channels, G-protein-coupled receptors and synaptic proteins. We also identified conserved coexpression modules enriched for mitochondrial fatty acid metabolism and heat shock that constitute the shared neurogenomic response. Our analysis suggests a toolkit wherein nuclear receptors, interacting with chaperones, induce transcriptional changes in mitochondrial activity, neural cytoarchitecture and synaptic transmission after social challenge. It shows systems-level mechanisms that have been repeatedly co-opted during evolution of analogous behaviors, thus advancing the genetic toolkit concept beyond individual genes.
Collapse
Affiliation(s)
- Michael C Saul
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Charles Blatti
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Wei Yang
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Syed A Bukhari
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Hagai Y Shpigler
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Ecology, Evolution and Behavior, Hebrew University, Jerusalem, Israel
| | - Joseph M Troy
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Christopher H Seward
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Laura Sloofman
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Genetics and Genomic Sciences, Mount Sinai Health System, New York, New York
| | | | - Alison M Bell
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Animal Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Lisa Stubbs
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Interdisciplinary Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Gene E Robinson
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Sihai D Zhao
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Statistics, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Saurabh Sinha
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois.,Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| |
Collapse
|
4
|
Shen L, Liu G, Zou Y, Zhou Z, Su Z, Gu X. The evolutionary panorama of organ-specifically expressed or repressed orthologous genes in nine vertebrate species. PLoS One 2015; 10:e0116872. [PMID: 25679776 PMCID: PMC4332667 DOI: 10.1371/journal.pone.0116872] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Accepted: 12/15/2014] [Indexed: 12/29/2022] Open
Abstract
RNA sequencing (RNA-Seq) technology provides the detailed transcriptomic information for a biological sample. Using the RNA-Seq data of six organs from nine vertebrate species, we identified a number of organ-specifically expressed or repressed orthologous genes whose expression patterns are mostly conserved across nine species. Our analyses show the following results: (i) About 80% of these genes have a chordate or more ancient origin and more than half of them are the legacy of one or multiple rounds of large-scale gene duplication events. (ii) Their evolutionary rates are shaped by the organ in which they are expressed or repressed, e.g. the genes specially expressed in testis and liver generally evolve more than twice as fast as the ones specially expressed in brain and cerebellum. The organ-specific transcription factors were discriminated from these genes. The ChIP-seq data from the ENCODE project also revealed the transcription-related factors that might be involved in regulating human organ-specifically expressed or repressed genes. Some of them are shared by all six human organs. The comparison of ENCODE data with mouse/chicken ChIP-seq data proposes that organ-specifically expressed or repressed orthologous genes are regulated in various combinatorial fashions in different species, although their expression features are conserved among these species. We found that the duplication events in some gene families might help explain the quick organ/tissue divergence in vertebrate lineage. The phylogenetic analysis of testis-specifically expressed genes suggests that some of them are prone to develop new functions for other organs/tissues.
Collapse
Affiliation(s)
- Libing Shen
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, PR China
| | - Gangbiao Liu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, PR China
| | - Yangyun Zou
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, PR China
| | - Zhan Zhou
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, PR China
| | - Zhixi Su
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, PR China
| | - Xun Gu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, PR China
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa, United States of America
- * E-mail:
| |
Collapse
|
5
|
Meng Y, Shao C, Chen M. Toward microRNA-mediated gene regulatory networks in plants. Brief Bioinform 2011; 12:645-59. [DOI: 10.1093/bib/bbq091] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
|
6
|
Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms Mol Biol 2007; 2:13. [PMID: 17927813 PMCID: PMC2174486 DOI: 10.1186/1748-7188-2-13] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2007] [Accepted: 10/10/2007] [Indexed: 11/15/2022] Open
Abstract
Background cis-Regulatory modules (CRMs) of eukaryotic genes often contain multiple binding sites for transcription factors. The phenomenon that binding sites form clusters in CRMs is exploited in many algorithms to locate CRMs in a genome. This gives rise to the problem of calculating the statistical significance of the event that multiple sites, recognized by different factors, would be found simultaneously in a text of a fixed length. The main difficulty comes from overlapping occurrences of motifs. So far, no tools have been developed allowing the computation of p-values for simultaneous occurrences of different motifs which can overlap. Results We developed and implemented an algorithm computing the p-value that s different motifs occur respectively k1, ..., ks or more times, possibly overlapping, in a random text. Motifs can be represented with a majority of popular motif models, but in all cases, without indels. Zero or first order Markov chains can be adopted as a model for the random text. The computational tool was tested on the set of cis-regulatory modules involved in D. melanogaster early development, for which there exists an annotation of binding sites for transcription factors. Our test allowed us to correctly identify transcription factors cooperatively/competitively binding to DNA. Method The algorithm that precisely computes the probability of simultaneous motif occurrences is inspired by the Aho-Corasick automaton and employs a prefix tree together with a transition function. The algorithm runs with the O(n|Σ|(m|ℋ
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@| + K|σ|K) ∏i ki) time complexity, where n is the length of the text, |Σ| is the alphabet size, m is the maximal motif length, |ℋ
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@| is the total number of words in motifs, K is the order of Markov model, and ki is the number of occurrences of the ith motif. Conclusion The primary objective of the program is to assess the likelihood that a given DNA segment is CRM regulated with a known set of regulatory factors. In addition, the program can also be used to select the appropriate threshold for PWM scanning. Another application is assessing similarity of different motifs. Availability Project web page, stand-alone version and documentation can be found at
Collapse
|
7
|
Li L, Zhu Q, He X, Sinha S, Halfon MS. Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses. Genome Biol 2007; 8:R101. [PMID: 17550599 PMCID: PMC2394749 DOI: 10.1186/gb-2007-8-6-r101] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2007] [Revised: 05/23/2007] [Accepted: 06/05/2007] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Transcriptional cis-regulatory modules (for example, enhancers) play a critical role in regulating gene expression. While many individual regulatory elements have been characterized, they have never been analyzed as a class. RESULTS We have performed the first such large-scale study of cis-regulatory modules in order to determine whether they have common properties that might aid in their identification and contribute to our understanding of the mechanisms by which they function. A total of 280 individual, experimentally verified cis-regulatory modules from Drosophila were analyzed for a range of sequence-level and functional properties. We report here that regulatory modules do indeed share common properties, among them an elevated GC content, an increased level of interspecific sequence conservation, and a tendency to be transcribed into RNA. However, we find that dense clustering of transcription factor binding sites, especially homotypic clustering, which is commonly believed to be a general characteristic of regulatory modules, is rather a feature that belongs chiefly to a specific subclass. This has important implications for current computational approaches, many of which are biased toward this subset. We explore two new strategies to assess binding site clustering and gauge their performances with respect to their ability to detect all 280 modules and various functionally coherent subsets. CONCLUSION Our findings demonstrate that cis-regulatory modules share common features that help to define them as a class and that may lead to new insights into mechanisms of gene regulation. However, these properties alone may not be sufficient to reliably distinguish regulatory from non-regulatory sequences. We also demonstrate that there are distinct subclasses of cis-regulatory modules that are more amenable to in silico detection than others and that these differences must be taken into account when attempting genome-wide regulatory element discovery.
Collapse
Affiliation(s)
- Long Li
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Qianqian Zhu
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | - Xin He
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Marc S Halfon
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14214, USA
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14214, USA
- New York State Center of Excellence in Bioinformatics and the Life Sciences, Buffalo, NY 14203, USA
- Department of Molecular and Cellular Biology, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
8
|
Qian J, Lin J, Zack DJ. Characterization of binding sites of eukaryotic transcription factors. GENOMICS PROTEOMICS & BIOINFORMATICS 2006; 4:67-79. [PMID: 16970547 PMCID: PMC5054036 DOI: 10.1016/s1672-0229(06)60019-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
To explore the nature of eukaryotic transcription factor (TF) binding sites and determine how they differ from surrounding DNA sequences, we examined four features associated with DNA binding sites: G+C content, pattern complexity, palindromic structure, and Markov sequence ordering. Our analysis of the regulatory motifs obtained from the TRANSFAC database, using yeast intergenic sequences as background, revealed that these four features show variable enrichment in motif sequences. For example, motif sequences were more likely to have palindromic structure than were background sequences. In addition, these features were tightly localized to the regulatory motifs, indicating that they are a property of the motif sequences themselves and are not shared by the general promoter “environment” in which the regulatory motifs reside. By breaking down the motif sequences according to the TF classes to which they bind, more specific associations were identified. Finally, we found that some correlations, such as G+C content enrichment, were species-specific, while others, such as complexity enrichment, were universal across the species examined. The quantitative analysis provided here should increase our understanding of protein-DNA interactions and also help facilitate the discovery of regulatory motifs through bioinformatics.
Collapse
Affiliation(s)
- Jiang Qian
- The Wilmer Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA.
| | | | | |
Collapse
|
9
|
Wilkins AS. Recasting developmental evolution in terms of genetic pathway and network evolution … and the implications for comparative biology. Brain Res Bull 2005; 66:495-509. [PMID: 16144639 DOI: 10.1016/j.brainresbull.2005.04.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The morphological features of complex organisms are the outcomes of developmental processes. Developmental processes, in turn, reflect the genetic networks that underlie them. Differences in morphology must ultimately, therefore, reflect differences in the underlying genetic networks. A mutation that affects a developmental process does so by affecting either a gene whose product acts as an upstream controlling element, an intermediary connecting link, or as a downstream output of the network that governs the trait's development. Although the immense diversity of gene networks in the animal and plant kingdoms would seem to preclude any general "rules" of network evolution, the material discussed here suggests that the patterns of genetic pathway and network evolution actually fall into a number of discrete modes. The potential utility of this conceptual framework in reconstructing instances of developmental evolution and for comparative neurobiology will be discussed.
Collapse
|
10
|
Liang M, Cowley AW, Hessner MJ, Lazar J, Basile DP, Pietrusz JL. Transcriptome analysis and kidney research: Toward systems biology. Kidney Int 2005; 67:2114-22. [PMID: 15882254 DOI: 10.1111/j.1523-1755.2005.00315.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
An enormous amount of data has been generated in kidney research using transcriptome analysis techniques. In this review article, we first describe briefly the principles and major characteristics of several of these techniques. We then summarize the progress in kidney research that has been made by using transcriptome analysis, emphasizing the experience gained and the lessons learned. Several technical issues regarding DNA microarray are highlighted because of the rapidly increased use of this technology. It appears clear from this brief survey that transcriptome analysis is an effective and important tool for question-driven exploratory science. To further enhance the power of this and other high throughput, as well as conventional approaches, in future studies of the kidney, we propose a multidimensional systems biology paradigm that integrates investigation at multiple levels of biologic regulation toward the goal of achieving a global understanding of physiology and pathophysiology.
Collapse
Affiliation(s)
- Mingyu Liang
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA.
| | | | | | | | | | | |
Collapse
|
11
|
Moll PR, Duschl J, Richter K. Optimized RNA amplification using T7-RNA-polymerase based in vitro transcription. Anal Biochem 2005; 334:164-74. [PMID: 15464965 DOI: 10.1016/j.ab.2004.07.013] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2004] [Indexed: 11/30/2022]
Abstract
The use of expression profiling to explore a cell's transcriptional landscape has exploded in recent years. In many cases, however, the very limited amount of starting material poses a major problem, making the amplification of the isolated RNA obligatory. The most prominent amplification method used was developed by the Eberwine lab in 1990: cDNA synthesis is started with an oligo(dT) primer containing a T7 RNA polymerase promoter. After second-strand synthesis RNA is transcribed in vitro using T7 RNA polymerase. It has been demonstrated that antisense RNA amplification not only preserves the fidelity of RNA-based microarray analysis but even improves the sensitivity. In our aim to improve the yield of in vitro transcription reactions and to facilitate the use of amplified RNA for the construction of cDNA libraries we tested a series of T7 primers with different 3' flanking sequences containing restriction sites. In addition we tested the impact of different DNA polymerases used for synthesizing the templates on the efficiency of the in vitro transcription reaction. A total of 28 different oligo(dT)-T7 promoter primers were tested. Two of them showed a dramatically increased yield of RNA from the in vitro transcription reaction. The combination of the improved second-strand synthesis with the new T7 primer increased the RNA yield 60-fold compared to the yield of standard procedures.
Collapse
Affiliation(s)
- Pamela R Moll
- Fachbereich Zellbiologie, University of Salzburg, Hellbrunnerstrasse 34, A-5020 Salzburg, Austria
| | | | | |
Collapse
|
12
|
Combined Literature Mining and Gene Expression Analysis for Modeling Neuro-endocrine-immune Interactions. LECTURE NOTES IN COMPUTER SCIENCE 2005. [DOI: 10.1007/11538356_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|
13
|
Kusakabe T, Yoshida R, Ikeda Y, Tsuda M. Computational discovery of DNA motifs associated with cell type-specific gene expression in Ciona. Dev Biol 2004; 276:563-80. [PMID: 15581886 DOI: 10.1016/j.ydbio.2004.09.037] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2004] [Revised: 08/30/2004] [Accepted: 09/28/2004] [Indexed: 10/26/2022]
Abstract
Temporally and spatially co-expressed genes are expected to be regulated by common transcription factors and therefore to share cis-regulatory elements. In the ascidian Ciona intestinalis, the whole-genome sequences and genome-scale gene expression profiles allow the use of computational techniques to investigate cis-elements that control transcription. We collected 5' flanking sequences of 50 tissue-specific genes from genome databases of C. intestinalis and a closely related species Ciona savignyi. We searched for DNA motifs over-represented in upstream regions of a group of co-expressed genes. Several motifs were distributed predominantly in upstream regions of photoreceptor, pan-neuronal, or muscle-specific gene groups. One muscle-specific motif, M2, was distributed preferentially in regions from -200 to -100 bp relative to the translational start sites. Promoters of muscle-specific genes of C. intestinalis were isolated, connected with a green fluorescent protein gene (GFP), and introduced into C. intestinalis embryos. In muscle cells, these promoters specifically drove GFP expression, which mutations of the M2 sites greatly reduced. When M2 sites were located upstream of a basal promoter, the reporter GFP was specifically expressed in muscle cells. These results suggest the validity of our computational prediction of cis-regulatory elements. Thus, bioinformatics can help identify cis-regulatory elements involved in chordate development.
Collapse
Affiliation(s)
- Takehiro Kusakabe
- Department of Life Science, Graduate School of Life Science, University of Hyogo, Kamigori, Ako-gun, Hyogo 678-1297, Japan.
| | | | | | | |
Collapse
|
14
|
Thompson W, Palumbo MJ, Wasserman WW, Liu JS, Lawrence CE. Decoding human regulatory circuits. Genome Res 2004; 14:1967-74. [PMID: 15466295 PMCID: PMC524421 DOI: 10.1101/gr.2589004] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2004] [Accepted: 07/22/2004] [Indexed: 11/24/2022]
Abstract
Clusters of transcription factor binding sites (TFBSs) which direct gene expression constitute cis-regulatory modules (CRMs). We present a novel algorithm, based on Gibbs sampling, which locates, de novo, the cis features of these CRMs, their component TFBSs, and the properties of their spatial distribution. The algorithm finds 69% of experimentally reported TFBSs and 85% of the CRMs in a reference data set of regions upstream of genes differentially expressed in skeletal muscle cells. A discriminant procedure based on the output of the model specifically discriminated regulatory sequences in muscle-specific genes in an independent test set. Application of the method to the analysis of 2710 10-kb fragments upstream of annotated human genes identified 17 novel candidate modules with a false discovery rate =0.05, demonstrating the applicability of the method to genome-scale data.
Collapse
Affiliation(s)
- William Thompson
- Center for Bioinformatics, Wadsworth Center, New York State Department of Health, Albany, New York 12208, USA.
| | | | | | | | | |
Collapse
|
15
|
Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, Domany E, Lancet D, Shmueli O. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 2004; 21:650-9. [PMID: 15388519 DOI: 10.1093/bioinformatics/bti042] [Citation(s) in RCA: 803] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genes are often characterized dichotomously as either housekeeping or single-tissue specific. We conjectured that crucial functional information resides in genes with midrange profiles of expression. RESULTS To obtain such novel information genome-wide, we have determined the mRNA expression levels for one of the largest hitherto analyzed set of 62 839 probesets in 12 representative normal human tissues. Indeed, when using a newly defined graded tissue specificity index tau, valued between 0 for housekeeping genes and 1 for tissue-specific genes, genes with midrange profiles having 0.15< tau<0.85 were found to constitute >50% of all expression patterns. We developed a binary classification, indicating for every gene the I(B) tissues in which it is overly expressed, and the 12-I(B) tissues in which it shows low expression. The 85 dominant midrange patterns with I(B)=2-11 were found to be bimodally distributed, and to contribute most significantly to the definition of tissue specification dendrograms. Our analyses provide a novel route to infer expression profiles for presumed ancestral nodes in the tissue dendrogram. Such definition has uncovered an unsuspected correlation, whereby de novo enhancement and diminution of gene expression go hand in hand. These findings highlight the importance of gene suppression events, with implications to the course of tissue specification in ontogeny and phylogeny. AVAILABILITY All data and analyses are publically available at the GeneNote website, http://genecards.weizmann.ac.il/genenote/ and, GEO accession GSE803. CONTACT doron.lancet@weizmann.ac.il SUPPLEMENTARY INFORMATION Four tables available at the above site.
Collapse
Affiliation(s)
- Itai Yanai
- Department of Molecular Genetics, Weizmann Institute of Science 76100 Rehovot, Israel
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Liang M, Cowley AW, Greene AS. High throughput gene expression profiling: a molecular approach to integrative physiology. J Physiol 2004; 554:22-30. [PMID: 14678487 PMCID: PMC1664740 DOI: 10.1113/jphysiol.2003.049395] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2003] [Accepted: 09/25/2003] [Indexed: 12/12/2022] Open
Abstract
Integrative physiology emphasizes the importance of understanding multiple pathways with overlapping, complementary, or opposing effects and their interactions in the context of intact organisms. The DNA microarray technology, the most commonly used method for high-throughput gene expression profiling, has been touted as an integrative tool that provides insights into regulatory pathways. However, the physiology community has been slow in acceptance of these techniques because of early failure in generating useful data and the lack of a cohesive theoretical framework in which experiments can be analysed. With recent advances in both technology and analysis, we propose a concept of multidimensional integration of physiology that incorporates data generated by DNA microarray and other functional, genomic, and proteomic approaches to achieve a truly integrative understanding of physiology. Analysis of several studies performed in simpler organisms or in mammalian model animals supports the feasibility of such multidimensional integration and demonstrates the power of DNA microarray as an indispensable molecular tool for such integration. Evaluation of DNA microarray techniques indicates that these techniques, despite limitations, have advanced to a point where the question-driven profiling research has become a feasible complement to the conventional, hypothesis-driven research. With a keen sense of homeostasis, global regulation, and quantitative analysis, integrative physiologists are uniquely positioned to apply these techniques to enhance the understanding of complex physiological functions.
Collapse
Affiliation(s)
- Mingyu Liang
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
| | | | | |
Collapse
|
17
|
Pan W, Lin J, Le CT. A mixture model approach to detecting differentially expressed genes with microarray data. Funct Integr Genomics 2003; 3:117-24. [PMID: 12844246 DOI: 10.1007/s10142-003-0085-7] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2002] [Accepted: 04/16/2003] [Indexed: 11/28/2022]
Abstract
An exciting biological advancement over the past few years is the use of microarray technologies to measure simultaneously the expression levels of thousands of genes. The bottleneck now is how to extract useful information from the resulting large amounts of data. An important and common task in analyzing microarray data is to identify genes with altered expression under two experimental conditions. We propose a nonparametric statistical approach, called the mixture model method (MMM), to handle the problem when there are a small number of replicates under each experimental condition. Specifically, we propose estimating the distributions of a t -type test statistic and its null statistic using finite normal mixture models. A comparison of these two distributions by means of a likelihood ratio test, or simply using the tail distribution of the null statistic, can identify genes with significantly changed expression. Several methods are proposed to effectively control the false positives. The methodology is applied to a data set containing expression levels of 1,176 genes of rats with and without pneumococcal middle ear infection.
Collapse
Affiliation(s)
- Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, A460 Mayo, MMC 303, 420 Delaware Street SE, Minneapolis, MN 55455-0378, USA.
| | | | | |
Collapse
|
18
|
Sandelin A, Höglund A, Lenhard B, Wasserman WW. Integrated analysis of yeast regulatory sequences for biologically linked clusters of genes. Funct Integr Genomics 2003; 3:125-34. [PMID: 12827523 DOI: 10.1007/s10142-003-0086-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2002] [Revised: 04/07/2003] [Accepted: 04/29/2003] [Indexed: 10/26/2022]
Abstract
Dramatic progress in deciphering the regulatory controls in Saccharomyces cerevisiae has been enabled by the fusion of high-throughput genomics technologies with advanced sequence analysis algorithms. Sets of genes likely to function together and with similar expression profiles have been identified in diverse studies. By fusing an advanced pattern recognition algorithm for identification of transcription factor binding sites with a new method for the quantitative comparison of binding properties of transcription factors, we provide an integrated means to move from expression data to biological insights. The Yeast Regulatory Sequence Analysis system, YRSA, combines standard functions with a novel pattern characterization procedure in an intuitive interface designed for use by a broad range of scientists. The features of the system include automated retrieval of user-defined promoter sequences, binding site discovery by pattern recognition, graphical displays of the observed pattern and positions of similar sequences in the specified genes, and comparison of the new pattern against a collection of binding patterns for characterized transcription factors. The comprehensive YRSA system was used to study the regulatory mechanisms of yeast regulons. Analysis of the regulatory controls of a battery of genes induced by DNA damaging agents supports a putative mediating role for the cell-cycle checkpoint regulatory element MCB. YRSA is available at http://yrsa.cgb.ki.se. [YRSA: ancient Scandinavian name meaning old she-bear (Latin Ursus arctos = brown bear/grizzly).]
Collapse
Affiliation(s)
- Albin Sandelin
- Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden
| | | | | | | |
Collapse
|
19
|
Wasserman WW, Krivan W. In silico identification of metazoan transcriptional regulatory regions. THE SCIENCE OF NATURE - NATURWISSENSCHAFTEN 2003; 90:156-66. [PMID: 12712249 DOI: 10.1007/s00114-003-0409-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Transcriptional regulation remains one of the most intriguing and challenging subjects in biomedical research. The catalysis of transcription is a clear example of multiple proteins interacting to orchestrate a biological process, offering a starting point for the study of biological systems. Transcriptional regulation is viewed as one of the principal mechanisms governing the spatial and temporal distribution of gene expression, thus the field of transcriptional regulation provides a natural stage for quantitative studies of multiple gene systems. Building on the body of focused experimental studies and new genomics-driven data, computational biologists are making significant strides in accelerating our understanding of the transcriptional regulatory process in metazoan cells. Recent advances in the computational analysis of the interplay between factors have been fueled by well-defined computational methods for the modeling of the binding of individual transcription factors. We present here an overview of advances in the analysis of regulatory systems and the fundamental methods that underlie the recent developments.
Collapse
Affiliation(s)
- Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics, University of British Columbia, 950 West 28th Avenue, Vancouver, BC, V5Z 4H4, Canada.
| | | |
Collapse
|