151
|
|
152
|
Kantorovitz MR, Kazemian M, Kinston S, Miranda-Saavedra D, Zhu Q, Robinson GE, Göttgens B, Halfon MS, Sinha S. Motif-blind, genome-wide discovery of cis-regulatory modules in Drosophila and mouse. Dev Cell 2009; 17:568-79. [PMID: 19853570 DOI: 10.1016/j.devcel.2009.09.002] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2009] [Revised: 07/02/2009] [Accepted: 09/09/2009] [Indexed: 12/24/2022]
Abstract
We present new approaches to cis-regulatory module (CRM) discovery in the common scenario where relevant transcription factors and/or motifs are unknown. Beginning with a small list of CRMs mediating a common gene expression pattern, we search genome-wide for CRMs with similar functionality, using new statistical scores and without requiring known motifs or accurate motif discovery. We cross-validate our predictions on 31 regulatory networks in Drosophila and through correlations with gene expression data. Five predicted modules tested using an in vivo reporter gene assay all show tissue-specific regulatory activity. We also demonstrate our methods' ability to predict mammalian tissue-specific enhancers. Finally, we predict human CRMs that regulate early blood and cardiovascular development. In vivo transgenic mouse analysis of two predicted CRMs demonstrates that both have appropriate enhancer activity. Overall, 7/7 predictions were validated successfully in vivo, demonstrating the effectiveness of our approach for insect and mammalian genomes.
Collapse
Affiliation(s)
- Miriam R Kantorovitz
- Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
153
|
Ho MCW, Johnsen H, Goetz SE, Schiller BJ, Bae E, Tran DA, Shur AS, Allen JM, Rau C, Bender W, Fisher WW, Celniker SE, Drewell RA. Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila. PLoS Genet 2009; 5:e1000709. [PMID: 19893611 PMCID: PMC2763271 DOI: 10.1371/journal.pgen.1000709] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2009] [Accepted: 10/05/2009] [Indexed: 11/19/2022] Open
Abstract
It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules.
Collapse
Affiliation(s)
- Margaret C. W. Ho
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Holly Johnsen
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Sara E. Goetz
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Benjamin J. Schiller
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Esther Bae
- College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, California, United States of America
| | - Diana A. Tran
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Andrey S. Shur
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - John M. Allen
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Christoph Rau
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| | - Welcome Bender
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - William W. Fisher
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Susan E. Celniker
- Berkeley Drosophila Genome Project, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Robert A. Drewell
- Biology Department, Harvey Mudd College, Claremont, California, United States of America
| |
Collapse
|
154
|
Patterns of DNA-sequence divergence between Drosophila miranda and D. pseudoobscura. J Mol Evol 2009; 69:601-11. [PMID: 19859648 DOI: 10.1007/s00239-009-9298-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 10/07/2009] [Indexed: 12/22/2022]
Abstract
Contrary to the classical view, a large amount of non-coding DNA seems to be selectively constrained in Drosophila and other species. Here, using Drosophila miranda BAC sequences and the Drosophila pseudoobscura genome sequence, we aligned coding and non-coding sequences between D. pseudoobscura and D. miranda, and investigated their patterns of evolution. We found two patterns that have previously been observed in comparisons between Drosophila melanogaster and its relatives. First, there is a negative correlation between intron divergence and intron length, suggesting that longer non-coding sequences may contain more regulatory elements than shorter sequences. Our other main finding is a negative correlation between the rate of non-synonymous substitutions (d(N)) and codon usage bias (F(op)), showing that fast-evolving genes have a lower codon usage bias, consistent with strong positive selection interfering with weak selection for codon usage.
Collapse
|
155
|
Vandenbon A, Nakai K. Modeling tissue-specific structural patterns in human and mouse promoters. Nucleic Acids Res 2009; 38:17-25. [PMID: 19850720 PMCID: PMC2800225 DOI: 10.1093/nar/gkp866] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Sets of genes expressed in the same tissue are believed to be under the regulation of a similar set of transcription factors, and can thus be assumed to contain similar structural patterns in their regulatory regions. Here we present a study of the structural patterns in promoters of genes expressed specifically in 26 human and 34 mouse tissues. For each tissue we constructed promoter structure models, taking into account presences of motifs, their positioning to the transcription start site, and pairwise positioning of motifs. We found that 35 out of 60 models (58%) were able to distinguish positive test promoter sequences from control promoter sequences with statistical significance. Models with high performance include those for liver, skeletal muscle, kidney and tongue. Many of the important structural patterns in these models involve transcription factors of known importance in the tissues in question and structural patterns tend to be conserved between human and mouse. In addition to that, promoter models for related tissues tend to have high inter-tissue performance, indicating that their promoters share common structural patterns. Together, these results illustrate the validity of our models, but also indicate that the promoter structures for some tissues are easier to model than those of others.
Collapse
Affiliation(s)
- Alexis Vandenbon
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | | |
Collapse
|
156
|
Tian F, Chen J, Bao S, Shi L, Liu X, Grossman R. A graph model based study on regulatory impacts of transcription factors of Drosophila melanogaster and comparison across species. Biochem Biophys Res Commun 2009; 386:559-562. [PMID: 19538943 DOI: 10.1016/j.bbrc.2009.06.055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Accepted: 06/12/2009] [Indexed: 05/27/2023]
Abstract
Transcription factor binding sites and the cis-regulatory modules they compose are central determinants of gene regulation. The gene regulations in some model species have been well addressed. However, not as much is known about the fly due to the lack of experimental data. To study the transcription regulation of Drosophila melanogaster genes, we analyzed the regulation data from ChIP chip experiments as well as the regulatory database. A graph-based approach is applied to study the impacts of each transcription factor to the regulatory network. The model is also applied to Saccharomyces cerevisiae and Homo sapiens to study the behaviors of transcription factors in different species. Gene ontology annotations were used for further studies of the biological significance of studied transcription factors.
Collapse
Affiliation(s)
- Feng Tian
- School of Medicine, Tsinghua University, Beijing 100084, PR China
| | | | | | | | | | | |
Collapse
|
157
|
Bullaughey K, Chavarria CI, Coop G, Gilad Y. Expression quantitative trait loci detected in cell lines are often present in primary tissues. Hum Mol Genet 2009; 18:4296-303. [PMID: 19671653 DOI: 10.1093/hmg/ddp382] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Expression quantitative trait loci (eQTL) mapping is a powerful tool for identifying genetic regulatory variation. However, at present, most eQTLs in humans were identified using gene expression data from cell lines, and it remains unknown whether these eQTLs also have a regulatory function in other expression contexts, such as human primary tissues. Here we investigate this question using a targeted strategy. Specifically, we selected a subset of large-effect eQTLs identified in the HapMap lymphoblastoid cell lines, and examined the association of these eQTLs with gene expression levels across individuals in five human primary tissues (heart, kidney, liver, lung and testes). We show that genotypes at the eQTLs we selected are often predictive of variation in gene expression levels in one or more of the five primary tissues. The genotype effects in the primary tissues are consistently in the same direction as the effects inferred in the cell lines. Additionally, a number of the eQTLs we tested are found in more than one of the tissues. Our results indicate that functional studies in cell lines may uncover a substantial amount of genetic variation that affects gene expression levels in human primary tissues.
Collapse
Affiliation(s)
- Kevin Bullaughey
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
| | | | | | | |
Collapse
|
158
|
Kaderali L, Dazert E, Zeuge U, Frese M, Bartenschlager R. Reconstructing signaling pathways from RNAi data using probabilistic Boolean threshold networks. Bioinformatics 2009; 25:2229-35. [PMID: 19542154 DOI: 10.1093/bioinformatics/btp375] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Lars Kaderali
- Viroquant Research Group Modeling, University of Heidelberg, Bioquant BQ26, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany.
| | | | | | | | | |
Collapse
|
159
|
Pape UJ, Klein H, Vingron M. Statistical detection of cooperative transcription factors with similarity adjustment. Bioinformatics 2009; 25:2103-9. [PMID: 19286833 PMCID: PMC2722994 DOI: 10.1093/bioinformatics/btp143] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Motivation: Statistical assessment of cis-regulatory modules (CRMs) is a crucial task in computational biology. Usually, one concludes from exceptional co-occurrences of DNA motifs that the corresponding transcription factors (TFs) are cooperative. However, similar DNA motifs tend to co-occur in random sequences due to high probability of overlapping occurrences. Therefore, it is important to consider similarity of DNA motifs in the statistical assessment. Results: Based on previous work, we propose to adjust the window size for co-occurrence detection. Using the derived approximation, one obtains different window sizes for different sets of DNA motifs depending on their similarities. This ensures that the probability of co-occurrences in random sequences are equal. Applying the approach to selected similar and dissimilar DNA motifs from human TFs shows the necessity of adjustment and confirms the accuracy of the approximation by comparison to simulated data. Furthermore, it becomes clear that approaches ignoring similarities strongly underestimate P-values for cooperativity of TFs with similar DNA motifs. In addition, the approach is extended to deal with overlapping windows. We derive Chen–Stein error bounds for the approximation. Comparing the error bounds for similar and dissimilar DNA motifs shows that the approximation for similar DNA motifs yields large bounds. Hence, one has to be careful using overlapping windows. Based on the error bounds, one can precompute the approximation errors and select an appropriate overlap scheme before running the analysis. Availability: Software to perform the calculation for pairs of position frequency matrices (PFMs) is available at http://mosta.molgen.mpg.de as well as C++ source code for downloading. Contact:utz.pape@molgen.mpg.de
Collapse
Affiliation(s)
- Utz J Pape
- Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestr. 73 and Mathematics and Computer Science, Free University of Berlin, Takustr. 9, 14195 Berlin, Germany.
| | | | | |
Collapse
|
160
|
Suárez-Díaz E. Molecular evolution: concepts and the origin of disciplines. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2009; 40:43-53. [PMID: 19268873 DOI: 10.1016/j.shpsc.2008.12.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
This paper focuses on the consolidation of Molecular Evolution, a field originating in the 1960s at the interface of molecular biology, biochemistry, evolutionary biology, biophysics and studies on the origin of life and exobiology. The claim is made that Molecular Evolution became a discipline by integrating different sorts of scientific traditions: experimental, theoretical and comparative. The author critically incorporates Timothy Lenoir's treatment of disciplines (1997), as well as ideas developed by Stephen Toulmin (1962) on the same subject. On their account disciplines are spaces where the social and epistemic dimensions of science are deeply and complexly interwoven. However, a more detailed account of discipline formation and the dynamics of an emerging disciplinary field is lacking in their analysis. The present essay suggests focusing on the role of scientific concepts in the double configuration of disciplines: the social/political and the epistemic order. In the case of Molecular Evolution the concepts of molecular clock and informational molecules played a central role, both in differentiating molecular from classical evolutionists, and in promoting communication between the different sorts of traditions integrated in Molecular Evolution. The paper finishes with a reflection on the historicity of disciplines, and the historicity of our concepts of disciplines.
Collapse
Affiliation(s)
- Edna Suárez-Díaz
- Facultad de Ciencias, Universidad Nacional Autónoma de México. Circuito Exterior de Ciuad Universitaria, Coyoacán, DF 04510, México.
| |
Collapse
|
161
|
Bhadra S, Bhattacharyya C, Chandra NR, Mian IS. A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data. Algorithms Mol Biol 2009; 4:5. [PMID: 19239685 PMCID: PMC2654898 DOI: 10.1186/1748-7188-4-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2008] [Accepted: 02/24/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. RESULTS The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. CONCLUSION A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational - experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
Collapse
|
162
|
Christiaen L, Stolfi A, Davidson B, Levine M. Spatio-temporal intersection of Lhx3 and Tbx6 defines the cardiac field through synergistic activation of Mesp. Dev Biol 2009; 328:552-60. [PMID: 19389354 DOI: 10.1016/j.ydbio.2009.01.033] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Revised: 01/16/2009] [Accepted: 01/23/2009] [Indexed: 11/18/2022]
Abstract
Mesp encodes a bHLH transcription factor required for specification of the cardiac mesoderm in Ciona embryos. The activities of Macho-1 and beta-catenin, two essential maternal determinants, are required for Mesp expression in the B7.5 blastomeres, which constitute the heart field. The T-box transcription factor Tbx6 functions downstream of Macho-1 as a direct activator of Mesp expression. However, Tbx6 cannot account for the restricted expression of Mesp in the B7.5 lineage since it is expressed throughout the presumptive tail muscles. Here we present evidence that the LIM-homeobox gene Lhx3, a direct target of beta-catenin, is essential for localized Mesp expression. Lhx3 is expressed throughout the presumptive endoderm and B7.5 blastomeres. Thus, the B7.5 blastomeres are the only cells to express sustained levels of the Tbx6 and Lhx3 activators. Like mammalian Lhx3 genes, Ci-Lhx3 encodes two isoforms with distinct N-terminal peptides. The Lhx3a isoform appears to be expressed both maternally and zygotically, while the Lhx3b isoform is exclusively zygotic. Misexpression of Lhx3b is sufficient to induce ectopic Mesp activation in cells expressing Tbx6b. Injection of antisense morpholino oligonucleotides showed that the Lhx3b isoform is required for endogenous Mesp expression. Mutations in the Lhx3 half-site of Tbx6/Lhx3 composite elements strongly reduced the activity of a minimal Mesp enhancer. We discuss the delineation of the heart field by the synergistic action of muscle and gut determinants.
Collapse
Affiliation(s)
- Lionel Christiaen
- Department of Molecular & Cell Biology, Division of Genetics, Genomics and Development, Center for Integrative Genomics, University of California Berkeley, CA 94720-3200, USA.
| | | | | | | |
Collapse
|
163
|
Affiliation(s)
- Albert Erives
- Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire
| |
Collapse
|
164
|
Gene regulatory network inference: data integration in dynamic models-a review. Biosystems 2008; 96:86-103. [PMID: 19150482 DOI: 10.1016/j.biosystems.2008.12.004] [Citation(s) in RCA: 414] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Revised: 11/05/2008] [Accepted: 12/09/2008] [Indexed: 12/19/2022]
Abstract
Systems biology aims to develop mathematical models of biological systems by integrating experimental and theoretical techniques. During the last decade, many systems biological approaches that base on genome-wide data have been developed to unravel the complexity of gene regulation. This review deals with the reconstruction of gene regulatory networks (GRNs) from experimental data through computational methods. Standard GRN inference methods primarily use gene expression data derived from microarrays. However, the incorporation of additional information from heterogeneous data sources, e.g. genome sequence and protein-DNA interaction data, clearly supports the network inference process. This review focuses on promising modelling approaches that use such diverse types of molecular biological information. In particular, approaches are discussed that enable the modelling of the dynamics of gene regulatory systems. The review provides an overview of common modelling schemes and learning algorithms and outlines current challenges in GRN modelling.
Collapse
|
165
|
Zhang HY, He H, Chen LB, Li L, Liang MZ, Wang XF, Liu XG, He GM, Chen RS, Ma LG, Deng XW. A genome-wide transcription analysis reveals a close correlation of promoter INDEL polymorphism and heterotic gene expression in rice hybrids. MOLECULAR PLANT 2008; 1:720-31. [PMID: 19825576 DOI: 10.1093/mp/ssn022] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Heterosis, or hybrid vigor, refers to the phenomenon in which hybrid progeny of two inbred varieties exhibits enhanced growth or agronomic performance. Although a century-long history of research has generated several hypotheses regarding the genetic basis of heterosis, the molecular mechanisms underlying heterosis and heterotic gene expression remain elusive. Here, we report a genome-wide gene expression analysis of two heterotic crosses in rice, taking advantage of its fully sequenced genomes. Approximately 7-9% of the genes were differentially expressed in the seedling shoots from two sets of heterotic crosses, including many transcription factor genes, and exhibited multiple modes of gene action. Comparison of the putative promoter regions of the ortholog genes between inbred parents revealed extensive sequence variation, particularly small insertions/deletions (INDELs), many of which result in the formation/disruption of putative cis-regulatory elements. Together, these results suggest that a combinatorial interplay between expression of transcription factors and polymorphic promoter cis-regulatory elements in the hybrids is one plausible molecular mechanism underlying heterotic gene action and thus heterosis in rice.
Collapse
Affiliation(s)
- Hui-Yong Zhang
- National Institute of Biological Sciences, Zhongguancun Life Science Park, Beijing 102206, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
166
|
Corbo JC. The role of cis-regulatory elements in the design of gene therapy vectors for inherited blindness. Expert Opin Biol Ther 2008; 8:599-608. [PMID: 18407764 DOI: 10.1517/14712598.8.5.599] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
BACKGROUND Hereditary retinal disease is currently known to involve nearly 200 different genetic loci. There has been remarkable recent progress in the treatment of retinal disease via gene therapy in animal models using virus-based vectors. The majority of retinal diseases affect one of several cell types. In order to target expression of a rescue transgene specifically to the cells in need of therapy, it is necessary to employ a cis-regulatory element (CRE) to drive expression of the transgene specifically in those cells. OBJECTIVE/METHODS This review discusses the repertoire of CREs currently available for use in gene therapy vectors for treatment of retinal disease and outlines the issues that must be taken into consideration in the development of novel CREs for the purpose of gene therapy in the retina. CONCLUSION There have been a number of important recent advances in the identification and characterization of retinal CREs and their utilization in gene therapy vectors. Nevertheless, future efforts to rationally manipulate existing CREs and design novel synthetic CREs for therapeutic purposes will require a better understanding of the cis-regulatory rules that govern CRE activity in vivo.
Collapse
Affiliation(s)
- Joseph C Corbo
- Washington University School of Medicine, Department of Pathology and Immunology, Campus Box 8118, 660 South Euclid Avenue, St. Louis, MO 63110, USA.
| |
Collapse
|
167
|
Lalancette C, Platts AE, Lu Y, Lu S, Krawetz SA. Computational identification of transcription frameworks of early committed spermatogenic cells. Mol Genet Genomics 2008; 280:263-74. [PMID: 18615256 DOI: 10.1007/s00438-008-0361-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2008] [Accepted: 06/17/2008] [Indexed: 11/28/2022]
Abstract
It is known that transcription factors (TFs) work in cooperation with each other to govern gene expression and thus single TF studies may not always reflect the underlying biology. Using microarray data obtained from two independent studies of the first wave of spermatogenesis, we tested the hypothesis that co-expressed spermatogenic genes in cells committed to differentiation are regulated by a set of distinct combinations of TF modules. A computational approach was designed to identify over-represented module combinations in the promoter regions of genes associated with transcripts that either increase or decrease in abundance between the first two major spermatogenic cell types: spermatogonia and spermatocytes. We identified five TFs constituting four module combinations that were correlated with expression and repression of similarly regulated genes. These modules were biologically assessed in the context that they represent the key transcriptional mediators in the developmental transition from the spermatogonia to spermatocyte.
Collapse
Affiliation(s)
- Claudia Lalancette
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, 275 East Hancock, Detroit, MI 48201, USA.
| | | | | | | | | |
Collapse
|
168
|
Tuteja G, Jensen ST, White P, Kaestner KH. Cis-regulatory modules in the mammalian liver: composition depends on strength of Foxa2 consensus site. Nucleic Acids Res 2008; 36:4149-57. [PMID: 18556755 PMCID: PMC2475634 DOI: 10.1093/nar/gkn366] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Foxa2 is a critical transcription factor that controls liver development and plays an important role in hepatic gluconeogensis in adult mice. Here, we use genome-wide location analysis for Foxa2 to identify its targets in the adult liver. We then show by computational analyses that Foxa2 containing cis-regulatory modules are not constructed from a random assortment of binding sites for other transcription factors expressed in the liver, but rather that their composition depends on the strength of the Foxa2 consensus site present. Genes containing a cis-regulatory module with a medium or weak Foxa2 consensus site are much more liver-specific than the genes with a strong consensus site. We not only provide a better understanding of the mechanisms of Foxa2 regulation but also introduce a novel method for identification of different cis-regulatory modules involving a single factor.
Collapse
Affiliation(s)
- Geetu Tuteja
- Department of Genetics, Genomics and Computational Biology Graduate Group, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | | | | | | |
Collapse
|
169
|
Matusik RJ, Jin RJ, Sun Q, Wang Y, Yu X, Gupta A, Nandana S, Case TC, Paul M, Mirosevich J, Oottamasathien S, Thomas J. Prostate epithelial cell fate. Differentiation 2008; 76:682-98. [PMID: 18462434 DOI: 10.1111/j.1432-0436.2008.00276.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Androgen receptor (AR) within prostatic mesenchymal cells, with the absence of AR in the epithelium, is still sufficient to induce prostate development. AR in the luminal epithelium is required to express the secretory markers associated with differentiation. Nkx3.1 is expressed in the epithelium in early prostatic embryonic development and expression is maintained in the adult. Induction of the mouse prostate gland by the embryonic mesenchymal cells results in the organization of a sparse basal layer below the luminal epithelium with rare neuroendocrine cells that are interdispersed within this basal layer. The human prostate shows similar glandular organization; however, the basal layer is continuous. The strong inductive nature of embryonic prostatic and bladder mesenchymal cells is demonstrated in grafts where embryonic stem (ES) cells are induced to differentiate and organize as a prostate and bladder, respectively. Further, the ES cells can be driven by the correct embryonic mesenchymal cells to form epithelium that differentiates into secretory prostate glands and differentiated bladders that produce uroplakin. This requires the ES cells to mature into endoderm that gives rise to differentiated epithelium. This process is control by transcription factors in both the inductive mesenchymal cells (AR) and the responding epithelium (FoxA1 and Nkx3.1) that allows for organ development and differentiation. In this review, we explore a molecular mechanism where the pattern of transcription factor expression controls cell determination, where the cell is assigned a developmental fate and subsequently cell differentiation, and where the assigned cell now emerges with it's own unique character.
Collapse
Affiliation(s)
- Robert J Matusik
- Department of Urologic Surgery, Vanderbilt University Medical Center, A-1302 Medical Center North, 1161 21st Ave South, Nashville, TN 37232 2765, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
170
|
Abstract
As the number of sequenced genomes increases, the ability to deduce genome function becomes increasingly salient. For many genome sequences, the only annotation that will be available for the foreseeable future will be based on computational predictions and comparisons with functional elements in related species. Here we discuss computational approaches for automated genome-wide annotation of functional elements in mammalian genomes. These include methods for ab initio and comparative gene-structure predictions. Gene features such as intron splice sites, 3' untranslated regions, promoters, and cis-regulatory elements are discussed, as is a novel method for predicting DNaseI hypersensitive sites. Recent methodologies for predicting noncoding RNA genes, including microRNA genes and their targets, are also reviewed.
Collapse
Affiliation(s)
- Steven J M Jones
- Genome Sciences Centre, British Columbia Cancer Research Center, Vancouver, British Columbia, V5Z 1L3, Canada.
| |
Collapse
|
171
|
Vuori KA, Nordlund E, Kallio J, Salakoski T, Nikinmaa M. Tissue-specific expression of aryl hydrocarbon receptor and putative developmental regulatory modules in Baltic salmon yolk-sac fry. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2008; 87:19-27. [PMID: 18294709 DOI: 10.1016/j.aquatox.2008.01.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2007] [Revised: 12/21/2007] [Accepted: 01/03/2008] [Indexed: 05/25/2023]
Abstract
The aryl hydrocarbon receptor (AhR) is an ancient protein that is conserved in vertebrates and invertebrates, indicating its important function throughout evolution. AhR has been studied largely because of its role in toxicology-gene expression via AhR is induced by many aromatic hydrocarbons in mammals. Recently, however, it has become clear that AhR is involved in various aspects of development such as cell proliferation and differentiation, and cell motility and migration. The mechanisms by which AhR regulates these various functions remain poorly understood. Across-species comparative studies of AhR in invertebrates, non-mammalian vertebrates and mammals may help to reveal the multiple functions of AhR. Here, we have studied AhR during larval development of Baltic salmon (Salmon salar). Our results indicate that AhR protein is expressed in nervous system, liver and muscle tissues. We also present putative regulatory modules and module-matching genes, produced by chromatin immunoprecipitation (ChIP) cloning and in silico analysis, which may be associated with evolutionarily conserved functions of AhR during development. For example, the module NFKB-AHRR-CREB found from salmon ChIP sequences is present in human ULK3 (regulating formation of granule cell axons in mouse and axon outgrowth in Caernohabditis elegans) and SRGAP1 (GTPase-activating protein involved in the Slit/Robo pathway) promoters. We suggest that AhR may have an evolutionarily conserved role in neuronal development and nerve cell targeting, and in Wnt signaling pathway.
Collapse
Affiliation(s)
- Kristiina A Vuori
- Centre of Excellence in Evolutionary Genetics and Physiology, Department of Biology, University of Turku, FI-20014 Turku, Finland.
| | | | | | | | | |
Collapse
|
172
|
Murakami K, Imanishi T, Gojobori T, Nakai K. Two different classes of co-occurring motif pairs found by a novel visualization method in human promoter regions. BMC Genomics 2008; 9:112. [PMID: 18312685 PMCID: PMC2292176 DOI: 10.1186/1471-2164-9-112] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2007] [Accepted: 03/01/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND It is essential in modern biology to understand how transcriptional regulatory regions are composed of cis-elements, yet we have limited knowledge of, for example, the combinational uses of these elements and their positional distribution. RESULTS We predicted the positions of 228 known binding motifs for transcription factors in phylogenetically conserved regions within -2000 and +1000 bp of transcriptional start sites (TSSs) of human genes and visualized their correlated non-overlapping occurrences. In the 8,454 significantly correlated motif pairs, two major classes were observed: 248 pairs in Class 1 were mainly found around TSSs, whereas 4,020 Class 2 pairs appear at rather arbitrary distances from TSSs. These classes are distinct in a number of aspects. First, the positional distribution of the Class 1 constituent motifs shows a single peak near the TSSs, whereas Class 2 motifs show a relatively broad distribution. Second, genes that harbor the Class 1 pairs are more likely to be CpG-rich and to be expressed ubiquitously than those that harbor Class 2 pairs. Third, the 'hub' motifs, which are used in many different motif pairs, are different between the two classes. In addition, many of the transcription factors that correspond to the Class 2 hub motifs contain domains rich in specific amino acids; these domains may form disordered regions important for protein-protein interaction. CONCLUSION There exist at least two classes of motif pairs with respect to TSSs in human promoters, possibly reflecting compositional differences between promoters and enhancers. We anticipate that our visualization method may be useful for the further characterisation of promoters.
Collapse
Affiliation(s)
- Katsuhiko Murakami
- Integrated Database Group, Japan Biological Information Research Center (JBIRC), Japan Biological Informatics Consortium, Aomi 2-41, Koto-ku, Tokyo, 135-0064, Japan.
| | | | | | | |
Collapse
|
173
|
Schilstra MJ, Nehaniv CL. Bio-logic: gene expression and the laws of combinatorial logic. ARTIFICIAL LIFE 2008; 14:121-133. [PMID: 18171135 DOI: 10.1162/artl.2008.14.1.121] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
At the heart of the development of fertilized eggs into fully formed organisms and the adaptation of cells to changed conditions are genetic regulatory networks (GRNs). In higher multicellular organisms, signal selection and multiplexing are performed at the cis-regulatory domains of genes, where combinations of transcription factors (TFs) regulate the rates at which the genes are transcribed into mRNA. To be able to act as activators or repressors of gene transcription, TFs must first bind to target sequences on the regulatory domains. Two TFs that act in concert may bind entirely independently of each other, but more often binding of the first one will alter the affinity of the other for its binding site. This article presents a systematic investigation into the effect of TF binding dependences on the predicted regulatory function of this bio-logic. Four extreme scenarios, commonly used to classify enzyme activation and inhibition patterns, for the binding of two TFs were explored: independent (the TFs bind without affecting each other's affinities), competitive (the TFs compete for the same binding site), ordered (the TFs bind in a compulsory order), and joint binding (the TFs either bind as a preformed complex, or binding of one is virtually impossible in the absence of the other). The conclusions are: (1) the laws of combinatorial logic hold only for systems with independently binding TFs; (2) systems formed according to the other scenarios can mimic the functions of their Boolean logical counterparts, but cannot be combined or decomposed in the same way; and (3) the continuously scaled output of systems consisting of competitively binding activators and repressors can be controlled more robustly than that of single TF or (quasi-)logical multi-TF systems.
Collapse
Affiliation(s)
- Maria J Schilstra
- Biological and Neural Computation Group, Science and Technology Research Institute, University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, United Kingdom.
| | | |
Collapse
|
174
|
Simpson P, Ayyar S. Chapter 3 Evolution of Cis‐Regulatory Sequences in Drosophila. LONG-RANGE CONTROL OF GENE EXPRESSION 2008; 61:67-106. [DOI: 10.1016/s0065-2660(07)00003-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
175
|
Coulibaly I, Page GP. Bioinformatic tools for inferring functional information from plant microarray data II: Analysis beyond single gene. INTERNATIONAL JOURNAL OF PLANT GENOMICS 2008; 2008:893941. [PMID: 18615189 PMCID: PMC2443398 DOI: 10.1155/2008/893941] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2007] [Accepted: 05/05/2008] [Indexed: 05/26/2023]
Abstract
While it is possible to interpret microarray experiments a single gene at a time, most studies generate long lists of differentially expressed genes whose interpretation requires the integration of prior biological knowledge. This prior knowledge is stored in various public and private databases and covers several aspects of gene function and biological information. In this review, we will describe the tools and places where to find prior accurate biological information and how to process and incorporate them to interpret microarray data analyses. Here, we highlight selected tools and resources for gene class level ontology analysis (Section 2), gene coexpression analysis (Section 3), gene network analysis (Section 4), biological pathway analysis (Section 5), analysis of transcriptional regulation (Section 6), and omics data integration (Section 7). The overall goal of this review is to provide researchers with tools and information to facilitate the interpretation of microarray data.
Collapse
Affiliation(s)
- Issa Coulibaly
- Department of Biostatistics, University of Alabama at Birmingham, 1665 University Blvd Ste 327, Birmingham, AL 35294-0022, USA
| | - Grier P. Page
- Department of Biostatistics, University of Alabama at Birmingham, 1665 University Blvd Ste 327, Birmingham, AL 35294-0022, USA
| |
Collapse
|
176
|
Zammit PS, Cohen A, Buckingham ME, Kelly RG. Integration of embryonic and fetal skeletal myogenic programs at the myosin light chain 1f/3f locus. Dev Biol 2007; 313:420-33. [PMID: 18062958 DOI: 10.1016/j.ydbio.2007.10.044] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2007] [Revised: 10/16/2007] [Accepted: 10/26/2007] [Indexed: 12/25/2022]
Abstract
The genetic control of skeletal muscle differentiation at the onset of myogenesis in the embryo is relatively well understood compared to the formation of muscle during the fetal period giving rise to the bulk of skeletal muscle fibers at birth. The Mlc1f/3f (Myl1) locus encodes two alkali myosin light chains, Mlc1f and Mlc3f, from two promoters that are differentially regulated during development. The Mlc1f promoter is active in embryonic, fetal and adult fast skeletal muscle whereas the Mlc3f promoter is upregulated during fetal development and remains on in adult fast skeletal muscle. Two enhancer elements have been identified at the mammalian Mlc1f/3f locus, a 3' element active at all developmental stages and an intronic enhancer activated during fetal development. Here, using transgenesis, we demonstrate that these enhancers act combinatorially to confer the spatial, temporal and quantitative expression profile of the endogenous Mlc3f promoter. Using double reporter transgenes we demonstrate that each enhancer can activate both Mlc1f and Mlc3f promoters in vivo, revealing enhancer sharing rather than exclusive enhancer-promoter interactions. Finally, we demonstrate that the fetal activated enhancer contains critical E-box myogenic regulatory factor binding sites and that enhancer activation is impaired in vivo in the absence of myogenin but not in the absence of innervation. Together our observations provide insights into the regulation of fetal myogenesis and the mechanisms by which temporally distinct genetic programs are integrated at a single locus.
Collapse
Affiliation(s)
- Peter S Zammit
- Department of Developmental Biology, CNRS URA 2578, Pasteur Institute, 28 Rue du Dr Roux, 75724 Paris Cedex 15, France
| | | | | | | |
Collapse
|
177
|
|
178
|
Noman N, Iba H. Inferring gene regulatory networks using differential evolution with local search heuristics. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2007; 4:634-647. [PMID: 17975274 DOI: 10.1109/tcbb.2007.1058] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
We present a memetic algorithm for evolving the structure of biomolecular interactions and inferring the effective kinetic parameters from the time series data of gene expression using the decoupled Ssystem formalism. We propose an Information Criteria based fitness evaluation for gene network model selection instead of the conventional Mean Squared Error (MSE) based fitness evaluation. A hill-climbing local-search method has been incorporated in our evolutionary algorithm for efficiently attaining the skeletal architecture which is most frequently observed in biological networks. The suitability of the method is tested in gene circuit reconstruction experiments, varying the network dimension and/or characteristics, the amount of gene expression data used for inference and the noise level present in expression profiles. The reconstruction method inferred the network topology and the regulatory parameters with high accuracy. Nevertheless, the performance is limited to the amount of expression data used and the noise level present in the data. The proposed fitness function has been found more suitable for identifying correct network topology and for estimating the accurate parameter values compared to the existing ones. Finally, we applied the methodology for analyzing the cell-cycle gene expression data of budding yeast and reconstructed the network of some key regulators.
Collapse
Affiliation(s)
- Nasimul Noman
- Iba Laboratory, Graduate School of Frontier Sceinces, University of Tokyo, Tokyo, Japan.
| | | |
Collapse
|
179
|
Bussemaker HJ, Foat BC, Ward LD. Predictive modeling of genome-wide mRNA expression: from modules to molecules. ACTA ACUST UNITED AC 2007; 36:329-47. [PMID: 17311525 DOI: 10.1146/annurev.biophys.36.040306.132725] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Various algorithms are available for predicting mRNA expression and modeling gene regulatory processes. They differ in whether they rely on the existence of modules of coregulated genes or build a model that applies to all genes, whether they represent regulatory activities as hidden variables or as mRNA levels, and whether they implicitly or explicitly model the complex cis-regulatory logic of multiple interacting transcription factors binding the same DNA. The fact that functional genomics data of different types reflect the same molecular processes provides a natural strategy for integrative computational analysis. One promising avenue toward an accurate and comprehensive model of gene regulation combines biophysical modeling of the interactions among proteins, DNA, and RNA with the use of large-scale functional genomics data to estimate regulatory network connectivity and activity parameters. As the ability of these models to represent complex cis-regulatory logic increases, the need for approaches based on cross-species conservation may diminish.
Collapse
Affiliation(s)
- Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.
| | | | | |
Collapse
|
180
|
Genomewide computational analysis of nitrate response elements in rice and Arabidopsis. Mol Genet Genomics 2007. [PMID: 17680272 DOI: 10.1007/s00438‐007‐0268‐3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]
Abstract
Nitrate response element (NRE) was originally reported to be comprised of an Ag/cTCA core sequence motif preceded by a 7-bp AT rich region, based on promoter deletion analyses in nitrate and nitrite reductases from Arabidopsis thaliana and birch. In view of hundreds of new nitrate responsive genes discovered recently, we sought to computationally verify whether the above motif indeed qualifies to be the cis-acting NRE for all the responsive genes. We searched for the specific occurrence of at least two copies of the above motif in and around the nitrate responsive genes and elsewhere in the Arabidopsis and rice (Oryza sativa) genomes, with respect to their positional, orientational and strand-specific bias. This is the first comprehensive analysis of NREs for 625 nitrate responsive genes of Arabidopsis and their rice homologs, representing dicots and monocots, respectively. We report that the above motifs are present almost randomly throughout these genomes and do not reveal any specificity or bias towards nitrate responsive genes. This also seems to be true for smaller subsets of nitrate responsive genes in Arabidopsis, such as the 21 early responsive genes, 261 and 90 genes for root-specific and shoot-specific response, respectively, and 25 housekeeping genes. This necessitates a fresh search for candidate sequences that qualify to be NREs in these and other plants.
Collapse
|
181
|
Landry CR, Hartl DL, Ranz JM. Genome clashes in hybrids: insights from gene expression. Heredity (Edinb) 2007; 99:483-93. [PMID: 17687247 DOI: 10.1038/sj.hdy.6801045] [Citation(s) in RCA: 114] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
In interspecific hybrids, novel phenotypes often emerge from the interaction of two divergent genomes. Interactions between the two transcriptional networks are assumed to contribute to these unpredicted new phenotypes by inducing novel patterns of gene expression. Here we provide a review of the recent literature on the accumulation of regulatory incompatibilities. We review specific examples of regulatory incompatibilities reported at particular loci as well as genome-scale surveys of gene expression in interspecific hybrids. Finally, we consider and preview novel technologies that could help decipher how divergent transcriptional networks interact in hybrids between species.
Collapse
Affiliation(s)
- C R Landry
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
| | | | | |
Collapse
|
182
|
Das SK, Pathak RR, Choudhury D, Raghuram N. Genomewide computational analysis of nitrate response elements in rice and Arabidopsis. Mol Genet Genomics 2007; 278:519-25. [PMID: 17680272 DOI: 10.1007/s00438-007-0268-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2007] [Accepted: 06/11/2007] [Indexed: 10/23/2022]
Abstract
Nitrate response element (NRE) was originally reported to be comprised of an Ag/cTCA core sequence motif preceded by a 7-bp AT rich region, based on promoter deletion analyses in nitrate and nitrite reductases from Arabidopsis thaliana and birch. In view of hundreds of new nitrate responsive genes discovered recently, we sought to computationally verify whether the above motif indeed qualifies to be the cis-acting NRE for all the responsive genes. We searched for the specific occurrence of at least two copies of the above motif in and around the nitrate responsive genes and elsewhere in the Arabidopsis and rice (Oryza sativa) genomes, with respect to their positional, orientational and strand-specific bias. This is the first comprehensive analysis of NREs for 625 nitrate responsive genes of Arabidopsis and their rice homologs, representing dicots and monocots, respectively. We report that the above motifs are present almost randomly throughout these genomes and do not reveal any specificity or bias towards nitrate responsive genes. This also seems to be true for smaller subsets of nitrate responsive genes in Arabidopsis, such as the 21 early responsive genes, 261 and 90 genes for root-specific and shoot-specific response, respectively, and 25 housekeeping genes. This necessitates a fresh search for candidate sequences that qualify to be NREs in these and other plants.
Collapse
Affiliation(s)
- Suman K Das
- School of Biotechnology, Guru Gobind Singh Indraprastha University, Kashmiri Gate, Delhi, 110 006, India
| | | | | | | |
Collapse
|
183
|
Ettwiller L, Paten B, Ramialison M, Birney E, Wittbrodt J. Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation. Nat Methods 2007; 4:563-5. [PMID: 17589518 DOI: 10.1038/nmeth1061] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2007] [Accepted: 05/18/2007] [Indexed: 11/08/2022]
Abstract
We developed Trawler, the fastest computational pipeline to date, to efficiently discover over-represented motifs in chromatin immunoprecipitation (ChIP) experiments and to predict their functional instances. When we applied Trawler to data from yeast and mammals, 83% of the known binding sites were accurately called, often with other additional binding sites, providing hints of combinatorial input. Newly discovered motifs and their features (identity, conservation, position in sequence) are displayed on a web interface.
Collapse
|
184
|
Wang Z, Wei GH, Liu DP, Liang CC. Unravelling the world of cis-regulatory elements. Med Biol Eng Comput 2007; 45:709-18. [PMID: 17541666 DOI: 10.1007/s11517-007-0195-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2006] [Accepted: 05/03/2007] [Indexed: 12/16/2022]
Abstract
Genome-wide comparisons indicate that only studying the coding regions will not be enough for explaining the biological complexity of an organism, while the genetic variants and the epigenetic differences of cis-regulatory elements are crucial to elucidate many complicated biological phenomena. Their various regulatory functions also play indispensable roles in forming organismal polymorphism. Recent studies showed that the cis-regulatory elements can regulate gene expression as nuclear organizers, and involve in functional noncoding transcription and produce regulatory noncoding RNA molecules. Novel high-throughput strategies and in silico analysis make a great amount data of cis-regulatory elements available. Particularly, the computational methods could help to combine reductionist studies with network biomedical investigations, and begin the era to understand organismal regulatory events at systems biology level.
Collapse
Affiliation(s)
- Zhao Wang
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Dong Dan San Tiao 5, 100005 Beijing, China
| | | | | | | |
Collapse
|
185
|
Cheung TH, Barthel KKB, Kwan YL, Liu X. Identifying pattern-defined regulatory islands in mammalian genomes. Proc Natl Acad Sci U S A 2007; 104:10116-21. [PMID: 17535887 PMCID: PMC1891267 DOI: 10.1073/pnas.0704028104] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Identifying cis-regulatory regions in mammalian genomes is a key challenge toward understanding transcriptional regulation. However, identification and functional characterization of those regulatory elements governing differential gene expression has been hampered by the limited understanding of their organization and locations in genomes. We hypothesized that genes that are conserved across species will also display conservation at the level of their transcriptional regulation and that this will be reflected in the organization of cis-elements mediating this regulation. Using a computational approach, clusters of transcription factor binding sites that are absolutely conserved in order and in spacing across human, rat, and mouse genomes were identified. We term these regions pattern-defined regulatory islands (PRIs). We discovered that these sequences are frequently active sites of transcriptional regulation. These PRIs occur in approximately 1.1% of the half-billion base pairs covered in the search and are located mainly in noncoding regions of the genome. We show that the premise of PRIs can be used to identify previously known and novel cis-regulatory regions controlling genes regulated by myogenic differentiation. Thus, PRIs may represent a fundamental property of the architecture of cis-regulatory elements in mammalian genomes, and this feature can be exploited to pinpoint critical transcriptional regulatory elements governing cell type-specific gene expression.
Collapse
Affiliation(s)
- Tom H. Cheung
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309
| | | | - Yin Lam Kwan
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309
| | - Xuedong Liu
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
186
|
Multiple non-collinear TF-map alignments of promoter regions. BMC Bioinformatics 2007; 8:138. [PMID: 17456238 PMCID: PMC1878506 DOI: 10.1186/1471-2105-8-138] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2007] [Accepted: 04/24/2007] [Indexed: 12/25/2022] Open
Abstract
Background The analysis of the promoter sequence of genes with similar expression patterns is a basic tool to annotate common regulatory elements. Multiple sequence alignments are on the basis of most comparative approaches. The characterization of regulatory regions from co-expressed genes at the sequence level, however, does not yield satisfactory results in many occasions as promoter regions of genes sharing similar expression programs often do not show nucleotide sequence conservation. Results In a recent approach to circumvent this limitation, we proposed to align the maps of predicted transcription factors (referred as TF-maps) instead of the nucleotide sequence of two related promoters, taking into account the label of the corresponding factor and the position in the primary sequence. We have now extended the basic algorithm to permit multiple promoter comparisons using the progressive alignment paradigm. In addition, non-collinear conservation blocks might now be identified in the resulting alignments. We have optimized the parameters of the algorithm in a small, but well-characterized collection of human-mouse-chicken-zebrafish orthologous gene promoters. Conclusion Results in this dataset indicate that TF-map alignments are able to detect high-level regulatory conservation at the promoter and the 3'UTR gene regions, which cannot be detected by the typical sequence alignments. Three particular examples are introduced here to illustrate the power of the multiple TF-map alignments to characterize conserved regulatory elements in absence of sequence similarity. We consider this kind of approach can be extremely useful in the future to annotate potential transcription factor binding sites on sets of co-regulated genes from high-throughput expression experiments.
Collapse
|
187
|
Abnizova I, Walter K, Te Boekhorst R, Elgar G, Gilks WR. Statistical information characterization of conserved non-coding elements in vertebrates. J Bioinform Comput Biol 2007; 5:533-47. [PMID: 17636860 DOI: 10.1142/s0219720007002898] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2006] [Revised: 02/23/2007] [Accepted: 02/23/2007] [Indexed: 01/28/2023]
Abstract
Recently, a set of highly conserved non-coding elements (CNEs) has been derived from a comparison between the genomes of the puffer fish, Takifugu or Fugu rubripes, and man. In order to facilitate the identification of these conserved elements in silico, we characterize them by a number of statistical features. We found a pronounced information pattern around CNE borders; although the CNEs themselves are AT rich and have high entropy (complexity), they are flanked by GC-rich regions of low entropy (complexity). We also identified the most abundant motifs within and around of CNEs, and identified those that group around their borders. Like in human promoter regions, the TBP, NF-Y and some other binding motifs are clustered around CNE boundaries, which may suggest a possible transcription regulatory function of CNEs.
Collapse
|
188
|
Zhao G, Schriefer LA, Stormo GD. Identification of muscle-specific regulatory modules in Caenorhabditis elegans. Genome Res 2007; 17:348-57. [PMID: 17284674 PMCID: PMC1800926 DOI: 10.1101/gr.5989907] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Transcriptional regulation is the major regulatory mechanism that controls the spatial and temporal expression of genes during development. This is carried out by transcription factors (TFs), which recognize and bind to their cognate binding sites. Recent studies suggest a modular organization of TF-binding sites, in which clusters of transcription-factor binding sites cooperate in the regulation of downstream gene expression. In this study, we report our computational identification and experimental verification of muscle-specific cis-regulatory modules in Caenorhabditis elegans. We first identified a set of motifs that are correlated with muscle-specific gene expression. We then predicted muscle-specific regulatory modules based on clusters of those motifs with characteristics similar to a collection of well-studied modules in other species. The method correctly identifies 88% of the experimentally characterized modules with a positive predictive value of at least 65%. The prediction accuracy of muscle-specific expression on an independent test set is highly significant (P<0.0001). We performed in vivo experimental tests of 12 predicted modules, and 10 of those drive muscle-specific gene expression. These results suggest that our method is highly accurate in identifying functional sequences important for muscle-specific gene expression and is a valuable tool for guiding experimental designs.
Collapse
Affiliation(s)
- Guoyan Zhao
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Lawrence A. Schriefer
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Gary D. Stormo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Corresponding author.E-mail ; fax (314) 362-7855
| |
Collapse
|
189
|
Chan ZSH, Collins L, Kasabov N. Bayesian learning of sparse gene regulatory networks. Biosystems 2007; 87:299-306. [PMID: 17223483 DOI: 10.1016/j.biosystems.2006.09.026] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2005] [Revised: 07/08/2006] [Accepted: 07/15/2006] [Indexed: 11/21/2022]
Abstract
Differential equations (DEs) have been the most widespread formalism for gene regulatory network (GRN) modeling, as they offer natural interpretation of biological processes, easy elucidation of gene relationships, and the capability of using efficient parameter estimation methods. However, an important limitation of DEs is their requirement of O(d(2)) parameters where d is the number of genes modeled, which often causes over-parameterization for large d, leading to the over-fitting of data and dense parameter sets that are hard to interpret. This paper presents the first effort to address the over-parameterization problem by applying the sparse Bayesian learning (SBL) method to sparsify the GRN model of DEs. SBL operates on the parsimony principle, with the objective to reduce the number of effective parameters by driving the redundant parameters to zero. The resulting sparse parameter set offers three important advantages for GRN inference: first, the inferred GRNs are more plausible, since the biological counterparts are known to be sparse; second, gene relationships can be more easily elucidated from sparse sets than from dense sets; and third, the solutions become more optimal and consistent, due to the reduction in the volume of solution space. Experiments are conducted on the yeast Saccharomyces cerevisiae time-series gene expression data, in which known regulatory events related to the cell cycle G1/S phase are reliably reproduced.
Collapse
Affiliation(s)
- Zeke S H Chan
- Knowledge Engineering and Discovery Research Institute (KEDRI), Auckland University of Technology, Auckland, New Zealand.
| | | | | |
Collapse
|
190
|
Down TA, Bergman CM, Su J, Hubbard TJP. Large-scale discovery of promoter motifs in Drosophila melanogaster. PLoS Comput Biol 2007; 3:e7. [PMID: 17238282 PMCID: PMC1779301 DOI: 10.1371/journal.pcbi.0030007] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2006] [Accepted: 12/01/2006] [Indexed: 11/28/2022] Open
Abstract
A key step in understanding gene regulation is to identify the repertoire of transcription factor binding motifs (TFBMs) that form the building blocks of promoters and other regulatory elements. Identifying these experimentally is very laborious, and the number of TFBMs discovered remains relatively small, especially when compared with the hundreds of transcription factor genes predicted in metazoan genomes. We have used a recently developed statistical motif discovery approach, NestedMICA, to detect candidate TFBMs from a large set of Drosophila melanogaster promoter regions. Of the 120 motifs inferred in our initial analysis, 25 were statistically significant matches to previously reported motifs, while 87 appeared to be novel. Analysis of sequence conservation and motif positioning suggested that the great majority of these discovered motifs are predictive of functional elements in the genome. Many motifs showed associations with specific patterns of gene expression in the D. melanogaster embryo, and we were able to obtain confident annotation of expression patterns for 25 of our motifs, including eight of the novel motifs. The motifs are available through Tiffin, a new database of DNA sequence motifs. We have discovered many new motifs that are overrepresented in D. melanogaster promoter regions, and offer several independent lines of evidence that these are novel TFBMs. Our motif dictionary provides a solid foundation for further investigation of regulatory elements in Drosophila, and demonstrates techniques that should be applicable in other species. We suggest that further improvements in computational motif discovery should narrow the gap between the set of known motifs and the total number of transcription factors in metazoan genomes.
Collapse
Affiliation(s)
- Thomas A Down
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom.
| | | | | | | |
Collapse
|
191
|
Abstract
Beginning in the late 1980s, Eric Davidson's group at Cal Tech developed a modularity hypothesis of developmental gene regulation, showing that in an expanding number of cases, particular aspects of development were governed by compact 'modules' of transcription factor binding sites (TFBSs), and that these modules were separable, complex and interconnected. Davidson made no attempt to further generalize the hypothesis, but others took up the idea, transported it out of development and extended it to a general rule of clustering. Despite such misbegotten origins, the 'extended' modularity hypothesis--that TFBSs in general tend to come in compact clusters--has been highly productive, yet it has never been challenged with a large, diverse and unbiased dataset to see how universal it actually is. The aim of the present paper is to do so. Applying human-mouse-rat phylogenetic footprinting to neighbourhoods of a diverse set of TFBSs, including both developmental and non-developmental signals, we find that the extended hypothesis holds in at least 93.5% of cases. Based on this particular sample, we found a mean module length of 609 nucleotides containing, on an average, 24.5 presumptive regulatory signals of length greater than 5 and averaging 8.5 nucleotides each.
Collapse
Affiliation(s)
| | - Rune Blomhoff
- Author and address for correspondence: PO Box 1046 Blindern, 0316 Oslo, Norway ()
| |
Collapse
|
192
|
Müller F, Borycki AG. Sequence analyses to study the evolutionary history and cis-regulatory elements of Hedgehog genes. Methods Mol Biol 2007; 397:231-250. [PMID: 18025724 DOI: 10.1007/978-1-59745-516-9_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Sequence analysis and comparative genomics are powerful tools to gain knowledge on multiple aspects of gene and protein regulation and function. These have been widely used to understand the evolutionary history and the biochemistry of Hedgehog (Hh) proteins, and the molecular control of Hedgehog gene expression. Here, we report on some of the methods available to retrieve protein and genomic sequences. We describe how protein sequence comparison can produce information on the evolutionary history of Hh proteins. Moreover, we describe the use of genomic sequence analysis including phylogenetic footprinting and transcription factor-binding site search tools, techniques that allow for the characterization of cis-regulatory elements of developmental genes such as the Hedgehog genes.
Collapse
|
193
|
Abstract
DNA sequences that regulate expression of the insulin gene are located within a region spanning approximately 400 bp that flank the transcription start site. This region, the insulin promoter, contains a number of cis-acting elements that bind transcription factors, some of which are expressed only in the beta-cell and a few other endocrine or neural cell types, while others have a widespread tissue distribution. The sequencing of the genome of a number of species has allowed us to examine the manner in which the insulin promoter has evolved over a 450 million-year period. The major findings are that the A-box sites that bind PDX-1 are among the most highly conserved regulatory sequences, and that the conservation of the C1, E1, and CRE sequences emphasize the importance of MafA, E47/beta2, and cAMP-associated regulation. The review also reveals that of all the insulin gene promoters studied, the rodent insulin promoters are considerably dissimilar to the human, leading to the conclusion that extreme care should be taken when extrapolating rodent-based data on the insulin gene to humans.
Collapse
Affiliation(s)
- Colin W Hay
- School of Medical Sciences, University of Aberdeen, Institute of Medical Sciences, Aberdeen, AB25 2ZD, UK
| | | |
Collapse
|
194
|
Pasquet S, Naye F, Faucheux C, Bronchain O, Chesneau A, Thiébaud P, Thézé N. Transcription Enhancer Factor-1-dependent Expression of the α-Tropomyosin Gene in the Three Muscle Cell Types. J Biol Chem 2006; 281:34406-20. [PMID: 16959782 DOI: 10.1074/jbc.m602282200] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
In vertebrates, the actin-binding proteins tropomyosins are encoded by four distinct genes that are expressed in a complex pattern during development and muscle differentiation. In this study, we have characterized the transcriptional machinery of the alpha-tropomyosin (alpha-Tm) gene in muscle cells. Promoter analysis revealed that a 284-bp proximal promoter region of the Xenopus laevis alpha-Tm gene is sufficient for maximal activity in the three muscle cell types. The transcriptional activity of this promoter in the three muscle cell types depends on both distinct and common cis-regulatory sequences. We have identified a 30-bp conserved sequence unique to all vertebrate alpha-Tm genes that contains an MCAT site that is critical for expression of the gene in all muscle cell types. This site can bind transcription enhancer factor-1 (TEF-1) present in muscle cells both in vitro and in vivo. In serum-deprived differentiated smooth muscle cells, TEF-1 was redistributed to the nucleus, and this correlated with increased activity of the alpha-Tm promoter. Overexpression of TEF-1 mRNA in Xenopus embryonic cells led to activation of both the endogenous alpha-Tm gene and the exogenous 284-bp promoter. Finally, we show that, in transgenic embryos and juveniles, an intact MCAT sequence is required for correct temporal and spatial expression of the 284-bp gene promoter. This study represents the first analysis of the transcriptional regulation of the alpha-Tm gene in vivo and highlights a common TEF-1-dependent regulatory mechanism necessary for expression of the gene in the three muscle lineages.
Collapse
|
195
|
Amemiya CT, Gomez-Chiarri M. Comparative genomics in vertebrate evolution and development. ACTA ACUST UNITED AC 2006; 305:672-82. [PMID: 16902957 DOI: 10.1002/jez.a.308] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The vast quantities of publicly available DNA sequencing data and genome resources are enabling biologists to investigate age-old problems in biology that were not addressable previously. In this review, we discuss how comparative genomics is practiced and how the data can be used to make biological inferences with respect to vertebrate evolution and development. Examples are taken from the well-known HOX clusters, which are always a high-priority target for genomic analyses due to their inferred role in the evolution of metazoans. In addition, we briefly discuss the application of genomic approaches to problems in comparative endocrinology.
Collapse
Affiliation(s)
- Chris T Amemiya
- Molecular Genetics Program, Benaroya Research Institute at Virginia Mason, Seattle, Washington 98101, USA.
| | | |
Collapse
|
196
|
Just W. Reverse Engineering Discrete Dynamical Systems from Data Sets with Random Input Vectors. J Comput Biol 2006; 13:1435-56. [PMID: 17061920 DOI: 10.1089/cmb.2006.13.1435] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Recently a new algorithm for reverse engineering of biochemical networks was developed by Laubenbacher and Stigler. It is based on methods from computational algebra and finds most parsimonious models for a given data set. We derive mathematically rigorous estimates for the expected amount of data needed by this algorithm to find the correct model. In particular, we demonstrate that for one type of input parameter (graded term orders), the expected data requirements scale polynomially with the number n of chemicals in the network, while for another type of input parameters (randomly chosen lex orders) this number scales exponentially in n. We also show that, for a modification of the algorithm, the expected data requirements scale as the logarithm of n.
Collapse
Affiliation(s)
- Winfried Just
- Department of Mathematics, Ohio University, Athens, 45701, USA.
| |
Collapse
|
197
|
Bishop KJM, Gray TP, Fialkowski M, Grzybowski BA. Microchameleons: nonlinear chemical microsystems for amplification and sensing. CHAOS (WOODBURY, N.Y.) 2006; 16:037102. [PMID: 17014236 DOI: 10.1063/1.2240142] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
In biological systems, the coupling of nonlinear biochemical kinetics and molecular transport enables functional sensing and "signal" amplification across many length scales. Drawing on biological inspiration, we describe how artificial reaction-diffusion (RD) microsystems can provide a basis for sensing applications, capable of amplifying micro- and nanoscopic events into macroscopic visual readouts. The RD applications reviewed here are based on a novel experimental technique, WETS for Wet Stamping, which offers unprecedented control over RD processes in microscopic and complex geometries. It is discussed how RD can be used to sense subtle differences in the thickness and/or absorptivity of thin absorptive films, amplify macromolecular phase transitions, detect the presence and quality of self-assembled monolayers, and provide dynamic spatiotemporal readouts of chemical "metabolites."
Collapse
Affiliation(s)
- K J M Bishop
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, Illinois 60208, USA
| | | | | | | |
Collapse
|
198
|
Abstract
Cis-regulatory sequences direct patterns of gene expression essential for development and physiology. Evolutionary changes in these sequences contribute to phenotypic divergence. Despite their importance, cis-regulatory regions remain one of the most enigmatic features of the genome. Patterns of sequence evolution can be used to identify cis-regulatory elements, but the power of this approach depends upon the relationship between sequence and function. Comparative studies of gene regulation among Diptera reveal that divergent sequences can underlie conserved expression, and that expression differences can evolve despite largely similar sequences. This complex structure-function relationship is the primary impediment for computational identification and interpretation of cis-regulatory sequences. Biochemical characterization and in vivo assays of cis-regulatory sequences on a genomic-scale will relieve this barrier.
Collapse
Affiliation(s)
- P J Wittkopp
- Department of Ecology and Evolutionary Biology, University of Michigan, 1061 Natural Science Building, 830 North University Ave., Ann Arbor, MI 48109-1048, USA.
| |
Collapse
|
199
|
Perco P, Rapberger R, Siehs C, Lukas A, Oberbauer R, Mayer G, Mayer B. Transforming omics data into context: Bioinformatics on genomics and proteomics raw data. Electrophoresis 2006; 27:2659-75. [PMID: 16739231 DOI: 10.1002/elps.200600064] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Differential gene expression analysis and proteomics have exerted significant impact on the elucidation of concerted cellular processes, as simultaneous measurement of hundreds to thousands of individual objects on the level of RNA and protein ensembles became technically feasible. The availability of such data sets has promised a profound understanding of phenomena on an aggregate level, expressed as the phenotypic response (observables) of cells, e.g., in the presence of drugs, or characterization of cells and tissue displaying distinct patho-physiological states. However, the step of transforming these data into context, i.e., linking distinct expression or abundance patterns with phenotypic observables - and furthermore enabling a sound biological interpretation on the level of reaction networks and concerted pathways, is still a major shortcoming. This finding is certainly based on the enormous complexity embedded in cellular reaction networks, but a variety of computational approaches have been developed over the last few years to overcome these issues. This review provides an overview on computational procedures for analysis of genomic and proteomic data introducing a sequential analysis workflow: Explorative statistics for deriving a first, from the purely statistical viewpoint, relevant candidate gene/protein list, followed by co-regulation and network analysis to biologically expand this core list toward functional networks and pathways. The review on these procedures is complemented by example applications tailored at identification of disease-associated proteins. Optimization of computational procedures involved, in conjunction with the continuous increase in additional biological data, clearly has the potential of boosting our understanding of processes on a cell-wide level.
Collapse
Affiliation(s)
- Paul Perco
- Department of Nephrology, Medical University of Vienna, Austria
| | | | | | | | | | | | | |
Collapse
|
200
|
Yu X, Suzuki K, Wang Y, Gupta A, Jin R, Orgebin-Crist MC, Matusik R. The role of forkhead box A2 to restrict androgen-regulated gene expression of lipocalin 5 in the mouse epididymis. Mol Endocrinol 2006; 20:2418-31. [PMID: 16740652 DOI: 10.1210/me.2006-0008] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Murine epididymal retinoic acid-binding protein [or lipocalin 5 (Lcn5)] is synthesized and secreted by the principal cells of the mouse middle/distal caput epididymidis. A 5-kb promoter fragment of the Lcn5 gene can dictate androgen-dependent and epididymis region-specific gene expression in transgenic mice. Here, we reported that the 1.8-kb Lcn5 promoter confers epididymis region-specific gene expression in transgenic mice. To decipher the mechanism that directs transcription, 14 chimeric constructs that sequentially removed 100 bp of 1.8-kb Lcn5 promoter were generated and transfected into epididymal cells and nonepididymal cells. Transient transfection analysis revealed that 1.3 kb promoter fragment gave the strongest response to androgens. Between the 1.2-kb to 1.3-kb region, two androgen receptor (AR) binding sites were identified. Adjacent to AR binding sites, a Foxa2 [Fox (Forkhead box) subclass A] binding site was confirmed by gel shift assay. Similar Foxa binding sites were also found on the promoters of human and rat Lcn5, indicating the Foxa binding site is conserved among species. We previously reported that among the three members of Foxa family, Foxa1 and Foxa3 were absent in the epididymis whereas Foxa2 was detected in epididymal principal cells. Here, we report that Foxa2 displays a region-specific expression pattern along the epididymis: no staining observed in initial segment, light staining in proximal caput, gradiently heavier staining in middle and distal caput, and strongest staining in corpus and cauda, regions with little or no expression of Lcn5. In transient transfection experiments, Foxa2 expression inhibits AR induction of the Lcn5 promoter, which is consistent with the lack of expression of Lcn5 in the corpus and cauda. We conclude that Foxa2 functions as a repressor that restricts AR regulation of Lcn5 to a segment-specific pattern in the epididymis.
Collapse
Affiliation(s)
- Xiuping Yu
- Department of Urologic Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, USA
| | | | | | | | | | | | | |
Collapse
|