1
|
Konuma J, Fujisawa T, Nishiyama T, Kasahara M, Shibata TF, Nozawa M, Shigenobu S, Toyoda A, Hasebe M, Sota T. Odd-Paired is Involved in Morphological Divergence of Snail-Feeding Beetles. Mol Biol Evol 2024; 41:msae110. [PMID: 38857185 PMCID: PMC11214159 DOI: 10.1093/molbev/msae110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 04/15/2024] [Accepted: 06/04/2024] [Indexed: 06/12/2024] Open
Abstract
Body shape and size diversity and their evolutionary rates correlate with species richness at the macroevolutionary scale. However, the molecular genetic mechanisms underlying the morphological diversification across related species are poorly understood. In beetles, which account for one-fourth of the known species, adaptation to different trophic niches through morphological diversification appears to have contributed to species radiation. Here, we explored the key genes for the morphological divergence of the slender to stout body shape related to divergent feeding methods on large to small snails within the genus Carabus. We show that the zinc-finger transcription factor encoded by odd-paired (opa) controls morphological variation in the snail-feeding ground beetle Carabus blaptoides. Specifically, opa was identified as the gene underlying the slender to stout morphological difference between subspecies through genetic mapping and functional analysis via gene knockdown. Further analyses revealed that changes in opa cis-regulatory sequences likely contributed to the differences in body shape and size between C. blaptoides subspecies. Among opa cis-regulatory sequences, single nucleotide polymorphisms on the transcription factor binding sites may be associated with the morphological differences between C. blaptoides subspecies. opa was highly conserved in a wide range of taxa, especially in beetles. Therefore, opa may play an important role in adaptive morphological divergence in beetles.
Collapse
Affiliation(s)
- Junji Konuma
- Department of Biology, Faculty of Science, Toho University, Funabashi, Chiba, Japan
| | - Tomochika Fujisawa
- Center for Data Science Education and Research, Shiga University, Hikone, Shiga, Japan
| | - Tomoaki Nishiyama
- Research Center for Experimental Modeling of Human Disease, Kanazawa University, Ishikawa, Japan
| | - Masahiro Kasahara
- Graduate School of Frontier Science, The University of Tokyo, Kashiwa, Chiba, Japan
| | | | - Masafumi Nozawa
- Department of Biological Sciences, Tokyo Metropolitan University, Hachioji, Tokyo, Japan
- Research Center for Genomics and Bioinformatics, Tokyo Metropolitan University, Hachioji, Tokyo, Japan
| | | | - Atsushi Toyoda
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Mitsuyasu Hasebe
- National Institute for Basic Biology, Okazaki, Aichi, Japan
- Department of Basic Biology, The Graduate School for Advanced Studies (SOKENDAI), Okazaki, Aichi, Japan
| | - Teiji Sota
- Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto, Japan
| |
Collapse
|
2
|
Fitzgerald DM, Stringer AM, Smith C, Lapierre P, Wade JT. Genome-Wide Mapping of the Escherichia coli PhoB Regulon Reveals Many Transcriptionally Inert, Intragenic Binding Sites. mBio 2023; 14:e0253522. [PMID: 37067422 PMCID: PMC10294691 DOI: 10.1128/mbio.02535-22] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 03/23/2023] [Indexed: 04/18/2023] Open
Abstract
Genome-scale analyses have revealed many transcription factor binding sites within, rather than upstream of, genes, raising questions as to the function of these binding sites. Here, we use complementary approaches to map the regulon of the Escherichia coli transcription factor PhoB, a response regulator that controls transcription of genes involved in phosphate homeostasis. Strikingly, the majority of PhoB binding sites are located within genes, but these intragenic sites are not associated with detectable transcription regulation and are not evolutionarily conserved. Many intragenic PhoB sites are located in regions bound by H-NS, likely due to shared sequence preferences of PhoB and H-NS. However, these PhoB binding sites are not associated with transcription regulation even in the absence of H-NS. We propose that for many transcription factors, including PhoB, binding sites not associated with promoter sequences are transcriptionally inert and hence are tolerated as genomic "noise." IMPORTANCE Recent studies have revealed large numbers of transcription factor binding sites within the genes of bacteria. The function, if any, of the vast majority of these binding sites has not been investigated. Here, we map the binding of the transcription factor PhoB across the Escherichia coli genome, revealing that the majority of PhoB binding sites are within genes. We show that PhoB binding sites within genes are not associated with regulation of the overlapping genes. Indeed, our data suggest that bacteria tolerate the presence of large numbers of nonregulatory, intragenic binding sites for transcription factors and that these binding sites are not under selective pressure.
Collapse
Affiliation(s)
- Devon M. Fitzgerald
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| | - Anne M. Stringer
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Carol Smith
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Pascal Lapierre
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Joseph T. Wade
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| |
Collapse
|
3
|
Fitzgerald D, Stringer A, Smith C, Lapierre P, Wade JT. Genome-wide mapping of the Escherichia coli PhoB regulon reveals many transcriptionally inert, intragenic binding sites. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.07.527549. [PMID: 36798257 PMCID: PMC9934606 DOI: 10.1101/2023.02.07.527549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Genome-scale analyses have revealed many transcription factor binding sites within, rather than upstream of genes, raising questions as to the function of these binding sites. Here, we use complementary approaches to map the regulon of the Escherichia coli transcription factor PhoB, a response regulator that controls transcription of genes involved in phosphate homeostasis. Strikingly, the majority of PhoB binding sites are located within genes, but these intragenic sites are not associated with detectable transcription regulation and are not evolutionarily conserved. Many intragenic PhoB sites are located in regions bound by H-NS, likely due to shared sequence preferences of PhoB and H-NS. However, these PhoB binding sites are not associated with transcription regulation even in the absence of H-NS. We propose that for many transcription factors, including PhoB, binding sites not associated with promoter sequences are transcriptionally inert, and hence are tolerated as genomic "noise". IMPORTANCE Recent studies have revealed large numbers of transcription factor binding sites within the genes of bacteria. The function, if any, of the vast majority of these binding sites has not been investigated. Here, we map the binding of the transcription factor PhoB across the Escherichia coli genome, revealing that the majority of PhoB binding sites are within genes. We show that PhoB binding sites within genes are not associated with regulation of the overlapping genes. Indeed, our data suggest that bacteria tolerate the presence of large numbers of non-regulatory, intragenic binding sites for transcription factors, and that these binding sites are not under selective pressure.
Collapse
Affiliation(s)
- Devon Fitzgerald
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| | - Anne Stringer
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Carol Smith
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Pascal Lapierre
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
| | - Joseph T. Wade
- Wadsworth Center, New York State Department of Health, Albany, New York, USA
- Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York, USA
| |
Collapse
|
4
|
Du X, McManus DP, French JD, Collinson N, Sivakumaran H, MacGregor SR, Fogarty CE, Jones MK, You H. CRISPR interference for sequence-specific regulation of fibroblast growth factor receptor A in Schistosoma mansoni. Front Immunol 2023; 13:1105719. [PMID: 36713455 PMCID: PMC9880433 DOI: 10.3389/fimmu.2022.1105719] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 12/28/2022] [Indexed: 01/15/2023] Open
Abstract
Employing the flatworm parasite Schistosoma mansoni as a model, we report the first application of CRISPR interference (CRISPRi) in parasitic helminths for loss-of-function studies targeting the SmfgfrA gene which encodes the stem cell marker, fibroblast growth factor receptor A (FGFRA). SmFGFRA is essential for maintaining schistosome stem cells and critical in the schistosome-host interplay. The SmfgfrA gene was targeted in S. mansoni adult worms, eggs and schistosomula using a catalytically dead Cas9 (dCas9) fused to a transcriptional repressor KRAB. We showed that SmfgfrA repression resulted in considerable phenotypic differences in the modulated parasites compared with controls, including reduced levels of SmfgfrA transcription and decreased protein expression of SmFGFRA, a decline in EdU (thymidine analog 5-ethynyl-2'-deoxyuridine, which specifically stains schistosome stem cells) signal, and an increase in cell apoptosis. Notably, reduced SmfgfrA transcription was evident in miracidia hatched from SmfgfrA-repressed eggs, and resulted in a significant change in miracidial behavior, indicative of a durable repression effect caused by CRISPRi. Intravenous injection of mice with SmfgfrA-repressed eggs resulted in granulomas that were markedly reduced in size and a decline in the level of serum IgE, emphasizing the importance of SmFGFRA in regulating the host immune response induced during schistosome infection. Our findings show the feasibility of applying CRISPRi for effective, targeted transcriptional repression in schistosomes, and provide the basis for employing CRISPRi to selectively perturb gene expression in parasitic helminths on a genome-wide scale.
Collapse
Affiliation(s)
- Xiaofeng Du
- Infection and Inflammation Program, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia,Faculty of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Donald P. McManus
- Infection and Inflammation Program, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia,Faculty of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Juliet D. French
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Natasha Collinson
- Infection and Inflammation Program, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Haran Sivakumaran
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Skye R. MacGregor
- Infection and Inflammation Program, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Conor E. Fogarty
- Genecology Research Centre, University of the Sunshine Coast, Sunshine Coast, QLD, Australia
| | - Malcolm K. Jones
- School of Veterinary Science, The University of Queensland, Gatton, QLD, Australia
| | - Hong You
- Infection and Inflammation Program, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia,School of Veterinary Science, The University of Queensland, Gatton, QLD, Australia,*Correspondence: Hong You,
| |
Collapse
|
5
|
You H, Mayer JU, Johnston RL, Sivakumaran H, Ranasinghe S, Rivera V, Kondrashova O, Koufariotis LT, Du X, Driguez P, French JD, Waddell N, Duke MG, Ittiprasert W, Mann VH, Brindley PJ, Jones MK, McManus DP. CRISPR/Cas9-mediated genome editing of Schistosoma mansoni acetylcholinesterase. FASEB J 2021; 35:e21205. [PMID: 33337558 DOI: 10.1096/fj.202001745rr] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Revised: 10/16/2020] [Accepted: 11/03/2020] [Indexed: 12/21/2022]
Abstract
CRISPR/Cas9-mediated genome editing shows cogent potential for the genetic modification of helminth parasites. We report successful gene knock-in (KI) into the genome of the egg of Schistosoma mansoni by combining CRISPR/Cas9 with single-stranded oligodeoxynucleotides (ssODNs). We edited the acetylcholinesterase (AChE) gene of S. mansoni targeting two guide RNAs (gRNAs), X5 and X7, located on exon 5 and exon 7 of Smp_154600, respectively. Eggs recovered from livers of experimentally infected mice were transfected by electroporation with a CRISPR/Cas9-vector encoding gRNA X5 or X7 combining with/ without a ssODN donor. Next generation sequencing analysis of reads of amplicon libraries spanning targeted regions revealed that the major modifications induced by CRISPR/Cas9 in the eggs were generated by homology directed repair (HDR). Furthermore, soluble egg antigen from AChE-edited eggs exhibited markedly reduced AChE activity, indicative that programed Cas9 cleavage mutated the AChE gene. Following injection of AChE-edited schistosome eggs into the tail veins of mice, an significantly enhanced Th2 response involving IL-4, -5, -10, and-13 was detected in lung cells and splenocytes in mice injected with X5-KI eggs in comparison to control mice injected with unmutated eggs. A Th2-predominant response, with increased levels of IL-4, -13, and GATA3, also was induced by X5 KI eggs in small intestine-draining mesenteric lymph node cells when the gene-edited eggs were introduced into the subserosa of the ileum of the mice. These findings confirmed the potential and the utility of CRISPR/Cas9-mediated genome editing for functional genomics in schistosomes.
Collapse
Affiliation(s)
- Hong You
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | | | - Rebecca L Johnston
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Haran Sivakumaran
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Shiwanthi Ranasinghe
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Vanessa Rivera
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia.,School of Medicine, Deakin University, Geelong, VIC, Australia
| | - Olga Kondrashova
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Lambros T Koufariotis
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Xiaofeng Du
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Patrick Driguez
- King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Juliet D French
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Nicola Waddell
- Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Mary G Duke
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - Wannaporn Ittiprasert
- Department of Microbiology, Immunology & Tropical Medicine, & Research Center for Neglected Diseases of Poverty, School of Medicine & Health Sciences, George Washington University, Washington, DC, USA
| | - Victoria H Mann
- Department of Microbiology, Immunology & Tropical Medicine, & Research Center for Neglected Diseases of Poverty, School of Medicine & Health Sciences, George Washington University, Washington, DC, USA
| | - Paul J Brindley
- Department of Microbiology, Immunology & Tropical Medicine, & Research Center for Neglected Diseases of Poverty, School of Medicine & Health Sciences, George Washington University, Washington, DC, USA
| | - Malcolm K Jones
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia.,School of Veterinary Science, The University of Queensland, Gatton, QLD, Australia
| | - Donald P McManus
- Immunology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| |
Collapse
|
6
|
Molecular and evolutionary processes generating variation in gene expression. Nat Rev Genet 2020; 22:203-215. [PMID: 33268840 DOI: 10.1038/s41576-020-00304-w] [Citation(s) in RCA: 143] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2020] [Indexed: 12/18/2022]
Abstract
Heritable variation in gene expression is common within and between species. This variation arises from mutations that alter the form or function of molecular gene regulatory networks that are then filtered by natural selection. High-throughput methods for introducing mutations and characterizing their cis- and trans-regulatory effects on gene expression (particularly, transcription) are revealing how different molecular mechanisms generate regulatory variation, and studies comparing these mutational effects with variation seen in the wild are teasing apart the role of neutral and non-neutral evolutionary processes. This integration of molecular and evolutionary biology allows us to understand how the variation in gene expression we see today came to be and to predict how it is most likely to evolve in the future.
Collapse
|
7
|
Oti M, Pane A, Sammeth M. Comparative Genomics in Drosophila. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2018; 1704:433-450. [PMID: 29277877 DOI: 10.1007/978-1-4939-7463-4_17] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered "junk" DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding sites, that due to the high degree of sequence variability usually impose increased challenges for bioinformatics approaches.
Collapse
Affiliation(s)
- Martin Oti
- Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), Avenida Carlos Chagas Filho 373, 21941-902, Rio de Janeiro, RJ, Brazil
| | - Attilio Pane
- Institute of Biomedical Sciences (ICB), Federal University of Rio de Janeiro (UFRJ), 21941-902, Rio de Janeiro, RJ, Brazil
| | - Michael Sammeth
- Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), Avenida Carlos Chagas Filho 373, 21941-902, Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
8
|
Buffry AD, Mendes CC, McGregor AP. The Functionality and Evolution of Eukaryotic Transcriptional Enhancers. ADVANCES IN GENETICS 2016; 96:143-206. [PMID: 27968730 DOI: 10.1016/bs.adgen.2016.08.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Enhancers regulate precise spatial and temporal patterns of gene expression in eukaryotes and, moreover, evolutionary changes in these modular cis-regulatory elements may represent the predominant genetic basis for phenotypic evolution. Here, we review approaches to identify and functionally analyze enhancers and their transcription factor binding sites, including assay for transposable-accessible chromatin-sequencing (ATAC-Seq) and clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9, respectively. We also explore enhancer functionality, including how transcription factor binding sites combine to regulate transcription, as well as research on shadow and super enhancers, and how enhancers can act over great distances and even in trans. Finally, we discuss recent theoretical and empirical data on how transcription factor binding sites and enhancers evolve. This includes how the function of enhancers is maintained despite the turnover of transcription factor binding sites as well as reviewing studies where mutations in enhancers have been shown to underlie morphological change.
Collapse
Affiliation(s)
- A D Buffry
- Oxford Brookes University, Oxford, United Kingdom
| | - C C Mendes
- Oxford Brookes University, Oxford, United Kingdom
| | - A P McGregor
- Oxford Brookes University, Oxford, United Kingdom
| |
Collapse
|
9
|
de los Reyes BG, Mohanty B, Yun SJ, Park MR, Lee DY. Upstream regulatory architecture of rice genes: summarizing the baseline towards genus-wide comparative analysis of regulatory networks and allele mining. RICE (NEW YORK, N.Y.) 2015; 8:14. [PMID: 25844119 PMCID: PMC4385054 DOI: 10.1186/s12284-015-0041-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2014] [Accepted: 01/12/2015] [Indexed: 05/23/2023]
Abstract
Dissecting the upstream regulatory architecture of rice genes and their cognate regulator proteins is at the core of network biology and its applications to comparative functional genomics. With the rapidly advancing comparative genomics resources in the genus Oryza, a reference genome annotation that defines the various cis-elements and trans-acting factors that interface each gene locus with various intrinsic and extrinsic signals for growth, development, reproduction and adaptation must be established to facilitate the understanding of phenotypic variation in the context of regulatory networks. Such information is also important to establish the foundation for mining non-coding sequence variation that defines novel alleles and epialleles across the enormous phenotypic diversity represented in rice germplasm. This review presents a synthesis of the state of knowledge and consensus trends regarding the various cis-acting and trans-acting components that define spatio-temporal regulation of rice genes based on representative examples from both foundational studies in other model and non-model plants, and more recent studies in rice. The goal is to summarize the baseline for systematic upstream sequence annotation of the rapidly advancing genome sequence resources in Oryza in preparation for genus-wide functional genomics. Perspectives on the potential applications of such information for gene discovery, network engineering and genomics-enabled rice breeding are also discussed.
Collapse
Affiliation(s)
| | - Bijayalaxmi Mohanty
- />Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore, 117576 Singapore
| | - Song Joong Yun
- />Department of Crop Science and Institute of Agricultural Science and Technology, Chonbuk National University, Chonju, 561-756 Korea
| | - Myoung-Ryoul Park
- />School of Biology and Ecology, University of Maine, Orono, ME 04469 USA
| | - Dong-Yup Lee
- />Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore, 117576 Singapore
| |
Collapse
|
10
|
Wan D, Ludolf F, Alanine DGW, Stretton O, Ali Ali E, Al-Barwary N, Wang X, Doenhoff MJ, Mari A, Fitzsimmons CM, Dunne DW, Nakamura R, Oliveira GC, Alcocer MJC, Falcone FH. Use of humanised rat basophilic leukaemia cell line RS-ATL8 for the assessment of allergenicity of Schistosoma mansoni proteins. PLoS Negl Trop Dis 2014; 8:e3124. [PMID: 25254513 PMCID: PMC4177753 DOI: 10.1371/journal.pntd.0003124] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Accepted: 07/17/2014] [Indexed: 12/31/2022] Open
Abstract
Background Parasite-specific IgE is thought to correlate with protection against Schistosoma mansoni infection or re-infection. Only a few molecular targets of the IgE response in S. mansoni infection have been characterised. A better insight into the basic mechanisms of anti-parasite immunity could be gained from a genome-wide characterisation of such S. mansoni allergens. This would have repercussions on our understanding of allergy and the development of safe and efficacious vaccinations against helminthic parasites. Methodology/Principal Findings A complete medium- to high-throughput amenable workflow, including important quality controls, is described, which enables the rapid translation of S. mansoni proteins using wheat germ lysate and subsequent assessment of potential allergenicity with a humanised Rat Basophilic Leukemia (RBL) reporter cell line. Cell-free translation is completed within 90 minutes, generating sufficient amounts of parasitic protein for rapid screening of allergenicity without any need for purification. Antigenic integrity is demonstrated using Western Blotting. After overnight incubation with infected individuals' serum, the RS-ATL8 reporter cell line is challenged with the complete wheat germ translation mixture and Luciferase activity measured, reporting cellular activation by the suspected allergen. The suitability of this system for characterization of novel S. mansoni allergens is demonstrated using well characterised plant and parasitic allergens such as Par j 2, SmTAL-1 and the IgE binding factor IPSE/alpha-1, expressed in wheat germ lysates and/or E. coli. SmTAL-1, but not SmTAL2 (used as a negative control), was able to activate the basophil reporter cell line. Conclusion/Significance This method offers an accessible way for assessment of potential allergenicity of anti-helminthic vaccine candidates and is suitable for medium- to high-throughput studies using infected individual sera. It is also suitable for the study of the basis of allergenicity of helminthic proteins. Infection with parasitic helminths is characterised by a marked elevation of total and parasite-specific Immunoglobulin E (IgE). It is widely believed that this IgE response has evolved to protect hosts against large metazoan parasites. Such a protective function has been well characterised in particular against members of the genus Schistosoma. However, with a few notable exceptions, the molecular targets of the IgE response and the downstream immunological mechanisms leading to host protection are not well understood. The molecular targets of a specific IgE response are by definition called allergens. While almost 3,000 different allergens, contained in e.g. plant pollen or seeds, moulds or animal materials, have been characterised at the molecular level, and are listed and described in databases such as the Allergome database (www.allergome.org), only a few dozen allergens have been characterised in parasitic helminths. A more detailed understanding of the molecular targets of the anti-helminth IgE response can not only be expected to further our basic understanding of protective immune responses and allergy in general–such knowledge can also be expected to have important repercussions on the production of safe and effective anti-helminthic vaccines. This research describes a novel approach suitable for genome-wide functional identification of allergens in S. mansoni and other parasites, paving the way for the identification of the Schistosoma allergome.
Collapse
Affiliation(s)
- Daniel Wan
- Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, United Kingdom
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, United Kingdom
| | - Fernanda Ludolf
- Genomics and Computational Biology Group, Centro de Pesquisas René Rachou, National Institute of Science and Technology in Tropical Diseases, Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, Minas Gerais, Brazil
| | - Daniel G. W. Alanine
- Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, United Kingdom
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, United Kingdom
| | - Owen Stretton
- Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, United Kingdom
| | - Eman Ali Ali
- Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, United Kingdom
| | - Nafal Al-Barwary
- Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, United Kingdom
| | - Xiaowei Wang
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, United Kingdom
| | - Michael J. Doenhoff
- School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
| | - Adriano Mari
- Center for Molecular Allergology, IDI-IRCCS, Rome, Italy
- Associated Centres for Molecular Allergology, Rome, Italy
| | | | - David W. Dunne
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | - Ryosuke Nakamura
- Division of Medicinal Safety Science, National Institute of Health Sciences, Setagaya-ku, Tokyo, Japan
| | - Guilherme C. Oliveira
- Genomics and Computational Biology Group, Centro de Pesquisas René Rachou, National Institute of Science and Technology in Tropical Diseases, Fundação Oswaldo Cruz - FIOCRUZ, Belo Horizonte, Minas Gerais, Brazil
| | - Marcos J. C. Alcocer
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, United Kingdom
| | - Franco H. Falcone
- Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, United Kingdom
- * E-mail: ;
| |
Collapse
|
11
|
Thompson JA, Congdon CB. An Exploration Into Improving DNA Motif Inference by Looking for Highly Conserved Core Regions. IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY PROCEEDINGS. IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2013; 2013:60-67. [PMID: 31008453 PMCID: PMC6474685 DOI: 10.1109/cibcb.2013.6595389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Although most verified functional elements in noncoding DNA contain a highly conserved core region, this concept is not generally incorporated into de novo motif inference systems. In this work, we explore the utility of adding the notion of conserved core regions into a comparative genomics approach for the search for putative functional elements in noncoding DNA. By modifying the scoring function for GAMI, Genetic Algorithms for Motif Inference, we investigate tradeoffs between the strength of conservation of the full motif vs. the strength of conservation of a core region. This work illustrates that incorporating information about the structure of transcription factor binding sites can be helpful in identifying biologically functional elements.
Collapse
Affiliation(s)
- Jeffrey A Thompson
- Department of Computer Science, University of Southern Maine, Portland, Maine 04104
| | - Clare Bates Congdon
- Department of Computer Science, University of Southern Maine, Portland, Maine 04104
| |
Collapse
|
12
|
Overmars L, Kerkhoven R, Siezen RJ, Francke C. MGcV: the microbial genomic context viewer for comparative genome analysis. BMC Genomics 2013; 14:209. [PMID: 23547764 PMCID: PMC3639932 DOI: 10.1186/1471-2164-14-209] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 03/22/2013] [Indexed: 01/22/2023] Open
Abstract
Background Conserved gene context is used in many types of comparative genome analyses. It is used to provide leads on gene function, to guide the discovery of regulatory sequences, but also to aid in the reconstruction of metabolic networks. We present the Microbial Genomic context Viewer (MGcV), an interactive, web-based application tailored to strengthen the practice of manual comparative genome context analysis for bacteria. Results MGcV is a versatile, easy-to-use tool that renders a visualization of the genomic context of any set of selected genes, genes within a phylogenetic tree, genomic segments, or regulatory elements. It is tailored to facilitate laborious tasks such as the interactive annotation of gene function, the discovery of regulatory elements, or the sequence-based reconstruction of gene regulatory networks. We illustrate that MGcV can be used in gene function annotation by visually integrating information on prokaryotic genes, like their annotation as available from NCBI with other annotation data such as Pfam domains, sub-cellular location predictions and gene-sequence characteristics such as GC content. We also illustrate the usefulness of the interactive features that allow the graphical selection of genes to facilitate data gathering (e.g. upstream regions, ID’s or annotation), in the analysis and reconstruction of transcription regulation. Moreover, putative regulatory elements and their corresponding scores or data from RNA-seq and microarray experiments can be uploaded, visualized and interpreted in (ranked-) comparative context maps. The ranked maps allow the interpretation of predicted regulatory elements and experimental data in light of each other. Conclusion MGcV advances the manual comparative analysis of genes and regulatory elements by providing fast and flexible integration of gene related data combined with straightforward data retrieval. MGcV is available at http://mgcv.cmbi.ru.nl.
Collapse
Affiliation(s)
- Lex Overmars
- Centre for Molecular and Biomolecular Informatics, Radboud University Nijmegen Medical Centre, Geert Grooteplein Zuid 26-28, Nijmegen, 6525GA, The Netherlands.
| | | | | | | |
Collapse
|
13
|
Garfield D, Haygood R, Nielsen WJ, Wray GA. Population genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus. Evol Dev 2013; 14:152-67. [PMID: 23017024 DOI: 10.1111/j.1525-142x.2012.00532.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Despite the fact that noncoding sequences comprise a substantial fraction of functional sites within all genomes, the evolutionary mechanisms that operate on genetic variation within regulatory elements remain poorly understood. In this study, we examine the population genetics of the core, upstream cis-regulatory regions of eight genes (AN, CyIIa, CyIIIa, Endo16, FoxB, HE, SM30 a, and SM50) that function during the early development of the purple sea urchin, Strongylocentrotus purpuratus. Quantitative and qualitative measures of segregating variation are not conspicuously different between cis-regulatory and closely linked "proxy neutral" noncoding regions containing no known functional sites. Length and compound mutations are common in noncoding sequences; conventional descriptive statistics ignore such mutations, under-representing true genetic variation by approximately 28% for these loci in this population. Patterns of variation in the cis-regulatory regions of six of the genes examined (CyIIa, CyIIIa, Endo16, FoxB, AN, and HE) are consistent with directional selection. Genetic variation within annotated transcription factor binding sites is comparable to, and frequently greater than, that of surrounding sequences. Comparisons of two paralog pairs (CyIIa/CyIIIa and AN/HE) suggest that distinct evolutionary processes have operated on their cis-regulatory regions following gene duplication. Together, these analyses provide a detailed view of the evolutionary mechanisms operating on noncoding sequences within a natural population, and underscore how little is known about how these processes operate on cis-regulatory sequences.
Collapse
Affiliation(s)
- David Garfield
- Department of Biology and Institute for Genome Sciences & Policy, Duke University, Box 90338, Durham, NC 27708, USA
| | | | | | | |
Collapse
|
14
|
|
15
|
Bompfünewerer AF, Flamm C, Fried C, Fritzsch G, Hofacker IL, Lehmann J, Missal K, Mosig A, Müller B, Prohaska SJ, Stadler BMR, Stadler PF, Tanzer A, Washietl S, Witwer C. Evolutionary patterns of non-coding RNAs. Theory Biosci 2012; 123:301-69. [PMID: 18202870 DOI: 10.1016/j.thbio.2005.01.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Accepted: 01/24/2005] [Indexed: 01/04/2023]
Abstract
A plethora of new functions of non-coding RNAs (ncRNAs) have been discovered in past few years. In fact, RNA is emerging as the central player in cellular regulation, taking on active roles in multiple regulatory layers from transcription, RNA maturation, and RNA modification to translational regulation. Nevertheless, very little is known about the evolution of this "Modern RNA World" and its components. In this contribution, we attempt to provide at least a cursory overview of the diversity of ncRNAs and functional RNA motifs in non-translated regions of regular messenger RNAs (mRNAs) with an emphasis on evolutionary questions. This survey is complemented by an in-depth analysis of examples from different classes of RNAs focusing mostly on their evolution in the vertebrate lineage. We present a survey of Y RNA genes in vertebrates and study the molecular evolution of the U7 snRNA, the snoRNAs E1/U17, E2, and E3, the Y RNA family, the let-7 microRNA (miRNA) family, and the mRNA-like evf-1 gene. We furthermore discuss the statistical distribution of miRNAs in metazoans, which suggests an explosive increase in the miRNA repertoire in vertebrates. The analysis of the transcription of ncRNAs suggests that small RNAs in general are genetically mobile in the sense that their association with a hostgene (e.g. when transcribed from introns of a mRNA) can change on evolutionary time scales. The let-7 family demonstrates, that even the mode of transcription (as intron or as exon) can change among paralogous ncRNA.
Collapse
|
16
|
Xu F, Park MR, Kitazumi A, Herath V, Mohanty B, Yun SJ, de los Reyes BG. Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints. BMC Genomics 2012; 13:497. [PMID: 22992304 PMCID: PMC3522565 DOI: 10.1186/1471-2164-13-497] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 09/14/2012] [Indexed: 01/10/2023] Open
Abstract
Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage-specific or species-specific regulatory fine-tuners. This synergy may be critical for finer-scale spatio-temporal regulation, hence unique expression profiles of homologous transcription factors from different species with distinct zones of ecological adaptation such as rice, sorghum and Arabidopsis. The patterns revealed from these comparisons set the stage for further empirical validation by functional genomics.
Collapse
Affiliation(s)
- Fuyu Xu
- School of Biology and Ecology, University of Maine, 5735 Hitchner Hall, Orono, ME 04469, USA
| | | | | | | | | | | | | |
Collapse
|
17
|
Hooghe B, Broos S, van Roy F, De Bleser P. A flexible integrative approach based on random forest improves prediction of transcription factor binding sites. Nucleic Acids Res 2012; 40:e106. [PMID: 22492513 PMCID: PMC3413102 DOI: 10.1093/nar/gks283] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Transcription factor binding sites (TFBSs) are DNA sequences of 6–15 base pairs. Interaction of these TFBSs with transcription factors (TFs) is largely responsible for most spatiotemporal gene expression patterns. Here, we evaluate to what extent sequence-based prediction of TFBSs can be improved by taking into account the positional dependencies of nucleotides (NPDs) and the nucleotide sequence-dependent structure of DNA. We make use of the random forest algorithm to flexibly exploit both types of information. Results in this study show that both the structural method and the NPD method can be valuable for the prediction of TFBSs. Moreover, their predictive values seem to be complementary, even to the widely used position weight matrix (PWM) method. This led us to combine all three methods. Results obtained for five eukaryotic TFs with different DNA-binding domains show that our method improves classification accuracy for all five eukaryotic TFs compared with other approaches. Additionally, we contrast the results of seven smaller prokaryotic sets with high-quality data and show that with the use of high-quality data we can significantly improve prediction performance. Models developed in this study can be of great use for gaining insight into the mechanisms of TF binding.
Collapse
Affiliation(s)
- Bart Hooghe
- Department of Biomedical Molecular Biology, Ghent University, B-9052 Ghent, Belgium
| | | | | | | |
Collapse
|
18
|
Bogdanović O, van Heeringen SJ, Veenstra GJC. The epigenome in early vertebrate development. Genesis 2012; 50:192-206. [PMID: 22139962 PMCID: PMC3294079 DOI: 10.1002/dvg.20831] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2011] [Revised: 11/22/2011] [Accepted: 11/23/2011] [Indexed: 01/04/2023]
Abstract
Epigenetic regulation defines the commitment and potential of cells, including the limitations in their competence to respond to inducing signals. This review discusses the developmental origins of chromatin state in Xenopus and other vertebrate species and provides an overview of its use in genome annotation. In most metazoans the embryonic genome is transcriptionally quiescent after fertilization. This involves nucleosome-dense chromatin, repressors and a temporal deficiency in the transcription machinery. Active histone modifications such as H3K4me3 appear in pluripotent blastula embryos, whereas repressive marks such as H3K27me3 show a major increase in enrichment during late blastula and gastrula stages. The H3K27me3 modification set by Polycomb restricts ectopic lineage-specific gene expression. Pluripotent chromatin in Xenopus embryos is relatively unconstrained, whereas the pluripotent cell lineage in mammalian embryos harbors a more enforced type of pluripotent chromatin.
Collapse
Affiliation(s)
- Ozren Bogdanović
- Radboud University Nijmegen, Dept. Molecular Biology, Faculty of Science, Nijmegen Centre of Molecular Life Sciences, Nijmegen, The Netherlands
| | - Simon J. van Heeringen
- Radboud University Nijmegen, Dept. Molecular Biology, Faculty of Science, Nijmegen Centre of Molecular Life Sciences, Nijmegen, The Netherlands
| | - Gert Jan C. Veenstra
- Radboud University Nijmegen, Dept. Molecular Biology, Faculty of Science, Nijmegen Centre of Molecular Life Sciences, Nijmegen, The Netherlands
| |
Collapse
|
19
|
Roepcke S, Stahlberg S, Klein H, Schulz MH, Theobald L, Gohlke S, Vingron M, Walther DJ. A tandem sequence motif acts as a distance-dependent enhancer in a set of genes involved in translation by binding the proteins NonO and SFPQ. BMC Genomics 2011; 12:624. [PMID: 22185324 PMCID: PMC3262029 DOI: 10.1186/1471-2164-12-624] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 12/20/2011] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Bioinformatic analyses of expression control sequences in promoters of co-expressed or functionally related genes enable the discovery of common regulatory sequence motifs that might be involved in co-ordinated gene expression. By studying promoter sequences of the human ribosomal protein genes we recently identified a novel highly specific Localized Tandem Sequence Motif (LTSM). In this work we sought to identify additional genes and LTSM-binding proteins to elucidate potential regulatory mechanisms. RESULTS Genome-wide analyses allowed finding a considerable number of additional LTSM-positive genes, the products of which are involved in translation, among them, translation initiation and elongation factors, and 5S rRNA. Electromobility shift assays then showed specific signals demonstrating the binding of protein complexes to LTSM in ribosomal protein gene promoters. Pull-down assays with LTSM-containing oligonucleotides and subsequent mass spectrometric analysis identified the related multifunctional nucleotide binding proteins NonO and SFPQ in the binding complex. Functional characterization then revealed that LTSM enhances the transcriptional activity of the promoters in dependency of the distance from the transcription start site. CONCLUSIONS Our data demonstrate the power of bioinformatic analyses for the identification of biologically relevant sequence motifs. LTSM and the here found LTSM-binding proteins NonO and SFPQ were discovered through a synergistic combination of bioinformatic and biochemical methods and are regulators of the expression of a set of genes of the translational apparatus in a distance-dependent manner.
Collapse
Affiliation(s)
- Stefan Roepcke
- Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
We tested whether functionally important sites in bacterial, yeast, and animal promoters are more conserved than their neighbors. We found that substitutions are predominantly seen in less important sites and that those that occurred tended to have less impact on gene expression than possible alternatives. These results suggest that purifying selection operates on promoter sequences.
Collapse
|
21
|
Degnan PH, Ochman H, Moran NA. Sequence conservation and functional constraint on intergenic spacers in reduced genomes of the obligate symbiont Buchnera. PLoS Genet 2011; 7:e1002252. [PMID: 21912528 PMCID: PMC3164680 DOI: 10.1371/journal.pgen.1002252] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2011] [Accepted: 07/05/2011] [Indexed: 11/18/2022] Open
Abstract
Analyses of genome reduction in obligate bacterial symbionts typically focus on the removal and retention of protein-coding regions, which are subject to ongoing inactivation and deletion. However, these same forces operate on intergenic spacers (IGSs) and affect their contents, maintenance, and rates of evolution. IGSs comprise both non-coding, non-functional regions, including decaying pseudogenes at varying stages of recognizability, as well as functional elements, such as genes for sRNAs and regulatory control elements. The genomes of Buchnera and other small genome symbionts display biased nucleotide compositions and high rates of sequence evolution and contain few recognizable regulatory elements. However, IGS lengths are highly correlated across divergent Buchnera genomes, suggesting the presence of functional elements. To identify functional regions within the IGSs, we sequenced two Buchnera genomes (from aphid species Uroleucon ambrosiae and Acyrthosiphon kondoi) and applied a phylogenetic footprinting approach to alignments of orthologous IGSs from a total of eight Buchnera genomes corresponding to six aphid species. Inclusion of these new genomes allowed comparative analyses at intermediate levels of divergence, enabling the detection of both conserved elements and previously unrecognized pseudogenes. Analyses of these genomes revealed that 232 of 336 IGS alignments over 50 nucleotides in length displayed substantial sequence conservation. Conserved alignment blocks within these IGSs encompassed 88 Shine-Dalgarno sequences, 55 transcriptional terminators, 5 Sigma-32 binding sites, and 12 novel small RNAs. Although pseudogene formation, and thus IGS formation, are ongoing processes in these genomes, a large proportion of intergenic spacers contain functional sequences.
Collapse
Affiliation(s)
- Patrick H Degnan
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America.
| | | | | |
Collapse
|
22
|
Soccio RE, Tuteja G, Everett LJ, Li Z, Lazar MA, Kaestner KH. Species-specific strategies underlying conserved functions of metabolic transcription factors. Mol Endocrinol 2011; 25:694-706. [PMID: 21292830 DOI: 10.1210/me.2010-0454] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The winged helix protein FOXA2 and the nuclear receptor peroxisome proliferator-activated receptor-γ (PPARγ) are highly conserved, regionally expressed transcription factors (TFs) that regulate networks of genes controlling complex metabolic functions. Cistrome analysis for Foxa2 in mouse liver and PPARγ in mouse adipocytes has previously produced consensus-binding sites that are nearly identical to those used by the corresponding TFs in human cells. We report here that, despite the conservation of the canonical binding motif, the great majority of binding regions for FOXA2 in human liver and for PPARγ in human adipocytes are not in the orthologous locations corresponding to the mouse genome, and vice versa. Of note, TF binding can be absent in one species despite sequence conservation, including motifs that do support binding in the other species, demonstrating a major limitation of in silico binding site prediction. Whereas only approximately 10% of binding sites are conserved, gene-centric analysis reveals that about 50% of genes with nearby TF occupancy are shared across species for both hepatic FOXA2 and adipocyte PPARγ. Remarkably, for both TFs, many of the shared genes function in tissue-specific metabolic pathways, whereas species-unique genes fail to show enrichment for these pathways. Nonetheless, the species-unique genes, like the shared genes, showed the expected transcriptional regulation by the TFs in loss-of-function experiments. Thus, species-specific strategies underlie the biological functions of metabolic TFs that are highly conserved across mammalian species. Analysis of factor binding in multiple species may be necessary to distinguish apparent species-unique noise and reveal functionally relevant information.
Collapse
Affiliation(s)
- Raymond E Soccio
- Division of Endocrinology, Diabetes, and Metabolism, Department of Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104-6149, USA
| | | | | | | | | | | |
Collapse
|
23
|
Transcription factor binding variation in the evolution of gene regulation. Trends Genet 2010; 26:468-75. [PMID: 20864205 DOI: 10.1016/j.tig.2010.08.005] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2010] [Revised: 08/22/2010] [Accepted: 08/22/2010] [Indexed: 01/17/2023]
Abstract
Transcription factor interactions with DNA are one of the primary mechanisms by which expression is modulated, yet their evolution remains poorly understood. Chromatin immunoprecipitation followed by microarray (ChIP-chip) or sequencing (ChIP-Seq) has revolutionized the study of protein-DNA interactions. However, only recently has attention focused on determining to what extent these regulatory interactions vary between species across entire genomes. A series of recent studies have compared in vivo binding data across a range of evolutionary distances. Binding events diverge rapidly, indicating gene regulation is an evolutionarily flexible process.
Collapse
|
24
|
Tian S, Haney RA, Feder ME. Phylogeny disambiguates the evolution of heat-shock cis-regulatory elements in Drosophila. PLoS One 2010; 5:e10669. [PMID: 20498853 PMCID: PMC2871787 DOI: 10.1371/journal.pone.0010669] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Accepted: 04/23/2010] [Indexed: 11/19/2022] Open
Abstract
Heat-shock genes have a well-studied control mechanism for their expression that is mediated through cis-regulatory motifs known as heat-shock elements (HSEs). The evolution of important features of this control mechanism has not been investigated in detail, however. Here we exploit the genome sequencing of multiple Drosophila species, combined with a wealth of available information on the structure and function of HSEs in D. melanogaster, to undertake this investigation. We find that in single-copy heat shock genes, entire HSEs have evolved or disappeared 14 times, and the phylogenetic approach bounds the timing and direction of these evolutionary events in relation to speciation. In contrast, in the multi-copy gene Hsp70, the number of HSEs is nearly constant across species. HSEs evolve in size, position, and sequence within heat-shock promoters. In turn, functional significance of certain features is implicated by preservation despite this evolutionary change; these features include tail-to-tail arrangements of HSEs, gapped HSEs, and the presence or absence of entire HSEs. The variation among Drosophila species indicates that the cis-regulatory encoding of responsiveness to heat and other stresses is diverse. The broad dimensions of variation uncovered are particularly important as they suggest a substantial challenge for functional studies.
Collapse
Affiliation(s)
- Sibo Tian
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, Illinois, United States of America
| | - Robert A. Haney
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, Illinois, United States of America
| | - Martin E. Feder
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
25
|
Oleksyk TK, Smith MW, O'Brien SJ. Genome-wide scans for footprints of natural selection. Philos Trans R Soc Lond B Biol Sci 2010; 365:185-205. [PMID: 20008396 PMCID: PMC2842710 DOI: 10.1098/rstb.2009.0219] [Citation(s) in RCA: 246] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Detecting recent selected ‘genomic footprints’ applies directly to the discovery of disease genes and in the imputation of the formative events that molded modern population genetic structure. The imprints of historic selection/adaptation episodes left in human and animal genomes allow one to interpret modern and ancestral gene origins and modifications. Current approaches to reveal selected regions applied in genome-wide selection scans (GWSSs) fall into eight principal categories: (I) phylogenetic footprinting, (II) detecting increased rates of functional mutations, (III) evaluating divergence versus polymorphism, (IV) detecting extended segments of linkage disequilibrium, (V) evaluating local reduction in genetic variation, (VI) detecting changes in the shape of the frequency distribution (spectrum) of genetic variation, (VII) assessing differentiating between populations (FST), and (VIII) detecting excess or decrease in admixture contribution from one population. Here, we review and compare these approaches using available human genome-wide datasets to provide independent verification (or not) of regions found by different methods and using different populations. The lessons learned from GWSSs will be applied to identify genome signatures of historic selective pressures on genes and gene regions in other species with emerging genome sequences. This would offer considerable potential for genome annotation in functional, developmental and evolutionary contexts.
Collapse
Affiliation(s)
- Taras K Oleksyk
- Biology Department, University of Puerto Rico at Mayaguez, Mayaguez 00681, Puerto Rico.
| | | | | |
Collapse
|
26
|
Identification of transcription factor binding sites derived from transposable element sequences using ChIP-seq. Methods Mol Biol 2010; 674:225-40. [PMID: 20827595 DOI: 10.1007/978-1-60761-854-6_14] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Transposable elements (TEs) form a substantial fraction of the non-coding DNA of many eukaryotic genomes. There are numerous examples of TEs being exapted for regulatory function by the host, many of which were identified through their high conservation. However, given that TEs are often the youngest part of a genome and typically exhibit a high turnover, conservation-based methods will fail to identify lineage- or species-specific exaptations. ChIP-seq has become a very popular and effective method for identifying in vivo DNA-protein interactions, such as those seen at transcription factor binding sites (TFBS), and has been used to show that there are a large number of TE-derived TFBS. Many of these TE-derived TFBS show poor conservation and would go unnoticed using conservation screens. Here, we describe a simple pipeline method for using data generated through ChIP-seq to identify TE-derived TFBS.
Collapse
|
27
|
Nathanson JL, Jappelli R, Scheeff ED, Manning G, Obata K, Brenner S, Callaway EM. Short Promoters in Viral Vectors Drive Selective Expression in Mammalian Inhibitory Neurons, but do not Restrict Activity to Specific Inhibitory Cell-Types. Front Neural Circuits 2009; 3:19. [PMID: 19949461 PMCID: PMC2783723 DOI: 10.3389/neuro.04.019.2009] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Accepted: 10/13/2009] [Indexed: 12/05/2022] Open
Abstract
Short cell-type specific promoter sequences are important for targeted gene therapy and studies of brain circuitry. We report on the ability of short promoter sequences to drive fluorescent protein expression in specific types of mammalian cortical inhibitory neurons using adeno-associated virus (AAV) and lentivirus (LV) vectors. We tested many gene regulatory sequences derived from fugu (Takifugu rubripes), mouse, human, and synthetic composite regulatory elements. All fugu compact promoters expressed in mouse cortex, with only the somatostatin (SST) and the neuropeptide Y (NPY) promoters largely restricting expression to GABAergic neurons. However these promoters did not control expression in inhibitory cells in a subtype specific manner. We also tested mammalian promoter sequences derived from genes putatively coexpressed or coregulated within three major inhibitory interneuron classes (PV, SST, VIP). In contrast to the fugu promoters, many of the mammalian sequences failed to express, and only the promoter from gene A930038C07Rik conferred restricted expression, although as in the case of the fugu sequences, this too was not inhibitory neuron subtype specific. Lastly and more promisingly, a synthetic sequence consisting of a composite regulatory element assembled with PAX6 E1.1 binding sites, NRSE and a minimal CMV promoter showed markedly restricted expression to a small subset of mostly inhibitory neurons, but whose commonalities are unknown.
Collapse
Affiliation(s)
- Jason L Nathanson
- Systems Neurobiology Laboratories, Salk Institute for Biological Studies La Jolla, CA, USA
| | | | | | | | | | | | | |
Collapse
|
28
|
Abstract
We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences.
Collapse
Affiliation(s)
- Alexei A Sharov
- Developmental Genomics and Aging Section, Laboratory of Genetics, National Institute on Aging, NIH, Baltimore, MD 21224, USA
| | | |
Collapse
|
29
|
Stanescu H, Wolfsberg T, Moreland R, Ayub M, Erickson E, Westbroek W, Huizing M, Gahl W, Helip-Wooley A. Identifying putative promoter regions of Hermansky-Pudlak syndrome genes by means of phylogenetic footprinting. Ann Hum Genet 2009; 73:422-8. [PMID: 19523149 PMCID: PMC2730976 DOI: 10.1111/j.1469-1809.2009.00525.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
HPS is an autosomal recessive disorder characterized by oculocutaneous albinism and prolonged bleeding. Eight human genes are described resulting in the HPS subtypes 1-8. Certain HPS proteins combine to form Biogenesis of Lysosome-related Organelles Complexes (BLOCs), thought to function in the formation of intracellular vesicles such as melanosomes, platelet dense bodies, and lytic granules. Specifically, BLOC-2 contains the HPS3, HPS5 and HPS6 proteins. We used phylogenetic footprinting to identify conserved regions in the upstream sequences of HPS3, HPS5 and HPS6. These conserved regions were verified to have in vitro transcription activation activity using luciferase reporter assays. Transcription factor binding site analyses of the regions identified 52 putative sites shared by all three genes. When analysis was limited to the conserved footprints, seven binding sites were found shared among all three genes: Pax-5, AIRE, CACD, ZF5, Zic1, E2F and Churchill. The HPS3 conserved upstream region was sequenced in four patients with decreased fibroblast HPS3 RNA levels and only one HPS3 mutation in the coding exons and surrounding exon/intron boundaries; no mutation was found. These findings illustrate the power of phylogenetic footprinting for identifying potential regulatory regions in non-coding sequences and define the first putative promoter elements for any HPS genes.
Collapse
Affiliation(s)
- H. Stanescu
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - T.G. Wolfsberg
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - R.T. Moreland
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - M.H. Ayub
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - E. Erickson
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - W. Westbroek
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - M. Huizing
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - W.A. Gahl
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - A. Helip-Wooley
- Section on Human Biochemical Genetics, Medical Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
30
|
D'Elia AV, Bregant E, Passon N, Puppin C, Meneghel A, Damante G. Conservation across species identifies several transcriptional enhancers in the HEX genomic region. Mol Cell Biochem 2009; 332:67-75. [PMID: 19554426 DOI: 10.1007/s11010-009-0175-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2008] [Accepted: 06/09/2009] [Indexed: 10/20/2022]
Abstract
The HEX gene encodes for a homeodomain-containing transcription factor that controls various phases of vertebrate development. During development, as well as in adult, HEX is expressed in several different tissues including thyroid, liver, lung, mammary gland, haematopoietic progenitors, and endothelial cells, suggesting that this gene is subjected to a complex transcriptional regulation. In this study, we have evaluated the presence of different enhancers in the HEX gene region by using a phylogenetic approach. Several non-coding sequences, conserved between human and mouse, were selected. Four conserved sequences showed enhancer activity in MCF-7 cells. Two of these enhancers (located in the first and third intron, respectively) have been previously identified by other experimental approaches. These elements, as well as one among the new identified enhancers (located 2 kb 3' to the HEX gene), are able to activate the HEX minimal promoter "in trans." The activity of the 3' enhancer was strongly reduced by overexpression of HDAC3.
Collapse
|
31
|
Terenius O, Marinotti O, Sieglaff D, James AA. Molecular genetic manipulation of vector mosquitoes. Cell Host Microbe 2008; 4:417-23. [PMID: 18996342 PMCID: PMC2656434 DOI: 10.1016/j.chom.2008.09.002] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2008] [Revised: 08/29/2008] [Accepted: 09/09/2008] [Indexed: 01/01/2023]
Abstract
Genetic strategies for reducing populations of vector mosquitoes or replacing them with those that are not able to transmit pathogens benefit greatly from molecular tools that allow gene manipulation and transgenesis. Mosquito genome sequences and associated EST (expressed sequence tags) databases enable large-scale investigations to provide new insights into evolutionary, biochemical, genetic, metabolic, and physiological pathways. Additionally, comparative genomics reveals the bases for evolutionary mechanisms with particular focus on specific interactions between vectors and pathogens. We discuss how this information may be exploited for the optimization of transgenes that interfere with the propagation and development of pathogens in their mosquito hosts.
Collapse
Affiliation(s)
- Olle Terenius
- Department of Molecular Biology and Biochemistry, 3205 McGaugh Hall, University of California, Irvine, CA 92697, USA
| | - Osvaldo Marinotti
- Department of Molecular Biology and Biochemistry, 3205 McGaugh Hall, University of California, Irvine, CA 92697, USA
| | - Douglas Sieglaff
- Department of Molecular Biology and Biochemistry, 3205 McGaugh Hall, University of California, Irvine, CA 92697, USA
- Institute for Genomics and Bioinformatics, University of California, Irvine
| | - Anthony A. James
- Department of Molecular Biology and Biochemistry, 3205 McGaugh Hall, University of California, Irvine, CA 92697, USA
- Department of Microbiology & Molecular Genetics, University of California, Irvine, CA 92697, USA
| |
Collapse
|
32
|
Mages J, Freimüller K, Lang R, Hatzopoulos AK, Guggemoos S, Koszinowski UH, Adler H. Proteins of the secretory pathway govern virus productivity during lytic gammaherpesvirus infection. J Cell Mol Med 2008; 12:1974-89. [PMID: 18194452 PMCID: PMC2673020 DOI: 10.1111/j.1582-4934.2008.00235.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2007] [Accepted: 01/08/2007] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Diseases caused by gammaherpesviruses continue to be a challenge for human health and antiviral treatment. Most of the commonly used antiviral drugs are directed against viral gene products. However, the emergence of drug-resistant mutations ma limit the effectiveness of these drugs. Since viruses require a host cell to propagate, the search for host cell targets is an interesting alternative. METHODS In this study, we infected three different cell types (fibroblasts, endothelial precursor cells and macrophages with a murine gammaherpesvirus and analysed the host cell response for changes either common to all or unique to a particular cell type using oligonucleotide microarrays. RESULTS The analysis revealed a number of genes whose transcription was significantly up- or down-regulated in either one or two of the cell types tested. After infection, only two genes, Lman1 (also known as ERGIC53) an synaptobrevin-like 1 (sybl1) were significantly up-regulated in all three cell types, suggestive for a general role for the virus life cycl independent of the cell type. Both proteins have been implicated in cellular exocytosis and transport of glycoproteins through the secretory pathway. To test the significance of the observed up-regulation, the functionality of these proteins was modulated, and the effect on virus replication was monitored. Inhibition of either Lman1 or sybl1 resulted in a significant reduction in virus production. CONCLUSIONS This suggests that proteins of the secretory pathway which appear to be rate limiting for virus production may represent new targets for intervention.
Collapse
Affiliation(s)
- J Mages
- Institute of Medical Microbiology, Immunology and Hygiene, Technical University MunichMunich, Germany
| | - K Freimüller
- Institute of Molecular Immunology, Clinical Cooperation Group Hematopoietic Cell Transplantation, GSF, National Research Center for Environment and HealthMunich, Germany
| | - R Lang
- Institute of Medical Microbiology, Immunology and Hygiene, Technical University MunichMunich, Germany
| | - A K Hatzopoulos
- Institute of Clinical Molecular Biology and Tumor Genetics, GSF, National Research Center for Environment and HealthMunich, Germany
- Vanderbilt University Medical Center, Departments of Medicine and Cell & Developmental BiologyNashville, TN, USA
| | - S Guggemoos
- Institute of Molecular Immunology, Clinical Cooperation Group Hematopoietic Cell Transplantation, GSF, National Research Center for Environment and HealthMunich, Germany
| | - U H Koszinowski
- Vanderbilt University Medical Center, Departments of Medicine and Cell & Developmental BiologyNashville, TN, USA
- Max von Pettenkofer-Institute, Ludwig-Maximilians-University MunichMunich, Germany
| | - H Adler
- Institute of Molecular Immunology, Clinical Cooperation Group Hematopoietic Cell Transplantation, GSF, National Research Center for Environment and HealthMunich, Germany
| |
Collapse
|
33
|
Polavarapu N, Mariño-Ramírez L, Landsman D, McDonald JF, Jordan IK. Evolutionary rates and patterns for human transcription factor binding sites derived from repetitive DNA. BMC Genomics 2008; 9:226. [PMID: 18485226 PMCID: PMC2397414 DOI: 10.1186/1471-2164-9-226] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2008] [Accepted: 05/17/2008] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The majority of human non-protein-coding DNA is made up of repetitive sequences, mainly transposable elements (TEs). It is becoming increasingly apparent that many of these repetitive DNA sequence elements encode gene regulatory functions. This fact has important evolutionary implications, since repetitive DNA is the most dynamic part of the genome. We set out to assess the evolutionary rate and pattern of experimentally characterized human transcription factor binding sites (TFBS) that are derived from repetitive versus non-repetitive DNA to test whether repeat-derived TFBS are in fact rapidly evolving. We also evaluated the position-specific patterns of variation among TFBS to look for signs of functional constraint on TFBS derived from repetitive and non-repetitive DNA. RESULTS We found numerous experimentally characterized TFBS in the human genome, 7-10% of all mapped sites, which are derived from repetitive DNA sequences including simple sequence repeats (SSRs) and TEs. TE-derived TFBS sequences are far less conserved between species than TFBS derived from SSRs and non-repetitive DNA. Despite their rapid evolution, several lines of evidence indicate that TE-derived TFBS are functionally constrained. First of all, ancient TE families, such as MIR and L2, are enriched for TFBS relative to younger families like Alu and L1. Secondly, functionally important positions in TE-derived TFBS, specifically those residues thought to physically interact with their cognate protein binding factors (TF), are more evolutionarily conserved than adjacent TFBS positions. Finally, TE-derived TFBS show position-specific patterns of sequence variation that are highly distinct from random patterns and similar to the variation seen for non-repeat derived sequences of the same TFBS. CONCLUSION The abundance of experimentally characterized human TFBS that are derived from repetitive DNA speaks to the substantial regulatory effects that this class of sequence has on the human genome. The unique evolutionary properties of repeat-derived TFBS are perhaps even more intriguing. TE-derived TFBS in particular, while clearly functionally constrained, evolve extremely rapidly relative to non-repeat derived sites. Such rapidly evolving TFBS are likely to confer species-specific regulatory phenotypes, i.e. divergent expression patterns, on the human evolutionary lineage. This result has practical implications with respect to the widespread use of evolutionary conservation as a surrogate for functionally relevant non-coding DNA. Most TE-derived TFBS would be missed using the kinds of sequence conservation-based screens, such as phylogenetic footprinting, that are used to help characterize non-coding DNA. Thus, the very TFBS that are most likely to yield human-specific characteristics will be neglected by the comparative genomic techniques that are currently de rigeur for the identification of novel regulatory sites.
Collapse
Affiliation(s)
- Nalini Polavarapu
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - David Landsman
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - John F McDonald
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - I King Jordan
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
34
|
Choi SH, Lee G, Monahan P, Park JH. Spatial regulation of Corazonin neuropeptide expression requires multiple cis-acting elements in Drosophila melanogaster. J Comp Neurol 2008; 507:1184-95. [PMID: 18181151 DOI: 10.1002/cne.21594] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Although most invertebrate neuropeptide-encoding genes display distinct expression patterns in the central nervous system (CNS), the molecular mechanisms underlying spatial regulation of the neuropeptide genes are largely unknown. Expression of the neuropeptide Corazonin (Crz) is limited to only 24 neurons in the larval CNS of Drosophila melanogaster, and these neurons have been categorized into three groups, namely, DL, DM, and vCrz. To identify cis-regulatory elements that control transcription of Crz in each neuronal group, reporter gene expression patterns driven by various 5' flanking sequences of Crz were analyzed to assess their promoter activities in the CNS. We show that the 504-bp 5' upstream sequence is the shortest promoter directing reporter activities in all Crz neurons. Further dissection of this sequence revealed two important regions responsible for group specificity: -504::-419 for DM expression and -380::-241 for DL and vCrz expression. The latter region is further subdivided into three sites (proximal, center, and distal), in which any combinations of the two are sufficient for DL expression, whereas both proximal and distal sites are required for vCrz expression. Interestingly, the TATA box does not play a role in Crz transcription in most neurons. We also show that a 434-bp 5' upstream sequence of the D. virilis Crz gene, when introduced into the D. melanogaster genome, drives reporter expression in the DL and vCrz neurons, suggesting that regulatory mechanisms for Crz expression in at least two such neuronal groups are conserved between the two species.
Collapse
Affiliation(s)
- Seung-Hoon Choi
- Laboratory of Neurogenetics, Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee 37996, USA
| | | | | | | |
Collapse
|
35
|
Creux NM, Ranik M, Berger DK, Myburg AA. Comparative analysis of orthologous cellulose synthase promoters from Arabidopsis, Populus and Eucalyptus: evidence of conserved regulatory elements in angiosperms. THE NEW PHYTOLOGIST 2008; 179:722-737. [PMID: 18547376 DOI: 10.1111/j.1469-8137.2008.02517.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
* The cellulose synthase (CesA) gene family encodes the catalytic subunits of a large protein complex responsible for the deposition of cellulose into plant cell walls. Early in vascular plant evolution, the gene family diverged into distinct members with conserved structures and functions (e.g. primary or secondary cell wall biosynthesis). Although the functions and expression domains of CesA genes have been extensively studied in plants, little is known about transcriptional regulation and promoter evolution in this gene family. * Here, comparative sequence analysis of orthologous CesA promoters from three angiosperm genera, Arabidopsis, Populus and Eucalyptus, was performed to identify putative cis-regulatory sequences. The promoter sequences of groups of Arabidopsis genes that are co-expressed with the primary or secondary cell wall-related CesA genes were also analyzed. * Reporter gene analysis of newly isolated promoter regions of six E. grandis CesA genes in Arabidopsis revealed the conserved functionality of the promoter sequences. Comparative sequence analysis identified 71 conserved sequence motifs, of which 66 were significantly over-represented in either primary or secondary wall-associated promoters. * The presence of conserved cis-regulatory elements in the evolutionary distant CesA promoters of Arabidopsis, Populus and Eucalyptus suggests an ancient transcriptional network regulating cellulose biosynthesis in vascular plants.
Collapse
Affiliation(s)
| | | | - David Kenneth Berger
- Department of Plant Science, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0002, South Africa
| | | |
Collapse
|
36
|
Boily G, Beaulieu P, Healy J, Sinnett D. Connections between ETV6-modulated genes: identification of shared features. Cancer Inform 2008; 6:183-201. [PMID: 19259410 PMCID: PMC2623305 DOI: 10.4137/cin.s556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Accumulating genetic and functional evidence point to ETV6 as being the tumour suppressor gene targeted by the deletions at chromosome 12p12-13 found in various cancers, particularly childhood leukemia. ETV6 is a ubiquitously expressed transcription factor (TF) of the ETS family with very few known targeted genes. We recently compiled a list of 87 ETV6-modulated genes that can be classified into a number of subgroups based on their coordinated expression patterns. In the present report, we hypothesized that genes presenting a similar profile of modulation could also share biological features, promoter sequence similarities and/or, common transcription factor binding sites (TFBSs). Using an exploratory approach based on hierarchical clustering of expression data, Gene Ontology (GO) terms, sequence similarity and evolutionary conserved putative TFBSs, we found that many genes presenting a similar expression profile also share biological features and/or conserved predicted TFBSs but rarely show detectable promoter sequence similarities. We also calculated the proportion of ETV6-modulated genes that have any conserved TFBSs of the Jaspar database in their regulatory sequence and compared these proportions to those calculated for two other gene lists, ETV6 non-modulated and ETS-regulated. We found that the NF-kB, c-REL and p65 TFBSs, which all bind TFs of the REL class, were under-represented among the ETV6-modulated genes compared to the ETV6-non-modulated genes, while the Broad-complex 1 TFBS appeared to be over-represented. NF-Y and Chop/cEBP TFBSs were over-represented in the promoters of ETV6-modulated genes compared to ETS-regulated genes. These analyses will help direct further studies intending to understand the role of ETV6 as a transcriptional regulator and aid in constructing the ETV6-regulatory gene network.
Collapse
Affiliation(s)
- Gino Boily
- Division of Hematology-Oncology, Charles-Bruneau Cancer Center, Research Center, CHU Sainte-Justine, Montreal, Quebec, Canada
| | | | | | | |
Collapse
|
37
|
Moroni E, Caselle M, Fogolari F. Identification of DNA-binding protein target sequences by physical effective energy functions: free energy analysis of lambda repressor-DNA complexes. BMC STRUCTURAL BIOLOGY 2007; 7:61. [PMID: 17900341 PMCID: PMC2194778 DOI: 10.1186/1472-6807-7-61] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2007] [Accepted: 09/27/2007] [Indexed: 11/26/2022]
Abstract
Background Specific binding of proteins to DNA is one of the most common ways gene expression is controlled. Although general rules for the DNA-protein recognition can be derived, the ambiguous and complex nature of this mechanism precludes a simple recognition code, therefore the prediction of DNA target sequences is not straightforward. DNA-protein interactions can be studied using computational methods which can complement the current experimental methods and offer some advantages. In the present work we use physical effective potentials to evaluate the DNA-protein binding affinities for the λ repressor-DNA complex for which structural and thermodynamic experimental data are available. Results The binding free energy of two molecules can be expressed as the sum of an intermolecular energy (evaluated using a molecular mechanics forcefield), a solvation free energy term and an entropic term. Different solvation models are used including distance dependent dielectric constants, solvent accessible surface tension models and the Generalized Born model. The effect of conformational sampling by Molecular Dynamics simulations on the computed binding energy is assessed; results show that this effect is in general negative and the reproducibility of the experimental values decreases with the increase of simulation time considered. The free energy of binding for non-specific complexes, estimated using the best energetic model, agrees with earlier theoretical suggestions. As a results of these analyses, we propose a protocol for the prediction of DNA-binding target sequences. The possibility of searching regulatory elements within the bacteriophage λ genome using this protocol is explored. Our analysis shows good prediction capabilities, even in absence of any thermodynamic data and information on the naturally recognized sequence. Conclusion This study supports the conclusion that physics-based methods can offer a completely complementary methodology to sequence-based methods for the identification of DNA-binding protein target sequences.
Collapse
Affiliation(s)
- Elisabetta Moroni
- Dipartimento di Fisica Teorica, Universià di Torino and INFN, Via P. Giuria 1, 10125 Torino, Italy
- Dipartimento di Fisica G. Occhialini, Università di Milano-Bicocca and INFN, Piazza delle Scienze 3, 20156 Milano, Italy
| | - Michele Caselle
- Dipartimento di Fisica Teorica, Universià di Torino and INFN, Via P. Giuria 1, 10125 Torino, Italy
| | - Federico Fogolari
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| |
Collapse
|
38
|
Song N, Sedgewick RD, Durand D. Domain architecture comparison for multidomain homology identification. J Comput Biol 2007; 14:496-516. [PMID: 17572026 DOI: 10.1089/cmb.2007.a009] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Homology identification is the first step for many genomic studies. Current methods, based on sequence comparison, can result in a substantial number of mis-assignments due to the similarity of homologous domains in otherwise unrelated sequences. Here we propose methods to detect homologs through explicit comparison of protein domain content. We developed several schemes for scoring the homology of a pair of protein sequences based on methods used in the field of information retrieval. We evaluate the proposed methods and methods used in the literature using a benchmark of fifteen sequence families of known evolutionary history. The results of these studies demonstrate the effectiveness of comparing domain architectures using these similarity measures. We also demonstrate the importance of both weighting promiscuous domains and of compensating for the statistical effect of having a large number of domains in a protein. Using logistic regression, we demonstrate the benefit of combining similarity measures based on domain content with sequence similarity measures.
Collapse
Affiliation(s)
- N Song
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | | | | |
Collapse
|
39
|
Horvath MM, Wang X, Resnick MA, Bell DA. Divergent evolution of human p53 binding sites: cell cycle versus apoptosis. PLoS Genet 2007; 3:e127. [PMID: 17677004 PMCID: PMC1934401 DOI: 10.1371/journal.pgen.0030127] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2007] [Accepted: 06/15/2007] [Indexed: 12/12/2022] Open
Abstract
The p53 tumor suppressor is a sequence-specific pleiotropic transcription factor that coordinates cellular responses to DNA damage and stress, initiating cell-cycle arrest or triggering apoptosis. Although the human p53 binding site sequence (or response element [RE]) is well characterized, some genes have consensus-poor REs that are nevertheless both necessary and sufficient for transactivation by p53. Identification of new functional gene regulatory elements under these conditions is problematic, and evolutionary conservation is often employed. We evaluated the comparative genomics approach for assessing evolutionary conservation of putative binding sites by examining conservation of 83 experimentally validated human p53 REs against mouse, rat, rabbit, and dog genomes and detected pronounced conservation differences among p53 REs and p53-regulated pathways. Bona fide NRF2 (nuclear factor [erythroid-derived 2]-like 2 nuclear factor) and NFkappaB (nuclear factor of kappa light chain gene enhancer in B cells) binding sites, which direct oxidative stress and innate immunity responses, were used as controls, and both exhibited high interspecific conservation. Surprisingly, the average p53 RE was not significantly more conserved than background genomic sequence, and p53 REs in apoptosis genes as a group showed very little conservation. The common bioinformatics practice of filtering RE predictions by 80% rodent sequence identity would not only give a false positive rate of approximately 19%, but miss up to 57% of true p53 REs. Examination of interspecific DNA base substitutions as a function of position in the p53 consensus sequence reveals an unexpected excess of diversity in apoptosis-regulating REs versus cell-cycle controlling REs (rodent comparisons: p < 1.0 e-12). While some p53 REs show relatively high levels of conservation, REs in many genes such as BAX, FAS, PCNA, CASP6, SIVA1, and P53AIP1 show little if any homology to rodent sequences. This difference suggests that among mammalian species, evolutionary conservation differs among p53 REs, with some having ancient ancestry and others of more recent origin. Overall our results reveal divergent evolutionary pressure among the binding targets of p53 and emphasize that comparative genomics methods must be used judiciously and tailored to the evolutionary history of the targeted functional regulatory regions.
Collapse
Affiliation(s)
- Monica M Horvath
- Laboratory of Molecular Genetics, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, United States of America
| | - Xuting Wang
- Laboratory of Molecular Genetics, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, United States of America
| | - Michael A Resnick
- Laboratory of Molecular Genetics, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, United States of America
| | - Douglas A Bell
- Laboratory of Molecular Genetics, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
40
|
Piriyapongsa J, Mariño-Ramírez L, Jordan IK. Origin and evolution of human microRNAs from transposable elements. Genetics 2007; 176:1323-37. [PMID: 17435244 PMCID: PMC1894593 DOI: 10.1534/genetics.107.072553] [Citation(s) in RCA: 261] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2007] [Accepted: 04/12/2007] [Indexed: 12/19/2022] Open
Abstract
We sought to evaluate the extent of the contribution of transposable elements (TEs) to human microRNA (miRNA) genes along with the evolutionary dynamics of TE-derived human miRNAs. We found 55 experimentally characterized human miRNA genes that are derived from TEs, and these TE-derived miRNAs have the potential to regulate thousands of human genes. Sequence comparisons revealed that TE-derived human miRNAs are less conserved, on average, than non-TE-derived miRNAs. However, there are 18 TE-derived miRNAs that are relatively conserved, and 14 of these are related to the ancient L2 and MIR families. Comparison of miRNA vs. mRNA expression patterns for TE-derived miRNAs and their putative target genes showed numerous cases of anti-correlated expression that are consistent with regulation via mRNA degradation. In addition to the known human miRNAs that we show to be derived from TE sequences, we predict an additional 85 novel TE-derived miRNA genes. TE sequences are typically disregarded in genomic surveys for miRNA genes and target sites; this is a mistake. Our results indicate that TEs provide a natural mechanism for the origination miRNAs that can contribute to regulatory divergence between species as well as a rich source for the discovery of as yet unknown miRNA genes.
Collapse
Affiliation(s)
- Jittima Piriyapongsa
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332 and National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894
| | - Leonardo Mariño-Ramírez
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332 and National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894
| | - I. King Jordan
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332 and National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894
| |
Collapse
|
41
|
Kuraku S, Kuratani S. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences. Zoolog Sci 2007; 23:1053-64. [PMID: 17261918 DOI: 10.2108/zsj.23.1053] [Citation(s) in RCA: 145] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods.
Collapse
Affiliation(s)
- Shigehiro Kuraku
- Laboratory for Evolutionary Morphology, RIKEN Center for Developmental Biology, Kobe 650-0047, Japan.
| | | |
Collapse
|
42
|
van Deursen D, Botma GJ, Jansen H, Verhoeven AJM. Comparative genomics and experimental promoter analysis reveal functional liver-specific elements in mammalian hepatic lipase genes. BMC Genomics 2007; 8:99. [PMID: 17428321 PMCID: PMC1853088 DOI: 10.1186/1471-2164-8-99] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2007] [Accepted: 04/11/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mammalian hepatic lipase (HL) genes are transcribed almost exclusively in hepatocytes. The basis for this liver-restricted expression is not completely understood. We hypothesized that the responsible cis-acting elements are conserved among mammalian HL genes. To identify these elements, we made a genomic comparison of 30 kb of 5'-flanking region of the rat, mouse, rhesus monkey, and human HL genes. The in silico data were verified by promoter-reporter assays in transfected hepatoma HepG2 and non-hepatoma HeLa cells using serial 5'-deletions of the rat HL (-2287/+9) and human HL (-685/+13) promoter region. RESULTS Highly conserved elements were present at the proximal promoter region, and at 14 and 22 kb upstream of the transcriptional start site. Both of these upstream elements increased transcriptional activity of the human HL (-685/+13) promoter region 2-3 fold. Within the proximal HL promoter region, conserved clusters of transcription factor binding sites (TFBS) were identified at -240/-200 (module A), -80/-40 (module B), and -25/+5 (module C) by the rVista software. In HepG2 cells, modules B and C, but not module A, were important for basal transcription. Module B contains putative binding sites for hepatocyte nuclear factors HNF1alpha. In the presence of module B, transcription from the minimal HL promoter was increased 1.5-2 fold in HepG2 cells, but inhibited 2-4 fold in HeLa cells. CONCLUSION Our data demonstrate that searching for conserved non-coding sequences by comparative genomics is a valuable tool in identifying candidate enhancer elements. With this approach, we found two putative enhancer elements in the far upstream region of the HL gene. In addition, we obtained evidence that the -80/-40 region of the HL gene is responsible for enhanced HL promoter activity in hepatoma cells, and for silencing HL promoter activity in non-liver cells.
Collapse
Affiliation(s)
- Diederik van Deursen
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| | - Gert-Jan Botma
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| | - Hans Jansen
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
- Department of Clinical Chemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| | - Adrie JM Verhoeven
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| |
Collapse
|
43
|
Boily G, Ouellet S, Langlois S, Larivière M, Drouin R, Sinnett D. In vivo footprinting analysis of the Glypican 3 (GPC3) promoter region in neuroblastoma cells. ACTA ACUST UNITED AC 2007; 1769:182-93. [PMID: 17350117 DOI: 10.1016/j.bbaexp.2007.01.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2006] [Revised: 01/24/2007] [Accepted: 01/29/2007] [Indexed: 11/17/2022]
Abstract
Glypican 3 (GPC3) is an X-linked gene that has its peak expression during development and is down-regulated in all studied tissues after birth. We have shown that GPC3 was expressed in neuroblastoma and Wilms' tumor. To understand the mechanisms regulating the transcription of this gene in neuroblastoma cells, we have focused our study on the identification of putative transcription factors binding the promoter. In this report we performed in vivo dimethylsulfate, UV type C irradiation and DNaseI footprinting analyses coupled with ligation-mediated PCR on nearly 1000 bp of promoter in two neuroblastoma cell lines, SJNB-7 (expressing GPC3) and SK-N-FI (not expressing GPC3). Nucleosome signature footprints were observed in the most distal part of the studied region in both cell lines. We detected eight large differentially protected regions, suggesting the presence of binding proteins in both cell lines but more DNA-protein interactions in GPC3-expressing cells. Sp1 was previously shown to be able to bind some of these regions. Here by combining electromobility shift assays and chromatin immunoprecipitations we showed that the transcription factor NFY was part of the DNA-protein complex found in footprinted regions upstream of the described minimal promoter. These studies performed on chromatin in situ suggest that NFY and yet unknown cell type-specific factors may play an important role in the regulation of GPC3.
Collapse
Affiliation(s)
- Gino Boily
- Division of Hemato-Oncology, Charles-Bruneau Cancer Center, Research Center, CHU Sainte- Justine, Montreal, QC, Canada H3T 1C5
| | | | | | | | | | | |
Collapse
|
44
|
Bowser PRF, Tobe SS. Comparative genomic analysis of allatostatin-encoding (Ast) genes in Drosophila species and prediction of regulatory elements by phylogenetic footprinting. Peptides 2007; 28:83-93. [PMID: 17175069 DOI: 10.1016/j.peptides.2006.08.033] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/27/2006] [Revised: 08/04/2006] [Accepted: 08/04/2006] [Indexed: 01/02/2023]
Abstract
The role of the YXFGLa family of allatostatin (AST) peptides in dipterans is not well-established. The recent completion of sequencing of genomes for multiple Drosophila species provides an opportunity to study the evolutionary variation of the allatostatins and to examine regulatory elements that control gene expression. We performed comparative analyses of Ast genes from seven Drosophila species (Drosophila melanogaster, Drosophila simulans, Drosophila ananassae, Drosophila yakuba, Drosophila pseudoobscura, Drosophila mojavensis, and Drosophila grimshawi) and used phylogenetic footprinting methods to identify conserved noncoding motifs, which are candidates for regulatory regions. The peptides encoded by the Ast precursor are nearly identical across species with the exception of AST-1, in which the leading residue may be either methionine or valine. Phylogenetic footprinting predicts as few as 3, to as many as 17 potential regulatory sites depending on the parameters used during analysis. These include a Hunchback motif approximately 1.2 kb upstream of the open reading frame (ORF), overlapping motifs for two Broad-complex isoforms in the first intron, and a CF2-II motif located in the 3'-UTR. Understanding the regulatory elements involved in Ast expression may provide insight into the function of this neuropeptide family.
Collapse
Affiliation(s)
- P R F Bowser
- Department of Zoology, University of Toronto, 25 Harbord Street, Toronto, Ont. M5S 3G5, Canada
| | | |
Collapse
|
45
|
Yan B, Yang X, Lee TL, Friedman J, Tang J, Van Waes C, Chen Z. Genome-wide identification of novel expression signatures reveal distinct patterns and prevalence of binding motifs for p53, nuclear factor-kappaB and other signal transcription factors in head and neck squamous cell carcinoma. Genome Biol 2007; 8:R78. [PMID: 17498291 PMCID: PMC1929156 DOI: 10.1186/gb-2007-8-5-r78] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2006] [Revised: 02/07/2007] [Accepted: 05/11/2007] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Differentially expressed gene profiles have previously been observed among pathologically defined cancers by microarray technologies, including head and neck squamous cell carcinomas (HNSCCs). However, the molecular expression signatures and transcriptional regulatory controls that underlie the heterogeneity in HNSCCs are not well defined. RESULTS Genome-wide cDNA microarray profiling of ten HNSCC cell lines revealed novel gene expression signatures that distinguished cancer cell subsets associated with p53 status. Three major clusters of over-expressed genes (A to C) were defined through hierarchical clustering, Gene Ontology, and statistical modeling. The promoters of genes in these clusters exhibited different patterns and prevalence of transcription factor binding sites for p53, nuclear factor-kappaB (NF-kappaB), activator protein (AP)-1, signal transducer and activator of transcription (STAT)3 and early growth response (EGR)1, as compared with the frequency in vertebrate promoters. Cluster A genes involved in chromatin structure and function exhibited enrichment for p53 and decreased AP-1 binding sites, whereas clusters B and C, containing cytokine and antiapoptotic genes, exhibited a significant increase in prevalence of NF-kappaB binding sites. An increase in STAT3 and EGR1 binding sites was distributed among the over-expressed clusters. Novel regulatory modules containing p53 or NF-kappaB concomitant with other transcription factor binding motifs were identified, and experimental data supported the predicted transcriptional regulation and binding activity. CONCLUSION The transcription factors p53, NF-kappaB, and AP-1 may be important determinants of the heterogeneous pattern of gene expression, whereas STAT3 and EGR1 may broadly enhance gene expression in HNSCCs. Defining these novel gene signatures and regulatory mechanisms will be important for establishing new molecular classifications and subtyping, which in turn will promote development of targeted therapeutics for HNSCC.
Collapse
Affiliation(s)
- Bin Yan
- Head and Neck Surgery Branch, National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Center Drive, Bethesda, Maryland 20892, USA
| | - Xinping Yang
- Head and Neck Surgery Branch, National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Center Drive, Bethesda, Maryland 20892, USA
| | - Tin-Lap Lee
- Laboratory of Clinical Genomics, National Institute of Child Health and Human Development, National Institutes of Health, Convent Drive, Bethesda, MD 20892, USA
| | - Jay Friedman
- Head and Neck Surgery Branch, National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Center Drive, Bethesda, Maryland 20892, USA
| | - Jun Tang
- Department of Preventive Medicine, University of Tennessee, Health Science Center, N Pauline St., Memphis, TN 38163, USA
| | - Carter Van Waes
- Head and Neck Surgery Branch, National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Center Drive, Bethesda, Maryland 20892, USA
| | - Zhong Chen
- Head and Neck Surgery Branch, National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Center Drive, Bethesda, Maryland 20892, USA
| |
Collapse
|
46
|
Haberer G, Mader MT, Kosarev P, Spannagl M, Yang L, Mayer KFX. Large-scale cis-element detection by analysis of correlated expression and sequence conservation between Arabidopsis and Brassica oleracea. PLANT PHYSIOLOGY 2006; 142:1589-602. [PMID: 17028152 PMCID: PMC1676041 DOI: 10.1104/pp.106.085639] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The rapidly increasing amount of plant genomic sequences allows for the detection of cis-elements through comparative methods. In addition, large-scale gene expression data for Arabidopsis (Arabidopsis thaliana) have recently become available. Coexpression and evolutionarily conserved sequences are criteria widely used to identify shared cis-regulatory elements. In our study, we employ an integrated approach to combine two sources of information, coexpression and sequence conservation. Best-candidate orthologous promoter sequences were identified by a bidirectional best blast hit strategy in genome survey sequences from Brassica oleracea. The analysis of 779 microarrays from 81 different experiments provided detailed expression information for Arabidopsis genes coexpressed in multiple tissues and under various conditions and developmental stages. We discovered candidate transcription factor binding sites in 64% of the Arabidopsis genes analyzed. Among them, we detected experimentally verified binding sites and showed strong enrichment of shared cis-elements within functionally related genes. This study demonstrates the value of partially shotgun sequenced genomes and their combinatorial use with functional genomics data to address complex questions in comparative genomics.
Collapse
Affiliation(s)
- Georg Haberer
- Munich Information Center for Protein Sequences, Institute for Bioinformatics, GSF National Research Center for Environment and Health, 85764 Neuherberg, Germany
| | | | | | | | | | | |
Collapse
|
47
|
|
48
|
Karro JE, Yan Y, Zheng D, Zhang Z, Carriero N, Cayting P, Harrrison P, Gerstein M. Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res 2006; 35:D55-60. [PMID: 17099229 PMCID: PMC1669708 DOI: 10.1093/nar/gkl851] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different approaches to the problem of identification. Consequently, it is difficult to maintain a consistent collection of pseudogenes in detail necessary for their effective use. Our database is designed to address this issue. It integrates a variety of heterogeneous resources and supports a subset structure that highlights specific groups of pseudogenes that are of interest to the research community. Tools are provided for the comparison of sets and the creation of layered set unions, enabling researchers to derive a current ‘consensus’ set of pseudogenes. Additional features include versatile search, the capacity for robust interaction with other databases, the ability to reconstruct older versions of the database (accounting for changing genome builds) and an underlying object-oriented interface designed for researchers with a minimal knowledge of programming. At the present time, the database contains more than 100 000 pseudogenes spanning 64 prokaryote and 11 eukaryote genomes, including a collection of human annotations compiled from 16 sources.
Collapse
Affiliation(s)
- John E Karro
- Center for Comparative Genomics and Bioinformatics, 506B Wartik, Pennsylvania State University, University Park, PA 16802, USA.
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Sharov AA, Dudekula DB, Ko MSH. CisView: a browser and database of cis-regulatory modules predicted in the mouse genome. DNA Res 2006; 13:123-34. [PMID: 16980320 DOI: 10.1093/dnares/dsl005] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
To facilitate the analysis of gene regulatory regions of the mouse genome, we developed a CisView (http://lgsun.grc.nia.nih.gov/cisview), a browser and database of genome-wide potential transcription factor binding sites (TFBSs) that were identified using 134 position-weight matrices and 219 sequence patterns from various sources and were presented with the information about sequence conservation, neighboring genes and their structures, GO annotations, protein domains, DNA repeats and CpG islands. Analysis of the distribution of TFBSs revealed that many TFBSs (N = 145) were over-represented near transcription start sites. We also identified potential cis-regulatory modules (CRMs) defined as clusters of conserved TFBSs in the entire mouse genome. Out of 739 074 CRMs, 157 442 had a significantly higher regulatory potential score than semi-random sequences generated with a 3rd-order Markov process. The CisView browser provides a user-friendly computer environment for studying transcription regulation on a whole-genome scale and can also be used for interpreting microarray experiments and identifying putative targets of transcription factors.
Collapse
Affiliation(s)
- Alexei A Sharov
- Developmental Genomics and Aging Section, Laboratory of Genetics, National Institute on Aging, National Institutes of Health 333 Cassell Drive, Suite 3000, Baltimore, MD 21224, USA
| | | | | |
Collapse
|
50
|
Mariño-Ramírez L, Jordan IK. Transposable element derived DNaseI-hypersensitive sites in the human genome. Biol Direct 2006; 1:20. [PMID: 16857058 PMCID: PMC1538576 DOI: 10.1186/1745-6150-1-20] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2006] [Accepted: 07/20/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transposable elements (TEs) are abundant genomic sequences that have been found to contribute to genome evolution in unexpected ways. Here, we characterize the evolutionary and functional characteristics of TE-derived human genome regulatory sequences uncovered by the high throughput mapping of DNaseI-hypersensitive (HS) sites. RESULTS Human genome TEs were found to contribute substantially to HS regulatory sequences characterized in CD4+ T cells: 23% of HS sites contain TE-derived sequences. While HS sites are far more evolutionarily conserved than non HS sites in the human genome, consistent with their functional importance, TE-derived HS sites are highly divergent. Nevertheless, TE-derived HS sites were shown to be functionally relevant in terms of driving gene expression in CD4+ T cells. Genes involved in immune response are statistically over-represented among genes with TE-derived HS sites. A number of genes with both TE-derived HS sites and immune tissue related expression patterns were found to encode proteins involved in immune response such as T cell specific receptor antigens and secreted cytokines as well as proteins with clinical relevance to HIV and cancer. Genes with TE-derived HS sites have higher average levels of sequence and expression divergence between human and mouse orthologs compared to genes with non TE-derived HS sites. CONCLUSION The results reported here support the notion that TEs provide a specific genome-wide mechanism for generating functionally relevant gene regulatory divergence between evolutionary lineages. REVIEWERS This article was reviewed by Wolfgang J. Miller (nominated by Jerzy Jurka), Itai Yanai and Mikhail S.Gelfand.
Collapse
Affiliation(s)
- Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
| | - I King Jordan
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|