1
|
Mekkaoui F, Drewell RA, Dresch JM, Spratt DE. Experimental approaches to investigate biophysical interactions between homeodomain transcription factors and DNA. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2025; 1868:195074. [PMID: 39644990 PMCID: PMC11832328 DOI: 10.1016/j.bbagrm.2024.195074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 11/26/2024] [Accepted: 12/01/2024] [Indexed: 12/09/2024]
Abstract
Homeodomain transcription factors (TFs) bind to specific DNA sequences to regulate the expression of target genes. Structural work has provided insight into molecular identities and aided in unraveling structural features of these TFs. However, the detailed affinity and specificity by which these TFs bind to DNA sequences is still largely unknown. Qualitative methods, such as DNA footprinting, Electrophoretic Mobility Shift Assays (EMSAs), Systematic Evolution of Ligands by Exponential Enrichment (SELEX), Bacterial One Hybrid (B1H) systems, Surface Plasmon Resonance (SPR), and Protein Binding Microarrays (PBMs) have been widely used to investigate the biochemical characteristics of TF-DNA binding events. In addition to these qualitative methods, bioinformatic approaches have also assisted in TF binding site discovery. Here we discuss the advantages and limitations of these different approaches, as well as the benefits of utilizing more quantitative approaches, such as Mechanically Induced Trapping of Molecular Interactions (MITOMI), Microscale Thermophoresis (MST) and Isothermal Titration Calorimetry (ITC), in determining the biophysical basis of binding specificity of TF-DNA complexes and improving upon existing computational approaches aimed at affinity predictions.
Collapse
Affiliation(s)
- Fadwa Mekkaoui
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, MA 01610, United States of America
| | - Robert A Drewell
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, United States of America
| | - Jacqueline M Dresch
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, United States of America
| | - Donald E Spratt
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, MA 01610, United States of America.
| |
Collapse
|
2
|
Kondratova L, Vallejos CE, Conesa A. Profiling conserved transcription factor binding motifs in Phaseolus vulgaris through comparative genomics. BMC Genomics 2025; 26:169. [PMID: 39979816 PMCID: PMC11841308 DOI: 10.1186/s12864-025-11309-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Accepted: 01/29/2025] [Indexed: 02/22/2025] Open
Abstract
Common bean (Phaseolus vulgaris), a staple food in Latin America and Africa, serves as a vital source of energy, protein, and essential minerals for millions of people. However, genomics knowledge that breeders could leverage for improvement of this crop is scarce. We have developed and validated a comparative genomics approach to predict conserved transcription factor binding sites (TFBS) in common bean and studied gene regulatory networks. We analyzed promoter regions and identified TFBS for 12,631 bean genes with an average of 6 conserved motifs per gene. Moreover, we discovered a statistically significant relationship between the number of conserved motifs and amount of available experimental evidence of gene regulation. Notably, ERF, MYB, and bHLH transcription factor families dominated conserved motifs, with implications for starch biosynthesis regulation. Furthermore, we provide gene regulatory data as a resource that can be interrogated for the regulatory landscape of any set of genes. Our results underscore the significance of TFBS conservation in legumes and aligns with the notion that core genes often exhibit a more conserved regulatory makeup. The study demonstrates the effectiveness of a comparative genomics approach for addressing genome information gaps in non-model organisms and provides valuable insights into the regulatory networks governing starch biosynthesis genes that can support crop improvement programs.
Collapse
Affiliation(s)
- Liudmyla Kondratova
- Genetics & Genomics Graduate Program, University of Florida, Gainesville, FL, USA
| | - C Eduardo Vallejos
- Genetics & Genomics Graduate Program, University of Florida, Gainesville, FL, USA.
- Horticultural Sciences Department, University of Florida, Gainesville, FL, USA.
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council, Paterna, Spain.
| |
Collapse
|
3
|
Cao Y, Hong J, Zhao Y, Li X, Feng X, Wang H, Zhang L, Lin M, Cai Y, Han Y. De novo gene integration into regulatory networks via interaction with conserved genes in peach. HORTICULTURE RESEARCH 2024; 11:uhae252. [PMID: 39664695 PMCID: PMC11630308 DOI: 10.1093/hr/uhae252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 08/29/2024] [Indexed: 12/13/2024]
Abstract
De novo genes can evolve "from scratch" from noncoding sequences, acquiring novel functions in organisms and integrating into regulatory networks during evolution to drive innovations in important phenotypes and traits. However, identifying de novo genes is challenging, as it requires high-quality genomes from closely related species. According to the comparison with nine closely related Prunus genomes, we determined at least 178 de novo genes in P. persica "baifeng". The distinct differences were observed between de novo and conserved genes in gene characteristics and expression patterns. Gene ontology enrichment analysis suggested that Type I de novo genes originated from sequences related to plastid modification functions, while Type II genes were inferred to have derived from sequences related to reproductive functions. Finally, transcriptome sequencing across different tissues and developmental stages suggested that de novo genes have been evolutionarily recruited into existing regulatory networks, playing important roles in plant growth and development, which was also supported by WGCNA analysis and quantitative trait loci data. This study lays the groundwork for future research on the origins and functions of genes in Prunus and related taxa.
Collapse
Affiliation(s)
- Yunpeng Cao
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Jiayi Hong
- College of Life Sciences, Anhui Agricultural University, Hefei 230036, China
| | - Yun Zhao
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Xiaoxu Li
- Beijing Life Science Academy, Beijing 102209, China
| | - Xiaofeng Feng
- College of Life Sciences, Anhui Agricultural University, Hefei 230036, China
| | - Han Wang
- Key Laboratory of Horticultural Crop Germplasm Innovation and Utilization (Co-construction by Ministry and Province), Institute of Horticulture, Anhui Academy of Agricultural Sciences, Hefei 230000, China
| | - Lin Zhang
- Hubei Shizhen Laboratory, Hubei Key Laboratory of Theory and Application Research of Liver and Kidney in Traditional Chinese Medicine, School of Basic Medical Sciences, Hubei University of Chinese Medicine, Wuhan 430065, China
| | - Mengfei Lin
- Jiangxi Provincial Key Laboratory of Plantation and High Valued Utilization of Specialty Fruit Tree and Tea, Institute of Biological Resources, Jiangxi Academy of Sciences, Nanchang 330224 Jiangxi, China
| | - Yongping Cai
- College of Life Sciences, Anhui Agricultural University, Hefei 230036, China
| | - Yuepeng Han
- State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| |
Collapse
|
4
|
Keränen SVE, Villahoz-Baleta A, Bruno AE, Halfon MS. REDfly: An Integrated Knowledgebase for Insect Regulatory Genomics. INSECTS 2022; 13:618. [PMID: 35886794 PMCID: PMC9323752 DOI: 10.3390/insects13070618] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/01/2022] [Accepted: 07/06/2022] [Indexed: 11/29/2022]
Abstract
We provide here an updated description of the REDfly (Regulatory Element Database for Fly) database of transcriptional regulatory elements, a unique resource that provides regulatory annotation for the genome of Drosophila and other insects. The genomic sequences regulating insect gene expression-transcriptional cis-regulatory modules (CRMs, e.g., "enhancers") and transcription factor binding sites (TFBSs)-are not currently curated by any other major database resources. However, knowledge of such sequences is important, as CRMs play critical roles with respect to disease as well as normal development, phenotypic variation, and evolution. Characterized CRMs also provide useful tools for both basic and applied research, including developing methods for insect control. REDfly, which is the most detailed existing platform for metazoan regulatory-element annotation, includes over 40,000 experimentally verified CRMs and TFBSs along with their DNA sequences, their associated genes, and the expression patterns they direct. Here, we briefly describe REDfly's contents and data model, with an emphasis on the new features implemented since 2020. We then provide an illustrated walk-through of several common REDfly search use cases.
Collapse
Affiliation(s)
| | - Angel Villahoz-Baleta
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA; (A.V.-B.); (A.E.B.)
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Andrew E. Bruno
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA; (A.V.-B.); (A.E.B.)
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S. Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
5
|
Krieger G, Lupo O, Wittkopp P, Barkai N. Evolution of transcription factor binding through sequence variations and turnover of binding sites. Genome Res 2022; 32:1099-1111. [PMID: 35618416 PMCID: PMC9248875 DOI: 10.1101/gr.276715.122] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 05/20/2022] [Indexed: 01/08/2023]
Abstract
Variations in noncoding regulatory sequences play a central role in evolution. Interpreting such variations, however, remains difficult even in the context of defined attributes such as transcription factor (TF) binding sites. Here, we systematically link variations in cis-regulatory sequences to TF binding by profiling the allele-specific binding of 27 TFs expressed in a yeast hybrid, in which two related genomes are present within the same nucleus. TFs localize preferentially to sites containing their known consensus motifs but occupy only a small fraction of the motif-containing sites available within the genomes. Differential binding of TFs to the orthologous alleles was well explained by variations that alter motif sequence, whereas differences in chromatin accessibility between alleles were of little apparent effect. Motif variations that abolished binding when present in only one allele were still bound when present in both alleles, suggesting evolutionary compensation, with a potential role for sequence conservation at the motif's vicinity. At the level of the full promoter, we identify cases of binding-site turnover, in which binding sites are reciprocally gained and lost, yet most interspecific differences remained uncompensated. Our results show the flexibility of TFs to bind imprecise motifs and the fast evolution of TF binding sites between related species.
Collapse
Affiliation(s)
- Gat Krieger
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Offir Lupo
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Patricia Wittkopp
- Department of Ecology and Evolutionary Biology, Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
6
|
Krinsky BH, Arthur RK, Xia S, Sosa D, Arsala D, White KP, Long M. Rapid Cis-Trans Coevolution Driven by a Novel Gene Retroposed from a Eukaryotic Conserved CCR4-NOT Component in Drosophila. Genes (Basel) 2021; 13:57. [PMID: 35052398 PMCID: PMC8774992 DOI: 10.3390/genes13010057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 12/10/2021] [Accepted: 12/23/2021] [Indexed: 12/11/2022] Open
Abstract
Young, or newly evolved, genes arise ubiquitously across the tree of life, and they can rapidly acquire novel functions that influence a diverse array of biological processes. Previous work identified a young regulatory duplicate gene in Drosophila, Zeus that unexpectedly diverged rapidly from its parent, Caf40, an extremely conserved component in the CCR4-NOT machinery in post-transcriptional and post-translational regulation of eukaryotic cells, and took on roles in the male reproductive system. This neofunctionalization was accompanied by differential binding of the Zeus protein to loci throughout the Drosophila melanogaster genome. However, the way in which new DNA-binding proteins acquire and coevolve with their targets in the genome is not understood. Here, by comparing Zeus ChIP-Seq data from D. melanogaster and D. simulans to the ancestral Caf40 binding events from D. yakuba, a species that diverged before the duplication event, we found a dynamic pattern in which Zeus binding rapidly coevolved with a previously unknown DNA motif, which we term Caf40 and Zeus-Associated Motif (CAZAM), under the influence of positive selection. Interestingly, while both copies of Zeus acquired targets at male-biased and testis-specific genes, D. melanogaster and D. simulans proteins have specialized binding on different chromosomes, a pattern echoed in the evolution of the associated motif. Using CRISPR-Cas9-mediated gene knockout of Zeus and RNA-Seq, we found that Zeus regulated the expression of 661 differentially expressed genes (DEGs). Our results suggest that the evolution of young regulatory genes can be coupled to substantial rewiring of the transcriptional networks into which they integrate, even over short evolutionary timescales. Our results thus uncover dynamic genome-wide evolutionary processes associated with new genes.
Collapse
Affiliation(s)
- Benjamin H. Krinsky
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Robert K. Arthur
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Dylan Sosa
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Deanna Arsala
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| | - Kevin P. White
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Manyuan Long
- Committee on Evolutionary Biology, University of Chicago, Chicago, IL 60637, USA;
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA; (R.K.A.); (S.X.); (D.S.); (D.A.); (K.P.W.)
| |
Collapse
|
7
|
Schweizer G, Wagner A. Both Binding Strength and Evolutionary Accessibility Affect the Population Frequency of Transcription Factor Binding Sequences in Arabidopsis thaliana. Genome Biol Evol 2021; 13:6459646. [PMID: 34894231 PMCID: PMC8712246 DOI: 10.1093/gbe/evab273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2021] [Indexed: 11/22/2022] Open
Abstract
Mutations in DNA sequences that bind transcription factors and thus modulate gene expression are a source of adaptive variation in gene expression. To understand how transcription factor binding sequences evolve in natural populations of the thale cress Arabidopsis thaliana, we integrated genomic polymorphism data for loci bound by transcription factors with in vitro data on binding affinity for these transcription factors. Specifically, we studied 19 different transcription factors, and the allele frequencies of 8,333 genomic loci bound in vivo by these transcription factors in 1,135 A. thaliana accessions. We find that transcription factor binding sequences show very low genetic diversity, suggesting that they are subject to purifying selection. High frequency alleles of such binding sequences tend to bind transcription factors strongly. Conversely, alleles that are absent from the population tend to bind them weakly. In addition, alleles with high frequencies also tend to be the endpoints of many accessible evolutionary paths leading to these alleles. We show that both high affinity and high evolutionary accessibility contribute to high allele frequency for at least some transcription factors. Although binding sequences with stronger affinity are more frequent, we did not find them to be associated with higher gene expression levels. Epistatic interactions among individual mutations that alter binding affinity are pervasive and can help explain variation in accessibility among binding sequences. In summary, combining in vitro binding affinity data with in vivo binding sequence data can help understand the forces that affect the evolution of transcription factor binding sequences in natural populations.
Collapse
Affiliation(s)
- Gabriel Schweizer
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Switzerland.,Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode, Lausanne, Switzerland
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Switzerland.,Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode, Lausanne, Switzerland.,Santa Fe Institute, Santa Fe, New Mexico, USA.,Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, South Africa
| |
Collapse
|
8
|
Joshi M, Kapopoulou A, Laurent S. Impact of Genetic Variation in Gene Regulatory Sequences: A Population Genomics Perspective. Front Genet 2021; 12:660899. [PMID: 34276769 PMCID: PMC8282999 DOI: 10.3389/fgene.2021.660899] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 05/31/2021] [Indexed: 01/06/2023] Open
Abstract
The unprecedented rise of high-throughput sequencing and assay technologies has provided a detailed insight into the non-coding sequences and their potential role as gene expression regulators. These regulatory non-coding sequences are also referred to as cis-regulatory elements (CREs). Genetic variants occurring within CREs have been shown to be associated with altered gene expression and phenotypic changes. Such variants are known to occur spontaneously and ultimately get fixed, due to selection and genetic drift, in natural populations and, in some cases, pave the way for speciation. Hence, the study of genetic variation at CREs has improved our overall understanding of the processes of local adaptation and evolution. Recent advances in high-throughput sequencing and better annotations of CREs have enabled the evaluation of the impact of such variation on gene expression, phenotypic alteration and fitness. Here, we review recent research on the evolution of CREs and concentrate on studies that have investigated genetic variation occurring in these regulatory sequences within the context of population genetics.
Collapse
Affiliation(s)
- Manas Joshi
- Department of Comparative Development and Genetics, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | | | - Stefan Laurent
- Department of Comparative Development and Genetics, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| |
Collapse
|
9
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
10
|
Hatleberg WL, Hinman VF. Modularity and hierarchy in biological systems: Using gene regulatory networks to understand evolutionary change. Curr Top Dev Biol 2021; 141:39-73. [DOI: 10.1016/bs.ctdb.2020.11.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
11
|
Zrimec J, Börlin CS, Buric F, Muhammad AS, Chen R, Siewers V, Verendel V, Nielsen J, Töpel M, Zelezniak A. Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure. Nat Commun 2020; 11:6141. [PMID: 33262328 PMCID: PMC7708451 DOI: 10.1038/s41467-020-19921-4] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 11/02/2020] [Indexed: 12/31/2022] Open
Abstract
Understanding the genetic regulatory code governing gene expression is an important challenge in molecular biology. However, how individual coding and non-coding regions of the gene regulatory structure interact and contribute to mRNA expression levels remains unclear. Here we apply deep learning on over 20,000 mRNA datasets to examine the genetic regulatory code controlling mRNA abundance in 7 model organisms ranging from bacteria to Human. In all organisms, we can predict mRNA abundance directly from DNA sequence, with up to 82% of the variation of transcript levels encoded in the gene regulatory structure. By searching for DNA regulatory motifs across the gene regulatory structure, we discover that motif interactions could explain the whole dynamic range of mRNA levels. Co-evolution across coding and non-coding regions suggests that it is not single motifs or regions, but the entire gene regulatory structure and specific combination of regulatory elements that define gene expression levels.
Collapse
Affiliation(s)
- Jan Zrimec
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Christoph S Börlin
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Filip Buric
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Azam Sheikh Muhammad
- Computer Science and Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Rhongzen Chen
- Computer Science and Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Verena Siewers
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Vilhelm Verendel
- Computer Science and Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Jens Nielsen
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden
| | - Mats Töpel
- Department of Marine Sciences, University of Gothenburg, Box 461, SE-405 30, Gothenburg, Sweden
- Gothenburg Global Biodiversity Center (GGBC), Box 461, 40530, Gothenburg, Sweden
| | - Aleksej Zelezniak
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96, Gothenburg, Sweden.
- Science for Life Laboratory, Tomtebodavägen 23a, SE-171 65, Stockholm, Sweden.
| |
Collapse
|
12
|
Rivera J, Keränen SVE, Gallo SM, Halfon MS. REDfly: the transcriptional regulatory element database for Drosophila. Nucleic Acids Res 2020; 47:D828-D834. [PMID: 30329093 PMCID: PMC6323911 DOI: 10.1093/nar/gky957] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 10/04/2018] [Indexed: 12/21/2022] Open
Abstract
The REDfly database provides a comprehensive curation of experimentally-validated Drosophila transcriptional cis-regulatory elements and includes information on DNA sequence, experimental evidence, patterns of regulated gene expression, and more. Now in its thirteenth year, REDfly has grown to over 23 000 records of tested reporter gene constructs and 2200 tested transcription factor binding sites. Recent developments include the start of curation of predicted cis-regulatory modules in addition to experimentally-verified ones, improved search and filtering, and increased interaction with the authors of curated papers. An expanded data model that will capture information on temporal aspects of gene regulation, regulation in response to environmental and other non-developmental cues, sexually dimorphic gene regulation, and non-endogenous (ectopic) aspects of reporter gene expression is under development and expected to be in place within the coming year. REDfly is freely accessible at http://redfly.ccr.buffalo.edu, and news about database updates and new features can be followed on Twitter at @REDfly_database.
Collapse
Affiliation(s)
- John Rivera
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | | | - Steven M Gallo
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
13
|
Mehrotra R, Loake G, Mehrotra S. Promoter choice: Selection vs. rejection. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
14
|
Combs PA, Fraser HB. Spatially varying cis-regulatory divergence in Drosophila embryos elucidates cis-regulatory logic. PLoS Genet 2018; 14:e1007631. [PMID: 30383747 PMCID: PMC6211617 DOI: 10.1371/journal.pgen.1007631] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 08/14/2018] [Indexed: 12/30/2022] Open
Abstract
Spatial patterning of gene expression is a key process in development, yet how it evolves is still poorly understood. Both cis- and trans-acting changes could participate in complex interactions, so to isolate the cis-regulatory component of patterning evolution, we measured allele-specific spatial gene expression patterns in D. melanogaster × simulans hybrid embryos. RNA-seq of cryo-sectioned slices revealed 66 genes with strong spatially varying allele-specific expression. We found that hunchback, a major regulator of developmental patterning, had reduced expression of the D. simulans allele specifically in the anterior tip of hybrid embryos. Mathematical modeling of hunchback cis-regulation suggested a candidate transcription factor binding site variant, which we verified as causal using CRISPR-Cas9 genome editing. In sum, even comparing morphologically near-identical species we identified surprisingly extensive spatial variation in gene expression, suggesting not only that development is robust to many such changes, but also that natural selection may have ample raw material for evolving new body plans via changes in spatial patterning.
Collapse
Affiliation(s)
- Peter A. Combs
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Hunter B. Fraser
- Department of Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|
15
|
Characterization of dFOXO binding sites upstream of the Insulin Receptor P2 promoter across the Drosophila phylogeny. PLoS One 2017; 12:e0188357. [PMID: 29200426 PMCID: PMC5714339 DOI: 10.1371/journal.pone.0188357] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 11/06/2017] [Indexed: 01/01/2023] Open
Abstract
The insulin/TOR signal transduction pathway plays a critical role in determining such important traits as body and organ size, metabolic homeostasis and life span. Although this pathway is highly conserved across the animal kingdom, the affected traits can exhibit important differences even between closely related species. Evolutionary studies of regulatory regions require the reliable identification of transcription factor binding sites. Here we have focused on the Insulin Receptor (InR) expression from its P2 promoter in the Drosophila genus, which in D. melanogaster is up-regulated by hypophosphorylated Drosophila FOXO (dFOXO). We have finely characterized this transcription factor binding sites in vitro along the 1.3 kb region upstream of the InR P2 promoter in five Drosophila species. Moreover, we have tested the effect of mutations in the characterized dFOXO sites of D. melanogaster in transgenic flies. The number of experimentally established binding sites varies across the 1.3 kb region of any particular species, and their distribution also differs among species. In D. melanogaster, InR expression from P2 is differentially affected by dFOXO binding sites at the proximal and distal halves of the species 1.3 kb fragment. The observed uneven distribution of binding sites across this fragment might underlie their differential contribution to regulate InR transcription.
Collapse
|
16
|
Barr KA, Martinez C, Moran JR, Kim AR, Ramos AF, Reinitz J. Synthetic enhancer design by in silico compensatory evolution reveals flexibility and constraint in cis-regulation. BMC SYSTEMS BIOLOGY 2017; 11:116. [PMID: 29187214 PMCID: PMC5708098 DOI: 10.1186/s12918-017-0485-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 11/09/2017] [Indexed: 11/12/2022]
Abstract
BACKGROUND Models that incorporate specific chemical mechanisms have been successful in describing the activity of Drosophila developmental enhancers as a function of underlying transcription factor binding motifs. Despite this, the minimum set of mechanisms required to reconstruct an enhancer from its constituent parts is not known. Synthetic biology offers the potential to test the sufficiency of known mechanisms to describe the activity of enhancers, as well as to uncover constraints on the number, order, and spacing of motifs. RESULTS Using a functional model and in silico compensatory evolution, we generated putative synthetic even-skipped stripe 2 enhancers with varying degrees of similarity to the natural enhancer. These elements represent the evolutionary trajectories of the natural stripe 2 enhancer towards two synthetic enhancers designed ab initio. In the first trajectory, spatially regulated expression was maintained, even after more than a third of binding sites were lost. In the second, sequences with high similarity to the natural element did not drive expression, but a highly diverged sequence about half the length of the minimal stripe 2 enhancer drove ten times greater expression. Additionally, homotypic clusters of Zelda or Stat92E motifs, but not Bicoid, drove expression in developing embryos. CONCLUSIONS Here, we present a functional model of gene regulation to test the degree to which the known transcription factors and their interactions explain the activity of the Drosophila even-skipped stripe 2 enhancer. Initial success in the first trajectory showed that the gene regulation model explains much of the function of the stripe 2 enhancer. Cases where expression deviated from prediction indicates that undescribed factors likely act to modulate expression. We also showed that activation driven Bicoid and Hunchback is highly sensitive to spatial organization of binding motifs. In contrast, Zelda and Stat92E drive expression from simple homotypic clusters, suggesting that activation driven by these factors is less constrained. Collectively, the 40 sequences generated in this work provides a powerful training set for building future models of gene regulation.
Collapse
Affiliation(s)
- Kenneth A Barr
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Zoology 111, 1101 E 57th St, Chicago, 60637, Illinois, USA.
- Department of Ecology and Evolution, The University of Chicago, Chicago, 60637, Illinois, USA.
| | - Carlos Martinez
- Department Biochemistry and Molecular Genetics, Northwestern University, Chicago, 60611, Illinois, USA
| | - Jennifer R Moran
- Department Human Genetics, The University of Chicago, Chicago, 60637, Illinois, USA
- Institute for Genomics & Systems Biology, The University of Chicago, Chicago, 60637, Illinois, USA
| | - Ah-Ram Kim
- School of Life Science, Handong Global University, Pohang, 37554, Gyeongbuk, South Korea
| | - Alexandre F Ramos
- Departamento de Radiologia - Faculdade de Medicina, Universidade de São Paulo & Instituto do Câncer do Estado de São Paulo, São Paulo, SP CEP, 05403-911, Brazil
- Escola de Artes, Ciências e Humanidades & Núcleo de Estudos Interdisciplinares em Sistemas Complexos, Universidade de São Paulo, Av. Arlindo Béttio, São Paulo, 1000 CEP 03828-000, SP, Brazil
| | - John Reinitz
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Zoology 111, 1101 E 57th St, Chicago, 60637, Illinois, USA
- Department of Ecology and Evolution, The University of Chicago, Chicago, 60637, Illinois, USA
- Institute for Genomics & Systems Biology, The University of Chicago, Chicago, 60637, Illinois, USA
- Department Statistics, The University of Chicago, 5747 S. Ellis Avenue Jones 312, Chicago, 60637, IL, USA
| |
Collapse
|
17
|
Gursky VV, Kozlov KN, Kulakovskiy IV, Zubair A, Marjoram P, Lawrie DS, Nuzhdin SV, Samsonova MG. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network. PLoS One 2017; 12:e0184657. [PMID: 28898266 PMCID: PMC5595321 DOI: 10.1371/journal.pone.0184657] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 08/28/2017] [Indexed: 11/18/2022] Open
Abstract
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects.
Collapse
Affiliation(s)
- Vitaly V. Gursky
- Theoretical Department, Ioffe Institute, Saint Petersburg, Russia
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
- * E-mail:
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Ivan V. Kulakovskiy
- Engelhardt Institute of Molecular Biology, Moscow, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Asif Zubair
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Paul Marjoram
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - David S. Lawrie
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sergey V. Nuzhdin
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg, Russia
| |
Collapse
|
18
|
Chertkova AA, Schiffman JS, Nuzhdin SV, Kozlov KN, Samsonova MG, Gursky VV. In silico evolution of the Drosophila gap gene regulatory sequence under elevated mutational pressure. BMC Evol Biol 2017; 17:4. [PMID: 28251865 PMCID: PMC5333172 DOI: 10.1186/s12862-016-0866-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. RESULTS We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. CONCLUSION In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.
Collapse
Affiliation(s)
- Aleksandra A. Chertkova
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
| | - Joshua S. Schiffman
- Molecular and Computational Biology, University of Southern California, Los Angeles, 90089 CA USA
| | - Sergey V. Nuzhdin
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
- Molecular and Computational Biology, University of Southern California, Los Angeles, 90089 CA USA
| | - Konstantin N. Kozlov
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
| | - Maria G. Samsonova
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
| | - Vitaly V. Gursky
- Systems Biology and Bioinformatics Laboratory, Peter the Great St. Petersburg Polytechnic University, Polytechnicheskaya, 29, St. Petersburg, 195251 Russia
- Theoretical Department, Ioffe Institute, Polytechnicheskaya, 26, St. Petersburg, 194021 Russia
| |
Collapse
|
19
|
Ma L, Zhang W, Ding Z, Wu SG, Jin Y, Jiang N, Du H, Cai D, Miao L, Chen X. Association of a common variant of SYNPO2 gene with increased risk of serous epithelial ovarian cancer. Tumour Biol 2017; 39:1010428317691185. [PMID: 28231729 DOI: 10.1177/1010428317691185] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
In China, the majority of ovarian cancer patients (80%–90%) are women who are diagnosed with epithelial ovarian cancer. The SYNPO2 gene has recently been reported to be associated with epithelial ovarian cancer in Europeans. To investigate the association of common variants of SYNPO2 gene with epithelial ovarian cancer in Han Chinese individuals, we designed a case–control study with 719 epithelial ovarian cancer patients and 1568 unrelated healthy controls of Han Chinese descent. A total of 49 tagging single-nucleotide polymorphisms were genotyped; single-single-nucleotide polymorphism association, imputation, and haplotypic association analyses were performed. The single-nucleotide polymorphism rs17329882 was found to be strongly associated with serous epithelial ovarian cancer and with ages ≤49 years, consistent with the pre-menopausal status of analyzed epithelial ovarian cancer cases. Odds ratios and 95% confidence intervals provided evidence of the risk effects of the C allele of the single-nucleotide polymorphism on epithelial ovarian cancer. Imputation analyses also confirmed the results with a similar pattern. Additionally, haplotype analyses indicated that the haplotype block that contained rs17329882 was significantly associated with epithelial ovarian cancer risk, specifically with the serous epithelial ovarian cancer subtype. In conclusion, our results show that SYNPO2 gene plays an important role in the etiology of epithelial ovarian cancer, suggesting that this gene may be a potential genetic modifier for developing epithelial ovarian cancer.
Collapse
Affiliation(s)
- Li Ma
- Department of Pathology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Wei Zhang
- Department of Scientific Research, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Zhaoli Ding
- Department of Oncology, Zhengzhou Central Hospital Affiliated to Zhengzhou University, Zhengzhou, China
| | - Stephen G Wu
- Department of Energy, Environmental & Chemical Engineering, Washington University in St. Louis, Saint Louis, MO, USA
| | - Yaofeng Jin
- Department of Pathology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Na Jiang
- Department of Pathology, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Hongyan Du
- Department of Pathology, Maternity and Children Hospital of Shaanxi Province, Xi’an, China
| | - Dongge Cai
- Department of Obstetrics and Gynecology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Li Miao
- Department of Pathology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Xiaoli Chen
- Department of Pathology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
20
|
Kim D, Thairu MW, Hansen AK. Novel Insights into Insect-Microbe Interactions-Role of Epigenomics and Small RNAs. FRONTIERS IN PLANT SCIENCE 2016; 7:1164. [PMID: 27540386 PMCID: PMC4972996 DOI: 10.3389/fpls.2016.01164] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Accepted: 07/20/2016] [Indexed: 05/23/2023]
Abstract
It has become increasingly clear that microbes form close associations with the vast majority of animal species, especially insects. In fact, an array of diverse microbes is known to form shared metabolic pathways with their insect hosts. A growing area of research in insect-microbe interactions, notably for hemipteran insects and their mutualistic symbionts, is to elucidate the regulation of this inter-domain metabolism. This review examines two new emerging mechanisms of gene regulation and their importance in host-microbe interactions. Specifically, we highlight how the incipient areas of research on regulatory "dark matter" such as epigenomics and small RNAs, can play a pivotal role in the evolution of both insect and microbe gene regulation. We then propose specific models of how these dynamic forms of gene regulation can influence insect-symbiont-plant interactions. Future studies in this area of research will give us a systematic understanding of how these symbiotic microbes and animals reciprocally respond to and regulate their shared metabolic processes.
Collapse
|
21
|
Laarits T, Bordalo P, Lemos B. Genes under weaker stabilizing selection increase network evolvability and rapid regulatory adaptation to an environmental shift. J Evol Biol 2016; 29:1602-16. [DOI: 10.1111/jeb.12897] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Revised: 05/03/2016] [Accepted: 05/13/2016] [Indexed: 11/28/2022]
Affiliation(s)
| | - P. Bordalo
- Department of Systems Biology; Harvard Medical School; Boston MA USA
| | - B. Lemos
- Program in Molecular and Integrative Physiological Sciences; Department of Environmental Health; Harvard T. H. Chan School of Public Health; Boston MA USA
| |
Collapse
|
22
|
Marxer M, Vollenweider V, Schmid-Hempel P. Insect antimicrobial peptides act synergistically to inhibit a trypanosome parasite. Philos Trans R Soc Lond B Biol Sci 2016; 371:20150302. [PMID: 27160603 PMCID: PMC4874398 DOI: 10.1098/rstb.2015.0302] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/08/2016] [Indexed: 11/12/2022] Open
Abstract
The innate immune system provides protection from infection by producing essential effector molecules, such as antimicrobial peptides (AMPs) that possess broad-spectrum activity. This is also the case for bumblebees, Bombus terrestris, when infected by the trypanosome, Crithidia bombi Furthermore, the expressed mixture of AMPs varies with host genetic background and infecting parasite strain (genotype). Here, we used the fact that clones of C. bombi can be cultivated and kept as strains in medium to test the effect of various combinations of AMPs on the growth rate of the parasite. In particular, we used pairwise combinations and a range of physiological concentrations of three AMPs, namely Abaecin, Defensin and Hymenoptaecin, synthetized from the respective genomic sequences. We found that these AMPs indeed suppress the growth of eight different strains of C. bombi, and that combinations of AMPs were typically more effective than the use of a single AMP alone. Furthermore, the most effective combinations were rarely those consisting of maximum concentrations. In addition, the AMP combination treatments revealed parasite strain specificity, such that strains varied in their sensitivity towards the same mixtures. Hence, variable expression of AMPs could be an alternative strategy to combat highly variable infections.This article is part of the themed issue 'Evolutionary ecology of arthropod antimicrobial peptides'.
Collapse
Affiliation(s)
- Monika Marxer
- ETH Zurich, Institute of Integrative Biology (IBZ), Universitätsstrasse 16, 8092 Zürich, Switzerland
| | - Vera Vollenweider
- ETH Zurich, Institute of Integrative Biology (IBZ), Universitätsstrasse 16, 8092 Zürich, Switzerland
| | - Paul Schmid-Hempel
- ETH Zurich, Institute of Integrative Biology (IBZ), Universitätsstrasse 16, 8092 Zürich, Switzerland
| |
Collapse
|
23
|
Polygenic evolution of a sugar specialization trade-off in yeast. Nature 2016; 530:336-9. [PMID: 26863195 DOI: 10.1038/nature16938] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 12/18/2015] [Indexed: 11/08/2022]
Abstract
The evolution of novel traits can involve many mutations scattered throughout the genome. Detecting and validating such a suite of alleles, particularly if they arose long ago, remains a key challenge in evolutionary genetics. Here we dissect an evolutionary trade-off of unprecedented genetic complexity between long-diverged species. When cultured in 1% glucose medium supplemented with galactose, Saccharomyces cerevisiae, but not S. bayanus or other Saccharomyces species, delayed commitment to galactose metabolism until glucose was exhausted. Promoters of seven galactose (GAL) metabolic genes from S. cerevisiae, when introduced together into S. bayanus, largely recapitulated the delay phenotype in 1% glucose-galactose medium, and most had partial effects when tested in isolation. Variation in GAL coding regions also contributed to the delay when tested individually in 1% glucose-galactose medium. When combined, S. cerevisiae GAL coding regions gave rise to profound growth defects in the S. bayanus background. In medium containing 2.5% glucose supplemented with galactose, wild-type S. cerevisiae repressed GAL gene expression and had a robust growth advantage relative to S. bayanus; transgenesis of S. cerevisiae GAL promoter alleles or GAL coding regions was sufficient for partial reconstruction of these phenotypes. S. cerevisiae GAL genes thus encode a regulatory program of slow induction and avid repression, and a fitness detriment during the glucose-galactose transition but a benefit when glucose is in excess. Together, these results make clear that genetic mapping of complex phenotypes is within reach, even in deeply diverged species.
Collapse
|
24
|
Rastegar S, Strähle U. The Zebrafish as Model for Deciphering the Regulatory Architecture of Vertebrate Genomes. GENETICS, GENOMICS AND FISH PHENOMICS 2016; 95:195-216. [DOI: 10.1016/bs.adgen.2016.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
25
|
Carvunis AR, Wang T, Skola D, Yu A, Chen J, Kreisberg JF, Ideker T. Evidence for a common evolutionary rate in metazoan transcriptional networks. eLife 2015; 4. [PMID: 26682651 PMCID: PMC4764585 DOI: 10.7554/elife.11615] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 12/17/2015] [Indexed: 12/13/2022] Open
Abstract
Genome sequences diverge more rapidly in mammals than in other animal lineages, such as birds or insects. However, the effect of this rapid divergence on transcriptional evolution remains unclear. Recent reports have indicated a faster divergence of transcription factor binding in mammals than in insects, but others found the reverse for mRNA expression. Here, we show that these conflicting interpretations resulted from differing methodologies. We performed an integrated analysis of transcriptional network evolution by examining mRNA expression, transcription factor binding and cis-regulatory motifs across >25 animal species, including mammals, birds and insects. Strikingly, we found that transcriptional networks evolve at a common rate across the three animal lineages. Furthermore, differences in rates of genome divergence were greatly reduced when restricting comparisons to chromatin-accessible sequences. The evolution of transcription is thus decoupled from the global rate of genome sequence evolution, suggesting that a small fraction of the genome regulates transcription. DOI:http://dx.doi.org/10.7554/eLife.11615.001 The genetic information that makes each individual unique is encoded in DNA molecules. Cells read this molecular instruction manual by a process called transcription, in which proteins called transcription factors bind to DNA in specific places and regulate which sections of the DNA will be expressed. These 'transcripts' are active molecules that determine the cell’s – and ultimately the individual’s – characteristics. However, it is not well understood how alterations in the DNA of different individuals or species can lead to changes in where the transcription factors bind, and in which transcripts are expressed. Carvunis, Wang, Skola et al. set out to determine if there is a relationship between how often DNA changes and how often transcription changes during the evolution of animals. The experiments examined the abundance of transcripts in the cells of a variety of animal species with close or distant evolutionary relationships. For example, the house mouse was compared to a close relative called the Algerian mouse, to another species of rodent (rat) and to humans. The experiments show that the changes in transcript abundances are happening at similar rates in mammals, birds and insects, even though DNA changes at very different rates in these groups of animals. This similarity was also observed for other aspects of transcription, such as in changes to where transcription factors bind to DNA. The next challenges are to find out what makes transcription evolve at such similar rates in these groups of animals, and whether these findings extend to other species and to other processes in cells. DOI:http://dx.doi.org/10.7554/eLife.11615.002
Collapse
Affiliation(s)
| | - Tina Wang
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Dylan Skola
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Alice Yu
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Jonathan Chen
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Jason F Kreisberg
- Department of Medicine, University of California, San Diego, La Jolla, United States
| | - Trey Ideker
- Department of Medicine, University of California, San Diego, La Jolla, United States
| |
Collapse
|
26
|
Kozlov K, Gursky VV, Kulakovskiy IV, Dymova A, Samsonova M. Analysis of functional importance of binding sites in the Drosophila gap gene network model. BMC Genomics 2015; 16 Suppl 13:S7. [PMID: 26694511 PMCID: PMC4686791 DOI: 10.1186/1471-2164-16-s13-s7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. RESULTS We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. CONCLUSIONS The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.
Collapse
Affiliation(s)
- Konstantin Kozlov
- Peter the Great St. Petersburg Polytechnic University, 29 Polytechnicheskaya, 195251 St.Petersburg, Russia
| | - Vitaly V Gursky
- Peter the Great St. Petersburg Polytechnic University, 29 Polytechnicheskaya, 195251 St.Petersburg, Russia
- Ioffe Institute, 26 Polytechnicheskaya, 194021 St.Petersburg, Russia
| | - Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, 32 Vavilova, 119991 Moscow, Russia
| | - Arina Dymova
- Peter the Great St. Petersburg Polytechnic University, 29 Polytechnicheskaya, 195251 St.Petersburg, Russia
| | - Maria Samsonova
- Peter the Great St. Petersburg Polytechnic University, 29 Polytechnicheskaya, 195251 St.Petersburg, Russia
| |
Collapse
|
27
|
Freeling M, Scanlon MJ, Fowler JE. Fractionation and subfunctionalization following genome duplications: mechanisms that drive gene content and their consequences. Curr Opin Genet Dev 2015; 35:110-8. [PMID: 26657818 DOI: 10.1016/j.gde.2015.11.002] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 11/09/2015] [Accepted: 11/09/2015] [Indexed: 12/11/2022]
Abstract
A gene's duplication relaxes selection. Loss of duplicate, low-function DNA (fractionation) sometimes follows, mostly by deletion in plants, but mostly via the pseudogene pathway in fish and other clades with smaller population sizes. Subfunctionalization--the founding term of the Xfunctionalization lexicon--while not the general cause of differences in duplicate gene retention, becomes primary as the number of a gene's cis-regulatory sites increases. Balanced gene drive explains retention for the average gene. Both maintenance-of-balance and subfunctionalization drive gene content nonrandomly, and currently fall outside of our accepted Theory of Evolution. The 'typical' mutation encountered by a gene duplicate is not a neutral loss-of-function; dominant mutations (Muller's lexicon; these are not neutral) abound, and confound X functionalization terms like 'neofunctionalization'. Confusion of words may cause confusion of thought. As with many plants, fish tetraploidies provide a higher throughput surrogate-genetic method to infer function from human and other vertebrate ENCODE-like regulatory sites.
Collapse
Affiliation(s)
- Michael Freeling
- Department of Plant and Microbial Biology, Univ. California, Berkeley, CA 94720, United States.
| | - Michael J Scanlon
- Section of Plant Biology, Cornell University, Ithaca, NY 14853, United States
| | - John E Fowler
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, United States
| |
Collapse
|
28
|
Davey NE, Cyert MS, Moses AM. Short linear motifs - ex nihilo evolution of protein regulation. Cell Commun Signal 2015; 13:43. [PMID: 26589632 PMCID: PMC4654906 DOI: 10.1186/s12964-015-0120-z] [Citation(s) in RCA: 162] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 11/13/2015] [Indexed: 12/12/2022] Open
Abstract
Short sequence motifs are ubiquitous across the three major types of biomolecules: hundreds of classes and thousands of instances of DNA regulatory elements, RNA motifs and protein short linear motifs (SLiMs) have been characterised. The increase in complexity of transcriptional, post-transcriptional and post-translational regulation in higher Eukaryotes has coincided with a significant expansion of motif use. But how did the eukaryotic cell acquire such a vast repertoire of motifs? In this review, we curate the available literature on protein motif evolution and discuss the evidence that suggests SLiMs can be acquired by mutations, insertions and deletions in disordered regions. We propose a mechanism of ex nihilo SLiM evolution – the evolution of a novel SLiM from “nothing” – adding a functional module to a previously non-functional region of protein sequence. In our model, hundreds of motif-binding domains in higher eukaryotic proteins connect simple motif specificities with useful functions to create a large functional motif space. Accessible peptides that match the specificity of these motif-binding domains are continuously created and destroyed by mutations in rapidly evolving disordered regions, creating a dynamic supply of new interactions that may have advantageous phenotypic novelty. This provides a reservoir of diversity to modify existing interaction networks. Evolutionary pressures will act on these motifs to retain beneficial instances. However, most will be lost on an evolutionary timescale as negative selection and genetic drift act on deleterious and neutral motifs respectively. In light of the parallels between the presented model and the evolution of motifs in the regulatory segments of genes and (pre-)mRNAs, we suggest our understanding of regulatory networks would benefit from the creation of a shared model describing the evolution of transcriptional, post-transcriptional and post-translational regulation.
Collapse
Affiliation(s)
- Norman E Davey
- Conway Institute of Biomolecular and Biomedical Sciences, University College Dublin, Dublin 4, Ireland.
| | - Martha S Cyert
- Department of Biology, Stanford University, Stanford, CA, 94305, USA.
| | - Alan M Moses
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada. .,Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada.
| |
Collapse
|
29
|
Tuğrul M, Paixão T, Barton NH, Tkačik G. Dynamics of Transcription Factor Binding Site Evolution. PLoS Genet 2015; 11:e1005639. [PMID: 26545200 PMCID: PMC4636380 DOI: 10.1371/journal.pgen.1005639] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 10/09/2015] [Indexed: 11/19/2022] Open
Abstract
Evolution of gene regulation is crucial for our understanding of the phenotypic differences between species, populations and individuals. Sequence-specific binding of transcription factors to the regulatory regions on the DNA is a key regulatory mechanism that determines gene expression and hence heritable phenotypic variation. We use a biophysical model for directional selection on gene expression to estimate the rates of gain and loss of transcription factor binding sites (TFBS) in finite populations under both point and insertion/deletion mutations. Our results show that these rates are typically slow for a single TFBS in an isolated DNA region, unless the selection is extremely strong. These rates decrease drastically with increasing TFBS length or increasingly specific protein-DNA interactions, making the evolution of sites longer than ∼ 10 bp unlikely on typical eukaryotic speciation timescales. Similarly, evolution converges to the stationary distribution of binding sequences very slowly, making the equilibrium assumption questionable. The availability of longer regulatory sequences in which multiple binding sites can evolve simultaneously, the presence of “pre-sites” or partially decayed old sites in the initial sequence, and biophysical cooperativity between transcription factors, can all facilitate gain of TFBS and reconcile theoretical calculations with timescales inferred from comparative genomics. Evolution has produced a remarkable diversity of living forms that manifests in qualitative differences as well as quantitative traits. An essential factor that underlies this variability is transcription factor binding sites, short pieces of DNA that control gene expression levels. Nevertheless, we lack a thorough theoretical understanding of the evolutionary times required for the appearance and disappearance of these sites. By combining a biophysically realistic model for how cells read out information in transcription factor binding sites with model for DNA sequence evolution, we explore these timescales and ask what factors crucially affect them. We find that the emergence of binding sites from a random sequence is generically slow under point and insertion/deletion mutational mechanisms. Strong selection, sufficient genomic sequence in which the sites can evolve, the existence of partially decayed old binding sites in the sequence, as well as certain biophysical mechanisms such as cooperativity, can accelerate the binding site gain times and make them consistent with the timescales suggested by comparative analyses of genomic data.
Collapse
Affiliation(s)
- Murat Tuğrul
- Institute of Science and Technology Austria, Klosterneuburg, Austria
- * E-mail:
| | - Tiago Paixão
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | | | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
30
|
Schaefke B, Wang TY, Wang CY, Li WH. Gains and Losses of Transcription Factor Binding Sites in Saccharomyces cerevisiae and Saccharomyces paradoxus. Genome Biol Evol 2015. [PMID: 26220934 PMCID: PMC4558856 DOI: 10.1093/gbe/evv138] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Gene expression evolution occurs through changes in cis- or trans-regulatory elements or both. Interactions between transcription factors (TFs) and their binding sites (TFBSs) constitute one of the most important points where these two regulatory components intersect. In this study, we investigated the evolution of TFBSs in the promoter regions of different Saccharomyces strains and species. We divided the promoter of a gene into the proximal region and the distal region, which are defined, respectively, as the 200-bp region upstream of the transcription starting site and as the 200-bp region upstream of the proximal region. We found that the predicted TFBSs in the proximal promoter regions tend to be evolutionarily more conserved than those in the distal promoter regions. Additionally, Saccharomyces cerevisiae strains used in the fermentation of alcoholic drinks have experienced more TFBS losses than gains compared with strains from other environments (wild strains, laboratory strains, and clinical strains). We also showed that differences in TFBSs correlate with the cis component of gene expression evolution between species (comparing S. cerevisiae and its sister species Saccharomyces paradoxus) and within species (comparing two closely related S. cerevisiae strains).
Collapse
Affiliation(s)
- Bernhard Schaefke
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan National Yang-Ming University, Taipei, Taiwan Bioinformatics Program, Institute of Information Science, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan
| | | | | | - Wen-Hsiung Li
- National Yang-Ming University, Taipei, Taiwan China Medical University Hospital, Taichung, Taiwan Department of Ecology and Evolution, University of Chicago
| |
Collapse
|
31
|
Gordon KL, Arthur RK, Ruvinsky I. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence. PLoS Genet 2015; 11:e1005268. [PMID: 26020930 PMCID: PMC4447282 DOI: 10.1371/journal.pgen.1005268] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 05/09/2015] [Indexed: 11/28/2022] Open
Abstract
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. To explore the phylogenetic limits of conservation of cis-regulatory elements, we used transgenesis to test the functions of enhancers of four genes from several species spanning the phylum Nematoda. While we found a striking degree of functional conservation among the examined cis elements, their DNA sequences lacked apparent conservation with the C. elegans orthologs. In fact, sequence similarity between C. elegans and the distantly related nematodes was no greater than would be expected by chance. Short motifs, similar to known regulatory sequences in C. elegans, can be detected in most of the cis elements. When tested, some of these sites appear to mediate regulatory function. However, they seem to have originated through motif turnover, rather than to have been preserved from a common ancestor. Our results suggest that gene regulatory networks are broadly conserved in the phylum Nematoda, but this conservation persists despite substantial reorganization of regulatory elements and could not be detected using naïve comparisons of sequence similarity.
Collapse
Affiliation(s)
- Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| | - Robert K. Arthur
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| |
Collapse
|
32
|
Duque T, Sinha S. What does it take to evolve an enhancer? A simulation-based study of factors influencing the emergence of combinatorial regulation. Genome Biol Evol 2015; 7:1415-31. [PMID: 25956793 PMCID: PMC4494070 DOI: 10.1093/gbe/evv080] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
There is widespread interest today in understanding enhancers, which are regulatory elements typically harboring several transcription factor binding sites and mediating the combinatorial effect of transcription factors on gene expression. The evolution of enhancers poses interesting unanswered questions, for example, the evolutionary time taken for a typical enhancer to emerge or the factors shaping its evolution. Existing approaches to cis-regulatory evolution have often ignored the combinatorial nature and varied biochemical mechanisms of gene regulation encoded in enhancers. We report on our investigation of enhancer evolution through the use of PEBCRES, a framework for evolutionary simulation of enhancers that employs a mechanistic and well-supported sequence-to-expression model to assign fitness to the evolving enhancer genotype. We estimated the time necessary to evolve, from genomic background, enhancers capable of driving complex gene expression patterns similar to those involved in early development in Drosophila. We found the time-to-evolve to range between 0.5 and 10 Myr, and to vary greatly with the target expression pattern, complexity of the real enhancer known to encode that pattern, and the strength of input from specific transcription factors. To our knowledge, this is the first estimate of waiting times for realistic enhancers to evolve. The in silico evolved enhancers had, with a few interesting exceptions, site compositions similar to those seen in real enhancers for the same patterns. Our simulations also revealed that certain features of an enhancer might evolve not due to their biological function but as aids to the evolutionary process itself.
Collapse
Affiliation(s)
- Thyago Duque
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign Institute for Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
33
|
Naval-Sánchez M, Potier D, Hulselmans G, Christiaens V, Aerts S. Identification of Lineage-Specific Cis-Regulatory Modules Associated with Variation in Transcription Factor Binding and Chromatin Activity Using Ornstein-Uhlenbeck Models. Mol Biol Evol 2015; 32:2441-55. [PMID: 25944915 PMCID: PMC4540964 DOI: 10.1093/molbev/msv107] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Scoring the impact of noncoding variation on the function of cis-regulatory regions, on their chromatin state, and on the qualitative and quantitative expression levels of target genes is a fundamental problem in evolutionary genomics. A particular challenge is how to model the divergence of quantitative traits and to identify relationships between the changes across the different levels of the genome, the chromatin activity landscape, and the transcriptome. Here, we examine the use of the Ornstein-Uhlenbeck (OU) model to infer selection at the level of predicted cis-regulatory modules (CRMs), and link these with changes in transcription factor binding and chromatin activity. Using publicly available cross-species ChIP-Seq and STARR-Seq data we show how OU can be applied genome-wide to identify candidate transcription factors for which binding site and CRM turnover is correlated with changes in regulatory activity. Next, we profile open chromatin in the developing eye across three Drosophila species. We identify the recognition motifs of the chromatin remodelers, Trithorax-like and Grainyhead as mostly correlating with species-specific changes in open chromatin. In conclusion, we show in this study that CRM scores can be used as quantitative traits and that motif discovery approaches can be extended towards more complex models of divergence.
Collapse
Affiliation(s)
- Marina Naval-Sánchez
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Delphine Potier
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Gert Hulselmans
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Valerie Christiaens
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
34
|
Shi TY, Jiang Z, Jiang R, Yin S, Wang MY, Yu KD, Shao ZM, Sun MH, Zang R, Wei Q. Polymorphisms in the kinesin-like factor 1 B gene and risk of epithelial ovarian cancer in Eastern Chinese women. Tumour Biol 2015; 36:6919-27. [PMID: 25854172 DOI: 10.1007/s13277-015-3394-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Accepted: 03/25/2015] [Indexed: 01/15/2023] Open
Abstract
The kinesin-like factor 1 B (KIF1B) gene plays an important role in the process of apoptosis and the transformation and progression of malignant cells. Genetic variations in KIF1B may contribute to risk of epithelial ovarian cancer (EOC). In this study of 1,324 EOC patients and 1,386 cancer-free female controls, we investigated associations between two potentially functional single nucleotide polymorphisms in KIF1B and EOC risk by the conditional logistic regression analysis. General linear regression model was used to evaluate the correlation between the number of variant alleles and KIF1B mRNA expression levels. We found that the rs17401966 variant AG/GG genotypes were significantly associated with a decreased risk of EOC (adjusted odds ratio (OR) = 0.81, 95 % confidence interval (CI) = 0.68-0.97), compared with the AA genotype, but no associations were observed for rs1002076. Women who carried both rs17401966 AG/GG and rs1002076 AG/AA genotypes of KIF1B had a 0.82-fold decreased risk (adjusted 95 % CI = 0.69-0.97), compared with others. Additionally, there was no evidence of possible interactions between about-mentioned co-variants. Further genotype-phenotype correlation analysis indicated that the number of rs17401966 variant G allele was significantly associated with KIF1B mRNA expression levels (P for GLM = 0.003 and 0.001 in all and Chinese subjects, respectively), with GG carriers having the lowest level of KIF1B mRNA expression. Taken together, the rs17401966 polymorphism likely regulates KIF1B mRNA expression and thus may be associated with EOC risk in Eastern Chinese women. Larger, independent studies are warranted to validate our findings.
Collapse
Affiliation(s)
- Ting-Yan Shi
- Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Department of Obstetrics and Gynecology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Zhi Jiang
- Department of Gynecologic Oncology, Jiangsu Cancer Hospital, Nanjing, Jiangsu, China
| | - Rong Jiang
- Department of Obstetrics and Gynecology, Zhongshan Hospital, Fudan University, Shanghai, China.,Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Sheng Yin
- Department of Obstetrics and Gynecology, Zhongshan Hospital, Fudan University, Shanghai, China.,Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Meng-Yun Wang
- Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Ke-Da Yu
- Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Zhi-Ming Shao
- Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Meng-Hong Sun
- Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Rongyu Zang
- Department of Obstetrics and Gynecology, Zhongshan Hospital, Fudan University, Shanghai, China. .,Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Shanghai, China.
| | - Qingyi Wei
- Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai, China. .,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China. .,Duke Cancer Institute, Duke University Medical Center, Durham, NC, USA.
| |
Collapse
|
35
|
Garcia-Cordero JL, Maerkl SJ. Mechanically Induced Trapping of Molecular Interactions and Its Applications. ACTA ACUST UNITED AC 2015; 21:356-67. [PMID: 25805850 DOI: 10.1177/2211068215578586] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2014] [Indexed: 12/21/2022]
Abstract
Measuring binding affinities and association/dissociation rates of molecular interactions is important for a quantitative understanding of cellular mechanisms. Many low-throughput methods have been developed throughout the years to obtain these parameters. Acquiring data with higher accuracy and throughput is, however, necessary to characterize complex biological networks. Here, we provide an overview of a high-throughput microfluidic method based on mechanically induced trapping of molecular interactions (MITOMI). MITOMI can be used to obtain affinity constants and kinetic rates of hundreds of protein-ligand interactions in parallel. It has been used in dozens of studies to measure binding affinities of transcription factors, map protein interaction networks, identify pharmacological inhibitors, and perform high-throughput, low-cost molecular diagnostics. This article covers the technological aspects of MITOMI and its applications.
Collapse
Affiliation(s)
| | - Sebastian J Maerkl
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|
36
|
Evolutionary meandering of intermolecular interactions along the drift barrier. Proc Natl Acad Sci U S A 2014; 112:E30-8. [PMID: 25535374 DOI: 10.1073/pnas.1421641112] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Many cellular functions depend on highly specific intermolecular interactions, for example transcription factors and their DNA binding sites, microRNAs and their RNA binding sites, the interfaces between heterodimeric protein molecules, the stems in RNA molecules, and kinases and their response regulators in signal-transduction systems. Despite the need for complementarity between interacting partners, such pairwise systems seem to be capable of high levels of evolutionary divergence, even when subject to strong selection. Such behavior is a consequence of the diminishing advantages of increasing binding affinity between partners, the multiplicity of evolutionary pathways between selectively equivalent alternatives, and the stochastic nature of evolutionary processes. Because mutation pressure toward reduced affinity conflicts with selective pressure for greater interaction, situations can arise in which the expected distribution of the degree of matching between interacting partners is bimodal, even in the face of constant selection. Although biomolecules with larger numbers of interacting partners are subject to increased levels of evolutionary conservation, their more numerous partners need not converge on a single sequence motif or be increasingly constrained in more complex systems. These results suggest that most phylogenetic differences in the sequences of binding interfaces are not the result of adaptive fine tuning but a simple consequence of random genetic drift.
Collapse
|
37
|
Siepel A, Arbiza L. Cis-regulatory elements and human evolution. Curr Opin Genet Dev 2014; 29:81-9. [PMID: 25218861 PMCID: PMC4258466 DOI: 10.1016/j.gde.2014.08.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 08/17/2014] [Accepted: 08/23/2014] [Indexed: 11/20/2022]
Abstract
Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider 'new frontiers' in this field stemming from recent research on transcriptional regulation.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| | - Leonardo Arbiza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
38
|
Grigoriev D, Reinitz J, Vakulenko S, Weber A. Punctuated evolution and robustness in morphogenesis. Biosystems 2014; 123:106-13. [PMID: 24996115 PMCID: PMC4283494 DOI: 10.1016/j.biosystems.2014.06.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 06/25/2014] [Accepted: 06/29/2014] [Indexed: 11/23/2022]
Abstract
This paper presents an analytic approach to the pattern stability and evolution problem in morphogenesis. The approach used here is based on the ideas from the gene and neural network theory. We assume that gene networks contain a number of small groups of genes (called hubs) controlling morphogenesis process. Hub genes represent an important element of gene network architecture and their existence is empirically confirmed. We show that hubs can stabilize morphogenetic pattern and accelerate the morphogenesis. The hub activity exhibits an abrupt change depending on the mutation frequency. When the mutation frequency is small, these hubs suppress all mutations and gene product concentrations do not change, thus, the pattern is stable. When the environmental pressure increases and the population needs new genotypes, the genetic drift and other effects increase the mutation frequency. For the frequencies that are larger than a critical amount the hubs turn off; and as a result, many mutations can affect phenotype. This effect can serve as an engine for evolution. We show that this engine is very effective: the evolution acceleration is an exponential function of gene redundancy. Finally, we show that the Eldredge-Gould concept of punctuated evolution results from the network architecture, which provides fast evolution, control of evolvability, and pattern robustness. To describe analytically the effect of exponential acceleration, we use mathematical methods developed recently for hard combinatorial problems, in particular, for so-called k-SAT problem, and numerical simulations.
Collapse
Affiliation(s)
- D Grigoriev
- CNRS, Mathématiques, Université de Lille, Villeneuve d'Ascq 59655, France.
| | - J Reinitz
- Department of Statistics, University of Chicago, Chicago, IL 60637, United States; Department of Ecology and Evolution, University of Chicago, United States; Department of Molecular Genetics and Cell Biology, University of Chicago, United States; Institute for Genomics and Systems Biology, University of Chicago, United States.
| | - S Vakulenko
- Institute for Mechanical Engineering Problems, Bolshoy pr. V. O.61, Sankt Petersburg, Russia; ITMO University, Sankt Petersburg, Russia.
| | - A Weber
- Computer Science Department, University of Bonn, 53113 Bonn, Germany.
| |
Collapse
|
39
|
Haldane A, Manhart M, Morozov AV. Biophysical fitness landscapes for transcription factor binding sites. PLoS Comput Biol 2014; 10:e1003683. [PMID: 25010228 PMCID: PMC4091707 DOI: 10.1371/journal.pcbi.1003683] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 05/11/2014] [Indexed: 11/18/2022] Open
Abstract
Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions. Specialized proteins called transcription factors turn genes on and off by binding to short stretches of DNA in their regulatory regions. Precise gene regulation is essential for cellular survival and proliferation, and its evolution and maintenance under mutational pressure are central issues in biology. Here we discuss how evolution of gene regulation is shaped by the need to maintain favorable binding energies between transcription factors and their genomic binding sites. We show that, surprisingly, transcription factor binding is not affected by many biological properties, such as the essentiality of the gene it regulates. Rather, all sites for a given factor appear to evolve under a universal set of constraints, which can be rationalized in terms of a simple model inspired by transcription factor – DNA binding thermodynamics.
Collapse
Affiliation(s)
- Allan Haldane
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Alexandre V. Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
40
|
Naturally occurring deletions of hunchback binding sites in the even-skipped stripe 3+7 enhancer. PLoS One 2014; 9:e91924. [PMID: 24786295 PMCID: PMC4006794 DOI: 10.1371/journal.pone.0091924] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 02/18/2014] [Indexed: 11/23/2022] Open
Abstract
Changes in regulatory DNA contribute to phenotypic differences within and between taxa. Comparative studies show that many transcription factor binding sites (TFBS) are conserved between species whereas functional studies reveal that some mutations segregating within species alter TFBS function. Consistently, in this analysis of 13 regulatory elements in Drosophila melanogaster populations, single base and insertion/deletion polymorphism are rare in characterized regulatory elements. Experimentally defined TFBS are nearly devoid of segregating mutations and, as has been shown before, are quite conserved. For instance 8 of 11 Hunchback binding sites in the stripe 3+7 enhancer of even-skipped are conserved between D. melanogaster and Drosophila virilis. Oddly, we found a 72 bp deletion that removes one of these binding sites (Hb8), segregating within D. melanogaster. Furthermore, a 45 bp deletion polymorphism in the spacer between the stripe 3+7 and stripe 2 enhancers, removes another predicted Hunchback site. These two deletions are separated by ∼250 bp, sit on distinct haplotypes, and segregate at appreciable frequency. The Hb8Δ is at 5 to 35% frequency in the new world, but also shows cosmopolitan distribution. There is depletion of sequence variation on the Hb8Δ-carrying haplotype. Quantitative genetic tests indicate that Hb8Δ affects developmental time, but not viability of offspring. The Eve expression pattern differs between inbred lines, but the stripe 3 and 7 boundaries seem unaffected by Hb8Δ. The data reveal segregating variation in regulatory elements, which may reflect evolutionary turnover of characterized TFBS due to drift or co-evolution.
Collapse
|
41
|
Villar D, Flicek P, Odom DT. Evolution of transcription factor binding in metazoans - mechanisms and functional implications. Nat Rev Genet 2014; 15:221-33. [PMID: 24590227 PMCID: PMC4175440 DOI: 10.1038/nrg3481] [Citation(s) in RCA: 157] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Differences in transcription factor binding can contribute to organismal evolution by altering downstream gene expression programmes. Genome-wide studies in Drosophila melanogaster and mammals have revealed common quantitative and combinatorial properties of in vivo DNA binding, as well as marked differences in the rate and mechanisms of evolution of transcription factor binding in metazoans. Here, we review the recently discovered rapid 're-wiring' of in vivo transcription factor binding between related metazoan species and summarize general principles underlying the observed patterns of evolution. We then consider what might explain the differences in genome evolution between metazoan phyla and outline the conceptual and technological challenges facing this research field.
Collapse
Affiliation(s)
- Diego Villar
- University of Cambridge, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB1 01SD, UK
| | - Duncan T Odom
- University of Cambridge, Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK
| |
Collapse
|
42
|
Amrine KCH, Swingley WD, Ardell DH. tRNA signatures reveal a polyphyletic origin of SAR11 strains among alphaproteobacteria. PLoS Comput Biol 2014; 10:e1003454. [PMID: 24586126 PMCID: PMC3937112 DOI: 10.1371/journal.pcbi.1003454] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 12/10/2013] [Indexed: 12/18/2022] Open
Abstract
Molecular phylogenetics and phylogenomics are subject to noise from horizontal gene transfer (HGT) and bias from convergence in macromolecular compositions. Extensive variation in size, structure and base composition of alphaproteobacterial genomes has complicated their phylogenomics, sparking controversy over the origins and closest relatives of the SAR11 strains. SAR11 are highly abundant, cosmopolitan aquatic Alphaproteobacteria with streamlined, A+T-biased genomes. A dominant view holds that SAR11 are monophyletic and related to both Rickettsiales and the ancestor of mitochondria. Other studies dispute this, finding evidence of a polyphyletic origin of SAR11 with most strains distantly related to Rickettsiales. Although careful evolutionary modeling can reduce bias and noise in phylogenomic inference, entirely different approaches may be useful to extract robust phylogenetic signals from genomes. Here we develop simple phyloclassifiers from bioinformatically derived tRNA Class-Informative Features (CIFs), features predicted to target tRNAs for specific interactions within the tRNA interaction network. Our tRNA CIF-based model robustly and accurately classifies alphaproteobacterial genomes into one of seven undisputed monophyletic orders or families, despite great variability in tRNA gene complement sizes and base compositions. Our model robustly rejects monophyly of SAR11, classifying all but one strain as Rhizobiales with strong statistical support. Yet remarkably, conventional phylogenetic analysis of tRNAs classifies all SAR11 strains identically as Rickettsiales. We attribute this discrepancy to convergence of SAR11 and Rickettsiales tRNA base compositions. Thus, tRNA CIFs appear more robust to compositional convergence than tRNA sequences generally. Our results suggest that tRNA-CIF-based phyloclassification is robust to HGT of components of the tRNA interaction network, such as aminoacyl-tRNA synthetases. We explain why tRNAs are especially advantageous for prediction of traits governing macromolecular interactions from genomic data, and why such traits may be advantageous in the search for robust signals to address difficult problems in classification and phylogeny. If gene products work well in the networks of foreign cells, their genes may transfer horizontally between unrelated genomes. What factors dictate the ability to integrate into foreign networks? Different RNAs and proteins must interact specifically in order to function well as a system. For example, tRNA functions are determined by the interactions they have with other macromolecules. We have developed ways to predict, from genomic data alone, how tRNAs distinguish themselves to their specific interaction partners. Here, as proof of concept, we built a robust computational model from these bioinformatic predictions in seven lineages of Alphaproteobacteria. We validated our model by classifying hundreds of diverse alphaproteobacterial taxa and tested it on eight strains of SAR11, a phylogenetically controversial group that is highly abundant in the world's oceans. We found that different strains of SAR11 are more distantly related, both to each other and to mitochondria, than widely believed. We explain conflicting results about SAR11 as an artifact of bias created by the variability in base contents of alphaproteobacterial genomes. While this bias affects tRNAs too, our classifier appears unexpectedly robust to it. More broadly, our results suggest that traits governing macromolecular interactions may be more faithfully vertically inherited than the macromolecules themselves.
Collapse
Affiliation(s)
- Katherine C. H. Amrine
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
| | - Wesley D. Swingley
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
| | - David H. Ardell
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
- * E-mail:
| |
Collapse
|
43
|
Martinez C, Rest JS, Kim AR, Ludwig M, Kreitman M, White K, Reinitz J. Ancestral resurrection of the Drosophila S2E enhancer reveals accessible evolutionary paths through compensatory change. Mol Biol Evol 2014; 31:903-16. [PMID: 24408913 DOI: 10.1093/molbev/msu042] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Upstream regulatory sequences that control gene expression evolve rapidly, yet the expression patterns and functions of most genes are typically conserved. To address this paradox, we have reconstructed computationally and resurrected in vivo the cis-regulatory regions of the ancestral Drosophila eve stripe 2 element and evaluated its evolution using a mathematical model of promoter function. Our feed-forward transcriptional model predicts gene expression patterns directly from enhancer sequence. We used this functional model along with phylogenetics to generate a set of possible ancestral eve stripe 2 sequences for the common ancestors of 1) D. simulans and D. sechellia; 2) D. melanogaster, D. simulans, and D. sechellia; and 3) D. erecta and D. yakuba. These ancestral sequences were synthesized and resurrected in vivo. Using a combination of quantitative and computational analysis, we find clear support for functional compensation between the binding sites for Bicoid, Giant, and Krüppel over the course of 40-60 My of Drosophila evolution. We show that this compensation is driven by a coupling interaction between Bicoid activation and repression at the anterior and posterior border necessary for proper placement of the anterior stripe 2 border. A multiplicity of mechanisms for binding site turnover exemplified by Bicoid, Giant, and Krüppel sites, explains how rapid sequence change may occur while maintaining the function of the cis-regulatory element.
Collapse
Affiliation(s)
- Carlos Martinez
- Institute for Genomics and Systems Biology, University of Chicago
| | | | | | | | | | | | | |
Collapse
|
44
|
Cooperativity and rapid evolution of cobound transcription factors in closely related mammals. Cell 2013; 154:530-40. [PMID: 23911320 PMCID: PMC3732390 DOI: 10.1016/j.cell.2013.07.007] [Citation(s) in RCA: 117] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 05/22/2013] [Accepted: 07/08/2013] [Indexed: 12/04/2022]
Abstract
To mechanistically characterize the microevolutionary processes active in altering transcription factor (TF) binding among closely related mammals, we compared the genome-wide binding of three tissue-specific TFs that control liver gene expression in six rodents. Despite an overall fast turnover of TF binding locations between species, we identified thousands of TF regions of highly constrained TF binding intensity. Although individual mutations in bound sequence motifs can influence TF binding, most binding differences occur in the absence of nearby sequence variations. Instead, combinatorial binding was found to be significant for genetic and evolutionary stability; cobound TFs tend to disappear in concert and were sensitive to genetic knockout of partner TFs. The large, qualitative differences in genomic regions bound between closely related mammals, when contrasted with the smaller, quantitative TF binding differences among Drosophila species, illustrate how genome structure and population genetics together shape regulatory evolution. Earliest steps of regulatory evolution in mammals captured using five mouse species Interspecies differences in TF binding are rarely caused by DNA variation in motifs Cobound TFs change their genomic binding cooperatively in closely related mammals Genetic knockouts revealed the extent of cooperative stabilization in TF binding clusters
Collapse
|
45
|
Schraiber JG, Mostovoy Y, Hsu TY, Brem RB. Inferring evolutionary histories of pathway regulation from transcriptional profiling data. PLoS Comput Biol 2013; 9:e1003255. [PMID: 24130471 PMCID: PMC3794907 DOI: 10.1371/journal.pcbi.1003255] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 08/20/2013] [Indexed: 01/09/2023] Open
Abstract
One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution from expression measurements, and to identify gene groups whose expression patterns were consistent with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power and accuracy of our inference method. As an experimental testbed for our approach, we generated and analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signatures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, highlighting the prevalence of pathway-level expression change during the divergence of yeast species. We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to understand the evolutionary relevance of regulatory change. Comparative transcriptomic studies routinely identify thousands of genes differentially expressed between species. The central question in the field is whether and how such regulatory changes have been the product of natural selection. Can the signal of evolutionarily relevant expression divergence be detected amid the noise of changes resulting from genetic drift? Our work develops a theory of gene expression variation among a suite of genes that function together. We derive a formalism that relates empirical observations of expression of pathway genes in divergent species to the underlying strength of natural selection on expression output. We show that fitting this type of model to simulated data accurately recapitulates the parameters used to generate the simulation. We then make experimental measurements of gene expression in a panel of single-celled eukaryotic yeast species. To these data we apply our inference method, and identify pathways with striking evidence for accelerated or constrained regulatory evolution, in particular species and across the phylogeny. Our method provides a key advance over previous approaches in that it maximizes the power of rigorous molecular-evolution analysis of regulatory variation even when data are relatively sparse. As such, the theory and tools we have developed will likely find broad application in the field of comparative genomics.
Collapse
Affiliation(s)
- Joshua G. Schraiber
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Yulia Mostovoy
- Department of Molecular and Cellular Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Tiffany Y. Hsu
- Department of Molecular and Cellular Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Rachel B. Brem
- Department of Molecular and Cellular Biology, University of California, Berkeley, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
46
|
Duque T, Samee MAH, Kazemian M, Pham HN, Brodsky MH, Sinha S. Simulations of enhancer evolution provide mechanistic insights into gene regulation. Mol Biol Evol 2013; 31:184-200. [PMID: 24097306 PMCID: PMC3879441 DOI: 10.1093/molbev/mst170] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
There is growing interest in models of regulatory sequence evolution. However, existing models specifically designed for regulatory sequences consider the independent evolution of individual transcription factor (TF)-binding sites, ignoring that the function and evolution of a binding site depends on its context, typically the cis-regulatory module (CRM) in which the site is located. Moreover, existing models do not account for the gene-specific roles of TF-binding sites, primarily because their roles often are not well understood. We introduce two models of regulatory sequence evolution that address some of the shortcomings of existing models and implement simulation frameworks based on them. One model simulates the evolution of an individual binding site in the context of a CRM, while the other evolves an entire CRM. Both models use a state-of-the art sequence-to-expression model to predict the effects of mutations on the regulatory output of the CRM and determine the strength of selection. We use the new framework to simulate the evolution of TF-binding sites in 37 well-studied CRMs belonging to the anterior-posterior patterning system in Drosophila embryos. We show that these simulations provide accurate fits to evolutionary data from 12 Drosophila genomes, which includes statistics of binding site conservation on relatively short evolutionary scales and site loss across larger divergence times. The new framework allows us, for the first time, to test hypotheses regarding the underlying cis-regulatory code by directly comparing the evolutionary implications of the hypothesis with the observed evolutionary dynamics of binding sites. Using this capability, we find that explicitly modeling self-cooperative DNA binding by the TF Caudal (CAD) provides significantly better fits than an otherwise identical evolutionary simulation that lacks this mechanistic aspect. This hypothesis is further supported by a statistical analysis of the distribution of intersite spacing between adjacent CAD sites. Experimental tests confirm direct homodimeric interaction between CAD molecules as well as self-cooperative DNA binding by CAD. We note that computational modeling of the D. melanogaster CRMs alone did not yield significant evidence to support CAD self-cooperativity. We thus demonstrate how specific mechanistic details encoded in CRMs can be revealed by modeling their evolution and fitting such models to multispecies data.
Collapse
Affiliation(s)
- Thyago Duque
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | | | | | | | | | | |
Collapse
|
47
|
Stewart AJ, Plotkin JB. The evolution of complex gene regulation by low-specificity binding sites. Proc Biol Sci 2013; 280:20131313. [PMID: 23945682 DOI: 10.1098/rspb.2013.1313] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Requirements for gene regulation vary widely both within and among species. Some genes are constitutively expressed, whereas other genes require complex regulatory control. Transcriptional regulation is often controlled by a module of multiple transcription factor binding sites that, in combination, mediate the expression of a target gene. Here, we study how such regulatory modules evolve in response to natural selection. Using a population-genetic model, we show that complex regulatory modules which contain a larger number of binding sites must employ binding motifs that are less specific, on average, compared with smaller regulatory modules. This effect is extremely general, and it holds regardless of the selected binding logic that a module experiences. We attribute this phenomenon to the inability of stabilizing selection to maintain highly specific sites in large regulatory modules. Our analysis helps to explain broad empirical trends in the Saccharomyces cerevisiae regulatory network: those genes with a greater number of distinct transcriptional regulators feature less-specific binding motifs, compared with genes with fewer regulators. Our results also help to explain empirical trends in module size and motif specificity across species, ranging from prokaryotes to single-cellular and multi-cellular eukaryotes.
Collapse
|
48
|
Connelly CF, Skelly DA, Dunham MJ, Akey JM. Population genomics and transcriptional consequences of regulatory motif variation in globally diverse Saccharomyces cerevisiae strains. Mol Biol Evol 2013; 30:1605-13. [PMID: 23619145 DOI: 10.1093/molbev/mst073] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Noncoding genetic variation is known to significantly influence gene expression levels in a growing number of specific cases; however, the patterns of genome-wide noncoding variation present within populations, the evolutionary forces acting on noncoding variants, and the relative effects of regulatory polymorphisms on transcript abundance are not well characterized. Here, we address these questions by analyzing patterns of regulatory variation in motifs for 177 DNA binding proteins in 37 strains of Saccharomyces cerevisiae. Between S. cerevisiae strains, we found considerable polymorphism in regulatory motifs across strains (mean π = 0.005) as well as diversity in regulatory motifs (mean 0.91 motifs differences per regulatory region). Population genetics analyses reveal that motifs are under purifying selection, and there is considerable heterogeneity in the magnitude of selection across different motifs. Finally, we obtained RNA-Seq data in 22 strains and identified 49 polymorphic DNA sequence motifs in 30 distinct genes that are significantly associated with transcriptional differences between strains. In 22 of these genes, there was a single polymorphic motif associated with expression in the upstream region. Our results provide comprehensive insights into the evolutionary trajectory of regulatory variation in yeast and the characteristics of a compendium of regulatory alleles.
Collapse
|
49
|
Shi TY, Zhu ML, He J, Wang MY, Li QX, Zhou XY, Sun MH, Shao ZM, Yu KD, Cheng X, Wu X, Wei Q. Polymorphisms of the Interleukin 6 gene contribute to cervical cancer susceptibility in Eastern Chinese women. Hum Genet 2013; 132:301-312. [PMID: 23180271 DOI: 10.1007/s00439-012-1245-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Accepted: 10/30/2012] [Indexed: 02/07/2023]
Abstract
Interleukin 6 (IL6) encodes a cytokine protein, which functions in inflammation, maintains immune homeostasis and plays important roles in cervical carcinogenesis. Single nucleotide polymorphisms (SNPs) in IL6 that cause variations in host immune response may contribute to cervical cancer risk. In this two-stage case-control study with a total of 1,584 cervical cancer cases and 1,768 cancer-free female controls, we investigated associations between two IL6 SNPs and cervical cancer risk in Eastern Chinese women. In both Study 1 and Study 2, we found a significant association of the IL6-rs2069837 SNP with an increased risk of cervical cancer as well as in their combined data (OR 1.27 and 1.19, 95% CI 1.08-1.49 and 1.04-1.36, P = 0.004 and 0.014 for dominant and additive genetic models, respectively). Furthermore, rs2069837 variant AG/GG carriers showed significantly higher levels of IL6 protein than did rs2069837 AA carriers in the target tissues. Using multifactor dimensionality reduction (MDR) and classification and regression tree (CART) analyses, we observed some evidence of interactions of the IL6 rs2069837 SNP with age at primiparity and menopausal status in cervical cancer risk. We concluded that the IL6-rs2069837 SNP may be a marker for susceptibility to cervical cancer in Eastern Chinese women by a possible mechanism of altering the IL6 protein expression. Although lacked information on human papillomavirus (HPV) infection, our study also suggested possible interactions between IL6 genotypes and age at primiparity or menopausal status in cervical carcinogenesis. However, larger, independent studies with detailed HPV infection data are warranted to validate our findings.
Collapse
Affiliation(s)
- Ting-Yan Shi
- Cancer Institute, Fudan University Shanghai Cancer Center, 270 Dong An Road, Shanghai 200032, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Garfield D, Haygood R, Nielsen WJ, Wray GA. Population genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus. Evol Dev 2013; 14:152-67. [PMID: 23017024 DOI: 10.1111/j.1525-142x.2012.00532.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Despite the fact that noncoding sequences comprise a substantial fraction of functional sites within all genomes, the evolutionary mechanisms that operate on genetic variation within regulatory elements remain poorly understood. In this study, we examine the population genetics of the core, upstream cis-regulatory regions of eight genes (AN, CyIIa, CyIIIa, Endo16, FoxB, HE, SM30 a, and SM50) that function during the early development of the purple sea urchin, Strongylocentrotus purpuratus. Quantitative and qualitative measures of segregating variation are not conspicuously different between cis-regulatory and closely linked "proxy neutral" noncoding regions containing no known functional sites. Length and compound mutations are common in noncoding sequences; conventional descriptive statistics ignore such mutations, under-representing true genetic variation by approximately 28% for these loci in this population. Patterns of variation in the cis-regulatory regions of six of the genes examined (CyIIa, CyIIIa, Endo16, FoxB, AN, and HE) are consistent with directional selection. Genetic variation within annotated transcription factor binding sites is comparable to, and frequently greater than, that of surrounding sequences. Comparisons of two paralog pairs (CyIIa/CyIIIa and AN/HE) suggest that distinct evolutionary processes have operated on their cis-regulatory regions following gene duplication. Together, these analyses provide a detailed view of the evolutionary mechanisms operating on noncoding sequences within a natural population, and underscore how little is known about how these processes operate on cis-regulatory sequences.
Collapse
Affiliation(s)
- David Garfield
- Department of Biology and Institute for Genome Sciences & Policy, Duke University, Box 90338, Durham, NC 27708, USA
| | | | | | | |
Collapse
|