26
|
Kemena C, Dohmen E, Bornberg-Bauer E. DOGMA: a web server for proteome and transcriptome quality assessment. Nucleic Acids Res 2020; 47:W507-W510. [PMID: 31076763 PMCID: PMC6602495 DOI: 10.1093/nar/gkz366] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 04/18/2019] [Accepted: 04/29/2019] [Indexed: 11/16/2022] Open
Abstract
Even in the era of next generation sequencing, in which bioinformatics tools abound, annotating transcriptomes and proteomes remains a challenge. This can have major implications for the reliability of studies based on these datasets. Therefore, quality assessment represents a crucial step prior to downstream analyses on novel transcriptomes and proteomes. DOGMA allows such a quality assessment to be carried out. The data of interest are evaluated based on a comparison with a core set of conserved protein domains and domain arrangements. Depending on the studied species, DOGMA offers precomputed core sets for different phylogenetic clades. We now developed a web server for the DOGMA software, offering a user-friendly, simple to use interface. Additionally, the server provides a graphical representation of the analysis results and their placement in comparison to publicly available data. The server is freely available under https://domainworld-services.uni-muenster.de/dogma/. Additionally, for large scale analyses the software can be downloaded free of charge from https://domainworld.uni-muenster.de.
Collapse
|
27
|
Heames B, Schmitz J, Bornberg-Bauer E. A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila. J Mol Evol 2020; 88:382-398. [PMID: 32253450 PMCID: PMC7162840 DOI: 10.1007/s00239-020-09939-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 03/13/2020] [Indexed: 12/13/2022]
Abstract
Orphan genes, lacking detectable homologs in outgroup species, typically represent 10-30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7-39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes.
Collapse
|
28
|
Huang Y, Feulner PGD, Eizaguirre C, Lenz TL, Bornberg-Bauer E, Milinski M, Reusch TBH, Chain FJJ. Genome-Wide Genotype-Expression Relationships Reveal Both Copy Number and Single Nucleotide Differentiation Contribute to Differential Gene Expression between Stickleback Ecotypes. Genome Biol Evol 2020; 11:2344-2359. [PMID: 31298693 PMCID: PMC6735750 DOI: 10.1093/gbe/evz148] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/10/2019] [Indexed: 12/11/2022] Open
Abstract
Repeated and independent emergence of trait divergence that matches habitat differences is a sign of parallel evolution by natural selection. Yet, the molecular underpinnings that are targeted by adaptive evolution often remain elusive. We investigate this question by combining genome-wide analyses of copy number variants (CNVs), single nucleotide polymorphisms (SNPs), and gene expression across four pairs of lake and river populations of the three-spined stickleback (Gasterosteus aculeatus). We tested whether CNVs that span entire genes and SNPs occurring in putative cis-regulatory regions contribute to gene expression differences between sticklebacks from lake and river origins. We found 135 gene CNVs that showed a significant positive association between gene copy number and gene expression, suggesting that CNVs result in dosage effects that can fuel phenotypic variation and serve as substrates for habitat-specific selection. Copy number differentiation between lake and river sticklebacks also contributed to expression differences of two immune-related genes in immune tissues, cathepsin A and GIMAP7. In addition, we identified SNPs in cis-regulatory regions (eSNPs) associated with the expression of 1,865 genes, including one eSNP upstream of a carboxypeptidase gene where both the SNP alleles differentiated and the gene was differentially expressed between lake and river populations. Our study highlights two types of mutations as important sources of genetic variation involved in the evolution of gene expression and in potentially facilitating repeated adaptation to novel environments.
Collapse
|
29
|
Hartke J, Schell T, Jongepier E, Schmidt H, Sprenger PP, Paule J, Bornberg-Bauer E, Schmitt T, Menzel F, Pfenninger M, Feldmeyer B. Hybrid Genome Assembly of a Neotropical Mutualistic Ant. Genome Biol Evol 2020; 11:2306-2311. [PMID: 31329228 PMCID: PMC6735702 DOI: 10.1093/gbe/evz159] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/16/2019] [Indexed: 01/13/2023] Open
Abstract
The success of social insects is largely intertwined with their highly advanced chemical communication system that facilitates recognition and discrimination of species and nest-mates, recruitment, and division of labor. Hydrocarbons, which cover the cuticle of insects, not only serve as waterproofing agents but also constitute a major component of this communication system. Two cryptic Crematogaster species, which share their nest with Camponotus ants, show striking diversity in their cuticular hydrocarbon (CHC) profile. This mutualistic system therefore offers a great opportunity to study the genetic basis of CHC divergence between sister species. As a basis for further genome-wide studies high-quality genomes are needed. Here, we present the annotated draft genome for Crematogaster levior A. By combining the three most commonly used sequencing techniques—Illumina, PacBio, and Oxford Nanopore—we constructed a high-quality de novo ant genome. We show that even low coverage of long reads can add significantly to overall genome contiguity. Annotation of desaturase and elongase genes, which play a role in CHC biosynthesis revealed one of the largest repertoires in ants and a higher number of desaturases in general than in other Hymenoptera. This may provide a mechanistic explanation for the high diversity observed in C. levior CHC profiles.
Collapse
|
30
|
Kaur R, Stoldt M, Jongepier E, Feldmeyer B, Menzel F, Bornberg-Bauer E, Foitzik S. Ant behaviour and brain gene expression of defending hosts depend on the ecological success of the intruding social parasite. Philos Trans R Soc Lond B Biol Sci 2020; 374:20180192. [PMID: 30967075 DOI: 10.1098/rstb.2018.0192] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The geographical mosaic theory of coevolution predicts that species interactions vary between locales. Depending on who leads the coevolutionary arms race, the effectivity of parasite attack or host defence strategies will explain parasite prevalence. Here, we compare behaviour and brain transcriptomes of Temnothorax longispinosus ant workers when defending their nest against an invading social parasite, the slavemaking ant Temnothorax americanus. A full-factorial design allowed us to test whether behaviour and gene expression are linked to parasite pressure on host populations or to the ecological success of parasite populations. Albeit host defences had been shown before to covary with local parasite pressure, we found parasite success to be much more important. Our chemical and behavioural analyses revealed that parasites from high prevalence sites carry lower concentrations of recognition cues and are less often attacked by hosts. This link was further supported by gene expression analysis. Our study reveals that host-parasite interactions are strongly influenced by social parasite strategies, so that variation in parasite prevalence is determined by parasite traits rather than the efficacy of host defence. Gene functions associated with parasite success indicated strong neuronal responses in hosts, including long-term changes in gene regulation, indicating an enduring impact of parasites on host behaviour. This article is part of the theme issue 'The coevolutionary biology of brood parasitism: from mechanism to pattern'.
Collapse
|
31
|
Dohmen E, Klasberg S, Bornberg-Bauer E, Perrey S, Kemena C. The modular nature of protein evolution: domain rearrangement rates across eukaryotic life. BMC Evol Biol 2020; 20:30. [PMID: 32059645 PMCID: PMC7023805 DOI: 10.1186/s12862-020-1591-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 01/31/2020] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Modularity is important for evolutionary innovation. The recombination of existing units to form larger complexes with new functionalities spares the need to create novel elements from scratch. In proteins, this principle can be observed at the level of protein domains, functional subunits which are regularly rearranged to acquire new functions. RESULTS In this study we analyse the mechanisms leading to new domain arrangements in five major eukaryotic clades (vertebrates, insects, fungi, monocots and eudicots) at unprecedented depth and breadth. This allows, for the first time, to directly compare rates of rearrangements between different clades and identify both lineage specific and general patterns of evolution in the context of domain rearrangements. We analyse arrangement changes along phylogenetic trees by reconstructing ancestral domain content in combination with feasible single step events, such as fusion or fission. Using this approach we explain up to 70% of all rearrangements by tracing them back to their precursors. We find that rates in general and the ratio between these rates for a given clade in particular, are highly consistent across all clades. In agreement with previous studies, fusions are the most frequent event leading to new domain arrangements. A lineage specific pattern in fungi reveals exceptionally high loss rates compared to other clades, supporting recent studies highlighting the importance of loss for evolutionary innovation. Furthermore, our methodology allows us to link domain emergences at specific nodes in the phylogenetic tree to important functional developments, such as the origin of hair in mammals. CONCLUSIONS Our results demonstrate that domain rearrangements are based on a canonical set of mutational events with rates which lie within a relatively narrow and consistent range. In addition, gained knowledge about these rates provides a basis for advanced domain-based methodologies for phylogenetics and homology analysis which complement current sequence-based methods.
Collapse
|
32
|
Thomas GWC, Dohmen E, Hughes DST, Murali SC, Poelchau M, Glastad K, Anstead CA, Ayoub NA, Batterham P, Bellair M, Binford GJ, Chao H, Chen YH, Childers C, Dinh H, Doddapaneni HV, Duan JJ, Dugan S, Esposito LA, Friedrich M, Garb J, Gasser RB, Goodisman MAD, Gundersen-Rindal DE, Han Y, Handler AM, Hatakeyama M, Hering L, Hunter WB, Ioannidis P, Jayaseelan JC, Kalra D, Khila A, Korhonen PK, Lee CE, Lee SL, Li Y, Lindsey ARI, Mayer G, McGregor AP, McKenna DD, Misof B, Munidasa M, Munoz-Torres M, Muzny DM, Niehuis O, Osuji-Lacy N, Palli SR, Panfilio KA, Pechmann M, Perry T, Peters RS, Poynton HC, Prpic NM, Qu J, Rotenberg D, Schal C, Schoville SD, Scully ED, Skinner E, Sloan DB, Stouthamer R, Strand MR, Szucsich NU, Wijeratne A, Young ND, Zattara EE, Benoit JB, Zdobnov EM, Pfrender ME, Hackett KJ, Werren JH, Worley KC, Gibbs RA, Chipman AD, Waterhouse RM, Bornberg-Bauer E, Hahn MW, Richards S. Gene content evolution in the arthropods. Genome Biol 2020; 21:15. [PMID: 31969194 PMCID: PMC6977273 DOI: 10.1186/s13059-019-1925-7] [Citation(s) in RCA: 105] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 12/26/2019] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Arthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods. RESULTS Using 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality, and chemoperception. CONCLUSIONS These analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.
Collapse
|
33
|
van Loo B, Heberlein M, Mair P, Zinchenko A, Schüürmann J, Eenink BDG, Holstein JM, Dilkaute C, Jose J, Hollfelder F, Bornberg-Bauer E. High-Throughput, Lysis-Free Screening for Sulfatase Activity Using Escherichia coli Autodisplay in Microdroplets. ACS Synth Biol 2019; 8:2690-2700. [PMID: 31738524 DOI: 10.1021/acssynbio.9b00274] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Directed evolution of enzymes toward improved catalytic performance has become a powerful tool in protein engineering. To be effective, a directed evolution campaign requires the use of high-throughput screening. In this study we describe the development of an ultra high-throughput lysis-free procedure to screen for improved sulfatase activity by combining microdroplet-based single-variant activity sorting with E. coli autodisplay. For the first step in a 4-step screening procedure, we quantitatively screened >105 variants of the homodimeric arylsulfatase from Silicibacter pomeroyi (SpAS1), displayed on the E. coli cell surface, for improved sulfatase activity using fluorescence activated droplet sorting. Compartmentalization of the fluorescent reaction product with living E. coli cells autodisplaying the sulfatase variants ensured the continuous linkage of genotype and phenotype during droplet sorting and allowed for direct recovery by simple regrowth of the sorted cells. The use of autodisplay on living cells simplified and reduced the degree of liquid handling during all steps in the screening procedure to the single event of simply mixing substrate and cells. The percentage of apparent improved variants was enriched >10-fold as a result of droplet sorting. We ultimately identified 25 SpAS1 variants with improved performance toward 4-nitrophenyl sulfate (up to 6.2-fold) and/or fluorescein disulfate (up to 30-fold). In SpAS1 variants with improved performance toward the bulky fluorescein disulfate, many of the beneficial mutations occur in residues that form hydrogen bonds between α-helices in the C-terminal oligomerization region, suggesting a previously unknown role for the dimer interface in shaping the substrate binding site of SpAS1.
Collapse
|
34
|
Kleppe AS, Bornberg-Bauer E. Robustness by intrinsically disordered C-termini and translational readthrough. Nucleic Acids Res 2019; 47:11978-11980. [PMID: 31733061 PMCID: PMC7145639 DOI: 10.1093/nar/gkz1106] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
35
|
Kleppe AS, Bornberg-Bauer E. Robustness by intrinsically disordered C-termini and translational readthrough. Nucleic Acids Res 2019; 46:10184-10194. [PMID: 30247639 PMCID: PMC6365619 DOI: 10.1093/nar/gky778] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 09/20/2018] [Indexed: 12/20/2022] Open
Abstract
During protein synthesis genetic instructions are passed from DNA via mRNA to the ribosome to assemble a protein chain. Occasionally, stop codons in the mRNA are bypassed and translation continues into the untranslated region (3′-UTR). This process, called translational readthrough (TR), yields a protein chain that becomes longer than would be predicted from the DNA sequence alone. Protein sequences vary in propensity for translational errors, which may yield evolutionary constraints by limiting evolutionary paths. Here we investigated TR in Saccharomyces cerevisiae by analysing ribosome profiling data. We clustered proteins as either prone or non-prone to TR, and conducted comparative analyses. We find that a relatively high frequency (5%) of genes undergo TR, including ribosomal subunit proteins. Our main finding is that proteins undergoing TR are highly expressed and have a higher proportion of intrinsically disordered C-termini. We suggest that highly expressed proteins may compensate for the deleterious effects of TR by having intrinsically disordered C-termini, which may provide conformational flexibility but without distorting native function. Moreover, we discuss whether minimizing deleterious effects of TR is also enabling exploration of the phenotypic landscape of protein isoforms.
Collapse
|
36
|
Kurafeiski JD, Pinto P, Bornberg-Bauer E. Evolutionary Potential of Cis-Regulatory Mutations to Cause Rapid Changes in Transcription Factor Binding. Genome Biol Evol 2019; 11:406-414. [PMID: 30597011 PMCID: PMC6370388 DOI: 10.1093/gbe/evy269] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/11/2018] [Indexed: 01/25/2023] Open
Abstract
Transcriptional regulation is crucial for all biological processes and well investigated at the molecular level for a wide range of organisms. However, it is quite unclear how innovations, such as the activity of a novel regulatory element, evolve. In the case of transcription factor (TF) binding, both a novel TF and a novel-binding site would need to evolve concertedly. Since promiscuous functions have recently been identified as important intermediate steps in creating novel specific functions in many areas such as enzyme evolution and protein-protein interactions, we ask here how promiscuous binding of TFs to TF-binding sites (TFBSs) affects the robustness and evolvability of this tightly regulated system. Specifically, we investigate the binding behavior of several hundred TFs from different species at unprecedented breadth. Our results illustrate multiple aspects of TF-binding interactions, ranging from correlations between the strength of the interaction bond and specificity, to preferences regarding TFBS nucleotide composition in relation to both domains and binding specificity. We identified a subset of high A/T binding motifs. Motifs in this subset had many functionally neutral one-error mutants, and were bound by multiple different binding domains. Our results indicate that, especially for some TF-TFBS associations, low binding specificity confers high degrees of evolvability, that is that few mutations facilitate rapid changes in transcriptional regulation, in particular for large and old TF families. In this study we identify binding motifs exhibiting behavior indicating high evolutionary potential for innovations in transcriptional regulation.
Collapse
|
37
|
Abstract
Protein domains are reusable segments of proteins and play an important role in protein evolution. By combining the elements from a relatively small set of domains into unique arrangements, a large number of distinct proteins can be generated. Since domains often have specific functions, changes in their arrangement usually affect the overall protein function. Furthermore, domains are well amenable to computational representations, e.g., by Hidden Markov Models (HMMs), and these HMMs are widely represented in various databases. Therefore, domains can be efficiently used for proteomic analyses. Here, we describe how domains are annotated using different domain databases and then how to assess the annotation quality of proteomes. We next show how functional annotations of domains in large-scale data such as whole genomes or transcriptomes can be used to analyze molecular differences between species. Furthermore, we describe methods to analyze the changes in domain content of proteins which significantly helps to characterize and reconstruct the modular evolution of proteins. Altogether, domain-based methods offer a computationally highly effective approach to analyze large amounts of proteomic data in an evolutionary setting.
Collapse
|
38
|
Bornberg-Bauer E, Harrison MC, Jongepier E. The first cockroach genome and its significance for understanding development and the evolution of insect eusociality. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2018; 330:251-253. [PMID: 30168666 DOI: 10.1002/jez.b.22826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 07/30/2018] [Indexed: 06/08/2023]
|
39
|
Klasberg S, Bitard-Feildel T, Callebaut I, Bornberg-Bauer E. Origins and structural properties of novel and de novo protein domains during insect evolution. FEBS J 2018; 285:2605-2625. [PMID: 29802682 DOI: 10.1111/febs.14504] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2017] [Revised: 04/12/2018] [Accepted: 05/11/2018] [Indexed: 12/11/2022]
Abstract
Over long time scales, protein evolution is characterized by modular rearrangements of protein domains. Such rearrangements are mainly caused by gene duplication, fusion and terminal losses. To better understand domain emergence mechanisms we investigated 32 insect genomes covering a speciation gradient ranging from ~ 2 to ~ 390 mya. We use established domain models and foldable domains delineated by hydrophobic cluster analysis (HCA), which does not require homologous sequences, to also identify domains which have likely arisen de novo, that is, from previously noncoding DNA. Our results indicate that most novel domains emerge terminally as they originate from ORF extensions while fewer arise in middle arrangements, resulting from exonization of intronic or intergenic regions. Many novel domains rapidly migrate between terminal or middle positions and single- and multidomain arrangements. Young domains, such as most HCA-defined domains, are under strong selection pressure as they show signals of purifying selection. De novo domains, linked to ancient domains or defined by HCA, have higher degrees of intrinsic disorder and disorder-to-order transition upon binding than ancient domains. However, the corresponding DNA sequences of the novel domains of de novo origins could only rarely be found in sister genomes. We conclude that novel domains are often recruited by other proteins and undergo important structural modifications shortly after their emergence, but evolve too fast to be characterized by cross-species comparisons alone.
Collapse
|
40
|
Kremer LPM, Korb J, Bornberg-Bauer E. Reconstructed evolution of insulin receptors in insects reveals duplications in early insects and cockroaches. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2018; 330:305-311. [DOI: 10.1002/jez.b.22809] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 04/11/2018] [Accepted: 05/03/2018] [Indexed: 11/10/2022]
|
41
|
Jongepier E, Kemena C, Lopez-Ezquerra A, Belles X, Bornberg-Bauer E, Korb J. Remodeling of the juvenile hormone pathway through caste-biased gene expression and positive selection along a gradient of termite eusociality. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2018; 330:296-304. [PMID: 29845724 DOI: 10.1002/jez.b.22805] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 04/25/2018] [Accepted: 05/03/2018] [Indexed: 11/10/2022]
Abstract
The evolution of division of labor between sterile and fertile individuals represents one of the major transitions in biological complexity. A fascinating gradient in eusociality evolved among the ancient hemimetabolous insects, ranging from noneusocial cockroaches through the primitively social lower termites-where workers retain the ability to reproduce-to the higher termites, characterized by lifetime commitment to worker sterility. Juvenile hormone (JH) is a prime candidate for the regulation of reproductive division of labor in termites, as it plays a key role in insect postembryonic development and reproduction. We compared the expression of JH pathway genes between workers and queens in two lower termites (Zootermopsis nevadensis and Cryptotermes secundus) and a higher termite (Macrotermes natalensis) to that of analogous nymphs and adult females of the noneusocial cockroach Blattella germanica. JH biosynthesis and metabolism genes ranged from reproductive female-biased expression in the cockroach to predominantly worker-biased expression in the lower termites. Remarkably, the expression profile of JH pathway genes sets the higher termite apart from the two lower termites, as well as the cockroach, indicating that JH signaling has undergone major changes in this eusocial termite. These changes go beyond mere shifts in gene expression between the different castes, as we find evidence for positive selection in several termite JH pathway genes. Thus, remodeling of the JH pathway may have played a major role in termite social evolution, representing a striking case of convergent molecular evolution between the termites and the distantly related social hymenoptera.
Collapse
|
42
|
van Loo B, Schober M, Valkov E, Heberlein M, Bornberg-Bauer E, Faber K, Hyvönen M, Hollfelder F. Structural and Mechanistic Analysis of the Choline Sulfatase from Sinorhizobium melliloti: A Class I Sulfatase Specific for an Alkyl Sulfate Ester. J Mol Biol 2018; 430:1004-1023. [PMID: 29458126 PMCID: PMC5870055 DOI: 10.1016/j.jmb.2018.02.010] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 02/09/2018] [Accepted: 02/13/2018] [Indexed: 12/23/2022]
Abstract
Hydrolysis of organic sulfate esters proceeds by two distinct mechanisms, water attacking at either sulfur (S-O bond cleavage) or carbon (C-O bond cleavage). In primary and secondary alkyl sulfates, attack at carbon is favored, whereas in aromatic sulfates and sulfated sugars, attack at sulfur is preferred. This mechanistic distinction is mirrored in the classification of enzymes that catalyze sulfate ester hydrolysis: arylsulfatases (ASs) catalyze S-O cleavage in sulfate sugars and arylsulfates, and alkyl sulfatases break the C-O bond of alkyl sulfates. Sinorhizobium meliloti choline sulfatase (SmCS) efficiently catalyzes the hydrolysis of alkyl sulfate choline-O-sulfate (kcat/KM=4.8×103s-1M-1) as well as arylsulfate 4-nitrophenyl sulfate (kcat/KM=12s-1M-1). Its 2.8-Å resolution X-ray structure shows a buried, largely hydrophobic active site in which a conserved glutamate (Glu386) plays a role in recognition of the quaternary ammonium group of the choline substrate. SmCS structurally resembles members of the alkaline phosphatase superfamily, being most closely related to dimeric ASs and tetrameric phosphonate monoester hydrolases. Although >70% of the amino acids between protomers align structurally (RMSDs 1.79-1.99Å), the oligomeric structures show distinctly different packing and protomer-protomer interfaces. The latter also play an important role in active site formation. Mutagenesis of the conserved active site residues typical for ASs, H218O-labeling studies and the observation of catalytically promiscuous behavior toward phosphoesters confirm the close relation to alkaline phosphatase superfamily members and suggest that SmCS is an AS that catalyzes S-O cleavage in alkyl sulfate esters with extreme catalytic proficiency.
Collapse
|
43
|
Lopez-Ezquerra A, Mitschke A, Bornberg-Bauer E, Joop G. Tribolium castaneum gene expression changes after Paranosema whitei infection. J Invertebr Pathol 2018; 153:92-98. [DOI: 10.1016/j.jip.2018.02.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 01/10/2018] [Accepted: 02/12/2018] [Indexed: 12/24/2022]
|
44
|
Gerke M, Bornberg-Bauer E, Jiang X, Fuellen G. Finding Common Protein Interaction Patterns Across Organisms. Evol Bioinform Online 2017. [DOI: 10.1177/117693430600200011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Protein interactions are an important resource to obtain an understanding of cell function. Recently, researchers have compared networks of interactions in order to understand network evolution. While current methods first infer homologs and then compare topologies, we here present a method which first searches for interesting topologies and then looks for homologs. PINA (protein interaction network analysis) takes the protein interaction networks of two organisms, scans both networks for subnetworks deemed interesting, and then tries to find orthologs among the interesting subnetworks. The application is very fast because orthology investigations are restricted to subnetworks like hubs and clusters that fulfill certain criteria regarding neighborhood and connectivity. Finally, the hubs or clusters found to be related can be visualized and analyzed according to protein annotation.
Collapse
|
45
|
Gubala AM, Schmitz JF, Kearns MJ, Vinh TT, Bornberg-Bauer E, Wolfner MF, Findlay GD. The Goddard and Saturn Genes Are Essential for Drosophila Male Fertility and May Have Arisen De Novo. Mol Biol Evol 2017; 34:1066-1082. [PMID: 28104747 DOI: 10.1093/molbev/msx057] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important functional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila melanogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction.
Collapse
|
46
|
Lopez-Ezquerra A, Harrison MC, Bornberg-Bauer E. Comparative analysis of lincRNA in insect species. BMC Evol Biol 2017; 17:155. [PMID: 28673235 PMCID: PMC5494802 DOI: 10.1186/s12862-017-0985-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 06/02/2017] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND The ever increasing availability of genomes makes it possible to investigate and compare not only the genomic complements of genes and proteins, but also of RNAs. One class of RNAs, the long noncoding RNAs (lncRNAs) and, in particular, their subclass of long intergenic noncoding RNAs (lincRNAs) have recently gained much attention because of their roles in regulation of important biological processes such as immune response or cell differentiation and as possible evolutionary precursors for protein coding genes. lincRNAs seem to be poorly conserved at the sequence level but at least some lincRNAs have conserved structural elements and syntenic genomic positions. Previous studies showed that transposable elements are a main contribution to the evolution of lincRNAs in mammals. In contrast, plant lincRNA emergence and evolution has been linked with local duplication events. However, little is known about their evolutionary dynamics in general and in insect genomes in particular. RESULTS Here we compared lincRNAs between seven insect genomes and investigated possible evolutionary changes and functional roles. We find very low sequence conservation between different species and that similarities within a species are mostly due to their association with transposable elements (TE) and simple repeats. Furthermore, we find that TEs are less frequent in lincRNA exons than in their introns, indicating that TEs may have been removed by selection. When we analysed the predicted thermodynamic stabilities of lincRNAs we found that they are more stable than their randomized controls which might indicate some selection pressure to maintain certain structural elements. We list several of the most stable lincRNAs which could serve as prime candidates for future functional studies. We also discuss the possibility of de novo protein coding genes emerging from lincRNAs. This is because lincRNAs with high GC content and potentially with longer open reading frames (ORF) are candidate loci where de novo gene emergence might occur. CONCLUSION The processes responsible for the emergence and diversification of lincRNAs in insects remain unclear. Both duplication and transposable elements may be important for the creation of new lincRNAs in insects.
Collapse
|
47
|
van Loo B, Bornberg-Bauer E. Enzyme sub-functionalization driven by regulation. EMBO Rep 2017; 18:1043-1045. [PMID: 28615289 DOI: 10.15252/embr.201744383] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
|
48
|
Schmitz JF, Bornberg-Bauer E. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA. F1000Res 2017; 6:57. [PMID: 28163910 PMCID: PMC5247788 DOI: 10.12688/f1000research.10079.1] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2017] [Indexed: 12/31/2022] Open
Abstract
Over the last few years, there has been an increasing amount of evidence for the
de novo emergence of protein-coding genes, i.e. out of non-coding DNA. Here, we review the current literature and summarize the state of the field. We focus specifically on open questions and challenges in the study of
de novo protein-coding genes such as the identification and verification of
de novo-emerged genes. The greatest obstacle to date is the lack of high-quality genomic data with very short divergence times which could help precisely pin down the location of origin of a
de novo gene. We conclude that, while there is plenty of evidence from a genetics perspective, there is a lack of functional studies of bona fide
de novo genes and almost no knowledge about protein structures and how they come about during the emergence of
de novo protein-coding genes. We suggest that future studies should concentrate on the functional and structural characterization of
de novo protein-coding genes as well as the detailed study of the emergence of functional
de novo protein-coding genes.
Collapse
|
49
|
Abstract
Repeats are ubiquitous elements of proteins and they play important roles for cellular function and during evolution. Repeats are, however, also notoriously difficult to capture computationally and large scale studies so far had difficulties in linking genetic causes, structural properties and evolutionary trajectories of protein repeats. Here we apply recently developed methods for repeat detection and analysis to a large dataset comprising over hundred metazoan genomes. We find that repeats in larger protein families experience generally very few insertions or deletions (indels) of repeat units but there is also a significant fraction of noteworthy volatile outliers with very high indel rates. Analysis of structural data indicates that repeats with an open structure and independently folding units are more volatile and more likely to be intrinsically disordered. Such disordered repeats are also significantly enriched in sites with a high functional potential such as linear motifs. Furthermore, the most volatile repeats have a high sequence similarity between their units. Since many volatile repeats also show signs of recombination, we conclude they are often shaped by concerted evolution. Intriguingly, many of these conserved yet volatile repeats are involved in host-pathogen interactions where they might foster fast but subtle adaptation in biological arms races. KEY WORDS: protein evolution, domain rearrangements, protein repeats, concerted evolution.
Collapse
|
50
|
Jueterbock A, Franssen SU, Bergmann N, Gu J, Coyer JA, Reusch TBH, Bornberg-Bauer E, Olsen JL. Phylogeographic differentiation versus transcriptomic adaptation to warm temperatures inZostera marina, a globally important seagrass. Mol Ecol 2016; 25:5396-5411. [DOI: 10.1111/mec.13829] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Revised: 08/15/2016] [Accepted: 08/23/2016] [Indexed: 12/17/2022]
|