1
|
Käther KK, Remmel A, Lemke S, Stadler PF. Unbiased anchors for reliable genome-wide synteny detection. Algorithms Mol Biol 2025; 20:5. [PMID: 40188341 PMCID: PMC11972476 DOI: 10.1186/s13015-025-00275-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Accepted: 03/12/2025] [Indexed: 04/07/2025] Open
Abstract
Orthology inference lies at the foundation of comparative genomics research. The correct identification of loci which descended from a common ancestral sequence is not only complicated by sequence divergence but also duplication and other genome rearrangements. The conservation of gene order, i.e. synteny, is used in conjunction with sequence similarity as an additional factor for orthology determination. Current approaches, however, rely on genome annotations and are therefore limited. Here we present an annotation-free approach and compare it to synteny analysis with annotations. We find that our approach works better in closely related genomes whereas there is a better performance with annotations for more distantly related genomes. Overall, the presented algorithm offers a useful alternative to annotation-based methods and can outperform them in many cases.
Collapse
Affiliation(s)
- Karl K Käther
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstrasse 16-18, D-04017, Leipzig, Germany.
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA.
| | - Andreas Remmel
- Zoology Department, University of Hohenheim, 10587, Stuttgart, Germany
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| | - Steffen Lemke
- Zoology Department, University of Hohenheim, 10587, Stuttgart, Germany
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Härtelstrasse 16-18, D-04017, Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103, Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090, Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Ridebanevej 9, DK-1870, Frederiksberg, Denmark
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| |
Collapse
|
2
|
Bermudez-Santana CI, Gallego-Gómez JC. Toward a Categorization of Virus-ncRNA Interactions in the World of RNA to Disentangle the Tiny Secrets of Dengue Virus. Viruses 2024; 16:804. [PMID: 38793685 PMCID: PMC11125801 DOI: 10.3390/v16050804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 05/26/2024] Open
Abstract
In recent years, the function of noncoding RNAs (ncRNAs) as regulatory molecules of cell physiology has begun to be better understood. Advances in viral molecular biology have shown that host ncRNAs, cellular factors, and virus-derived ncRNAs and their interplay are strongly disturbed during viral infections. Nevertheless, the folding of RNA virus genomes has also been identified as a critical factor in regulating canonical and non-canonical functions. Due to the influence of host ncRNAs and the structure of RNA viral genomes, complex molecular and cellular processes in infections are modulated. We propose three main categories to organize the current information about RNA-RNA interactions in some well-known human viruses. The first category shows examples of host ncRNAs associated with the immune response triggered in viral infections. Even though miRNAs introduce a standpoint, they are briefly presented to keep researchers moving forward in uncovering other RNAs. The second category outlines interactions between virus-host ncRNAs, while the third describes how the structure of the RNA viral genome serves as a scaffold for processing virus-derived RNAs. Our grouping may provide a comprehensive framework to classify ncRNA-host-cell interactions for emerging viruses and diseases. In this sense, we introduced them to organize DENV-host-cell interactions.
Collapse
Affiliation(s)
- Clara Isabel Bermudez-Santana
- Computational and theoretical RNomics Group, Center of Excellence in Scientific Computing, Universidad Nacional de Colombia, Bogotá 111321, Colombia
| | - Juan Carlos Gallego-Gómez
- Grupo de Medicina de Traslación, Facultad de Medicina, Universidad de Antioquia, Medellín 050010, Colombia;
| |
Collapse
|
3
|
Gutierrez-Diaz A, Hoffmann S, Gallego-Gómez JC, Bermudez-Santana CI. Systematic computational hunting for small RNAs derived from ncRNAs during dengue virus infection in endothelial HMEC-1 cells. FRONTIERS IN BIOINFORMATICS 2024; 4:1293412. [PMID: 38357577 PMCID: PMC10864640 DOI: 10.3389/fbinf.2024.1293412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/08/2024] [Indexed: 02/16/2024] Open
Abstract
In recent years, a population of small RNA fragments derived from non-coding RNAs (sfd-RNAs) has gained significant interest due to its functional and structural resemblance to miRNAs, adding another level of complexity to our comprehension of small-RNA-mediated gene regulation. Despite this, scientists need more tools to test the differential expression of sfd-RNAs since the current methods to detect miRNAs may not be directly applied to them. The primary reasons are the lack of accurate small RNA and ncRNA annotation, the multi-mapping read (MMR) placement, and the multicopy nature of ncRNAs in the human genome. To solve these issues, a methodology that allows the detection of differentially expressed sfd-RNAs, including canonical miRNAs, by using an integrated copy-number-corrected ncRNA annotation was implemented. This approach was coupled with sixteen different computational strategies composed of combinations of four aligners and four normalization methods to provide a rank-order of prediction for each differentially expressed sfd-RNA. By systematically addressing the three main problems, we could detect differentially expressed miRNAs and sfd-RNAs in dengue virus-infected human dermal microvascular endothelial cells. Although more biological evaluations are required, two molecular targets of the hsa-mir-103a and hsa-mir-494 (CDK5 and PI3/AKT) appear relevant for dengue virus (DENV) infections. Here, we performed a comprehensive annotation and differential expression analysis, which can be applied in other studies addressing the role of small fragment RNA populations derived from ncRNAs in virus infection.
Collapse
Affiliation(s)
- Aimer Gutierrez-Diaz
- Grupo Rnomica Teórica y Computacional, Departamento de Biología, Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
| | - Steve Hoffmann
- Faculty of Biosciences, Leibniz Institute on Aging—Fritz Lipmann Institute (FLI), Friedrich Schiller University Jena, Jena, Germany
| | - Juan Carlos Gallego-Gómez
- Molecular and Translational Medicine Group, Medicine Faculty Universidad de Antioquia, Medellin, Colombia
| | - Clara Isabel Bermudez-Santana
- Grupo Rnomica Teórica y Computacional, Departamento de Biología, Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
| |
Collapse
|
4
|
Chekunova AI, Sorokina SY, Sivoplyas EA, Bakhtoyarov GN, Proshakov PA, Fokin AV, Melnikov AI, Kulikov AM. Episodes of Rapid Recovery of the Functional Activity of the ras85D Gene in the Evolutionary History of Phylogenetically Distant Drosophila Species. Front Genet 2022; 12:807234. [PMID: 35096018 PMCID: PMC8790561 DOI: 10.3389/fgene.2021.807234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
As assemblies of genomes of new species with varying degrees of relationship appear, it becomes obvious that structural rearrangements of the genome, such as inversions, translocations, and transposon movements, are an essential and often the main source of evolutionary variation. In this regard, the following questions arise. How conserved are the regulatory regions of genes? Do they have a common evolutionary origin? And how and at what rate is the functional activity of genes restored during structural changes in the promoter region? In this article, we analyze the evolutionary history of the formation of the regulatory region of the ras85D gene in different lineages of the genus Drosophila, as well as the participation of mobile elements in structural rearrangements and in the replacement of specific areas of the promoter region with those of independent evolutionary origin. In the process, we substantiate hypotheses about the selection of promoter elements from a number of frequently repeated motifs with different degrees of degeneracy in the ancestral sequence, as well as about the restoration of the minimum required set of regulatory sequences using a conversion mechanism or similar.
Collapse
Affiliation(s)
- A I Chekunova
- Evolutionary Genetics of Development, N.K. Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| | - S Yu Sorokina
- Evolutionary Genetics of Development, N.K. Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| | - E A Sivoplyas
- Department of Biochemistry, Molecular Biology and Genetics, Institute of Biology and Chemistry of Moscow Pedagogical State University (MPGU), Moscow, Russia
| | - G N Bakhtoyarov
- Laboratory of Genetics of DNA Containing Viruses, Federal State Budgetary Scientific Institution «I. Mechnikov Research Institute of Vaccines and Sera», Moscow, Russia
| | - P A Proshakov
- Evolutionary Genetics of Development, N.K. Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| | - A V Fokin
- Evolutionary Genetics of Development, N.K. Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| | - A I Melnikov
- Evolutionary Genetics of Development, N.K. Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| | - A M Kulikov
- Evolutionary Genetics of Development, N.K. Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
5
|
Wint R, Salamov A, Grigoriev IV. Kingdom-Wide Analysis of Fungal Transcriptomes and tRNAs Reveals Conserved Patterns of Adaptive Evolution. Mol Biol Evol 2022; 39:6513383. [PMID: 35060603 PMCID: PMC8826637 DOI: 10.1093/molbev/msab372] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Protein-coding genes evolved codon usage bias due to the combined but uneven effects of adaptive and nonadaptive influences. Studies in model fungi agree on codon usage bias as an adaptation for fine-tuning gene expression levels; however, such knowledge is lacking for most other fungi. Our comparative genomics analysis of over 450 species supports codon usage and transfer RNAs (tRNAs) as coadapted for translation speed and this is most likely a realization of convergent evolution. Rather than drift, phylogenetic reconstruction inferred adaptive radiation as the best explanation for the variation of interspecific codon usage bias. Although the phylogenetic signals for individual codon and tRNAs frequencies are lower than expected by genetic drift, we found remarkable conservation of highly expressed genes being codon optimized for translation by the most abundant tRNAs, especially by inosine-modified tRNAs. As an application, we present a sequence-to-expression neural network that uses codons to reliably predict highly expressed transcripts. The kingdom Fungi, with over a million species, includes many key players in various ecosystems and good targets for biotechnology. Collectively, our results have implications for better understanding the evolutionary success of fungi, as well as informing the biosynthetic manipulation of fungal genes.
Collapse
Affiliation(s)
- Rhondene Wint
- Molecular and Cell Biology Unit, Quantitative and Systems Biology Program, University of California Merced, Merced, CA, 95343, USA
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Asaf Salamov
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, 94720 US
| |
Collapse
|
6
|
Evolution and Phylogeny of MicroRNAs - Protocols, Pitfalls, and Problems. Methods Mol Biol 2021. [PMID: 34432281 DOI: 10.1007/978-1-0716-1170-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023]
Abstract
MicroRNAs are important regulators in many eukaryotic lineages. Typical miRNAs have a length of about 22nt and are processed from precursors that form a characteristic hairpin structure. Once they appear in a genome, miRNAs are among the best-conserved elements in both animal and plant genomes. Functionally, they play an important role in particular in development. In contrast to protein-coding genes, miRNAs frequently emerge de novo. The genomes of animals and plants harbor hundreds of mutually unrelated families of homologous miRNAs that tend to be persistent throughout evolution. The evolution of their genomic miRNA complement closely correlates with important morphological innovation. In addition, miRNAs have been used as valuable characters in phylogenetic studies. An accurate and comprehensive annotation of miRNAs is required as a basis to understand their impact on phenotypic evolution. Since experimental data on miRNA expression are limited to relatively few species and are subject to unavoidable ascertainment biases, it is inevitable to complement miRNA sequencing by homology based annotation methods. This chapter reviews the state of the art workflows for homology based miRNA annotation, with an emphasis on their limitations and open problems.
Collapse
|
7
|
Chakraborty M, Chang CH, Khost DE, Vedanayagam J, Adrion JR, Liao Y, Montooth KL, Meiklejohn CD, Larracuente AM, Emerson JJ. Evolution of genome structure in the Drosophila simulans species complex. Genome Res 2021; 31:380-396. [PMID: 33563718 PMCID: PMC7919458 DOI: 10.1101/gr.263442.120] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 12/28/2020] [Indexed: 12/25/2022]
Abstract
The rapid evolution of repetitive DNA sequences, including satellite DNA, tandem duplications, and transposable elements, underlies phenotypic evolution and contributes to hybrid incompatibilities between species. However, repetitive genomic regions are fragmented and misassembled in most contemporary genome assemblies. We generated highly contiguous de novo reference genomes for the Drosophila simulans species complex (D. simulans, D. mauritiana, and D. sechellia), which speciated ∼250,000 yr ago. Our assemblies are comparable in contiguity and accuracy to the current D. melanogaster genome, allowing us to directly compare repetitive sequences between these four species. We find that at least 15% of the D. simulans complex species genomes fail to align uniquely to D. melanogaster owing to structural divergence-twice the number of single-nucleotide substitutions. We also find rapid turnover of satellite DNA and extensive structural divergence in heterochromatic regions, whereas the euchromatic gene content is mostly conserved. Despite the overall preservation of gene synteny, euchromatin in each species has been shaped by clade- and species-specific inversions, transposable elements, expansions and contractions of satellite and tRNA tandem arrays, and gene duplications. We also find rapid divergence among Y-linked genes, including copy number variation and recent gene duplications from autosomes. Our assemblies provide a valuable resource for studying genome evolution and its consequences for phenotypic evolution in these genetic model species.
Collapse
Affiliation(s)
- Mahul Chakraborty
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
| | - Ching-Ho Chang
- Department of Biology, University of Rochester, Rochester, New York 14627, USA
| | - Danielle E Khost
- Department of Biology, University of Rochester, Rochester, New York 14627, USA
- FAS Informatics and Scientific Applications, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Jeffrey Vedanayagam
- Department of Developmental Biology, Memorial Sloan-Kettering Cancer Center, New York, New York 10065, USA
| | - Jeffrey R Adrion
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403, USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
| | - Kristi L Montooth
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, Nebraska 68502, USA
| | - Colin D Meiklejohn
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, Nebraska 68502, USA
| | | | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
| |
Collapse
|
8
|
Velandia-Huerto CA, Fallmann J, Stadler PF. miRNAture-Computational Detection of microRNA Candidates. Genes (Basel) 2021; 12:348. [PMID: 33673400 PMCID: PMC7996739 DOI: 10.3390/genes12030348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 02/19/2021] [Accepted: 02/20/2021] [Indexed: 12/16/2022] Open
Abstract
Homology-based annotation of short RNAs, including microRNAs, is a difficult problem because their inherently small size limits the available information. Highly sensitive methods, including parameter optimized blast, nhmmer, or cmsearch runs designed to increase sensitivity inevitable lead to large numbers of false positives, which can be detected only by detailed analysis of specific features typical for a RNA family and/or the analysis of conservation patterns in structure-annotated multiple sequence alignments. The miRNAture pipeline implements a workflow specific to animal microRNAs that automatizes homology search and validation steps. The miRNAture pipeline yields very good results for a large number of "typical" miRBase families. However, it also highlights difficulties with atypical cases, in particular microRNAs deriving from repetitive elements and microRNAs with unusual, branched precursor structures and atypical locations of the mature product, which require specific curation by domain experts.
Collapse
Affiliation(s)
- Cristian A. Velandia-Huerto
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany
| | - Jörg Fallmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, CO-111321 Bogotá, Colombia
- Santa Fe Insitute, Santa Fe, NM 87501, USA
| |
Collapse
|
9
|
Phillips JB, Ardell DH. Structural and Genetic Determinants of Convergence in the Drosophila tRNA Structure-Function Map. J Mol Evol 2021; 89:103-116. [PMID: 33528599 PMCID: PMC7884595 DOI: 10.1007/s00239-021-09995-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 01/11/2021] [Indexed: 10/29/2022]
Abstract
The evolution of tRNA multigene families remains poorly understood, exhibiting unusual phenomena such as functional conversions of tRNA genes through anticodon shift substitutions. We improved FlyBase tRNA gene annotations from twelve Drosophila species, incorporating previously identified ortholog sets to compare substitution rates across tRNA bodies at single-site and base-pair resolution. All rapidly evolving sites fell within the same metal ion-binding pocket that lies at the interface of the two major stacked helical domains. We applied our tRNA Structure-Function Mapper (tSFM) method independently to each Drosophila species and one outgroup species Musca domestica and found that, although predicted tRNA structure-function maps are generally highly conserved in flies, one tRNA Class-Informative Feature (CIF) within the rapidly evolving ion-binding pocket-Cytosine 17 (C17), ancestrally informative for lysylation identity-independently gained asparaginylation identity and substituted in parallel across tRNAAsn paralogs at least once, possibly multiple times, during evolution of the genus. In D. melanogaster, most tRNALys and tRNAAsn genes are co-arrayed in one large heterologous gene cluster, suggesting that heterologous gene conversion as well as structural similarities of tRNA-binding interfaces in the closely related asparaginyl-tRNA synthetase (AsnRS) and lysyl-tRNA synthetase (LysRS) proteins may have played a role in these changes. A previously identified Asn-to-Lys anticodon shift substitution in D. ananassae may have arisen to compensate for the convergent and parallel gains of C17 in tRNAAsn paralogs in that lineage. Our results underscore the functional and evolutionary relevance of our tRNA structure-function map predictions and illuminate multiple genomic and structural factors contributing to rapid, parallel and compensatory evolution of tRNA multigene families.
Collapse
Affiliation(s)
- Julie Baker Phillips
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95343, USA
- Department of Biology, Cumberland University, 1 Cumberland Square, Lebanon, TN, 37087, USA
| | - David H Ardell
- Quantitative and Systems Biology Program, University of California, Merced, CA, 95343, USA.
- Department of Molecular and Cell Biology, University of California, Merced, CA, 95343, USA.
| |
Collapse
|
10
|
Duncan GA, Dunigan DD, Van Etten JL. Diversity of tRNA Clusters in the Chloroviruses. Viruses 2020; 12:v12101173. [PMID: 33081353 PMCID: PMC7589089 DOI: 10.3390/v12101173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 10/12/2020] [Accepted: 10/12/2020] [Indexed: 11/25/2022] Open
Abstract
Viruses rely on their host’s translation machinery for the synthesis of their own proteins. Problems belie viral translation when the host has a codon usage bias (CUB) that is different from an infecting virus due to differences in the GC content between the host and virus genomes. Here, we examine the hypothesis that chloroviruses adapted to host CUB by acquisition and selection of tRNAs that at least partially favor their own CUB. The genomes of 41 chloroviruses comprising three clades, each infecting a different algal host, have been sequenced, assembled and annotated. All 41 viruses not only encode tRNAs, but their tRNA genes are located in clusters. While differences were observed between clades and even within clades, seven tRNA genes were common to all three clades of chloroviruses, including the tRNAArg gene, which was found in all 41 chloroviruses. By comparing the codon usage of one chlorovirus algal host, in which the genome has been sequenced and annotated (67% GC content), to that of two of its viruses (40% GC content), we found that the viruses were able to at least partially overcome the host’s CUB by encoding tRNAs that recognize AU-rich codons. Evidence presented herein supports the hypothesis that a chlorovirus tRNA cluster was present in the most recent common ancestor (MRCA) prior to divergence into three clades. In addition, the MRCA encoded a putative isoleucine lysidine synthase (TilS) that remains in 39/41 chloroviruses examined herein, suggesting a strong evolutionary pressure to retain the gene. TilS alters the anticodon of tRNAMet that normally recognizes AUG to then recognize AUA, a codon for isoleucine. This is advantageous to the chloroviruses because the AUA codon is 12–13 times more common in the chloroviruses than their host, further helping the chloroviruses to overcome CUB. Among large DNA viruses infecting eukaryotes, the presence of tRNA genes and tRNA clusters appear to be most common in the Phycodnaviridae and, to a lesser extent, in the Mimiviridae.
Collapse
Affiliation(s)
- Garry A. Duncan
- Nebraska Center for Virology, University of Nebraska-Lincoln, Lincoln, NE 68583-0900, USA; (G.A.D.); (D.D.D.)
| | - David D. Dunigan
- Nebraska Center for Virology, University of Nebraska-Lincoln, Lincoln, NE 68583-0900, USA; (G.A.D.); (D.D.D.)
- Department of Plant Pathology, University of Nebraska-Lincoln, Lincoln, NE 68583-0833, USA
| | - James L. Van Etten
- Nebraska Center for Virology, University of Nebraska-Lincoln, Lincoln, NE 68583-0900, USA; (G.A.D.); (D.D.D.)
- Department of Plant Pathology, University of Nebraska-Lincoln, Lincoln, NE 68583-0833, USA
- Correspondence: ; Tel.: +1-402-472-3168
| |
Collapse
|
11
|
Balogh G, Bernhart SH, Stadler PF, Schor J. A probabilistic version of Sankoff's maximum parsimony algorithm. J Bioinform Comput Biol 2020; 18:2050004. [PMID: 32336248 DOI: 10.1142/s0219720020500043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The number of genes belonging to a multi-gene family usually varies substantially over their evolutionary history as a consequence of gene duplications and losses. A first step toward analyzing these histories in detail is the inference of the changes in copy number that take place along the individual edges of the underlying phylogenetic tree. The corresponding maximum parsimony minimizes the total number of changes along the edges of the species tree. Incorrectly determined numbers of family members however may influence the estimates drastically. We therefore augment the analysis by introducing a probabilistic model that also considers suboptimal assignments of changes. Technically, this amounts to a partition function variant of Sankoff's parsimony algorithm. As a showcase application, we reanalyze the gain and loss patterns of metazoan microRNA families. As expected, the differences between the probabilistic and the parsimony method is moderate, in this limit of T→0, i.e. very little tolerance for deviations from parsimony, the total number of reconstructed changes is the same. However, we find that the partition function approach systematically predicts fewer gains and more loss events, showing that the data admit co-optimal solutions among which the parsimony approach selects biased representatives.
Collapse
Affiliation(s)
- Gábor Balogh
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Stephan H Bernhart
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, Leipzig Research Center for Civilization Diseases, Leipzig Research Center for Civilization Diseases (LIFE), University Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.,Max-Planck-Institute for Mathematics in Sciences, Inselstraße 22, D-04109 Leipzig, Germany.,Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, A-1090 Vienna, Austria.,Faculdad de Ciencias, Universidad Nacional de Colombia, Sede Bogotá, Ciudad Universitaria, COL-111321, Bogotá, D.C., Colombia.,Santa Fe Institute, 1399 Hyde Park Road, Santa Fe NM 87501, USA
| | - Jana Schor
- Young Investigators Group Bioinformatics and Transcriptomics, Department of Molecular Systems Biology, Helmholtz Centre for Environmental Research - UFZ, Permoserstraße 15, D-04318 Leipzig, Germany
| |
Collapse
|
12
|
Romanova EV, Bukin YS, Mikhailov KV, Logacheva MD, Aleoshin VV, Sherbakov DY. Hidden cases of tRNA gene duplication and remolding in mitochondrial genomes of amphipods. Mol Phylogenet Evol 2019; 144:106710. [PMID: 31846708 DOI: 10.1016/j.ympev.2019.106710] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Revised: 12/13/2019] [Accepted: 12/13/2019] [Indexed: 12/30/2022]
Abstract
The evolution of tRNA genes in mitochondrial (mt) genomes is a complex process that includes duplications, degenerations, and transpositions, as well as a specific process of identity change through mutations in the anticodon (tRNA gene remolding or tRNA gene recruitment). Using amphipod-specific tRNA models for annotation, we show that tRNA duplications are more common in the mt genomes of amphipods than what was revealed by previous annotations. Seventeen cases of tRNA gene duplications were detected in the mt genomes of amphipods, and ten of them were tRNA genes that underwent remolding. The additional tRNA gene findings were verified using phylogenetic analysis and genetic distance analysis. The majority of remolded tRNA genes (seven out of ten cases) were found in the mt genomes of endemic amphipod species from Lake Baikal. All additional mt tRNA genes arose independently in the Baikalian amphipods, indicating the unusual plasticity of tRNA gene evolution in these species assemblages. The possible reasons for the unusual abundance of additional tRNA genes in the mt genomes of Baikalian amphipods are discussed. The amphipod-specific tRNA models developed for MiTFi refine existing predictions of tRNA genes in amphipods and reveal additional cases of duplicated tRNA genes overlooked by using less specific Metazoa-wide models. The application of these models for mt tRNA gene prediction will be useful for the correct annotation of mt genomes of amphipods and probably other crustaceans.
Collapse
Affiliation(s)
- Elena V Romanova
- Laboratory of Molecular Systematics, Limnological Institute, Irkutsk, Russian Federation.
| | - Yurij S Bukin
- Laboratory of Molecular Systematics, Limnological Institute, Irkutsk, Russian Federation; Faculty of Biology and Soil Studies, Irkutsk State University, Irkutsk, Russian Federation
| | - Kirill V Mikhailov
- Belozersky Institute for Physicochemical Biology, Lomonosov Moscow State University, Moscow, Russian Federation; Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russian Federation
| | - Maria D Logacheva
- Belozersky Institute for Physicochemical Biology, Lomonosov Moscow State University, Moscow, Russian Federation; Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russian Federation
| | - Vladimir V Aleoshin
- Belozersky Institute for Physicochemical Biology, Lomonosov Moscow State University, Moscow, Russian Federation; Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russian Federation
| | - Dmitry Yu Sherbakov
- Laboratory of Molecular Systematics, Limnological Institute, Irkutsk, Russian Federation; Faculty of Biology and Soil Studies, Irkutsk State University, Irkutsk, Russian Federation
| |
Collapse
|
13
|
Walter Costa MB, Höner zu Siederdissen C, Dunjić M, Stadler PF, Nowick K. SSS-test: a novel test for detecting positive selection on RNA secondary structure. BMC Bioinformatics 2019; 20:151. [PMID: 30898084 PMCID: PMC6429701 DOI: 10.1186/s12859-019-2711-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 03/03/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Long non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Comprehensive assessments of selection acting on RNAs therefore must also encompass structure. Selection pressures acting on the structure of non-coding genes can be detected within multiple sequence alignments. Approaches of this type, however, have so far focused on negative selection. Thus, a computational method for identifying ncRNAs under positive selection is needed. RESULTS We introduce the SSS-test (test for Selection on Secondary Structure) to identify positive selection and thus adaptive evolution. Benchmarks with biological as well as synthetic controls yield coherent signals for both negative and positive selection, demonstrating the functionality of the test. A survey of a lncRNA collection comprising 15,443 families resulted in 110 candidates that appear to be under positive selection in human. In 26 lncRNAs that have been associated with psychiatric disorders we identified local structures that have signs of positive selection in the human lineage. CONCLUSIONS It is feasible to assay positive selection acting on RNA secondary structures on a genome-wide scale. The detection of human-specific positive selection in lncRNAs associated with cognitive disorder provides a set of candidate genes for further experimental testing and may provide insights into the evolution of cognitive abilities in humans. AVAILABILITY The SSS-test and related software is available at: https://github.com/waltercostamb/SSS-test . The databases used in this work are available at: http://www.bioinf.uni-leipzig.de/Software/SSS-test/ .
Collapse
Affiliation(s)
- Maria Beatriz Walter Costa
- Embrapa Agroenergia, Parque Estação Biológica (PqEB), Asa Norte, Brasília, DF, 70770-901 Brazil
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, Leipzig, 04107 Germany
| | - Christian Höner zu Siederdissen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, Leipzig, 04107 Germany
| | - Marko Dunjić
- Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Königin-Luise-Straße 1-3, Berlin, 14195 Germany
- Center for Human Molecular Genetics, Faculty of Biology, University of Belgrade, Studentski trg 16, PO box 43, Belgrade, 11000 Serbia
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16–18, Leipzig, 04107 Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig & Competence Center for Scalable Data Services and Solutions Dresden-Leipzig & Leipzig Research Center for Civilization Diseases, University Leipzig, Leipzig, 04107 Germany
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, 04103 Germany
- Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, Vienna, A-1090 Austria
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg C, DK-1870 Denmark
- Faculdad de Ciencias, Universidad Nacional de Colombia, Sede Bogotá, Ciudad Universitaria, Bogotá, D.C., COL-111321 Colombia
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501 USA
| | - Katja Nowick
- Human Biology Group, Institute for Biology, Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Königin-Luise-Straße 1-3, Berlin, 14195 Germany
- TFome Research Group, Bioinformatics Group, Interdisciplinary Center of Bioinformatics, Department of Computer Science, University of Leipzig, Härtelstraße 16-18, Leipzig, 04107 Germany
- Paul-Flechsig-Institute for Brain Research, University of Leipzig, Liebigstraße 19. Haus C, Leipzig, 04103 Germany
- Bioinformatics, Faculty of Agricultural Sciences, Institute of Animal Science, University of Hohenheim, Garbenstraße 13, Stuttgart, 70593 Germany
| |
Collapse
|
14
|
Hoffmann A, Fallmann J, Vilardo E, Mörl M, Stadler PF, Amman F. Accurate mapping of tRNA reads. Bioinformatics 2019; 34:1116-1124. [PMID: 29228294 DOI: 10.1093/bioinformatics/btx756] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 12/07/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation Many repetitive DNA elements are transcribed at appreciable expression levels. Mapping the corresponding RNA sequencing reads back to a reference genome is notoriously difficult and error-prone task, however. This is in particular true if chemical modifications introduce systematic mismatches, while at the same time the genomic loci are only approximately identical, as in the case of tRNAs. Results We therefore developed a dedicated mapping strategy to handle RNA-seq reads that map to tRNAs relying on a modified target genome in which known tRNA loci are masked and instead intronless tRNA precursor sequences are appended as artificial 'chromosomes'. In a first pass, reads that overlap the boundaries of mature tRNAs are extracted. In the second pass, the remaining reads are mapped to a tRNA-masked target that is augmented by representative mature tRNA sequences. Using both simulated and real life data we show that our best-practice workflow removes most of the mapping artefacts introduced by simpler mapping schemes and makes it possible to reliably identify many of chemical tRNA modifications in generic small RNA-seq data. Using simulated data the FDR is only 2%. We find compelling evidence for tissue specific differences of tRNA modification patterns. Availability and implementation The workflow is available both as a bash script and as a Galaxy workflow from https://github.com/AnneHoffmann/tRNA-read-mapping. Contact fabian@tbi.univie.ac.at. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anne Hoffmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany
| | - Jörg Fallmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany
| | - Elisa Vilardo
- Center for Anatomy and Cell Biology, Medical University of Vienna, Austria
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, D-04103 Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, Leipzig University, D-04107 Leipzig, Germany.,Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany.,Fraunhofer Institute for Cell Therapy and Immunology, D-04103 Leipzig, Germany.,Center for RNA in Technology and Health, University of Copenhagen, Frederiksberg C, Denmark.,Santa Fe Institute, Santa Fe, NM 87501, USA.,Department of Theoretical Chemistry of the University of Vienna, A-1090 Vienna, Austria
| | - Fabian Amman
- Department of Theoretical Chemistry of the University of Vienna, A-1090 Vienna, Austria.,Department of Chromosome Biology of the University of Vienna, A-1030 Vienna, Austria
| |
Collapse
|
15
|
Branciamore S, Gogoshin G, Di Giulio M, Rodin AS. Intrinsic Properties of tRNA Molecules as Deciphered via Bayesian Network and Distribution Divergence Analysis. Life (Basel) 2018; 8:life8010005. [PMID: 29419741 PMCID: PMC5871937 DOI: 10.3390/life8010005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2017] [Revised: 01/22/2018] [Accepted: 01/23/2018] [Indexed: 12/27/2022] Open
Abstract
The identity/recognition of tRNAs, in the context of aminoacyl tRNA synthetases (and other molecules), is a complex phenomenon that has major implications ranging from the origins and evolution of translation machinery and genetic code to the evolution and speciation of tRNAs themselves to human mitochondrial diseases to artificial genetic code engineering. Deciphering it via laboratory experiments, however, is difficult and necessarily time- and resource-consuming. In this study, we propose a mathematically rigorous two-pronged in silico approach to identifying and classifying tRNA positions important for tRNA identity/recognition, rooted in machine learning and information-theoretic methodology. We apply Bayesian Network modeling to elucidate the structure of intra-tRNA-molecule relationships, and distribution divergence analysis to identify meaningful inter-molecule differences between various tRNA subclasses. We illustrate the complementary application of these two approaches using tRNA examples across the three domains of life, and identify and discuss important (informative) positions therein. In summary, we deliver to the tRNA research community a novel, comprehensive methodology for identifying the specific elements of interest in various tRNA molecules, which can be followed up by the corresponding experimental work and/or high-resolution position-specific statistical analyses.
Collapse
Affiliation(s)
- Sergio Branciamore
- Department of Diabetes Complications and Metabolism, Diabetes and Metabolism Research Institute, City of Hope, Duarte, 91010 CA, USA.
| | - Grigoriy Gogoshin
- Department of Diabetes Complications and Metabolism, Diabetes and Metabolism Research Institute, City of Hope, Duarte, 91010 CA, USA.
| | - Massimo Di Giulio
- Early Evolution of Life Laboratory, Institute of Biosciences and Bioresources, CNR, 80131 Naples, Italy.
| | - Andrei S Rodin
- Department of Diabetes Complications and Metabolism, Diabetes and Metabolism Research Institute, City of Hope, Duarte, 91010 CA, USA.
| |
Collapse
|
16
|
SMORE: Synteny Modulator of Repetitive Elements. Life (Basel) 2017; 7:life7040042. [PMID: 29088079 PMCID: PMC5745555 DOI: 10.3390/life7040042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Revised: 10/27/2017] [Accepted: 10/28/2017] [Indexed: 12/19/2022] Open
Abstract
Several families of multicopy genes, such as transfer ribonucleic acids (tRNAs) and ribosomal RNAs (rRNAs), are subject to concerted evolution, an effect that keeps sequences of paralogous genes effectively identical. Under these circumstances, it is impossible to distinguish orthologs from paralogs on the basis of sequence similarity alone. Synteny, the preservation of relative genomic locations, however, also remains informative for the disambiguation of evolutionary relationships in this situation. In this contribution, we describe an automatic pipeline for the evolutionary analysis of such cases that use genome-wide alignments as a starting point to assign orthology relationships determined by synteny. The evolution of tRNAs in primates as well as the history of the Y RNA family in vertebrates and nematodes are used to showcase the method. The pipeline is freely available.
Collapse
|