1
|
Umu SU, Paynter VM, Trondsen H, Buschmann T, Rounge TB, Peterson KJ, Fromm B. Accurate microRNA annotation of animal genomes using trained covariance models of curated microRNA complements in MirMachine. CELL GENOMICS 2023; 3:100348. [PMID: 37601971 PMCID: PMC10435380 DOI: 10.1016/j.xgen.2023.100348] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 03/15/2023] [Accepted: 05/26/2023] [Indexed: 08/22/2023]
Abstract
The annotation of microRNAs depends on the availability of transcriptomics data and expert knowledge. This has led to a gap between the availability of novel genomes and high-quality microRNA complements. Using >16,000 microRNAs from the manually curated microRNA gene database MirGeneDB, we generated trained covariance models for all conserved microRNA families. These models are available in our tool MirMachine, which annotates conserved microRNAs within genomes. We successfully applied MirMachine to a range of animal species, including those with large genomes and genome duplications and extinct species, where small RNA sequencing is hard to achieve. We further describe a microRNA score of expected microRNAs that can be used to assess the completeness of genome assemblies. MirMachine closes a long-persisting gap in the microRNA field by facilitating automated genome annotation pipelines and deeper studies into the evolution of genome regulation, even in extinct organisms.
Collapse
Affiliation(s)
- Sinan Uğur Umu
- Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Vanessa M. Paynter
- The Arctic University Museum of Norway, UiT - The Arctic University of Norway, Tromsø, Norway
| | - Håvard Trondsen
- Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | | | - Trine B. Rounge
- Department of Research, Cancer Registry of Norway, Oslo, Norway
- Centre for Bioinformatics, Department of Pharmacy, University of Oslo, Oslo, Norway
| | - Kevin J. Peterson
- Department of Biological Sciences, Dartmouth College, Hanover, NH, USA
| | - Bastian Fromm
- The Arctic University Museum of Norway, UiT - The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
2
|
Fromm B, Høye E, Domanska D, Zhong X, Aparicio-Puerta E, Ovchinnikov V, Umu SU, Chabot PJ, Kang W, Aslanzadeh M, Tarbier M, Mármol-Sánchez E, Urgese G, Johansen M, Hovig E, Hackenberg M, Friedländer MR, Peterson KJ. MirGeneDB 2.1: toward a complete sampling of all major animal phyla. Nucleic Acids Res 2021; 50:D204-D210. [PMID: 34850127 PMCID: PMC8728216 DOI: 10.1093/nar/gkab1101] [Citation(s) in RCA: 90] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/20/2021] [Accepted: 11/23/2021] [Indexed: 12/03/2022] Open
Abstract
We describe an update of MirGeneDB, the manually curated microRNA gene database. Adhering to uniform and consistent criteria for microRNA annotation and nomenclature, we substantially expanded MirGeneDB with 30 additional species representing previously missing metazoan phyla such as sponges, jellyfish, rotifers and flatworms. MirGeneDB 2.1 now consists of 75 species spanning over ∼800 million years of animal evolution, and contains a total number of 16 670 microRNAs from 1549 families. Over 6000 microRNAs were added in this update using ∼550 datasets with ∼7.5 billion sequencing reads. By adding new phylogenetically important species, especially those relevant for the study of whole genome duplication events, and through updating evolutionary nodes of origin for many families and genes, we were able to substantially refine our nomenclature system. All changes are traceable in the specifically developed MirGeneDB version tracker. The performance of read-pages is improved and microRNA expression matrices for all tissues and species are now also downloadable. Altogether, this update represents a significant step toward a complete sampling of all major metazoan phyla, and a widely needed foundation for comparative microRNA genomics and transcriptomics studies. MirGeneDB 2.1 is part of RNAcentral and Elixir Norway, publicly and freely available at http://www.mirgenedb.org/.
Collapse
Affiliation(s)
- Bastian Fromm
- The Arctic University Museum of Norway, UiT- The Arctic University of Norway, Tromsø, Norway.,Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Eirik Høye
- Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway.,Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Diana Domanska
- Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway.,Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Xiangfu Zhong
- Department of Biosciences and Nutrition, Karolinska Institute, Huddinge, Sweden
| | - Ernesto Aparicio-Puerta
- Department of Genetics, Faculty of Sciences, MNAT Excellence Unit, University of Granada, Granada, Spain.,Biotechnology Institute, CIBM, Granada, Spain.,Biohealth Research Institute (ibs.GRANADA), University Hospitals of Granada, University of Granada, Granada, Spain
| | - Vladimir Ovchinnikov
- Computational and Molecular Evolutionary Biology Research Group, School of life sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham, UK
| | - Sinan U Umu
- Department of Research, Cancer Registry of Norway, Oslo, Norway
| | - Peter J Chabot
- Department of Biological Sciences, Dartmouth College, Hanover, USA
| | - Wenjing Kang
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.,Science for Life Laboratory, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Solna, Sweden
| | - Morteza Aslanzadeh
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Marcel Tarbier
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.,Science for Life Laboratory, Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Solna, Sweden
| | - Emilio Mármol-Sánchez
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.,Centre for Palaeogenetics, Stockholm, Sweden
| | | | - Morten Johansen
- Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
| | - Eivind Hovig
- Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway.,Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
| | - Michael Hackenberg
- Department of Genetics, Faculty of Sciences, MNAT Excellence Unit, University of Granada, Granada, Spain.,Biotechnology Institute, CIBM, Granada, Spain.,Biohealth Research Institute (ibs.GRANADA), University Hospitals of Granada, University of Granada, Granada, Spain
| | - Marc R Friedländer
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Kevin J Peterson
- Department of Biological Sciences, Dartmouth College, Hanover, USA
| |
Collapse
|
3
|
Patel VD, Capra JA. Ancient human miRNAs are more likely to have broad functions and disease associations than young miRNAs. BMC Genomics 2017; 18:672. [PMID: 28859623 PMCID: PMC5579935 DOI: 10.1186/s12864-017-4073-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Accepted: 08/16/2017] [Indexed: 12/16/2022] Open
Abstract
Background microRNAs (miRNAs) are essential to the regulation of gene expression in eukaryotes, and improper expression of miRNAs contributes to hundreds of diseases. Despite the essential functions of miRNAs, the evolutionary dynamics of how they are integrated into existing gene regulatory and functional networks is not well understood. Knowledge of the origin and evolutionary history a gene has proven informative about its functions and disease associations; we hypothesize that incorporating the evolutionary origins of miRNAs into analyses will help resolve differences in their functional dynamics and how they influence disease. Results We computed the phylogenetic age of miRNAs across 146 species and quantified the relationship between human miRNA age and several functional attributes. Older miRNAs are significantly more likely to be associated with disease than younger miRNAs, and the number of associated diseases increases with age. As has been observed for genes, the miRNAs associated with different diseases have different age profiles. For example, human miRNAs implicated in cancer are enriched for origins near the dawn of animal multicellularity. Consistent with the increasing contribution of miRNAs to disease with age, older miRNAs target more genes than younger miRNAs, and older miRNAs are expressed in significantly more tissues. Furthermore, miRNAs of all ages exhibit a strong preference to target older genes; 93% of validated miRNA gene targets were in existence at the origin of the targeting miRNA. Finally, we find that human miRNAs in evolutionarily related families are more similar in their targets and expression profiles than unrelated miRNAs. Conclusions Considering the evolutionary origin and history of a miRNA provides useful context for the analysis of its function. Consistent with recent work in Drosophila, our results support a model in which miRNAs increase their expression and functional regulatory interactions over evolutionary time, and thus older miRNAs have increased potential to cause disease. We anticipate that these patterns hold across mammalian species; however, comprehensively evaluating them will require refining miRNA annotations across species and collecting functional data in non-human systems. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4073-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Vir D Patel
- Department of Biology, Duke University, Durham, NC, 27708, USA.,Department of Biology, Western Kentucky University, Bowling Green, KY, 42101, USA
| | - John A Capra
- Departments of Biological Sciences, Biomedical Informatics, and Computer Science, Vanderbilt Genetics Institute, Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, USA.
| |
Collapse
|