1
|
Hu H, Dong B, Fan X, Wang M, Wang T, Liu Q. Mutational Bias and Natural Selection Driving the Synonymous Codon Usage of Single-Exon Genes in Rice (Oryza sativa L.). RICE (NEW YORK, N.Y.) 2023; 16:11. [PMID: 36849744 PMCID: PMC9971424 DOI: 10.1186/s12284-023-00627-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/16/2023] [Indexed: 06/18/2023]
Abstract
The relative abundance of single-exon genes (SEGs) in higher plants is perplexing. Uncovering the synonymous codon usage pattern of SEGs will benefit for further understanding their underlying evolutionary mechanism in plants. Using internal correspondence analysis (ICA), we reveal a significant difference in synonymous codon usage between SEGs and multiple-exon genes (MEGs) in rice. But the effect is weak, accounting for only 2.61% of the total codon usage variability. SEGs and MEGs contain remarkably different base compositions, and are under clearly differential selective constraints, with the former having higher GC content, and evolving relatively faster during evolution. In the group of SEGs, the variability in synonymous codon usage among genes is partially due to the variations in GC content, gene function, and gene expression level, which accounts for 22.03%, 5.99%, and 3.32% of the total codon usage variability, respectively. Therefore, mutational bias and natural selection should work on affecting the synonymous codon usage of SEGs in rice. These findings may deepen our knowledge for the mechanisms of origination, differentiation and regulation of SEGs in plants.
Collapse
Affiliation(s)
- Huan Hu
- The Key Laboratory for Quality Improvement of Agricultural Products of Zhejiang Province, College of Advanced Agricultural Sciences, Zhejiang A & F University, Lin'an, Hangzhou, 311300, People's Republic of China
| | - Boran Dong
- The Key Laboratory for Quality Improvement of Agricultural Products of Zhejiang Province, College of Advanced Agricultural Sciences, Zhejiang A & F University, Lin'an, Hangzhou, 311300, People's Republic of China
| | - Xiaoji Fan
- The Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, 310012, People's Republic of China
| | - Meixia Wang
- The Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, 310012, People's Republic of China
| | - Tingzhang Wang
- The Key Laboratory of Microbial Technology and Bioinformatics of Zhejiang Province, Hangzhou, 310012, People's Republic of China.
| | - Qingpo Liu
- The Key Laboratory for Quality Improvement of Agricultural Products of Zhejiang Province, College of Advanced Agricultural Sciences, Zhejiang A & F University, Lin'an, Hangzhou, 311300, People's Republic of China.
| |
Collapse
|
2
|
Rahman MM, Hossain MT, Reza MS, Peng Y, Feng S, Wei Y. Identification of Potential Long Non-Coding RNA Candidates that Contribute to Triple-Negative Breast Cancer in Humans through Computational Approach. Int J Mol Sci 2021; 22:12359. [PMID: 34830241 PMCID: PMC8619140 DOI: 10.3390/ijms222212359] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 10/26/2021] [Accepted: 10/28/2021] [Indexed: 12/31/2022] Open
Abstract
Breast cancer (BC) is the most frequent malignancy identified in adult females, resulting in enormous financial losses worldwide. Owing to the heterogeneity as well as various molecular subtypes, the molecular pathways underlying carcinogenesis in various forms of BC are distinct. Therefore, the advancement of alternative therapy is required to combat the ailment. Recent analyses propose that long non-coding RNAs (lncRNAs) perform an essential function in controlling immune response, and therefore, may provide essential information about the disorder. However, their function in patients with triple-negative BC (TNBC) has not been explored in detail. Here, we analyzed the changes in the genomic expression of messenger RNA (mRNA) and lncRNA in standard control in response to cancer metastasis using publicly available single-cell RNA-Seq data. We identified a total of 197 potentially novel lncRNAs in TNBC patients of which 86 were differentially upregulated and 111 were differentially downregulated. In addition, among the 909 candidate lncRNA transcripts, 19 were significantly differentially expressed (DE) of which three were upregulated and 16 were downregulated. On the other hand, 1901 mRNA transcripts were significantly DE of which 1110 were upregulated and 791 were downregulated by TNBCs subtypes. The Gene Ontology (GO) analyses showed that some of the host genes were enriched in various biological, molecular, and cellular functions. The Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis showed that some of the genes were involved in only one pathway of prostate cancer. The lncRNA-miRNA-gene network analysis showed that the lncRNAs TCONS_00076394 and TCONS_00051377 interacted with breast cancer-related micro RNAs (miRNAs) and the host genes of these lncRNAs were also functionally related to breast cancer. Thus, this study provides novel lncRNAs as potential biomarkers for the therapeutic intervention of this cancer subtype.
Collapse
MESH Headings
- Biomarkers, Tumor/genetics
- Biomarkers, Tumor/metabolism
- Computational Biology/methods
- Female
- Gene Expression Profiling
- Gene Expression Regulation, Neoplastic
- Gene Ontology
- Gene Regulatory Networks
- Humans
- Mammary Glands, Human/metabolism
- Mammary Glands, Human/pathology
- MicroRNAs/classification
- MicroRNAs/genetics
- MicroRNAs/metabolism
- Molecular Sequence Annotation
- RNA, Long Noncoding/classification
- RNA, Long Noncoding/genetics
- RNA, Long Noncoding/metabolism
- RNA, Messenger/classification
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- RNA, Neoplasm/classification
- RNA, Neoplasm/genetics
- RNA, Neoplasm/metabolism
- Triple Negative Breast Neoplasms/diagnosis
- Triple Negative Breast Neoplasms/genetics
- Triple Negative Breast Neoplasms/metabolism
- Triple Negative Breast Neoplasms/pathology
Collapse
Affiliation(s)
- Md. Motiar Rahman
- Department of Biochemistry and Molecular Biology, University of Rajshahi, Rajshahi 6205, Bangladesh
- Department of Chemistry, Binghamton University, State University of New York, Vestal, New York, NY 13902, USA
| | - Md. Tofazzal Hossain
- University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing 100049, China; (T.H.); (S.R.)
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
- Department of Statistics, Bangabandhu Sheikh Mujibur Rahaman Science and Technology University, Gopalganj 8100, Bangladesh
| | - Md. Selim Reza
- University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing 100049, China; (T.H.); (S.R.)
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
| | - Yin Peng
- Department of Pathology, The Shenzhen University School of Medicine, Shenzhen 518060, China;
| | - Shengzhong Feng
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
| | - Yanjie Wei
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
| |
Collapse
|
3
|
Aviña-Padilla K, Ramírez-Rafael JA, Herrera-Oropeza GE, Muley VY, Valdivia DI, Díaz-Valenzuela E, García-García A, Varela-Echavarría A, Hernández-Rosales M. Evolutionary Perspective and Expression Analysis of Intronless Genes Highlight the Conservation of Their Regulatory Role. Front Genet 2021; 12:654256. [PMID: 34306008 PMCID: PMC8302217 DOI: 10.3389/fgene.2021.654256] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 06/01/2021] [Indexed: 11/13/2022] Open
Abstract
The structure of eukaryotic genes is generally a combination of exons interrupted by intragenic non-coding DNA regions (introns) removed by RNA splicing to generate the mature mRNA. A fraction of genes, however, comprise a single coding exon with introns in their untranslated regions or are intronless genes (IGs), lacking introns entirely. The latter code for essential proteins involved in development, growth, and cell proliferation and their expression has been proposed to be highly specialized for neuro-specific functions and linked to cancer, neuropathies, and developmental disorders. The abundant presence of introns in eukaryotic genomes is pivotal for the precise control of gene expression. Notwithstanding, IGs exempting splicing events entail a higher transcriptional fidelity, making them even more valuable for regulatory roles. This work aimed to infer the functional role and evolutionary history of IGs centered on the mouse genome. IGs consist of a subgroup of genes with one exon including coding genes, non-coding genes, and pseudogenes, which conform approximately 6% of a total of 21,527 genes. To understand their prevalence, biological relevance, and evolution, we identified and studied 1,116 IG functional proteins validating their differential expression in transcriptomic data of embryonic mouse telencephalon. Our results showed that overall expression levels of IGs are lower than those of MEGs. However, strongly up-regulated IGs include transcription factors (TFs) such as the class 3 of POU (HMG Box), Neurog1, Olig1, and BHLHe22, BHLHe23, among other essential genes including the β-cluster of protocadherins. Most striking was the finding that IG-encoded BHLH TFs fit the criteria to be classified as microproteins. Finally, predicted protein orthologs in other six genomes confirmed high conservation of IGs associated with regulating neural processes and with chromatin organization and epigenetic regulation in Vertebrata. Moreover, this study highlights that IGs are essential modulators of regulatory processes, such as the Wnt signaling pathway and biological processes as pivotal as sensory organ developing at a transcriptional and post-translational level. Overall, our results suggest that IG proteins have specialized, prevalent, and unique biological roles and that functional divergence between IGs and MEGs is likely to be the result of specific evolutionary constraints.
Collapse
Affiliation(s)
- Katia Aviña-Padilla
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, Mexico
- Centro de Investigacioìn y de Estudios Avanzados del IPN, Unidad Irapuato, Guanajuato, Mexico
| | | | - Gabriel Emilio Herrera-Oropeza
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, Mexico
- Centre for Developmental Neurobiology, Institute of Psychiatry, Psychology, and Neuroscience, King’s College London, London, United Kingdom
| | | | - Dulce I. Valdivia
- Centro de Investigacioìn y de Estudios Avanzados del IPN, Unidad Irapuato, Guanajuato, Mexico
| | - Erik Díaz-Valenzuela
- Centro de Investigacioìn y de Estudios Avanzados del IPN, Unidad Irapuato, Guanajuato, Mexico
| | - Andrés García-García
- Centro de Física Aplicada y Tecnología Avanzada, Universidad Nacional Autónoma de México, Querétaro, Mexico
| | | | | |
Collapse
|
4
|
Tine M, Kuhl H, Teske PR, Reinhardt R. Genome-wide analysis of European sea bass provides insights into the evolution and functions of single-exon genes. Ecol Evol 2021; 11:6546-6557. [PMID: 34141239 PMCID: PMC8207432 DOI: 10.1002/ece3.7507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 01/24/2021] [Accepted: 03/12/2021] [Indexed: 11/17/2022] Open
Abstract
Several studies have attempted to understand the origin and evolution of single-exon genes (SEGs) in eukaryotic organisms, including fishes, but few have examined the functional and evolutionary relationships between SEGs and multiple-exon gene (MEG) paralogs, in particular the conservation of promoter regions. Given that SEGs originate via the reverse transcription of mRNA from a "parental" MEGs, such comparisons may enable identifying evolutionarily-related SEG/MEG paralogs, which might fulfill equivalent physiological functions. Here, the relationship of SEG proportion with MEG count, gene density, intron count, and chromosome size was assessed for the genome of the European sea bass, Dicentrarchus labrax. Then, SEGs with an MEG parent were identified, and promoter sequences of SEG/MEG paralogs were compared, to identify highly conserved functional motifs. The results revealed a total count of 1,585 (8.3% of total genes) SEGs in the European sea bass genome, which was correlated with MEG count but not with gene density. The significant correlation of SEG content with the number of MEGs suggests that SEGs were continuously and independently generated over evolutionary time following species divergence through retrotranscription events, followed by tandem duplications. Functional annotation showed that the majority of SEGs are functional, as is evident from their expression in RNA-seq data used to support homology-based genome annotation. Differences in 5'UTR and 3'UTR lengths between SEG/MEG paralogs observed in this study may contribute to gene expression divergence between them and therefore lead to the emergence of new SEG functions. The comparison of nonsynonymous to synonymous changes (Ka/Ks) between SEG/MEG parents showed that 74 of them are under positive selection (Ka/Ks > 1; p = .0447). An additional fifteen SEGs with an MEG parent have a common promoter, which implies that they are under the influence of common regulatory networks.
Collapse
Affiliation(s)
- Mbaye Tine
- UFR des Sciences Agronomiques, de l'Aquaculture et des Technologies Alimentaires (S2ATA)Université Gaston Berger (UGB)Saint‐LouisSenegal
- Genome Centre at the Max‐Planck Institute for Plant Breeding ResearchKölnGermany
| | - Heiner Kuhl
- Department of Ecophysiology and AquacultureLeibniz‐Institute of Freshwater Ecology and Inland Fisheries (IGB)BerlinGermany
| | - Peter R. Teske
- Department of ZoologyCentre for Ecological Genomics and Wildlife ConservationUniversity of JohannesburgJohannesburgSouth Africa
| | - Richard Reinhardt
- Genome Centre at the Max‐Planck Institute for Plant Breeding ResearchKölnGermany
| |
Collapse
|
5
|
Jorquera R, González C, Clausen PTLC, Petersen B, Holmes DS. SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6122466. [PMID: 33507271 PMCID: PMC7904048 DOI: 10.1093/database/baab002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 12/01/2020] [Accepted: 01/05/2021] [Indexed: 11/27/2022]
Abstract
Single-exon coding sequences (CDSs), also known as ‘single-exon genes’ (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL:http://v2.sinex.cl/
Collapse
Affiliation(s)
- R Jorquera
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Zañartu 1482, Ñuñoa Santiago 7780132, Chile
- Laboratorio Medicina Traslacional, Fundación Arturo López Pérez, José Manuel Infante 805, Providencia, Santiago 7500691, Chile
| | - C González
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Zañartu 1482, Ñuñoa Santiago 7780132, Chile
- Centro de Genómica y Bioinformática, Universidad Mayor, Camino la pirámide 5750, Huechuraba, Santiago 8580745, Chile
| | - P T L C Clausen
- Department of Global Surveillance, Technical University of Denmark, Kemitorvet building 204, 2800 Kgs. Lyngby, Denmark
| | - B Petersen
- Section for Evolutionary Genomics, The GLOBE Institute, University of Copenhagen, Hovedstaden, Øster Voldgade 5–7, Copenhagen 1350, Denmark
- Centre of Excellence for Omics-Driven Computational Biodiscovery (COMBio), AIMST University, Batu 3 1/2, Jalan Bukit Air Nasi, 08100 Bedong, Kedah, Malaysia
| | - D S Holmes
- *Corresponding author: Tel: +56 2 22398969;
| |
Collapse
|
6
|
Seefelder M, Alva V, Huang B, Engler T, Baumeister W, Guo Q, Fernández-Busnadiego R, Lupas AN, Kochanek S. The evolution of the huntingtin-associated protein 40 (HAP40) in conjunction with huntingtin. BMC Evol Biol 2020; 20:162. [PMID: 33297953 PMCID: PMC7725122 DOI: 10.1186/s12862-020-01705-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 10/20/2020] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND The huntingtin-associated protein 40 (HAP40) abundantly interacts with huntingtin (HTT), the protein that is altered in Huntington's disease (HD). Therefore, we analysed the evolution of HAP40 and its interaction with HTT. RESULTS We found that in amniotes HAP40 is encoded by a single-exon gene, whereas in all other organisms it is expressed from multi-exon genes. HAP40 co-occurs with HTT in unikonts, including filastereans such as Capsaspora owczarzaki and the amoebozoan Dictyostelium discoideum, but both proteins are absent from fungi. Outside unikonts, a few species, such as the free-living amoeboflagellate Naegleria gruberi, contain putative HTT and HAP40 orthologs. Biochemically we show that the interaction between HTT and HAP40 extends to fish, and bioinformatic analyses provide evidence for evolutionary conservation of this interaction. The closest homologue of HAP40 in current protein databases is the family of soluble N-ethylmaleimide-sensitive factor attachment proteins (SNAPs). CONCLUSION Our results indicate that the transition from a multi-exon to a single-exon gene appears to have taken place by retroposition during the divergence of amphibians and amniotes, followed by the loss of the parental multi-exon gene. Furthermore, it appears that the two proteins probably originated at the root of eukaryotes. Conservation of the interaction between HAP40 and HTT and their likely coevolution strongly indicate functional importance of this interaction.
Collapse
Affiliation(s)
| | - Vikram Alva
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany
| | - Bin Huang
- Department of Gene Therapy, Ulm University, 89081, Ulm, Germany
| | - Tatjana Engler
- Department of Gene Therapy, Ulm University, 89081, Ulm, Germany
| | - Wolfgang Baumeister
- Department of Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152, Martinsried, Germany
| | - Qiang Guo
- Department of Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152, Martinsried, Germany
- Peking-Tsinghua Joint Center for Life Sciences, School of Life Sciences, Peking University, Beijing, 100871, China
| | - Rubén Fernández-Busnadiego
- Department of Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152, Martinsried, Germany
- Institute of Neuropathology, University Medical Center Göttingen, 37099, Göttingen, Germany
- Cluster of Excellence "Multiscale Bioimaging: From Molecular Machines To Networks of Excitable Cells" (MBExC), University of Göttingen, Göttingen, Germany
| | - Andrei N Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| | - Stefan Kochanek
- Department of Gene Therapy, Ulm University, 89081, Ulm, Germany.
| |
Collapse
|
7
|
[Virus-host coevolution: Endogenous RNA viral elements as pseudogenes]. Uirusu 2020; 70:49-56. [PMID: 33967113 DOI: 10.2222/jsv.70.49] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
RNA viruses do not need to take the form of DNAs, and RNAs alone complete their replication cycles. On the other hand, since the 1970s, it has been known that DNA fragments derived from RNA viruses can be detected in RNA virus-infected cells. Furthermore, in this decade, it has become clear that the eukaryotic genomes contain genetic sequences derived from non-retroviral RNA viruses. The DNA sequences derived from these RNA viruses are thought to be generatedby using a transposable mechanism of retrotransposon, such as LINE-1. Many endogenous RNA viral sequences are formed by the same mechanism as processed pseudogenes in eukaryotic cells, but the significance of the production of RNA viral "pseudogenes " in infected cells has not been elucidated. We have discovered endogenous bornavirus-like elements (EBLs), which derived from a negative-sense, single-stranded RNA virus, Bornaviruses, and have studied the evolution and function of EBLs in host animals. The analysis of EBLs provides us a clue to unravel the history of host-RNA virus coexistence. In this review, I overview about the function of endogenous RNA virus sequences, especially EBLs in mammalian genomes, and discuss the significance of endogenization of RNA viruses as viral pseudogenes in evolution.
Collapse
|
8
|
Overcoming challenges and dogmas to understand the functions of pseudogenes. Nat Rev Genet 2019; 21:191-201. [DOI: 10.1038/s41576-019-0196-1] [Citation(s) in RCA: 92] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2019] [Indexed: 01/08/2023]
|
9
|
Genetic basis of functional variability in adhesion G protein-coupled receptors. Sci Rep 2019; 9:11036. [PMID: 31363148 PMCID: PMC6667449 DOI: 10.1038/s41598-019-46265-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Accepted: 06/21/2019] [Indexed: 12/15/2022] Open
Abstract
The enormous sizes of adhesion G protein-coupled receptors (aGPCRs) go along with complex genomic exon-intron architectures giving rise to multiple mRNA variants. There is a need for a comprehensive catalog of aGPCR variants for proper evaluation of the complex functions of aGPCRs found in structural, in vitro and animal model studies. We used an established bioinformatics pipeline to extract, quantify and visualize mRNA variants of aGPCRs from deeply sequenced transcriptomes. Data analysis showed that aGPCRs have multiple transcription start sites even within introns and that tissue-specific splicing is frequent. On average, 19 significantly expressed transcript variants are derived from a given aGPCR gene. The domain architecture of the N terminus encoded by transcript variants often differs and N termini without or with an incomplete seven-helix transmembrane anchor as well as separate seven-helix transmembrane domains are frequently derived from aGPCR genes. Experimental analyses of selected aGPCR transcript variants revealed marked functional differences. Our analysis has an impact on a rational design of aGPCR constructs for structural analyses and gene-deficient mouse lines and provides new support for independent functions of both, the large N terminus and the transmembrane domain of aGPCRs.
Collapse
|
10
|
Blommaert J, Riss S, Hecox-Lea B, Mark Welch DB, Stelzer CP. Small, but surprisingly repetitive genomes: transposon expansion and not polyploidy has driven a doubling in genome size in a metazoan species complex. BMC Genomics 2019; 20:466. [PMID: 31174483 PMCID: PMC6555955 DOI: 10.1186/s12864-019-5859-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 05/29/2019] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND The causes and consequences of genome size variation across Eukaryotes, which spans five orders of magnitude, have been hotly debated since before the advent of genome sequencing. Previous studies have mostly examined variation among larger taxonomic units (e.g., orders, or genera), while comparisons among closely related species are rare. Rotifers of the Brachionus plicatilis species complex exhibit a seven-fold variation in genome size and thus represent a unique opportunity to study such changes on a relatively short evolutionary timescale. Here, we sequenced and analysed the genomes of four species of this complex with nuclear DNA contents spanning 110-422 Mbp. To establish the likely mechanisms of genome size change, we analysed both sequencing read libraries and assemblies for signatures of polyploidy and repetitive element content. We also compared these genomes to that of B. calyciflorus, the closest relative with a sequenced genome (293 Mbp nuclear DNA content). RESULTS Despite the very large differences in genome size, we saw no evidence of ploidy level changes across the B. plicatilis complex. However, repetitive element content explained a large portion of genome size variation (at least 54%). The species with the largest genome, B. asplanchnoidis, has a strikingly high 44% repetitive element content, while the smaller B. plicatilis genomes contain between 14 and 25% repetitive elements. According to our analyses, the B. calyciflorus genome contains 39% repetitive elements, which is substantially higher than previously reported (21%), and suggests that high repetitive element load could be widespread in monogonont rotifers. CONCLUSIONS Even though the genome sizes of these species are at the low end of the metazoan spectrum, their genomes contain substantial amounts of repetitive elements. Polyploidy does not appear to play a role in genome size variations in these species, and these variations can be mostly explained by changes in repetitive element content. This contradicts the naïve expectation that small genomes are streamlined, or less complex, and that large variations in nuclear DNA content between closely related species are due to polyploidy.
Collapse
Affiliation(s)
- J. Blommaert
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
| | - S. Riss
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
| | - B. Hecox-Lea
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA USA
| | - D. B. Mark Welch
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA USA
| | - C. P. Stelzer
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
| |
Collapse
|
11
|
Gupta P, Peter S, Jung M, Lewin A, Hemmrich-Stanisak G, Franke A, von Kleist M, Schütte C, Einspanier R, Sharbati S, Bruegge JZ. Analysis of long non-coding RNA and mRNA expression in bovine macrophages brings up novel aspects of Mycobacterium avium subspecies paratuberculosis infections. Sci Rep 2019; 9:1571. [PMID: 30733564 PMCID: PMC6367368 DOI: 10.1038/s41598-018-38141-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 12/18/2018] [Indexed: 12/12/2022] Open
Abstract
Paratuberculosis is a major disease in cattle that severely affects animal welfare and causes huge economic losses worldwide. Development of alternative diagnostic methods is of urgent need to control the disease. Recent studies suggest that long non-coding RNAs (lncRNAs) play a crucial role in regulating immune function and may confer valuable information about the disease. However, their role has not yet been investigated in cattle with respect to infection towards Paratuberculosis. Therefore, we investigated the alteration in genomic expression profiles of mRNA and lncRNA in bovine macrophages in response to Paratuberculosis infection using RNA-Seq. We identified 397 potentially novel lncRNA candidates in macrophages of which 38 were differentially regulated by the infection. A total of 820 coding genes were also significantly altered by the infection. Co-expression analysis of lncRNAs and their neighbouring coding genes suggest regulatory functions of lncRNAs in pathways related to immune response. For example, this included protein coding genes such as TNIP3, TNFAIP3 and NF-κB2 that play a role in NF-κB2 signalling, a pathway associated with immune response. This study advances our understanding of lncRNA roles during Paratuberculosis infection.
Collapse
Affiliation(s)
- Pooja Gupta
- Department of Mathematics and Informatics, Freie Universität Berlin, Berlin, Germany. .,Department of Mathematics for Life and Materials Sciences, Zuse Institute Berlin, Berlin, Germany.
| | - Sarah Peter
- Institute for the Reproduction of Farm Animals Schönow Inc, Bernau, Germany
| | - Markus Jung
- Institute for the Reproduction of Farm Animals Schönow Inc, Bernau, Germany
| | - Astrid Lewin
- Robert Koch-Institute, Department Infectious Diseases, Berlin, Germany
| | | | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-University Kiel, Kiel, Germany
| | - Max von Kleist
- Department of Mathematics and Informatics, Freie Universität Berlin, Berlin, Germany
| | - Christof Schütte
- Department of Mathematics and Informatics, Freie Universität Berlin, Berlin, Germany.,Department of Mathematics for Life and Materials Sciences, Zuse Institute Berlin, Berlin, Germany
| | - Ralf Einspanier
- Institute of Veterinary Biochemistry, Department of Veterinary Medicine, Freie Universität Berlin, Berlin, Germany
| | - Soroush Sharbati
- Institute of Veterinary Biochemistry, Department of Veterinary Medicine, Freie Universität Berlin, Berlin, Germany
| | - Jennifer Zur Bruegge
- Institute of Veterinary Biochemistry, Department of Veterinary Medicine, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
12
|
Rahman F, Hassan M, Hanano A, Fitzpatrick DA, McCarthy CGP, Murphy DJ. Evolutionary, structural and functional analysis of the caleosin/peroxygenase gene family in the Fungi. BMC Genomics 2018; 19:976. [PMID: 30593269 PMCID: PMC6309107 DOI: 10.1186/s12864-018-5334-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Accepted: 11/29/2018] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Caleosin/peroxygenases, CLO/PXG, (designated PF05042 in Pfam) are a group of genes/proteins with anomalous distributions in eukaryotic taxa. We have previously characterised CLO/PXGs in the Viridiplantae. The aim of this study was to investigate the evolution and functions of the CLO/PXGs in the Fungi and other non-plant clades and to elucidate the overall origin of this gene family. RESULTS CLO/PXG-like genes are distributed across the full range of fungal groups from the basal clades, Cryptomycota and Microsporidia, to the largest and most complex Dikarya species. However, the genes were only present in 243 out of 844 analysed fungal genomes. CLO/PXG-like genes have been retained in many pathogenic or parasitic fungi that have undergone considerable genomic and structural simplification, indicating that they have important functions in these species. Structural and functional analyses demonstrate that CLO/PXGs are multifunctional proteins closely related to similar proteins found in all major taxa of the Chlorophyte Division of the Viridiplantae. Transcriptome and physiological data show that fungal CLO/PXG-like genes have complex patterns of developmental and tissue-specific expression and are upregulated in response to a range of biotic and abiotic stresses as well as participating in key metabolic and developmental processes such as lipid metabolism, signalling, reproduction and pathogenesis. Biochemical data also reveal that the Aspergillus flavus CLO/PXG has specific functions in sporulation and aflatoxin production as well as playing roles in lipid droplet function. CONCLUSIONS In contrast to plants, CLO/PXGs only occur in about 30% of sequenced fungal genomes but are present in all major taxa. Fungal CLO/PXGs have similar but not identical roles to those in plants, including stress-related oxylipin signalling, lipid metabolism, reproduction and pathogenesis. While the presence of CLO/PXG orthologs in all plant genomes sequenced to date would suggest that they have core housekeeping functions in plants, the selective loss of CLO/PXGs in many fungal genomes suggests more restricted functions in fungi as accessory genes useful in particular environments or niches. We suggest an ancient origin of CLO/PXG-like genes in the 'last eukaryotic common ancestor' (LECA) and their subsequent loss in ancestors of the Metazoa, after the latter had diverged from the ancestral fungal lineage.
Collapse
Affiliation(s)
- Farzana Rahman
- Genomics and Computational Biology Research Group, University of South Wales, Pontypridd, CF37 1DL UK
| | - Mehedi Hassan
- Genomics and Computational Biology Research Group, University of South Wales, Pontypridd, CF37 1DL UK
| | - Abdulsamie Hanano
- Department of Molecular Biology and Biotechnology, Atomic Energy Commission of Syria, P.O. Box 6091, Damascus, Syria
| | | | | | - Denis J. Murphy
- Genomics and Computational Biology Research Group, University of South Wales, Pontypridd, CF37 1DL UK
| |
Collapse
|
13
|
Amigo JD, Opazo JC, Jorquera R, Wichmann IA, Garcia-Bloj BA, Alarcon MA, Owen GI, Corvalán AH. The Reprimo Gene Family: A Novel Gene Lineage in Gastric Cancer with Tumor Suppressive Properties. Int J Mol Sci 2018; 19:E1862. [PMID: 29941787 PMCID: PMC6073456 DOI: 10.3390/ijms19071862] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Revised: 04/20/2018] [Accepted: 04/21/2018] [Indexed: 12/18/2022] Open
Abstract
The reprimo (RPRM) gene family is a group of single exon genes present exclusively within the vertebrate lineage. Two out of three members of this family are present in humans: RPRM and RPRM-Like (RPRML). RPRM induces cell cycle arrest at G2/M in response to p53 expression. Loss-of-expression of RPRM is related to increased cell proliferation and growth in gastric cancer. This evidence suggests that RPRM has tumor suppressive properties. However, the molecular mechanisms and signaling partners by which RPRM exerts its functions remain unknown. Moreover, scarce studies have attempted to characterize RPRML, and its functionality is unclear. Herein, we highlight the role of the RPRM gene family in gastric carcinogenesis, as well as its potential applications in clinical settings. In addition, we summarize the current knowledge on the phylogeny and expression patterns of this family of genes in embryonic zebrafish and adult humans. Strikingly, in both species, RPRM is expressed primarily in the digestive tract, blood vessels and central nervous system, supporting the use of zebrafish for further functional characterization of RPRM. Finally, drawing on embryonic and adult expression patterns, we address the potential relevance of RPRM and RPRML in cancer. Active investigation or analytical research in the coming years should contribute to novel translational applications of this poorly understood gene family as potential biomarkers and development of novel cancer therapies.
Collapse
Affiliation(s)
- Julio D Amigo
- Departamento de Fisiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, 8330025 Santiago, Chile.
| | - Juan C Opazo
- Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, 5090000 Valdivia, Chile.
| | - Roddy Jorquera
- CORE Biodata, Advanced Center for Chronic Diseases (ACCDiS), Pontificia Universidad Católica de Chile, 8330024 Santiago, Chile.
| | - Ignacio A Wichmann
- Laboratory of Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
- Departamento de Oncología y Hematología, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
- CORE Biodata, Advanced Center for Chronic Diseases (ACCDiS), Pontificia Universidad Católica de Chile, 8330024 Santiago, Chile.
| | - Benjamin A Garcia-Bloj
- Laboratory of Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
| | - Maria Alejandra Alarcon
- Laboratory of Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
- Departamento de Oncología y Hematología, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
| | - Gareth I Owen
- Departamento de Fisiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, 8330025 Santiago, Chile.
- Laboratory of Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
- Millennium Institute on Immunology and Immunotherapy, Pontificia Universidad Católica de Chile, 8331150 Santiago, Chile.
| | - Alejandro H Corvalán
- Laboratory of Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
- Departamento de Oncología y Hematología, Facultad de Medicina, Pontificia Universidad Católica de Chile, 8330034 Santiago, Chile.
- CORE Biodata, Advanced Center for Chronic Diseases (ACCDiS), Pontificia Universidad Católica de Chile, 8330024 Santiago, Chile.
| |
Collapse
|
14
|
Bräuer KE, Brockers K, Moneer J, Feuchtinger A, Wollscheid-Lengeling E, Lengeling A, Wolf A. Phylogenetic and genomic analyses of the ribosomal oxygenases Riox1 (No66) and Riox2 (Mina53) provide new insights into their evolution. BMC Evol Biol 2018; 18:96. [PMID: 29914368 PMCID: PMC6006756 DOI: 10.1186/s12862-018-1215-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 06/07/2018] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Translation of specific mRNAs can be highly regulated in different cells, tissues or under pathological conditions. Ribosome heterogeneity can originate from variable expression or post-translational modifications of ribosomal proteins. The ribosomal oxygenases RIOX1 (NO66) and RIOX2 (MINA53) modify ribosomal proteins by histidine hydroxylation. A similar mechanism is present in prokaryotes. Thus, ribosome hydroxylation may be a well-conserved regulatory mechanism with implications in disease and development. However, little is known about the evolutionary history of Riox1 and Riox2 genes and their encoded proteins across eukaryotic taxa. RESULTS In this study, we have analysed Riox1 and Riox2 orthologous genes from 49 metazoen species and have constructed phylogenomic trees for both genes. Our genomic and phylogenetic analyses revealed that Arthropoda, Annelida, Nematoda and Mollusca lack the Riox2 gene, although in the early phylum Cnidaria both genes, Riox1 and Riox2, are present and expressed. Riox1 is an intronless single-exon-gene in several species, including humans. In contrast to Riox2, Riox1 is ubiquitously present throughout the animal kingdom suggesting that Riox1 is the phylogenetically older gene from which Riox2 has evolved. Both proteins have maintained a unique protein architecture with conservation of active sites within the JmjC domains, a dimerization domain, and a winged-helix domain. In addition, Riox1 proteins possess a unique N-terminal extension domain. Immunofluorescence analyses in Hela cells and in Hydra vulgaris identified a nucleolar localisation signal within the extended N-terminal domain of human RIOX1 and an altered subnuclear localisation for the Hydra Riox2. CONCLUSIONS Conserved active site residues and uniform protein domain architecture suggest a consistent enzymatic activity within the Riox orthologs throughout evolution. However, differences in genomic architecture, like single exon genes and alterations in subnuclear localisation, as described for Hydra, point towards adaption mechanisms that may correlate with taxa- or species-specific requirements. The diversification of Riox1/Riox2 gene structures throughout evolution suggest that functional requirements in expression of protein isoforms and/or subcellular localisation of proteins may have evolved by adaptation to lifestyle.
Collapse
Affiliation(s)
- Katharina E Bräuer
- Institute of Molecular Toxicology and Pharmacology, Helmholtz Zentrum München-German Research Center for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany
| | - Kevin Brockers
- Institute of Molecular Toxicology and Pharmacology, Helmholtz Zentrum München-German Research Center for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany
| | - Jasmin Moneer
- Department of Biology II, Ludwig Maximillians University, Munich, Großhaderner Strasse 2, 82152 Planegg-, Martinsried, Germany
| | - Annette Feuchtinger
- Research Unit Analytical Pathology, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Ingolstädter Landstr. 1, 85764, Neuherberg, Germany
| | - Evi Wollscheid-Lengeling
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK
| | - Andreas Lengeling
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UK.,Present address: Max-Planck-Society, Administrative Headquarters, Hofgartenstr. 8, 80539, Munich, Germany
| | - Alexander Wolf
- Institute of Molecular Toxicology and Pharmacology, Helmholtz Zentrum München-German Research Center for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany.
| |
Collapse
|
15
|
Jorquera R, González C, Clausen P, Petersen B, Holmes DS. Improved ontology for eukaryotic single-exon coding sequences in biological databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:1-6. [PMID: 30239665 PMCID: PMC6146118 DOI: 10.1093/database/bay089] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/30/2018] [Indexed: 12/21/2022]
Abstract
Efficient extraction of knowledge from biological data requires the development of structured vocabularies to unambiguously define biological terms. This paper proposes descriptions and definitions to disambiguate the term 'single-exon gene'. Eukaryotic Single-Exon Genes (SEGs) have been defined as genes that do not have introns in their protein coding sequences. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancer and neurological/developmental disorders and many exhibit tissue-specific transcription. Unfortunately, the term 'SEGs' is rife with ambiguity, leading to biological misinterpretations. In the classic definition, no distinction is made between SEGs that harbor introns in their untranslated regions (UTRs) versus those without. This distinction is important to make because the presence of introns in UTRs affects transcriptional regulation and post-transcriptional processing of the mRNA. In addition, recent whole-transcriptome shotgun sequencing has led to the discovery of many examples of single-exon mRNAs that arise from alternative splicing of multi-exon genes, these single-exon isoforms are being confused with SEGs despite their clearly different origin. The increasing expansion of RNA-seq datasets makes it imperative to distinguish the different SEG types before annotation errors become indelibly propagated in biological databases. This paper develops a structured vocabulary for their disambiguation, allowing a major reassessment of their evolutionary trajectories, regulation, RNA processing and transport, and provides the opportunity to improve the detection of gene associations with disorders including cancers, neurological and developmental diseases.
Collapse
Affiliation(s)
- Roddy Jorquera
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Avenida Zañartu 1482, Ñuñoa, Santiago, Chile.,Facultad de Ciencias Biologicas, Universidad Andres Bello, Santiago, Chile
| | - Carolina González
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Avenida Zañartu 1482, Ñuñoa, Santiago, Chile
| | - Philip Clausen
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Bent Petersen
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs. Lyngby, Denmark.,Centre of Excellence for Omics-Driven Computational Biodiscovery (COMBio), Faculty of Applied Sciences, AIMST University, Kedah, Malaysia
| | - David S Holmes
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Avenida Zañartu 1482, Ñuñoa, Santiago, Chile.,Centro de Genómica y Bioinformática Facultad de Ciencias, Universidad Mayor, Santiago, Chile
| |
Collapse
|
16
|
Protein-Coding Genes' Retrocopies and Their Functions. Viruses 2017; 9:v9040080. [PMID: 28406439 PMCID: PMC5408686 DOI: 10.3390/v9040080] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 04/07/2017] [Accepted: 04/11/2017] [Indexed: 12/11/2022] Open
Abstract
Transposable elements, often considered to be not important for survival, significantly contribute to the evolution of transcriptomes, promoters, and proteomes. Reverse transcriptase, encoded by some transposable elements, can be used in trans to produce a DNA copy of any RNA molecule in the cell. The retrotransposition of protein-coding genes requires the presence of reverse transcriptase, which could be delivered by either non-long terminal repeat (non-LTR) or LTR transposons. The majority of these copies are in a state of “relaxed” selection and remain “dormant” because they are lacking regulatory regions; however, many become functional. In the course of evolution, they may undergo subfunctionalization, neofunctionalization, or replace their progenitors. Functional retrocopies (retrogenes) can encode proteins, novel or similar to those encoded by their progenitors, can be used as alternative exons or create chimeric transcripts, and can also be involved in transcriptional interference and participate in the epigenetic regulation of parental gene expression. They can also act in trans as natural antisense transcripts, microRNA (miRNA) sponges, or a source of various small RNAs. Moreover, many retrocopies of protein-coding genes are linked to human diseases, especially various types of cancer.
Collapse
|