1
|
Song QA, Catlin NS, Brad Barbazuk W, Li S. Computational analysis of alternative splicing in plant genomes. Gene 2019; 685:186-195. [PMID: 30321657 DOI: 10.1016/j.gene.2018.10.026] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 09/16/2018] [Accepted: 10/11/2018] [Indexed: 12/11/2022]
Abstract
Computational analyses play crucial roles in characterizing splicing isoforms in plant genomes. In this review, we provide a survey of computational tools used in recently published, genome-scale splicing analyses in plants. We summarize the commonly used software and pipelines for read mapping, isoform reconstruction, isoform quantification, and differential expression analysis. We also discuss methods for analyzing long reads and the strategies to combine long and short reads in identifying splicing isoforms. We review several tools for characterizing local splicing events, splicing graphs, coding potential, and visualizing splicing isoforms. We further discuss the procedures for identifying conserved splicing isoforms across plant species. Finally, we discuss the outlook of integrating other genomic data with splicing analyses to identify regulatory mechanisms of AS on genome-wide scale.
Collapse
Affiliation(s)
- Qi A Song
- Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America
| | - Nathan S Catlin
- Department of Biology, University of Florida, Gainesville, FL 32611, United States of America
| | - W Brad Barbazuk
- Department of Biology, University of Florida, Gainesville, FL 32611, United States of America; Genetics Institute, University of Florida, Gainesville, FL 32611, United States of America
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America.
| |
Collapse
|
2
|
Klasberg S, Bitard-Feildel T, Mallet L. Computational Identification of Novel Genes: Current and Future Perspectives. Bioinform Biol Insights 2016; 10:121-31. [PMID: 27493475 PMCID: PMC4970615 DOI: 10.4137/bbi.s39950] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 05/31/2016] [Accepted: 06/05/2016] [Indexed: 12/31/2022] Open
Abstract
While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.
Collapse
Affiliation(s)
- Steffen Klasberg
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| | - Tristan Bitard-Feildel
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| | - Ludovic Mallet
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| |
Collapse
|
3
|
Deng F, Chen SY. dbHT-Trans: An Efficient Tool for Filtering the Protein-Encoding Transcripts Assembled by RNA-Seq According to Search for Homologous Proteins. J Comput Biol 2015; 23:1-9. [PMID: 26484655 DOI: 10.1089/cmb.2015.0137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In RNA-Seq studies, there are still many challenges for reliably assembling transcripts. Both genome-guided and de novo methods always produce too many false transcripts because of known and unknown factors. Therefore, the postassembly quality filtering is necessary before performing downstream analyses. Here, we present an automatic and efficient tool of dbHT-Trans for filtering the protein-encoding transcripts assembled by RNA-Seq. For each candidate transcript, we first deduced all potential open reading frames and translated them into amino acid sequences. By searching against the reference protein database, a transcript would be predicted a false one when it has no homologous sequence. Using this method, it is expected to filter out the falsely assembled transcripts of protein-encoding genes. Application of dbHT-Trans to the annotated transcriptome of mouse revealed that the sensitivity was almost 90% for recalling protein-encoding transcripts. After this quality filtering, the numbers of assembled genes became more consistent between Cufflinks and Trinity tools. To significantly decrease the data storage, we transformed all intermediate data into descriptive metadata and stored by the MySQL database, which will be utilized by downstream analyses in a real-time style. The source codes, example data, and manual of dbHT-Trans are freely available on the GitHub repository.
Collapse
Affiliation(s)
- Feilong Deng
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University , Chengdu, China
| | - Shi-Yi Chen
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University , Chengdu, China
| |
Collapse
|
4
|
Grinev VV, Migas AA, Kirsanava AD, Mishkova OA, Siomava N, Ramanouskaya TV, Vaitsiankova AV, Ilyushonak IM, Nazarov PV, Vallar L, Aleinikova OV. Decoding of exon splicing patterns in the human RUNX1-RUNX1T1 fusion gene. Int J Biochem Cell Biol 2015; 68:48-58. [PMID: 26320575 DOI: 10.1016/j.biocel.2015.08.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Revised: 08/12/2015] [Accepted: 08/24/2015] [Indexed: 11/25/2022]
Abstract
The t(8;21) translocation is the most widespread genetic defect found in human acute myeloid leukemia. This translocation results in the RUNX1-RUNX1T1 fusion gene that produces a wide variety of alternative transcripts and influences the course of the disease. The rules of combinatorics and splicing of exons in the RUNX1-RUNX1T1 transcripts are not known. To address this issue, we developed an exon graph model of the fusion gene organization and evaluated its local exon combinatorics by the exon combinatorial index (ECI). Here we show that the local exon combinatorics of the RUNX1-RUNX1T1 gene follows a power-law behavior and (i) the vast majority of exons has a low ECI, (ii) only a small part is represented by "exons-hubs" of splicing with very high ECI values, and (iii) it is scale-free and very sensitive to targeted skipping of "exons-hubs". Stochasticity of the splicing machinery and preferred usage of exons in alternative splicing can explain such behavior of the system. Stochasticity may explain up to 12% of the ECI variance and results in a number of non-coding and unproductive transcripts that can be considered as a noise. Half-life of these transcripts is increased due to the deregulation of some key genes of the nonsense-mediated decay system in leukemia cells. On the other hand, preferred usage of exons may explain up to 75% of the ECI variability. Our analysis revealed a set of splicing-related cis-regulatory motifs that can explain "attractiveness" of exons in alternative splicing but only when they are considered together. Cis-regulatory motifs are guides for splicing trans-factors and we observed a leukemia-specific profile of expression of the splicing genes in t(8;21)-positive blasts. Altogether, our results show that alternative splicing of the RUNX1-RUNX1T1 transcripts follows strict rules and that the power-law component of the fusion gene organization confers a high flexibility to this process.
Collapse
Affiliation(s)
- Vasily V Grinev
- Department of Genetics, Faculty of Biology, Belarusian State University, Minsk, Belarus.
| | - Alexandr A Migas
- Laboratory of the Genetic Biotechnology, Department of Research, Belarusian Research Center for Pediatric Oncology, Hematology and Immunology, Minsk, Belarus
| | - Aksana D Kirsanava
- Department of Genetics, Faculty of Biology, Belarusian State University, Minsk, Belarus
| | - Olga A Mishkova
- Laboratory of the Genetic Biotechnology, Department of Research, Belarusian Research Center for Pediatric Oncology, Hematology and Immunology, Minsk, Belarus
| | - Natalia Siomava
- Department of Developmental Biology, University of Göttingen, Göttingen, Germany
| | | | - Alina V Vaitsiankova
- Department of Genetics, Faculty of Biology, Belarusian State University, Minsk, Belarus
| | - Ilia M Ilyushonak
- Department of Genetics, Faculty of Biology, Belarusian State University, Minsk, Belarus
| | - Petr V Nazarov
- Genomics Research Unit, Luxembourg Institute of Health, Luxembourg
| | - Laurent Vallar
- Genomics Research Unit, Luxembourg Institute of Health, Luxembourg
| | - Olga V Aleinikova
- Laboratory of the Genetic Biotechnology, Department of Research, Belarusian Research Center for Pediatric Oncology, Hematology and Immunology, Minsk, Belarus
| |
Collapse
|