1
|
Li Y, Fang Q, Cao Y, Yang M, Wang J, Wang M, Li N, Meng F. Identification and Functional Characterization of Soybean Microexon in Response to Saline-Alkali Stress. PLANT, CELL & ENVIRONMENT 2025. [PMID: 40302202 DOI: 10.1111/pce.15596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2024] [Revised: 04/18/2025] [Accepted: 04/22/2025] [Indexed: 05/01/2025]
Abstract
Salt-alkali stress is one of the most widespread and devastating abiotic stress. Alternative splicing is a response pathway to such stress. However, the role of microexons in response to salt-alkali stress in soybean remains obscure. In this study, we identified microexons related to salt-alkali stress. We focused on analyzing the conserved sequence patterns of 27-30 bp microexons, and consistently observed conserved GT and AG sequences at the 5' and 3' ends of these microexons. Additionally, we found that the AP2 protein domain had the most abundant microexons. Interestingly, the majority of microexons in the AP2 transcription factor were 9 bp in length, encoding a conserved valine (V), tyrosine (Y), or leucine (L), suggesting their indispensable role. Furthermore, we cloned two transcripts of three AP2 genes with and without the salt-alkali stress-induced microexon and generated stable transgenic soybeans. Surprisingly, we discovered that the depletion of microexons in the AP2 gene enhances salt-alkali resistance. Collectively, this characterization of microexon suggests a new scenario explaining soybean salt-alkali stress resistance.
Collapse
Affiliation(s)
- Yang Li
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Qingxi Fang
- Department of Agriculture, Northeast Agricultural University, Harbin, China
| | - Yingxue Cao
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Mingyu Yang
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Jing Wang
- Department of Agriculture, Northeast Agricultural University, Harbin, China
| | - Meizi Wang
- Department of Agriculture, Northeast Agricultural University, Harbin, China
| | - Na Li
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Fanli Meng
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| |
Collapse
|
2
|
Hegedüs B, Sahu N, Bálint B, Haridas S, Bense V, Merényi Z, Virágh M, Wu H, Liu XB, Riley R, Lipzen A, Koriabine M, Savage E, Guo J, Barry K, Ng V, Urbán P, Gyenesei A, Freitag M, Grigoriev IV, Nagy LG. Morphogenesis, starvation, and light responses in a mushroom-forming fungus revealed by long-read sequencing and extensive expression profiling. CELL GENOMICS 2025:100853. [PMID: 40262612 DOI: 10.1016/j.xgen.2025.100853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 12/19/2024] [Accepted: 03/24/2025] [Indexed: 04/24/2025]
Abstract
Mushroom-forming fungi (Agaricomycetes) are emerging as pivotal players in several fields of science and industry. Genomic data for Agaricomycetes are accumulating rapidly; however, this is not paralleled by improvements of gene annotations, which leave gene function notoriously poorly understood. We set out to improve our functional understanding of the model mushroom Coprinopsis cinerea by integrating a new, chromosome-level assembly, high-quality gene predictions, and functional information derived from broad gene-expression profiling data. The new annotation includes 5' and 3' untranslated regions (UTRs), polyadenylation sites (PASs), upstream open reading frames (uORFs), splicing isoforms, and microexons, as well as core gene sets corresponding to carbon starvation, light response, and hyphal differentiation. As a result, the genome of C. cinerea has now become the most comprehensively annotated genome among mushroom-forming fungi, which will contribute to multiple rapidly expanding fields, including research on their life history, light and stress responses, as well as multicellular development.
Collapse
Affiliation(s)
- Botond Hegedüs
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Neha Sahu
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Balázs Bálint
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Sajeet Haridas
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Viktória Bense
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Zsolt Merényi
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Máté Virágh
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Hongli Wu
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Xiao-Bin Liu
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary
| | - Robert Riley
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Anna Lipzen
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Maxim Koriabine
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Emily Savage
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jie Guo
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Kerrie Barry
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Vivian Ng
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Péter Urbán
- János Szentágothai Research Center, University of Pécs, Ifjúság útja 20, 7624 Pécs, Hungary
| | - Attila Gyenesei
- János Szentágothai Research Center, University of Pécs, Ifjúság útja 20, 7624 Pécs, Hungary
| | - Michael Freitag
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR 97331, USA
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - László G Nagy
- Synthetic and Systems Biology Unit, Institute of Biochemistry, HUN-REN Biological Research Center, Temesvári krt. 62, 6726 Szeged, Hungary.
| |
Collapse
|
3
|
Madrigal G, Minhas BF, Catchen J. Klumpy: A tool to evaluate the integrity of long-read genome assemblies and illusive sequence motifs. Mol Ecol Resour 2025; 25:e13982. [PMID: 38800997 PMCID: PMC11646305 DOI: 10.1111/1755-0998.13982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 05/13/2024] [Indexed: 05/29/2024]
Abstract
The improvement and decreasing costs of third-generation sequencing technologies has widened the scope of biological questions researchers can address with de novo genome assemblies. With the increasing number of reference genomes, validating their integrity with minimal overhead is vital for establishing confident results in their applications. Here, we present Klumpy, a tool for detecting and visualizing both misassembled regions in a genome assembly and genetic elements (e.g. genes) of interest in a set of sequences. By leveraging the initial raw reads in combination with their respective genome assembly, we illustrate Klumpy's utility by investigating antifreeze glycoprotein (afgp) loci across two icefishes, by searching for a reported absent gene in the northern snakehead fish, and by scanning the reference genomes of a mudskipper and bumblebee for misassembled regions. In the two former cases, we were able to provide support for the noncanonical placement of an afgp locus in the icefishes and locate the missing snakehead gene. Furthermore, our genome scans were able identify an unmappable locus in the mudskipper reference genome and identify a putative repetitive element shared among several species of bees.
Collapse
Affiliation(s)
- Giovanni Madrigal
- Department of Evolution, Ecology, and BehaviorUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| | - Bushra Fazal Minhas
- Informatics ProgramUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| | - Julian Catchen
- Department of Evolution, Ecology, and BehaviorUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
- Informatics ProgramUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| |
Collapse
|
4
|
Wang J, Ma X, Hu Y, Feng G, Guo C, Zhang X, Ma H. Regulation of micro- and small-exon retention and other splicing processes by GRP20 for flower development. NATURE PLANTS 2024; 10:66-85. [PMID: 38195906 PMCID: PMC10808074 DOI: 10.1038/s41477-023-01605-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 11/29/2023] [Indexed: 01/11/2024]
Abstract
Pre-mRNA splicing is crucial for gene expression and depends on the spliceosome and splicing factors. Plant exons have an average size of ~180 nucleotides and typically contain motifs for interactions with spliceosome and splicing factors. Micro exons (<51 nucleotides) are found widely in eukaryotes and in genes for plant development and environmental responses. However, little is known about transcript-specific regulation of splicing in plants and about the regulators for micro exon splicing. Here we report that glycine-rich protein 20 (GRP20) is an RNA-binding protein and required for splicing of ~2,100 genes including those functioning in flower development and/or environmental responses. Specifically, GRP20 is required for micro-exon retention in transcripts of floral homeotic genes; these micro exons are conserved across angiosperms. GRP20 is also important for small-exon (51-100 nucleotides) splicing. In addition, GRP20 is required for flower development. Furthermore, GRP20 binds to poly-purine motifs in micro and small exons and a spliceosome component; both RNA binding and spliceosome interaction are important for flower development and micro-exon retention. Our results provide new insights into the mechanisms of micro-exon retention in flower development.
Collapse
Affiliation(s)
- Jun Wang
- Department of Biology, Eberly College of Science, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Xinwei Ma
- Department of Biology, Eberly College of Science, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Yi Hu
- Department of Biology, Eberly College of Science, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Guanhua Feng
- Department of Biology, Eberly College of Science, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Chunce Guo
- Jiangxi Provincial Key Laboratory for Bamboo Germplasm Resources and Utilization, Forestry College, Jiangxi Agricultural University, Nanchang, China
| | - Xin Zhang
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Eberly College of Science, Pennsylvania State University, University Park, PA, USA
| | - Hong Ma
- Department of Biology, Eberly College of Science, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
5
|
Zhang H, Wafula EK, Eilers J, Harkess A, Ralph PE, Timilsena PR, dePamphilis CW, Waite JM, Honaas LA. Building a foundation for gene family analysis in Rosaceae genomes with a novel workflow: A case study in Pyrus architecture genes. FRONTIERS IN PLANT SCIENCE 2022; 13:975942. [PMID: 36452099 PMCID: PMC9702816 DOI: 10.3389/fpls.2022.975942] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/21/2022] [Indexed: 05/26/2023]
Abstract
The rapid development of sequencing technologies has led to a deeper understanding of plant genomes. However, direct experimental evidence connecting genes to important agronomic traits is still lacking in most non-model plants. For instance, the genetic mechanisms underlying plant architecture are poorly understood in pome fruit trees, creating a major hurdle in developing new cultivars with desirable architecture, such as dwarfing rootstocks in European pear (Pyrus communis). An efficient way to identify genetic factors for important traits in non-model organisms can be to transfer knowledge across genomes. However, major obstacles exist, including complex evolutionary histories and variable quality and content of publicly available plant genomes. As researchers aim to link genes to traits of interest, these challenges can impede the transfer of experimental evidence across plant species, namely in the curation of high-quality, high-confidence gene models in an evolutionary context. Here we present a workflow using a collection of bioinformatic tools for the curation of deeply conserved gene families of interest across plant genomes. To study gene families involved in tree architecture in European pear and other rosaceous species, we used our workflow, plus a draft genome assembly and high-quality annotation of a second P. communis cultivar, 'd'Anjou.' Our comparative gene family approach revealed significant issues with the most recent 'Bartlett' genome - primarily thousands of missing genes due to methodological bias. After correcting assembly errors on a global scale in the 'Bartlett' genome, we used our workflow for targeted improvement of our genes of interest in both P. communis genomes, thus laying the groundwork for future functional studies in pear tree architecture. Further, our global gene family classification of 15 genomes across 6 genera provides a valuable and previously unavailable resource for the Rosaceae research community. With it, orthologs and other gene family members can be easily identified across any of the classified genomes. Importantly, our workflow can be easily adopted for any other plant genomes and gene families of interest.
Collapse
Affiliation(s)
- Huiting Zhang
- Tree Fruit Research Laboratory, Agricultural Research Service (ARS), United States Department of Agriculture (USDA), Wenatchee, WA, United States
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Eric K. Wafula
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
| | - Jon Eilers
- Tree Fruit Research Laboratory, Agricultural Research Service (ARS), United States Department of Agriculture (USDA), Wenatchee, WA, United States
| | - Alex E. Harkess
- College of Agriculture, Auburn University, Auburn, AL, United States
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, United States
| | - Paula E. Ralph
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
| | - Prakash Raj Timilsena
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
| | - Claude W. dePamphilis
- Department of Biology, The Pennsylvania State University, University Park, PA, United States
| | - Jessica M. Waite
- Tree Fruit Research Laboratory, Agricultural Research Service (ARS), United States Department of Agriculture (USDA), Wenatchee, WA, United States
| | - Loren A. Honaas
- Tree Fruit Research Laboratory, Agricultural Research Service (ARS), United States Department of Agriculture (USDA), Wenatchee, WA, United States
| |
Collapse
|
6
|
Pervasive misannotation of microexons that are evolutionarily conserved and crucial for gene function in plants. Nat Commun 2022; 13:820. [PMID: 35145097 PMCID: PMC8831610 DOI: 10.1038/s41467-022-28449-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 01/26/2022] [Indexed: 12/31/2022] Open
Abstract
It is challenging to identify the smallest microexons (≤15-nt) due to their small size. Consequently, these microexons are often misannotated or missed entirely during genome annotation. Here, we develop a pipeline to accurately identify 2,398 small microexons in 10 diverse plant species using 990 RNA-seq datasets, and most of them have not been annotated in the reference genomes. Analysis reveals that microexons tend to have increased detained flanking introns that require post-transcriptional splicing after polyadenylation. Examination of 45 conserved microexon clusters demonstrates that microexons and associated gene structures can be traced back to the origin of land plants. Based on these clusters, we develop an algorithm to genome-wide model coding microexons in 132 plants and find that microexons provide a strong phylogenetic signal for plant organismal relationships. Microexon modeling reveals diverse evolutionary trajectories, involving microexon gain and loss and alternative splicing. Our work provides a comprehensive view of microexons in plants. The small size (≤15-nt) of micorexons poses difficulties for genome annotation and identification using standard RNA sequence mapping approaches. Here, the authors develop computational pipelines to discover and predict microexons in plants and reveal diverse evolutionary trajectories via genomewide microexon modeling.
Collapse
|
7
|
Kalariya KA, Meena RP, Poojara L, Shahi D, Patel S. Characterization of squalene synthase gene from Gymnema sylvestre R. Br. BENI-SUEF UNIVERSITY JOURNAL OF BASIC AND APPLIED SCIENCES 2021. [DOI: 10.1186/s43088-020-00094-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Squalene synthase (SQS) is a rate-limiting enzyme necessary to produce pentacyclic triterpenes in plants. It is an important enzyme producing squalene molecules required to run steroidal and triterpenoid biosynthesis pathways working in competitive inhibition mode. Reports are available on information pertaining to SQS gene in several plants, but detailed information on SQS gene in Gymnema sylvestre R. Br. is not available. G. sylvestre is a priceless rare vine of central eco-region known for its medicinally important triterpenoids. Our work aims to characterize the GS-SQS gene in this high-value medicinal plant.
Results
Coding DNA sequences (CDS) with 1245 bp length representing GS-SQS gene predicted from transcriptome data in G. sylvestre was used for further characterization. The SWISS protein structure modeled for the GS-SQS amino acid sequence data had MolProbity Score of 1.44 and the Clash Score 3.86. The quality estimates and statistical score of Ramachandran plots analysis indicated that the homology model was reliable. For full-length amplification of the gene, primers designed from flanking regions of CDS encoding GS-SQS were used to get amplification against genomic DNA as template which resulted in approximately 6.2-kb sized single-band product. The sequencing of this product through NGS was carried out generating 2.32 Gb data and 3347 number of scaffolds with N50 value of 457 bp. These scaffolds were compared to identify similarity with other SQS genes as well as the GS-SQSs of the transcriptome. Scaffold_3347 representing the GS-SQS gene harbored two introns of 101 and 164 bp size. Both these intronic regions were validated by primers designed from adjoining outside regions of the introns on the scaffold representing GS-SQS gene. The amplification took place when the template was genomic DNA and failed when the template was cDNA confirmed the presence of two introns in GS-SQS gene in Gymnema sylvestre R. Br.
Conclusion
This study shows GS-SQS gene was very closely related to Coffea arabica and Gardenia jasminoides and this gene harbored two introns of 101 and 164 bp size.
Collapse
|
8
|
Kurkowiak M, Grasso G, Faktor J, Scheiblecker L, Winniczuk M, Mayordomo MY, O'Neill JR, Oster B, Vojtesek B, Al-Saadi A, Marek-Trzonkowska N, Hupp TR. An integrated DNA and RNA variant detector identifies a highly conserved three base exon in the MAP4K5 kinase locus. RNA Biol 2021; 18:2556-2575. [PMID: 34190025 PMCID: PMC8632122 DOI: 10.1080/15476286.2021.1932345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
RNA variants that emerge from editing and alternative splicing form important regulatory stages in protein signalling. In this report, we apply an integrated DNA and RNA variant detection workbench to define the range of RNA variants that deviate from the reference genome in a human melanoma cell model. The RNA variants can be grouped into (i) classic ADAR-like or APOBEC-like RNA editing events and (ii) multiple-nucleotide variants (MNVs) including three and six base pair in-frame non-canonical unmapped exons. We focus on validating representative genes of these classes. First, clustered non-synonymous RNA edits (A-I) in the CDK13 gene were validated by Sanger sequencing to confirm the integrity of the RNA variant detection workbench. Second, a highly conserved RNA variant in the MAP4K5 gene was detected that results most likely from the splicing of a non-canonical three-base exon. The two RNA variants produced from the MAP4K5 locus deviate from the genomic reference sequence and produce V569E or V569del isoform variants. Low doses of splicing inhibitors demonstrated that the MAP4K5-V569E variant emerges from an SF3B1-dependent splicing event. Mass spectrometry of the recombinant SBP-tagged MAP4K5V569E and MAP4K5V569del proteins pull-downs in transfected cell systems was used to identify the protein-protein interactions of these two MAP4K5 isoforms and propose possible functions. Together these data highlight the utility of this integrated DNA and RNA variant detection platform to detect RNA variants in cancer cells and support future analysis of RNA variant detection in cancer tissue.
Collapse
Affiliation(s)
- Małgorzata Kurkowiak
- International Centre for Cancer Vaccine Science (ICCVS), University of Gdańsk, 80-822 Gdańsk, Poland
| | - Giuseppa Grasso
- University of Edinburgh, Institute of Genetics and Molecular Medicine, Edinburgh Cancer Research Centre, Edinburgh, Scotland, UK
| | - Jakub Faktor
- International Centre for Cancer Vaccine Science (ICCVS), University of Gdańsk, 80-822 Gdańsk, Poland.,Research Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| | - Lisa Scheiblecker
- Institute of Pharmacology and Toxicology, University of Veterinary Medicine Vienna, 1210 Vienna, Austria
| | - Małgorzata Winniczuk
- International Centre for Cancer Vaccine Science (ICCVS), University of Gdańsk, 80-822 Gdańsk, Poland
| | - Marcos Yebenes Mayordomo
- International Centre for Cancer Vaccine Science (ICCVS), University of Gdańsk, 80-822 Gdańsk, Poland.,University of Edinburgh, Institute of Genetics and Molecular Medicine, Edinburgh Cancer Research Centre, Edinburgh, Scotland, UK
| | - J Robert O'Neill
- Cambridge Oesophagogastric Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Bodil Oster
- QIAGEN Aarhus, Silkeborgvej 2, 8000 Aarhus, Denmark
| | - Borek Vojtesek
- Research Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| | - Ali Al-Saadi
- University of Edinburgh, Institute of Genetics and Molecular Medicine, Edinburgh Cancer Research Centre, Edinburgh, Scotland, UK
| | - Natalia Marek-Trzonkowska
- International Centre for Cancer Vaccine Science (ICCVS), University of Gdańsk, 80-822 Gdańsk, Poland.,Laboratory of Immunoregulation and Cellular Therapies, Department of Family Medicine, Medical University of Gdańsk, Gdańsk, Poland
| | - Ted R Hupp
- International Centre for Cancer Vaccine Science (ICCVS), University of Gdańsk, 80-822 Gdańsk, Poland.,University of Edinburgh, Institute of Genetics and Molecular Medicine, Edinburgh Cancer Research Centre, Edinburgh, Scotland, UK
| |
Collapse
|
9
|
Mayers CG, Harrington TC, Wai A, Hausner G. Recent and Ongoing Horizontal Transfer of Mitochondrial Introns Between Two Fungal Tree Pathogens. Front Microbiol 2021; 12:656609. [PMID: 34149643 PMCID: PMC8208691 DOI: 10.3389/fmicb.2021.656609] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 04/09/2021] [Indexed: 11/23/2022] Open
Abstract
Two recently introduced fungal plant pathogens (Ceratocystis lukuohia and Ceratocystis huliohia) are responsible for Rapid ‘ōhi‘a Death (ROD) in Hawai‘i. Despite being sexually incompatible, the two pathogens often co-occur in diseased ‘ōhi‘a sapwood, where genetic interaction is possible. We sequenced and annotated 33 mitochondrial genomes of the two pathogens and related species, and investigated 35 total Ceratocystis mitogenomes. Ten mtDNA regions [one group I intron, seven group II introns, and two autonomous homing endonuclease (HE) genes] were heterogeneously present in C. lukuohia mitogenomes, which were otherwise identical. Molecular surveys with specific primers showed that the 10 regions had uneven geographic distribution amongst populations of C. lukuohia. Conversely, identical orthologs of each region were present in every studied isolate of C. huliohia regardless of geographical origin. Close relatives of C. lukuohia lacked or, rarely, had few and dissimilar orthologs of the 10 regions, whereas most relatives of C. huliohia had identical or nearly identical orthologs. Each region included or worked in tandem with HE genes or reverse transcriptase/maturases that could facilitate interspecific horizontal transfers from intron-minus to intron-plus alleles. These results suggest that the 10 regions originated in C. huliohia and are actively moving to populations of C. lukuohia, perhaps through transient cytoplasmic contact of hyphal tips (anastomosis) in the wound surface of ‘ōhi‘a trees. Such contact would allow for the transfer of mitochondria followed by mitochondrial fusion or cytoplasmic exchange of intron intermediaries, which suggests that further genomic interaction may also exist between the two pathogens.
Collapse
Affiliation(s)
- Chase G Mayers
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, United States
| | - Thomas C Harrington
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, United States
| | - Alvan Wai
- Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada
| | - Georg Hausner
- Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada
| |
Collapse
|
10
|
Banerjee S, Bhandary P, Woodhouse M, Sen TZ, Wise RP, Andorf CM. FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences. BMC Bioinformatics 2021; 22:205. [PMID: 33879057 PMCID: PMC8056616 DOI: 10.1186/s12859-021-04120-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 04/07/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Gene annotation in eukaryotes is a non-trivial task that requires meticulous analysis of accumulated transcript data. Challenges include transcriptionally active regions of the genome that contain overlapping genes, genes that produce numerous transcripts, transposable elements and numerous diverse sequence repeats. Currently available gene annotation software applications depend on pre-constructed full-length gene sequence assemblies which are not guaranteed to be error-free. The origins of these sequences are often uncertain, making it difficult to identify and rectify errors in them. This hinders the creation of an accurate and holistic representation of the transcriptomic landscape across multiple tissue types and experimental conditions. Therefore, to gauge the extent of diversity in gene structures, a comprehensive analysis of genome-wide expression data is imperative. RESULTS We present FINDER, a fully automated computational tool that optimizes the entire process of annotating genes and transcript structures. Unlike current state-of-the-art pipelines, FINDER automates the RNA-Seq pre-processing step by working directly with raw sequence reads and optimizes gene prediction from BRAKER2 by supplementing these reads with associated proteins. The FINDER pipeline (1) reports transcripts and recognizes genes that are expressed under specific conditions, (2) generates all possible alternatively spliced transcripts from expressed RNA-Seq data, (3) analyzes read coverage patterns to modify existing transcript models and create new ones, and (4) scores genes as high- or low-confidence based on the available evidence across multiple datasets. We demonstrate the ability of FINDER to automatically annotate a diverse pool of genomes from eight species. CONCLUSIONS FINDER takes a completely automated approach to annotate genes directly from raw expression data. It is capable of processing eukaryotic genomes of all sizes and requires no manual supervision-ideal for bench researchers with limited experience in handling computational tools.
Collapse
Affiliation(s)
- Sagnik Banerjee
- Program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, 50011, USA
- Department of Statistics, Iowa State University, Ames, IA, 50011, USA
| | - Priyanka Bhandary
- Program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, 50011, USA
- Department of Genetics, Developmental and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Margaret Woodhouse
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, 50011, USA
| | - Taner Z Sen
- Crop Improvement and Genetics Research Unit, USDA-Agricultural Research Service, Albany, CA, 94710, USA
| | - Roger P Wise
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, 50011, USA
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, 50011, USA
| | - Carson M Andorf
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, 50011, USA.
- Department of Computer Science, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
11
|
Christian RW, Hewitt SL, Nelson G, Roalson EH, Dhingra A. Plastid transit peptides-where do they come from and where do they all belong? Multi-genome and pan-genomic assessment of chloroplast transit peptide evolution. PeerJ 2020; 8:e9772. [PMID: 32913678 PMCID: PMC7456531 DOI: 10.7717/peerj.9772] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 07/30/2020] [Indexed: 01/22/2023] Open
Abstract
Subcellular relocalization of proteins determines an organism's metabolic repertoire and thereby its survival in unique evolutionary niches. In plants, the plastid and its various morphotypes import a large and varied number of nuclear-encoded proteins to orchestrate vital biochemical reactions in a spatiotemporal context. Recent comparative genomics analysis and high-throughput shotgun proteomics data indicate that there are a large number of plastid-targeted proteins that are either semi-conserved or non-conserved across different lineages. This implies that homologs are differentially targeted across different species, which is feasible only if proteins have gained or lost plastid targeting peptides during evolution. In this study, a broad, multi-genome analysis of 15 phylogenetically diverse genera and in-depth analyses of pangenomes from Arabidopsis and Brachypodium were performed to address the question of how proteins acquire or lose plastid targeting peptides. The analysis revealed that random insertions or deletions were the dominant mechanism by which novel transit peptides are gained by proteins. While gene duplication was not a strict requirement for the acquisition of novel subcellular targeting, 40% of novel plastid-targeted genes were found to be most closely related to a sequence within the same genome, and of these, 30.5% resulted from alternative transcription or translation initiation sites. Interestingly, analysis of the distribution of amino acids in the transit peptides of known and predicted chloroplast-targeted proteins revealed monocot and eudicot-specific preferences in residue distribution.
Collapse
Affiliation(s)
- Ryan W. Christian
- Molecular Plant Sciences, Washington State University, Pullman, WA, USA
| | - Seanna L. Hewitt
- Molecular Plant Sciences, Washington State University, Pullman, WA, USA
| | - Grant Nelson
- Molecular Plant Sciences, Washington State University, Pullman, WA, USA
| | - Eric H. Roalson
- Molecular Plant Sciences, Washington State University, Pullman, WA, USA
- School of Biological Sciences, Washington State University, Pullman, WA, USA
| | - Amit Dhingra
- Molecular Plant Sciences, Washington State University, Pullman, WA, USA
- Department of Horticulture, Washington State University, Pullman, WA, USA
| |
Collapse
|
12
|
Chang LW, Tseng IC, Wang LH, Sun YH. Isoform-specific functions of an evolutionarily conserved 3 bp micro-exon alternatively spliced from another exon in Drosophila homothorax gene. Sci Rep 2020; 10:12783. [PMID: 32732884 PMCID: PMC7392893 DOI: 10.1038/s41598-020-69644-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Accepted: 07/14/2020] [Indexed: 12/03/2022] Open
Abstract
Micro-exons are exons of very small size (usually 3–30 nts). Some micro-exons are alternatively spliced. Their functions, regulation and evolution are largely unknown. Here, we present an example of an alternatively spliced 3 bp micro-exon (micro-Ex8) in the homothorax (hth) gene in Drosophila. Hth is involved in many developmental processes. It contains a MH domain and a TALE-class homeodomain (HD). It binds to another homeodomain Exd via its MH domain to promote the nuclear import of the Hth-Exd complex and serve as a cofactor for Hox proteins. The MH and HD domains in Hth as well as the HTh-Exd interaction are highly conserved in evolution. The alternatively spliced micro-exon lies between the exons encoding the MH and HD domains. We provide clear proof that the micro-Ex8 is produced by alternative splicing from a 48 bp full-length exon 8 (FL-Ex8) and the micro-Ex8 is the first three nt is FL-Ex8. We found that the micro-Ex8 is the ancient form and the 3 + 48 organization of alternatively spliced overlapping exons only emerged in the Schizophora group of Diptera and is absolutely conserved in this group. We then used several strategies to test the in vivo function of the two types of isoforms and found that the micro-Ex8 and FL-Ex8 isoforms have largely overlapping functions but also have non-redundant functions that are tissue-specific, which supports their strong evolutionary conservation. Since the different combinations of protein interaction of Hth with Exd and/or Hox can have different DNA target specificity, our finding of alternatively spliced isoforms adds to the spectrum of structural and functional diversity under developmental regulation.
Collapse
Affiliation(s)
- Ling-Wen Chang
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan, ROC
| | - I-Chieh Tseng
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan, ROC.,Institute of Genomic Sciences, National Yang-Ming University, Taipei, Taiwan, ROC.,Department of Life Science, Chinese Culture University, Taipei, Taiwan, ROC
| | - Lan-Hsin Wang
- Graduate Institute of Life Sciences, National Defense Medical Center, Taipei, Taiwan.
| | - Y Henry Sun
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan, ROC. .,Institute of Genomic Sciences, National Yang-Ming University, Taipei, Taiwan, ROC.
| |
Collapse
|
13
|
Comprehensive genomic analyses with 115 plastomes from algae to seed plants: structure, gene contents, GC contents, and introns. Genes Genomics 2020; 42:553-570. [PMID: 32200544 DOI: 10.1007/s13258-020-00923-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 03/09/2020] [Indexed: 02/08/2023]
Abstract
BACKGROUND Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns. OBJECTIVE This study aimed to get genomic and evolutionary insights on the plastomes from green algae to flowering plants. METHODS Plastomes of 115 species from green algae, bryophytes, pteridophytes (spore bearing vascular plants), gymnosperms, and angiosperms were mined from NCBI organelle genome database. Plastome structure, gene contents and GC contents were analyzed by the in-house developed Phyton code. Intronic features including presence/absence, length, intron phases were analyzed by manually in the annotated information in NCBI. RESULTS The canonical quadripartite structures were retained in most plastomes except of a few plastomes that had lost an invert repeat (IR). Expansion or reduction or deletion of IRs resulted in the length variation of the plastomes. The number of protein coding genes ranged from 40 to 92 with an average 79.43 ± 5.84 per plastome and gene losses were apparent in specific lineages. The number of trn genes ranged from 13 to 33 with an average 21.19 ± 2.42 per plastome. Ribosomal RNA genes, rrn, were located in the IRs so that they were present in a duplicate except of the species that had lost one of the IR. GC contents were variable from 24.9 to 51.0% with an average 38.21 ± 3.27%, indicating bias to high AT contents. Plastid introns were present in 18 protein coding genes, six trn genes, and one rrn gene. Intron losses occurred among the orthologous genes in different plant lineages. The plastid introns were long compared with the nuclear introns, which might be related with the spliceosome nuclear introns and self-splicing group II plastid introns. The trnK-UUU intron contained the maturase encoding matK gene except in the chlorophyte algae and monilophyte ferns in which the trnK-UUU was lost, but matK retained. There were many annotation artefacts in the intron positions in the NCBI database. In the analysis of intron phases, phase 0 introns were more frequent than those of phase 2 and 3 introns. Phase polymorphism was observed in the introns of clpP which was derived from nucleotide insertion. Plastid trn introns were long compared to the archaeal or eukaryotic nuclear tRNA introns. Of the six plastid trn introns, one was at the D loop and other five were at the anticodon loop. The insertion sites were conserved among the trn genes in archaea, eukaryotic nuclear and plastid tRNA genes. CONCLUSIONS Current study refurbrished the previous findings of structural variations, gene contents, and GC contents of the chloroplast genomes from green algae to flowering plants. The study also included some noble findings and discussions on the plastome introns including their length variations and phase variation. We also presented and corrected some false annotations on the introns in protein coding and tRNA genes in the genome database, which might be confirmed by the chloroplast transcriptome analysis in the future.
Collapse
|
14
|
Qulsum U, Tsukahara T. Tissue-specific alternative splicing of pentatricopeptide repeat (PPR) family genes in Arabidopsis thaliana. Biosci Trends 2018; 12:569-579. [PMID: 30555111 DOI: 10.5582/bst.2018.01178] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Alternative splicing is a post- and co-transcriptional regulatory mechanism of gene expression. Pentatricopeptide repeat (PPR) family proteins were recently found to be involved in RNA editing in plants. The aim of this study was to investigate the tissue-specific expression and alternative splicing of PPR family genes and their effects on protein structure and functionality. Of the 27 PPR genes in Arabidopsis thaliana, we selected six PPR genes of the P subfamily that are likely alternatively spliced, which were confirmed by sequencing. Four of these genes show intron retention, and the two remaining genes have 3' alternative-splicing sites. Alternative-splicing events occurred in the coding regions of three genes and in the 3' UTRs of the three remaining genes. We also identified five previously unannotated alternatively spliced isoforms of these PPR genes, which were confirmed by PCR and sequencing. Among these, three contain 3' alternative-splicing sites, one contains a 5' alternative-splicing site, and the remaining gene contains a 3'-5' alternative-splicing site. The new isoforms of two genes affect protein structure, and three other alternative-splicing sites are located in 3' UTRs. These findings suggest that tissue-specific expression of different alternatively spliced transcripts occurs in Arabidopsis, even at different developmental stages.
Collapse
Affiliation(s)
- Umme Qulsum
- School of Materials Science, Japan Advanced Institute of Science and Technology (JAIST)
| | - Toshifumi Tsukahara
- School of Materials Science, Japan Advanced Institute of Science and Technology (JAIST).,Area of Bioscience and Biotechnology, School of Materials Science, Japan Advanced Institute of Science and Technology (JAIST).,Division of Transdisciplinary Science, Japan Advanced Institute of Science and Technology (JAIST)
| |
Collapse
|
15
|
Ratcliffe CDH, Siddiqui N, Coelho PP, Laterreur N, Cookey TN, Sonenberg N, Park M. HGF-induced migration depends on the PI(3,4,5)P 3-binding microexon-spliced variant of the Arf6 exchange factor cytohesin-1. J Cell Biol 2018; 218:285-298. [PMID: 30404949 PMCID: PMC6314551 DOI: 10.1083/jcb.201804106] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 09/19/2018] [Accepted: 10/17/2018] [Indexed: 12/19/2022] Open
Abstract
Splice variants of the Arf6 guanine exchange factor cytohesin-1 display differential affinity for PI(4,5)P2 and PI(3,4,5)P3. Ratcliffe et al. show that the specific lipid binding of the diglycine variant of cytohesin-1 is needed for HGF-dependent cell migration and establishment of the leading edge, thereby regulating cancer cell migration following activation of the proto-oncogenic receptor tyrosine kinase Met. Differential inclusion or skipping of microexons is an increasingly recognized class of alternative splicing events. However, the functional significance of microexons and their contribution to signaling diversity is poorly understood. The Met receptor tyrosine kinase (RTK) modulates invasive growth and migration in development and cancer. Here, we show that microexon switching in the Arf6 guanine nucleotide exchange factor cytohesin-1 controls Met-dependent cell migration. Cytohesin-1 isoforms, differing by the inclusion of an evolutionarily conserved three-nucleotide microexon in the pleckstrin homology domain, display differential affinity for PI(4,5)P2 (triglycine) and PI(3,4,5)P3 (diglycine). We show that selective phosphoinositide recognition by cytohesin-1 isoforms promotes distinct subcellular localizations, whereby the triglycine isoform localizes to the plasma membrane and the diglycine to the leading edge. These data highlight microexon skipping as a mechanism to spatially restrict signaling and provide a mechanistic link between RTK-initiated phosphoinositide microdomains and Arf6 during signal transduction and cancer cell migration.
Collapse
Affiliation(s)
- Colin D H Ratcliffe
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada.,Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Nadeem Siddiqui
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada.,Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Paula P Coelho
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada.,Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Nancy Laterreur
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada
| | - Tumini N Cookey
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada
| | - Nahum Sonenberg
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada.,Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Morag Park
- Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, Quebec, Canada .,Department of Biochemistry, McGill University, Montreal, Quebec, Canada.,Department of Medicine, McGill University, Montreal, Quebec, Canada.,Department of Oncology, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
16
|
Cheng W, Zhou Y, Miao X, An C, Gao H. The Putative Smallest Introns in the Arabidopsis Genome. Genome Biol Evol 2018; 10:2551-2557. [PMID: 30184083 PMCID: PMC6161759 DOI: 10.1093/gbe/evy197] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2018] [Indexed: 12/15/2022] Open
Abstract
Most eukaryotic genes contain introns, which are noncoding sequences that are removed during premRNA processing. Introns are usually preserved across evolutionary time. However, the sizes of introns vary greatly. In Arabidopsis, some introns are longer than 10 kilo base pairs (bp) and others are predicted to be shorter than 10 bp. To identify the shortest intron in the genome, we analyzed the predicted introns in annotated version 10 of the Arabidopsis thaliana genome and found 103 predicted introns that are 30 bp or shorter, which make up only 0.08% of all introns in the genome. However, our own bioinformatics and experimental analyses found no evidence for the existence of these predicted introns. The predicted introns of 30–39 bp, 40–49 bp, and 50–59 bp in length are also rare and constitute only 0.07%, 0.2%, and 0.28% of all introns in the genome, respectively. An analysis of 30 predicted introns 31–59 bp long verified two in this range, both of which were 59 bp long. Thus, this study suggests that there is a limit to how small introns in A. thaliana can be, which is useful for the understanding of the evolution and processing of small introns in plants in general.
Collapse
Affiliation(s)
- Wenzhen Cheng
- College of Biological Sciences and Technology, Beijing Forestry University, China
| | - Yunlin Zhou
- College of Biological Sciences and Technology, Beijing Forestry University, China
| | - Xin Miao
- College of Biological Sciences and Technology, Beijing Forestry University, China
| | - Chuanjing An
- College of Biological Sciences and Technology, Beijing Forestry University, China
| | - Hongbo Gao
- College of Biological Sciences and Technology, Beijing Forestry University, China
| |
Collapse
|
17
|
Nimmy SF, Kamal MS, Hossain MI, Dey N, Ashour AS, Shi F. Neural Skyline Filtering for Imbalance Features Classification. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2017. [DOI: 10.1142/s1469026817500195] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In the current digitalized era, large datasets play a vital role in features extractions, information processing, knowledge mining and management. Sometimes, existing mining approaches are not sufficient to handle large volume of datasets. Biological data processing also suffers for the same issue. In the present work, a classification process is carried out on large volume of exons and introns from a set of raw data. The proposed work is designed into two parts as pre-processing and mapping-based classification. For pre-processing, three filtering techniques have been used. However, these traditional filtering techniques face difficulties for large datasets due to the long required time during large data processing as well as the large required memory size. In this regard, a mapping-based neural skyline filtering approach is designed. Randomized algorithm performed the mapping for large volume of datasets based on objective function. The objective function determines the randomized size of the datasets according to the homogeneity. Around 200 million DNA base pairs have been used for experimental analysis. Experimental result shows that mapping centric filtering outperforms other filtering techniques during large data processing.
Collapse
Affiliation(s)
- Sonia Farhana Nimmy
- Department of Computer Science and Engineering, Notre Dame University Bangladesh, Bangladesh
| | - Md. Sarwar Kamal
- Department of Computer Science and Engineering, East West University Bangladesh, Bangladesh
| | - Muhammad Iqbal Hossain
- Department of Computer Science and Engineering, BGC Trust University Bangladesh, Bangladesh
| | - Nilanjan Dey
- Department of Information Technology, Techno India College of Technology, India
| | - Amira S. Ashour
- Department of Electronics and Electrical, Communications Engineering Tanta University, Egypt
| | - Fuqian Shi
- College of Information and Engineering, Wenzhou Medical University, Wenzhou, P. R. China
| |
Collapse
|
18
|
Chang N, Sun Q, Hu J, An C, Gao AH. Large Introns of 5 to 10 Kilo Base Pairs Can Be Spliced out in Arabidopsis. Genes (Basel) 2017; 8:genes8080200. [PMID: 28800125 PMCID: PMC5575664 DOI: 10.3390/genes8080200] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Revised: 08/04/2017] [Accepted: 08/07/2017] [Indexed: 11/22/2022] Open
Abstract
Most of the eukaryotic genes contain introns, which are removed from the pre-RNA during RNA processing. In contrast to the introns in animals, which are usually several kilo base pairs (kb), those in plants generally are very small, which are mostly from dozens of base pairs (bp) to a few hundred bp. According to annotation version 10.0 of the genome of Arabidopsis thaliana, there are 127,854 introns in the nuclear genes; 99.23% of them are less than 1 kb, and only 16 introns are annotated to be larger than 5 kb, which are extremely large introns (ELI) in Arabidopsis. To learn whether these introns are true introns or not and how large introns could be in Arabidopsis, RT-PCR analysis of genes containing these ELIs were carried out. The results indicated that some of these putative introns are indeed ELIs. These ELIs are mainly composed of transposons or transposable elements (TE), excepting one, whose counterparts are also very long in diverse plant species. Thus, this study confirms the existence of introns larger than 5 kb or even 10 kb in Arabidopsis.
Collapse
Affiliation(s)
- Ning Chang
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China.
| | - Qingqing Sun
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China.
| | - Jinglei Hu
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China.
| | - Chuanjing An
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China.
| | - And Hongbo Gao
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China.
| |
Collapse
|
19
|
Osigus HJ, Eitel M, Schierwater B. Deep RNA sequencing reveals the smallest known mitochondrial micro exon in animals: The placozoan cox1 single base pair exon. PLoS One 2017; 12:e0177959. [PMID: 28542197 PMCID: PMC5436844 DOI: 10.1371/journal.pone.0177959] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Accepted: 05/05/2017] [Indexed: 11/18/2022] Open
Abstract
The phylum Placozoa holds a key position for our understanding of the evolution of mitochondrial genomes in Metazoa. Placozoans possess large mitochondrial genomes which harbor several remarkable characteristics such as a fragmented cox1 gene and trans-splicing cox1 introns. A previous study also suggested the existence of cox1 mRNA editing in Trichoplax adhaerens, yet the only formally described species in the phylum Placozoa. We have analyzed RNA-seq data of the undescribed sister species, Placozoa sp. H2 ("Panama" clone), with special focus on the mitochondrial mRNA. While we did not find support for a previously postulated cox1 mRNA editing mechanism, we surprisingly found two independent transcripts representing intermediate cox1 mRNA splicing stages. Both transcripts consist of partial cox1 exon as well as overlapping intron fragments. The data suggest that the cox1 gene harbors a single base pair (cytosine) micro exon. Furthermore, conserved group I intron structures flank this unique micro exon also in other placozoans. We discuss the evolutionary origin of this micro exon in the context of a self-splicing intron gain in the cox1 gene of the last common ancestor of extant placozoans.
Collapse
Affiliation(s)
- Hans-Jürgen Osigus
- ITZ, Ecology & Evolution, Stiftung Tierärztliche Hochschule Hannover, Hannover, Germany
| | - Michael Eitel
- ITZ, Ecology & Evolution, Stiftung Tierärztliche Hochschule Hannover, Hannover, Germany
| | - Bernd Schierwater
- ITZ, Ecology & Evolution, Stiftung Tierärztliche Hochschule Hannover, Hannover, Germany
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- Sackler Institute for Comparative Genomics and Division of Invertebrate Zoology, American Museum of Natural History, New York, New York, United States of America
| |
Collapse
|
20
|
Guo L, Jiang L, Zhang Y, Lu XL, Xie Q, Weijers D, Liu CM. The anaphase-promoting complex initiates zygote division in Arabidopsis through degradation of cyclin B1. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2016; 86:161-74. [PMID: 26952278 DOI: 10.1111/tpj.13158] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 02/27/2016] [Accepted: 03/01/2016] [Indexed: 05/03/2023]
Abstract
As the start of a new life cycle, activation of the first division of the zygote is a critical event in both plants and animals. Because the zygote in plants is difficult to access, our understanding of how this process is achieved remains poor. Here we report genetic and cell biological analyses of the zygote-arrest 1 (zyg1) mutant in Arabidopsis, which showed zygote-lethal and over-accumulation of cyclin B1 D-box-GUS in ovules. Map-based cloning showed that ZYG1 encodes the anaphase-promoting complex/cyclosome (APC/C) subunit 11 (APC11). Live-cell imaging studies showed that APC11 is expressed in both egg and sperm cells, in zygotes and during early embryogenesis. Using a GFP-APC11 fusion construct that fully complements zyg1, we showed that GFP-APC11 expression persisted throughout the mitotic cell cycle, and localized to cell plates during cytokinesis. Expression of non-degradable cyclin B1 in the zygote, or mutations of either APC1 or APC4, also led to a zyg1-like phenotype. Biochemical studies showed that APC11 has self-ubiquitination activity and is able to ubiquitinate cyclin B1 and promote degradation of cyclin B1. These results together suggest that APC/C-mediated degradation of cyclin B1 in Arabidopsis is critical for initiating the first division of the zygote.
Collapse
Affiliation(s)
- Lei Guo
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Li Jiang
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Ying Zhang
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Xiu-Li Lu
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Qi Xie
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Dolf Weijers
- Laboratory of Biochemistry, Wageningen University, Dreijenlaan 3, 6703 HA, Wageningen, The Netherlands
| | - Chun-Ming Liu
- Key Laboratory of Plant Molecular Physiology, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| |
Collapse
|