1
|
Abstract
Background Identifying splice sites is a necessary step to analyze the location and structure of genes. Two dinucleotides, GT and AG, are highly frequent on splice sites, and many other patterns are also on splice sites with important biological functions. Meanwhile, the dinucleotides occur frequently at the sequences without splice sites, which makes the prediction prone to generate false positives. Most existing tools select all the sequences with the two dimers and then focus on distinguishing the true splice sites from those pseudo ones. Such an approach will lead to a decrease in false positives; however, it will result in non-canonical splice sites missing. Result We have designed SpliceFinder based on convolutional neural network (CNN) to predict splice sites. To achieve the ab initio prediction, we used human genomic data to train our neural network. An iterative approach is adopted to reconstruct the dataset, which tackles the data unbalance problem and forces the model to learn more features of splice sites. The proposed CNN obtains the classification accuracy of 90.25%, which is 10% higher than the existing algorithms. The method outperforms other existing methods in terms of area under receiver operating characteristics (AUC), recall, precision, and F1 score. Furthermore, SpliceFinder can find the exact position of splice sites on long genomic sequences with a sliding window. Compared with other state-of-the-art splice site prediction tools, SpliceFinder generates results in about half lower false positive while keeping recall higher than 0.8. Also, SpliceFinder captures the non-canonical splice sites. In addition, SpliceFinder performs well on the genomic sequences of Drosophila melanogaster, Mus musculus, Rattus, and Danio rerio without retraining. Conclusion Based on CNN, we have proposed a new ab initio splice site prediction tool, SpliceFinder, which generates less false positives and can detect non-canonical splice sites. Additionally, SpliceFinder is transferable to other species without retraining. The source code and additional materials are available at https://gitlab.deepomics.org/wangruohan/SpliceFinder.
Collapse
|
2
|
Abstract
Glucosinolate-myrosinase is a substrate-enzyme defense mechanism present in Brassica crops. This binary system provides the plant with an efficient system against herbivores and pathogens. For humans, it is well known for its anti-carcinogenic, anti-inflammatory, immunomodulatory, anti-bacterial, cardio-protective, and central nervous system protective activities. Glucosinolate and myrosinase are spatially present in different cells that upon tissue disruption come together and result in the formation of a variety of hydrolysis products with diverse physicochemical and biological properties. The myrosinase-catalyzed reaction starts with cleavage of the thioglucosidic linkage resulting in release of a D-glucose and an unstable thiohydroximate-O-sulfate. The outcome of this thiohydroximate-O-sulfate has been shown to depend on the structure of the glucosinolate side chain, the presence of supplementary proteins known as specifier proteins and/or on the physiochemical condition. Myrosinase was first reported in mustard seed during 1939 as a protein responsible for release of essential oil. Until this date, myrosinases have been characterized from more than 20 species of Brassica, cabbage aphid, and many bacteria residing in the human intestine. All the plant myrosinases are reported to be activated by ascorbic acid while aphid and bacterial myrosinases are found to be either neutral or inhibited. Myrosinase catalyzes hydrolysis of the S-glycosyl bond, O-β glycosyl bond, and O-glycosyl bond. This review summarizes information on myrosinase, an essential component of this binary system, including its structural and molecular properties, mechanism of action, and its regulation and will be beneficial for the research going on the understanding and betterment of the glucosinolate-myrosinase system from an ecological and nutraceutical perspective.
Collapse
|
3
|
Identification, Molecular Characterization, and In Silico Structural Analysis of Carboxypeptidase B2 of Anopheles stephensi. JOURNAL OF MEDICAL ENTOMOLOGY 2019; 56:72-85. [PMID: 30124910 DOI: 10.1093/jme/tjy127] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Indexed: 06/08/2023]
Abstract
Malaria is a vector-borne infectious disease that is considered a priority of the World Health Organization due to its enormous impacts on global health. Plasmodium spp. (Haemosporida: Plasmodiidae), Anopheles spp. (Diptera: Culicidae), and a suitable host are the key elements for malaria transmission. To disrupt the parasitic life cycle of malaria or prevent its transmission, these three key elements should be targeted by effective control strategies. Development of vaccines that interrupt malaria transmission is one of the solutions that has been recommended to the countries that aim to eliminate malaria. With respect to the important role of Anopheles stephensi in malaria transmission and involvement of Anopheles carboxypeptidase B1 in sexual parasite development, we characterized the second member of cpb gene family (cpbAs2) of An. Stephensi to provide some basic information and evaluate significance of cpbAs2's role in complementing sexual plasmodium development role of cpbAs1. The cpbAs2 mRNA sequence was characterized by 3' and 5' RACE and the structural features of its coded protein were studied by in silico modeling. The coding sequence and gene structure of cpbAs2 were determined empirically and compared with the in silico predictions from the An. stephensi genome sequencing project. Furthermore, homology modeling revealed that its structure is very similar to the structurally important domains of procarboxypeptidase B2 in humans. This study provides basic molecular and structural information about another member of the cpb gene family of An. stephensi. The reported results are informative and necessary for evaluation of the role of this gene in sexual parasite development by future studies.
Collapse
|
4
|
Arabidopsis Myrosinase Genes AtTGG4 and AtTGG5 Are Root-Tip Specific and Contribute to Auxin Biosynthesis and Root-Growth Regulation. Int J Mol Sci 2016; 17:ijms17060892. [PMID: 27338341 PMCID: PMC4926426 DOI: 10.3390/ijms17060892] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 05/31/2016] [Accepted: 06/02/2016] [Indexed: 11/17/2022] Open
Abstract
Plant myrosinases (β-thioglucoside glucohydrolases) are classified into two subclasses, Myr I and Myr II. The biological function of Myr I has been characterized as a major biochemical defense against insect pests and pathogens in cruciferous plants. However, the biological function of Myr II remains obscure. We studied the function of two Myr II member genes AtTGG4 and AtTGG5 in Arabidopsis. RT-PCR showed that both genes were specifically expressed in roots. GUS-assay revealed that both genes were expressed in the root-tip but with difference: AtTGG4 was expressed in the elongation zone of the root-tip, while AtTGG5 was expressed in the whole root-tip. Moreover, myrosin cells that produce and store the Myr I myrosinases in aboveground organs were not observed in roots, and AtTGG4 and AtTGG5 were expressed in all cells of the specific region. A homozygous double mutant line tgg4tgg5 was obtained through cross-pollination between two T-DNA insertion lines, tgg4E8 and tgg5E12, by PCR-screening in the F2 and F3 generations. Analysis of myrosinase activity in roots of mutants revealed that AtTGG4 and AtTGG5 had additive effects and contributed 35% and 65% myrosinase activity in roots of the wild type Col-0, respectively, and myrosinase activity in tgg4tgg5 was severely repressed. When grown in Murashiege & Skoog (MS) medium or in soil with sufficient water, Col-0 had the shortest roots, and tgg4tgg5 had the longest roots, while tgg4E8 and tgg5E12 had intermediate root lengths. In contrast, when grown in soil with excessive water, Col-0 had the longest roots, and tgg4tgg5 had the shortest roots. These results suggested that AtTGG4 and AtTGG5 regulated root growth and had a role in flood tolerance. The auxin-indicator gene DR5::GUS was then introduced into tgg4tgg5 by cross-pollination. DR5::GUS expression patterns in seedlings of F1, F2, and F3 generations indicated that AtTGG4 and AtTGG5 contributed to auxin biosynthesis in roots. The proposed mechanism is that indolic glucosinolate is transported to the root-tip and converted to indole-3-acetonitrile (IAN) in the tryptophan-dependent pathways by AtTGG4 and AtTGG5, and IAN is finally converted to indole-3-acetic acid (IAA) by nitrilases in the root-tip. This mechanism guarantees the biosynthesis of IAA in correct cells of the root-tip and, thus, a correct auxin gradient is formed for healthy development of roots.
Collapse
|
5
|
Curtius-like Rearrangement of an Iron–Nitrenoid Complex and Application in Biomimetic Synthesis of Bisindolylmethanes. Org Lett 2016; 18:2228-31. [PMID: 27116426 DOI: 10.1021/acs.orglett.6b00864] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
6
|
Identification and Evolution of Functional Alleles of the Previously Described Pollen Specific Myrosinase Pseudogene AtTGG6 in Arabidopsis thaliana. Int J Mol Sci 2016; 17:262. [PMID: 26907263 PMCID: PMC4783991 DOI: 10.3390/ijms17020262] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Revised: 01/26/2016] [Accepted: 02/16/2016] [Indexed: 11/25/2022] Open
Abstract
Myrosinases are β-thioglucoside glucohydrolases and serve as defense mechanisms against insect pests and pathogens by producing toxic compounds. AtTGG6 in Arabidopsis thaliana was previously reported to be a myrosinase pseudogene but specifically expressed in pollen. However, we found that AlTGG6, an ortholog to AtTGG6 in A. lyrata (an outcrossing relative of A. thaliana) was functional, suggesting that functional AtTGG6 alleles may still exist in A. thaliana. AtTGG6 alleles in 29 A. thaliana ecotypes were cloned and sequenced. Results indicate that ten alleles were functional and encoded Myr II type myrosinase of 512 amino acids, and myrosinase activity was confirmed by overexpressing AtTGG6 in Pichia pastoris. However, the 19 other ecotypes had disabled alleles with highly polymorphic frame-shift mutations and diversified sequences. Thirteen frame-shift mutation types were identified, which occurred independently many times in the evolutionary history within a few thousand years. The functional allele was expressed specifically in pollen similar to the disabled alleles but at a higher expression level, suggesting its role in defense of pollen against insect pests such as pollen beetles. However, the defense function may have become less critical after A. thaliana evolved to self-fertilization, and thus resulted in loss of function in most ecotypes.
Collapse
|
7
|
Alternative splicing and its impact as a cancer diagnostic marker. Genomics Inform 2012; 10:74-80. [PMID: 23105933 PMCID: PMC3480681 DOI: 10.5808/gi.2012.10.2.74] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Revised: 05/18/2012] [Accepted: 05/21/2012] [Indexed: 01/13/2023] Open
Abstract
Most genes are processed by alternative splicing for gene expression, resulting in the complexity of the transcriptome in eukaryotes. It allows a limited number of genes to encode various proteins with intricate functions. Alternative splicing is regulated by genetic mutations in cis-regulatory factors and epigenetic events. Furthermore, splicing events occur differently according to cell type, developmental stage, and various diseases, including cancer. Genome instability and flexible proteomes by alternative splicing could affect cancer cells to grow and survive, leading to metastasis. Cancer cells that are transformed by aberrant and uncontrolled mechanisms could produce alternative splicing to maintain and spread them continuously. Splicing variants in various cancers represent crucial roles for tumorigenesis. Taken together, the identification of alternative spliced variants as biomarkers to distinguish between normal and cancer cells could cast light on tumorigenesis.
Collapse
|
8
|
Characterization of a novel β-thioglucosidase CpTGG1 in Carica papaya and its substrate-dependent and ascorbic acid-independent O-β-glucosidase activity. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2010; 52:879-90. [PMID: 20883440 DOI: 10.1111/j.1744-7909.2010.00988.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Plant thioglucosidases are the only known S-glycosidases in the large superfamily of glycosidases. These enzymes evolved more recently and are distributed mainly in Brassicales. Thioglucosidase research has focused mainly on the cruciferous crops due to their economic importance and cancer preventive benefits. In this study, we cloned a novel myrosinase gene, CpTGG1, from Carica papaya Linnaeus. and showed that it was expressed in the aboveground tissues in planta. The recombinant CpTGG1 expressed in Pichia pastoris catalyzed the hydrolysis of both sinigrin and glucotropaeolin (the only thioglucoside present in papaya), showing that CpTGG1 was indeed a functional myrosinase gene. Sequence alignment analysis indicated that CpTGG1 contained all the motifs conserved in functional myrosinases from crucifers, except for two aglycon-binding motifs, suggesting substrate priority variation of the non-cruciferous myrosinases. Using sinigrin as substrate, the apparent K(m) and V(max) values of recombinant CpTGG1 were 2.82 mM and 59.9 μmol min⁻¹ mg protein⁻¹ , respectively. The K(cat) /K(m) value was 23 s⁻¹ mM⁻¹ . O-β-glucosidase activity towards a variety of substrates were tested, CpTGG1 displayed substrate-dependent and ascorbic acid-independent O-β-glucosidase activity towards 2-nitrophenyl-β-D-glucopyranoside and 4-nitrophenyl-β-D-glucopyranoside, but was inactive towards glucovanillin and n-octyl-β-D-glucopyranoside. Phylogenetic analysis indicated CpTGG1 belongs to the MYR II subfamily of myrosinases.
Collapse
|
9
|
Analysis of class III peroxidase genes expressed in roots of resistant and susceptible wheat lines infected by Heterodera avenae. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2009; 22:1081-92. [PMID: 19656043 DOI: 10.1094/mpmi-22-9-1081] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
The response of resistant wheat-Aegilops ventricosa introgression line H-93-8 and its susceptible parent, Triticum aestivum H-10-15, to Ha71 Spanish population of Heterodera avenae was studied to determine the changes in peroxidase gene expression during incompatible and compatible wheat-nematode interactions. Twenty peroxidase genes were characterized from both 211 expressed sequence tags and 259 genomic DNA clones. Alignment of deduced amino acid sequences and phylogenetic clustering with peroxidases from other plant species showed that these enzymes fall into seven different groups (designated TaPrx108 to TaPrx114) which represent peroxidases secreted to the apoplast by a putative N-terminal peptide signal. TaPrx111, TaPrx112, and TaPrx113 were induced by nematode infection in both genotypes but with differing magnitude and timing. TaPrx112 and TaPrx113 groups increased more in resistant than in susceptible infected lines. In addition, in situ hybridization analyses of genes belonging to TaPrx111, TaPrx112, and TaPrx113 groups revealed a more intense signal in cells close to the vascular cylinder and parenchyma vascular cells of resistant than susceptible wheat when challenged by nematodes. These data seem to suggest that wheat apoplastic peroxidases, because of their different expression in quantity and timing, play different roles in the plant response to nematode infection.
Collapse
|
10
|
Recruitment of alkaloid-specific homospermidine synthase (HSS) from ubiquitous deoxyhypusine synthase: Does Crotalaria possess a functional HSS that still has DHS activity? PHYTOCHEMISTRY 2005; 66:1346-57. [PMID: 15935411 DOI: 10.1016/j.phytochem.2005.04.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2005] [Revised: 04/04/2005] [Accepted: 04/07/2005] [Indexed: 05/02/2023]
Abstract
Quinolizidine alkaloids are the most prominent group of alkaloids occurring in legumes, except for many members of the tribe Crotalarieae that accumulate pyrrolizidine alkaloids (PAs). To study the evolution of PA biosynthesis as a typical pathway of plant secondary metabolism in this tribe, we have searched for a cDNA coding for homospermidine synthase (HSS), the enzyme catalyzing the first specific step in this biosynthesis. HSS was shown to have been recruited from deoxyhypusine synthase (DHS) by independent gene duplication in several different angiosperm lineages during evolution. Except for a cDNA sequence coding for the DHS of Crotalaria retusa, no data is available concerning the origin of PA biosynthesis within this tribe of the Fabaceae. In addition to several pseudogenes, we have identified one functional DHS in C. scassellatii and two in C. juncea. Despite C. juncea plants under study being devoid of PAs, we have found that the two sequences of C. juncea are different with respect to their genomic organization, their tissue-specific expression, and their biochemical activities. Supported by the branching pattern of a maximum likelihood analysis of these sequences, they have been classified as "class 1" and "class 2" DHS. It remains open whether the duplicated DHS belonging to class 2 is involved in the biosynthesis of PAs.
Collapse
|
11
|
Functional genomic analysis of Arabidopsis thaliana glycoside hydrolase family 1. PLANT MOLECULAR BIOLOGY 2004; 55:343-67. [PMID: 15604686 DOI: 10.1007/s11103-004-0790-1] [Citation(s) in RCA: 187] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
In plants, Glycoside Hydrolase (GH) Family 1 beta -glycosidases are believed to play important roles in many diverse processes including chemical defense against herbivory, lignification, hydrolysis of cell wall-derived oligosaccharides during germination, and control of active phytohormone levels. Completion of the Arabidopsis thaliana genome sequencing project has enabled us, for the first time, to determine the total number of Family 1 members in a higher plant. Reiterative database searches revealed a multigene family of 48 members that includes eight probable pseudogenes. Manual reannotation and analysis of the entire family were undertaken to rectify existing misannotations and identify phylogenetic relationships among family members. Forty-seven members (designated BGLU1 through BGLU47 ) share a common evolutionary origin and were subdivided into approximately 10 subfamilies based on phylogenetic analysis and consideration of intron-exon organizations. The forty-eighth member of this family ( At3g06510; sfr2 ) is a beta -glucosidase-like gene that belongs to a distinct lineage. Information pertaining to expression patterns and potential functions of Arabidopsis GH Family 1 members is presented. To determine the biological function of all family members, we intend to investigate the substrate specificity of each mature hydrolase after its heterologous expression in the Pichia pastoris expression system. To test the validity of this approach, the BGLU44 -encoded hydrolase was expressed in P. pastoris and purified to homogeneity. When tested against a wide range of natural and synthetic substrates, this enzyme showed a preference for beta -mannosides including 1,4- beta -D-mannooligosaccharides, suggesting that it may be involved in A. thaliana in degradation of mannans, galactomannans, or glucogalactomannans. Supporting this notion, BGLU44 shared high sequence identity and similar gene organization with tomato endosperm beta -mannosidase and barley seed beta -glucosidase/ beta -mannosidase BGQ60.
Collapse
|
12
|
The SAC domain-containing protein gene family in Arabidopsis. PLANT PHYSIOLOGY 2003; 132:544-55. [PMID: 12805586 PMCID: PMC166996 DOI: 10.1104/pp.103.021444] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2003] [Revised: 03/04/2003] [Accepted: 03/26/2003] [Indexed: 05/18/2023]
Abstract
The SAC domain was first identified in the yeast (Saccharomyces cerevisiae) Sac1p phosphoinositide phosphatase protein and subsequently found in a number of proteins from yeast and animals. The SAC domain is approximately 400 amino acids in length and is characterized by seven conserved motifs. The SAC domains of several proteins have been recently demonstrated to possess phosphoinositide phosphatase activities. Sac1p has been shown to regulate the levels of various phosphoinositides in the phosphoinositide pool and affect diverse cellular functions such as actin cytoskeleton organization, Golgi function, and maintenance of vacuole morphology. The Arabidopsis genome contains a total of nine genes encoding SAC domain-containing proteins (AtSACs). The SAC domains of the AtSACs possess the conserved amino acid motifs that are believed to be important for the phosphoinositide phosphatase activities of yeast and animal SAC domain proteins. AtSACs can be divided into three subgroups based on their sequence similarities, hydropathy profiles, and phylogenetic relationship. Gene expression analysis demonstrated that the AtSAC genes exhibited differential expression patterns in different organs and, in particular, the AtSAC6 gene was predominantly expressed in flowers. Moreover, the expression of the AtSAC6 gene was highly induced by salinity. These results provide a foundation for future studies on the elucidation of the cellular functions of SAC domain-containing proteins in Arabidopsis.
Collapse
|
13
|
Chapter four Localization of plant myrosinases and glucosinolates. ACTA ACUST UNITED AC 2003. [DOI: 10.1016/s0079-9920(03)80019-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
|
14
|
Investigation of the microheterogeneity and aglycone specificity-conferring residues of black cherry prunasin hydrolases. PLANT PHYSIOLOGY 2002; 129:1252-64. [PMID: 12114579 PMCID: PMC166519 DOI: 10.1104/pp.010863] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2001] [Revised: 01/24/2002] [Accepted: 04/05/2002] [Indexed: 05/20/2023]
Abstract
In black cherry (Prunus serotina Ehrh.) seed homogenates, (R)-amygdalin is degraded to HCN, benzaldehyde, and glucose by the sequential action of amygdalin hydrolase (AH), prunasin hydrolase (PH), and mandelonitrile lyase. Leaves are also highly cyanogenic because they possess (R)-prunasin, PH, and mandelonitrile lyase. Taking both enzymological and molecular approaches, we demonstrate here that black cherry PH is encoded by a putative multigene family of at least five members. Their respective cDNAs (designated Ph1, Ph2, Ph3, Ph4, and Ph5) predict isoforms that share 49% to 92% amino acid identity with members of glycoside hydrolase family 1, including their catalytic asparagine-glutamate-proline and isoleucine-threonine-glutamate-asparagine-glycine motifs. Furthermore, consistent with the vacuolar/protein body location and glycoprotein character of these hydrolases, their open reading frames predict N-terminal signal sequences and multiple potential N-glycosylation sites. Genomic sequences corresponding to the open reading frames of these PHs and of the previously isolated AH1 isoform are interrupted at identical positions by 12 introns. Earlier studies established that native AH and PH display strict specificities toward their respective glucosidic substrates. Such behavior was also shown by recombinant AH1, PH2, and PH4 proteins after expression in Pichia pastoris. Three amino acid moieties that may play a role in conferring such aglycone specificities were predicted by structural modeling and comparative sequence analysis and tested by introducing single and multiple mutations into isoform AH1 by site-directed mutagenesis. The double mutant AH ID (Y200I and G394D) hydrolyzed prunasin at approximately 150% of the rate of amygdalin hydrolysis, whereas the other mutations failed to engender PH activity.
Collapse
|
15
|
The third myrosinase gene TGG3 in Arabidopsis thaliana is a pseudogene specifically expressed in stamen and petal. PHYSIOLOGIA PLANTARUM 2002; 115:25-34. [PMID: 12010464 DOI: 10.1034/j.1399-3054.2002.1150103.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Genomic clones and full-length cDNA for the myrosinase gene TGG3 from Arabidopsis thaliana ecotype Columbia were sequenced. The TGG3 gene was similar with the earlier described myrosinase genes and shared the conserved intron/exon splice sites but had an insertion of one nucleotide in exon 5, a deletion of two nucleotides in exon 6 and a deletion of approximately 210 nucleotides in exon 12. These mutations shifted the open reading frame in exon 5 and resulted in a truncated protein. Analysis of the TGG3 DNA sequence from five other Arabidopsis ecotypes showed polymorphisms, but in no case did a functional TGG3 gene appear to be present. Although TGG3 apparently is a pseudogene, it was expressed specifically in stamen and petal according to RT-PCR analysis, while TGG1 and TGG2 transcripts were present in most of the tested tissues. Western blot analysis showed only one myrosinase band of 68 kDa corresponding to TGG1 and TGG2 in flower samples, while no band corresponding to TGG3 was encountered. Apparently only two functional myrosinases are present in this gene family in Arabidopsis.
Collapse
|
16
|
Pyk10, a seedling and root specific gene and promoter from Arabidopsis thaliana. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2001; 161:337-346. [PMID: 11448764 DOI: 10.1016/s0168-9452(01)00412-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Pyk10 is a root and hypocotyl specific myrosinase from Arabidopsis thaliana. Northern analysis revealed the root specific expression of pyk10. In order to study the pyk10 promoter and the genomic structure of the gene, a genomic clone was isolated and sequenced. The clone contained the complete pyk10 gene and a promoter region of 3569 bp. The gene spans 2963 bp and consists of 12 exons and 11 introns, a structure that reflects the common gene organization of myrosinases. Within the promoter sequence, different development specific, organ specific, elicitor and plant hormone responsive regulatory elements could be identified, which also occur in other promoters. To determine the pattern of expression, four different 5'-promoter deletion fragments were linked to a ss-glucuronidase (gus) reporter gene and transformed into A. thaliana. The results demonstrated that the pyk10 promoter mediates a developmental gene activity with a strong emphasis in the root. Cis-acting sequences regulating root specific expression were identified to reside in the two promoter fragments B and C.
Collapse
|
17
|
Identification and characterization of soluble and insoluble myrosinase isoenzymes in different organs of Sinapis alba. PHYSIOLOGIA PLANTARUM 2001; 111:353-364. [PMID: 11240920 DOI: 10.1034/j.1399-3054.2001.1110313.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Extraction of Sinapis alba seeds under native conditions solubilized 3 myrosinase isoforms, pool I, II and III, which could be separated by ion exchange chromatography. Sequencing of numerous peptides of the I and III isoforms showed that they belonged to the Myrosinase A (MA) family of myrosinases and that they were encoded by different genes. Western blot analysis of S. alba seed proteins, extracted with a sodium dodecyl sulphate-containing buffer, using an anti-myrosinase monoclonal antibody, showed the presence of two additional myrosinase isoforms with approximate molecular sizes of 62 and 59 kDa. These myrosinases, which only could be solubilized from seeds by inclusion of denaturing agents in the extraction buffer, were by sequence analysis identified as MB myrosinases. These isoenzymes or very similar forms were also present in seedling cotyledons. However, from this tissue, they could be extracted with non-denaturing buffers. In addition, cotyledons contained a 65-kDa MB myrosinase not found in seeds. In contrast, seedling cotyledons contained only minute amounts of pool I and no pool III MA myrosinases, emphasizing the tissue-specific expression of the corresponding gene families. Sequence analysis of myrosinase cDNAs generated cDNA by reversed transcription-polymerase chain reaction using degenerate primers with mRNA isolated from seeds, cotyledons and leaves confirmed the result that the MA isoforms were expressed only in seed tissue, while MB myrosinases were found in all tissues investigated. Furthermore, seed and leaf contained unique MB myrosinase transcripts, suggesting organ-specific expression of individual MB genes.
Collapse
|
18
|
Abstract
A set of 43 337 splice junction pairs was extracted from mammalian GenBank annotated genes. Expressed sequence tag (EST) sequences support 22 489 of them. Of these, 98.71% contain canonical dinucleotides GT and AG for donor and acceptor sites, respectively; 0.56% hold non-canonical GC-AG splice site pairs; and the remaining 0.73% occurs in a lot of small groups (with a maximum size of 0.05%). Studying these groups we observe that many of them contain splicing dinucleotides shifted from the annotated splice junction by one position. After close examination of such cases we present a new classification consisting of only eight observed types of splice site pairs (out of 256 a priori possible combinations). EST alignments allow us to verify the exonic part of the splice sites, but many non-canonical cases may be due to intron sequencing errors. This idea is given substantial support when we compare the sequences of human genes having non-canonical splice sites deposited in GenBank by high throughput genome sequencing projects (HTG). A high proportion (156 out of 171) of the human non-canonical and EST-supported splice site sequences had a clear match in the human HTG. They can be classified after corrections as: 79 GC-AG pairs (of which one was an error that corrected to GC-AG), 61 errors that were corrected to GT-AG canonical pairs, six AT-AC pairs (of which two were errors that corrected to AT-AC), one case was produced from non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two cases left of supported non-canonical splice sites. If we assume that approximately the same situation is true for the whole set of annotated mammalian non-canonical splice sites, then the 99.24% of splice site pairs should be GT-AG, 0.69% GC-AG, 0.05% AT-AC and finally only 0.02% could consist of other types of non-canonical splice sites. We analyze several characteristics of EST-verified splice sites and build weight matrices for the major groups, which can be incorporated into gene prediction programs. We also present a set of EST-verified canonical splice sites larger by two orders of magnitude than the current one (22 199 entries versus approximately 600) and finally, a set of 290 EST-supported non-canonical splice sites. Both sets should be significant for future investigations of the splicing mechanism.
Collapse
|
19
|
Abstract
Glucosinolates are a category of secondary products present primarily in species of the order Capparales. When tissue is damaged, for example by herbivory, glucosinolates are degraded in a reaction catalyzed by thioglucosidases, denoted myrosinases, also present in these species. Thereby, toxic compounds such as nitriles, isothiocyanates, epithionitriles and thiocyanates are released. The glucosinolate-myrosinase system is generally believed to be part of the plant's defense against insects, and possibly also against pathogens. In this review, the evolution of the system and its impact on the interaction between plants and insects are discussed. Further, data suggesting additional functions in the defense against pathogens and in sulfur metabolism are reviewed.
Collapse
|
20
|
A splice site mutant of maize activates cryptic splice sites, elicits intron inclusion and exon exclusion, and permits branch point elucidation. PLANT PHYSIOLOGY 1999; 121:411-8. [PMID: 10517832 PMCID: PMC59403 DOI: 10.1104/pp.121.2.411] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/1999] [Accepted: 06/25/1999] [Indexed: 05/17/2023]
Abstract
DNA sequence analysis of the bt2-7503 mutant allele of the maize brittle-2 gene revealed a point mutation in the 5' terminal sequence of intron 3 changing GT to AT. This lesion completely abolishes use of this splice site, activates two cryptic splice sites, and alters the splicing pattern from extant splice sites. One activated donor site, located nine nt 5' to the normal splice donor site, begins with the dinucleotide GC. While non-consensus, this sequence still permits both trans-esterification reactions of pre-mRNA splicing. A second cryptic site located 23 nt 5' to the normal splice site and beginning with GA, undergoes the first trans-esterification reaction leading to lariat formation, but lacks the ability to participate in the second reaction. Accumulation of this splicing intermediate and use of an innovative reverse transcriptase-polymerase chain reaction technique (J. Vogel, R.H. Wolfgang, T. Borner [1997] Nucleic Acids Res 25: 2030-2031) led to the identification of 3' intron sequences needed for lariat formation. In most splicing reactions, neither cryptic site is recognized. Most mature transcripts include intron 3, while the second most frequent class lacks exon 3. Traditionally, the former class of transcripts is taken as evidence for the intron definition of splicing, while the latter class has given credence to the exon definition of splicing.
Collapse
|
21
|
AT-AC pre-mRNA splicing mechanisms and conservation of minor introns in voltage-gated ion channel genes. Mol Cell Biol 1999; 19:3225-36. [PMID: 10207048 PMCID: PMC84117 DOI: 10.1128/mcb.19.5.3225] [Citation(s) in RCA: 115] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
|
22
|
Molecular analysis of (R)-(+)-mandelonitrile lyase microheterogeneity in black cherry. PLANT PHYSIOLOGY 1999; 119:1535-46. [PMID: 10198113 PMCID: PMC32039 DOI: 10.1104/pp.119.4.1535] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/1998] [Accepted: 01/07/1999] [Indexed: 05/20/2023]
Abstract
The flavoprotein (R)-(+)-mandelonitrile lyase (MDL; EC 4.1.2.10), which plays a key role in cyanogenesis in rosaceous stone fruits, occurs in black cherry (Prunus serotina Ehrh.) homogenates as several closely related isoforms. Biochemical and molecular biological methods were used to investigate MDL microheterogeneity and function in this species. Three novel MDL cDNAs of high sequence identity (designated MDL2, MDL4, and MDL5) were isolated. Like MDL1 and MDL3 cDNAs (Z. Hu, J.E. Poulton [1997] Plant Physiol 115: 1359-1369), they had open reading frames that predicted a flavin adenine dinucleotide-binding site, multiple N-glycosylation sites, and an N-terminal signal sequence. The N terminus of an MDL isoform purified from seedlings matched the derived amino acid sequence of the MDL4 cDNA. Genomic sequences corresponding to the MDL1, MDL2, and MDL4 cDNAs were obtained by polymerase chain reaction amplification of genomic DNA. Like the previously reported mdl3 gene, these genes are interrupted at identical positions by three short, conserved introns. Given their overall similarity, we conclude that the genes mdl1, mdl2, mdl3, mdl4, and mdl5 are derived from a common ancestral gene and constitute members of a gene family. Genomic Southern-blot analysis showed that this family has approximately eight members. Northern-blot analysis using gene-specific probes revealed differential expression of the genes mdl1, mdl2, mdl3, mdl4, and mdl5.
Collapse
|
23
|
Identification of a novel delta 6-acyl-group desaturase by targeted gene disruption in Physcomitrella patens. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 1998; 15:39-48. [PMID: 9744093 DOI: 10.1046/j.1365-313x.1998.00178.x] [Citation(s) in RCA: 96] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
The moss Physcomitrella patens contains high levels of arachidonic acid. For its synthesis from linoleic acid by desaturation and elongation, novel delta 5- and delta 6-desaturases are required. To isolate one of these, PCR-based cloning was used, and resulted in the isolation of a full-length cDNA coding for a putatively new desaturase. The deduced amino acid sequence has three domains: a N-terminal segment of about 100 amino acids, with no similarity to any sequence in the data banks, followed by a cytochrome b5-related region and a C-terminal sequence with low similarity (27% identify) to acyl-lipid desaturases. To elucidate the function of this protein, we disrupted its gene by transforming P. patens with the corresponding linear genomic sequence, into which a positive selection marker had been inserted. The molecular analysis of five transformed lines showed that the selection cartridge had been inserted into the corresponding genomic locus of all five lines. The gene disruption resulted in a dramatic alteration of the fatty acid pattern in the knockout plants. The large increase in linoleic acid and the concomitant disappearance of gamma-linolenic and arachidonic acid in all knockout lines suggested that the new cDNA coded for a delta 6-desaturase. This was confirmed by expression of the cDNA in yeast and analysis of the resultant fatty acids by GC-MS. Only the transformed yeast cells were able to introduce a further double bond into the delta 6-position of unsaturated fatty acids. To our knowledge, this is the first report of a successful gene disruption in a multicellular plant resulting in a specific biochemical phenotype.
Collapse
|
24
|
Splicing of precursors to mRNA in higher plants: mechanism, regulation and sub-nuclear organisation of the spliceosomal machinery. PLANT MOLECULAR BIOLOGY 1996; 32:1-41. [PMID: 8980472 DOI: 10.1007/bf00039375] [Citation(s) in RCA: 82] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
The removal of introns from pre-mRNA transcripts and the concomitant ligation of exons is known as pre-mRNA splicing. It is a fundamental aspect of constitutive eukaryotic gene expression and an important level at which gene expression is regulated. The process is governed by multiple cis-acting elements of limited sequence content and particular spatial constraints, and is executed by a dynamic ribonucleoprotein complex termed the spliceosome. The mechanism and regulation of pre-mRNA splicing, and the sub-nuclear organisation of the spliceosomal machinery in higher plants is reviewed here. Heterologous introns are often not processed in higher plants indicating that, although highly conserved, the process of pre-mRNA splicing in plants exhibits significant differences that distinguish it from splicing in yeast and mammals. A fundamental distinguishing feature is the presence of and requirement for AU or U-rich intron sequence in higher-plant pre-mRNA splicing. In this review we document the properties of higher-plant introns and trans-acting spliceosomal components and discuss the means by which these elements combine to determine the accuracy and efficiency of pre-mRNA processing. We also detail examples of how introns can effect regulated gene expression by affecting the nature and abundance of mRNA in plants and list the effects of environmental stresses on splicing. Spliceosomal components exhibit a distinct pattern of organisation in higher-plant nuclei. Effective probes that reveal this pattern have only recently become available, but the domains in which spliceosomal components concentrate were identified in plant nuclei as enigmatic structures some sixty years ago. The organisation of spliceosomal components in plant nuclei is reviewed and these recent observations are unified with previous cytochemical and ultrastructural studies of plant ribonuleoprotein domains.
Collapse
|
25
|
Abstract
Data driven computational biology relies on the large quantities of genomic data stored in international sequence data banks. However, the possibilities are drastically impaired if the stored data is unreliable. During a project aiming to predict splice sites in the dicot Arabidopsis thaliana, we extracted a data set from the A.thaliana entries in GenBank. A number of simple 'sanity' checks, based on the nature of the data, revealed an alarmingly high error rate. More than 15% of the most important entries extracted did contain erroneous information. In addition, a number of entries had directly conflicting assignments of exons and introns, not stemming from alternative splicing. In a few cases the errors are due to mere typographical misprints, which may be corrected by comparison to the original papers, but errors caused by wrong assignments of splice sites from experimental data are the most common. It is proposed that the level of error correction should be increased and that gene structure sanity checks should be incorporated--also at the submitter level--to avoid or reduce the problem in the future. A non-redundant and error corrected subset of the data for A.thaliana is made available through anonymous FTP.
Collapse
|
26
|
The myrosinase gene family in Arabidopsis thaliana: gene organization, expression and evolution. PLANT MOLECULAR BIOLOGY 1995; 27:911-22. [PMID: 7766881 DOI: 10.1007/bf00037019] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Myrosinase (thioglucoside glucohydrolase, EC 3.2.3.1.) is in Brassicaceae species such as Brassica napus and Sinapis alba encoded by two differentially expressed gene families, MA and MB, consisting of about 4 and 10 genes, respectively. Southern blot analysis showed that Arabidopsis thaliana contains three myrosinase genes. These genes were isolated from a genomic library and two of them, TGG1 and TGG2, were sequenced. They were found to be located in an inverted mode with their 3' ends 4.4 kb apart. Their organization was highly conserved with 12 exons and 11 short introns. Comparison of nucleotide sequences of TGG1 and TGG2 exons revealed an overall 75% similarity. In contrast, the overall nucleotide sequence similarity in introns was only 42%. In intron 1 the unusual 5' splice border GC was used. Phylogenetic analyses using both distance matrix and parsimony programs suggested that the Arabidopsis genes could not be grouped with either MA or MB genes. Consequently, these two gene families arose only after Arabidopsis had diverged from the other Brassicaceae species. In situ hybridization experiments showed that TGG1 and TGG2 expressing cells are present in leaf, sepal, petal, and gynoecium. In developing seeds, a few cells reacting with the TGG1 probe, but not with the TGG2 probe, were found indicating a partly different expression of these genes.
Collapse
|