1
|
Lion M, Shahar Y. Implementation and evaluation of a multivariate abstraction-based, interval-based dynamic time-warping method as a similarity measure for longitudinal medical records. J Biomed Inform 2021; 123:103919. [PMID: 34628062 DOI: 10.1016/j.jbi.2021.103919] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 08/25/2021] [Accepted: 09/27/2021] [Indexed: 11/30/2022]
Abstract
OBJECTIVES A common prerequisite for tasks such as classification, prediction, clustering and retrieval of longitudinal medical records is a clinically meaningful similarity measure that considers both [multiple] variable (concept) values and their time. Currently, most similarity measures focus on raw, time-stamped data as these are stored in a medical record. However, clinicians think in terms of clinically meaningful temporal abstractions, such as "decreasing renal functions", enabling them to ignore minor time and value variations and focus on similarities among the clinical trajectories of different patients. Our objective was to define an abstraction- and interval-based methodology for matching longitudinal, multivariate medical records, and rigorously assess its value, versus the option of using just the raw, time-stamped data. METHODS We have developed a new methodology for determination of the relative distance between a pair of longitudinal records, by extending the known dynamic time warping (DTW) method into an interval-based dynamic time warping (iDTW) methodology. The iDTW methodology includes (A): A three-steps interval-based representation (iRep) method: [1] abstracting the raw, time-stamped data of the longitudinal records into clinically meaningful interval-based abstractions, using a domain-specific knowledge base, [2] scoping the period of comparison of the records, [3] creating from the intervals a symbolic time series, by partitioning them into a predetermined temporal granularity; (B) An interval-based matching (iMatch) method to match each relevant pair of multivariate longitudinal records, each represented as multiple series of short symbolic intervals in the determined temporal granularity, using a modified DTW version. EVALUATION Three classification or prediction tasks were defined: (1) classifying 161 records of oncology patients as having had autologous versus allogenic bone-marrow transplantation; (2) classifying the longitudinal records of 125 hepatitis patients as having B or C hepatitis; and (3) predicting micro- or macro-albuminuria in the second year, for 151 diabetes patients who were followed for five years. The raw, time-stamped, multivariate data within each medical record, for one, two, or three concepts out of four or five concepts judged as relevant in each medical domain, were abstracted into clinically meaningful intervals using the Knowledge-Based Temporal-Abstraction method, using previously acquired knowledge. We focused on two temporal-abstraction types: (1) State abstractions, which discretize a concept's raw value into a predetermined range (e.g., LOW or HIGH Hemoglobin); and (2) Gradient abstractions, which indicate the trend of the concept's value (e.g., INCREASING, DECREASING Hemoglobin value). We created all of the combinations of either uni-dimensional (State or Gradient) or multi-dimensional (State and Gradient) abstractions, of all of the concepts used. Classification of a record was determined by using a majority of the k-Nearest-Neighbors (KNN) of the given record, k ranging over the odd numbers (to break ties) from 1 to N, N being the size of the training set. We have experimented with all possible configurations of the parameters that our method uses. Overall, a total of 75,936 experiments were performed: 33,600 in the Oncology domain, 28,800 in the Hepatitis domain, and 13,536 in the Diabetes domain. Each experiment involved the performance of a 10-fold Cross Validation to compute the mean performance of a particular iDTW method-configuration set of settings, for a specific subset of one, two, or three concepts out of all of the domain-specific concepts relevant to the classification or prediction task on which the experiment focuses. We measured for each such experimental combination the Area Under the Curve (AUC) and the optimal Specificity/Sensitivity ratio using Youden's Index. We then aggregated the experiments by the types of unidimensional or multidimensional abstractions used in them (including the use of only raw concepts as a special case); for example, two state abstractions of different concepts, and one gradient abstraction of a third concept. We compared the mean AUC when using each such feature representation, or combination of abstractions, across all possible method-setting configurations, to the mean AUC when using as a feature representation, for the same task, only raw concepts, also across all possible method-setting configurations. Finally, we applied a paired t-test, to determine whether the mean difference between the accuracy of each temporal-abstraction representation, across all concept and configuration combinations, and the respective raw-concept combinations, across all concept subset and configuration combinations, is significant (P < 0.05). RESULTS The mean performance of the classification and prediction tasks when using, as a feature representation, the various temporal-abstraction combinations, was significantly higher than that performance when using only raw data. Furthermore, in each domain and task, there existed at least one representation using interval-based abstractions whose use led, on average (over all concept subset combinations and method configurations) to a significantly better performance than the use of only subsets of the raw time-stamped data. In seven of nine combinations of domain type (out of three) and number of concepts used (one, two, or three), the variance of the AUCs (for all representations and configurations) was considerably higher across all raw-concept subsets, compared to all abstract combinations. Increasing the number of features used by the matching task enhanced performance. Using multi-dimensional abstractions of the same concept further enhanced the performance. When using only raw data, increasing the number of neighbors monotonically increased the mean performance (over all concept combinations and method configurations) until reaching an optimal saddle-point aroundN; when using abstractions, however, optimal mean performance was often reached after matching only five nearest neighbors. CONCLUSIONS Using multivariate and multidimensional interval-based, abstraction-based similarity measures is feasible, and consistently and significantly improved the mean classification and prediction performance in time-oriented domains, using DTW-inspired methods, compared to the use of only raw, time-stamped data. It also made the KNN classification more effective. Nevertheless, although the mean performance for the abstract representations was higher than the mean performance when using only raw-data concepts, the actual optimal classification performance in each domain and task depends on the choice of the specific raw or abstract concepts used as features.
Collapse
Affiliation(s)
- Matan Lion
- Medical Informatics Research Center, Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
| | - Yuval Shahar
- Medical Informatics Research Center, Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
| |
Collapse
|
2
|
Estrada AD, Freese NH, Blakley IC, Loraine AE. Analysis of pollen-specific alternative splicing in Arabidopsis thaliana via semi-quantitative PCR. PeerJ 2015; 3:e919. [PMID: 25945312 PMCID: PMC4419537 DOI: 10.7717/peerj.919] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Accepted: 04/08/2015] [Indexed: 12/12/2022] Open
Abstract
Alternative splicing enables a single gene to produce multiple mRNA isoforms by varying splice site selection. In animals, alternative splicing of mRNA isoforms between cell types is widespread and supports cellular differentiation. In plants, at least 20% of multi-exon genes are alternatively spliced, but the extent and significance of tissue-specific splicing is less well understood, partly because it is difficult to isolate cells of a single type. Pollen is a useful model system to study tissue-specific splicing in higher plants because pollen grains contain only two cell types and can be collected in large amounts without damaging cells. Previously, we identified pollen-specific splicing patterns by comparing RNA-Seq data from Arabidopsis pollen and leaves. Here, we used semi-quantitative PCR to validate pollen-specific splicing patterns among genes where RNA-Seq data analysis indicated splicing was most different between pollen and leaves. PCR testing confirmed eight of nine alternative splicing patterns, and results from the ninth were inconclusive. In four genes, alternative transcriptional start sites coincided with alternative splicing. This study highlights the value of the low-cost PCR assay as a method of validating RNA-Seq results.
Collapse
Affiliation(s)
- April D Estrada
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte , Charlotte, NC , USA
| | - Nowlan H Freese
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte , Charlotte, NC , USA
| | - Ivory C Blakley
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte , Charlotte, NC , USA
| | - Ann E Loraine
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte , Charlotte, NC , USA
| |
Collapse
|
3
|
Iandolino A, Pearcy R, Williams L. Simulating three-dimensional grapevine canopies and modelling their light interception characteristics. AUSTRALIAN JOURNAL OF GRAPE AND WINE RESEARCH 2013. [PMID: 0 DOI: 10.1111/ajgw.12036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Affiliation(s)
- A.B. Iandolino
- Department of Viticulture and Enology; University of California; One Shields Avenue; Davis; CA; 95616; USA
| | - R.W. Pearcy
- Department of Evolution and Ecology; University of California; One Shields Avenue; Davis; CA; 95616; USA
| | | |
Collapse
|
4
|
Guo Y. Towards systems biological understanding of leaf senescence. PLANT MOLECULAR BIOLOGY 2013; 82:519-28. [PMID: 23065109 DOI: 10.1007/s11103-012-9974-2] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 09/20/2012] [Indexed: 05/22/2023]
Abstract
The application of systems biology approaches has greatly facilitated the process of deciphering the molecular mechanisms underlying leaf senescence. Analyses of the leaf senescence transcriptome have identified some of the major biochemical events during senescence including protein degradation and nutrient remobilization. Proteomic studies have confirmed these findings and have suggested up-regulated energy metabolism during leaf senescence which might be important for cell viability maintenance. As a critical part of systems biology, studies involving transcription regulation networking and senescence-inducing signaling have deepened our understanding on the molecular regulation of leaf senescence. The important next steps towards a systems biological understanding of leaf senescence will be discussed.
Collapse
Affiliation(s)
- Yongfeng Guo
- Tobacco Research Institute, Chinese Academy of Agricultural Sciences, Qingdao, 266101, China.
| |
Collapse
|
5
|
Shangguan L, Han J, Kayesh E, Sun X, Zhang C, Pervaiz T, Wen X, Fang J. Evaluation of genome sequencing quality in selected plant species using expressed sequence tags. PLoS One 2013; 8:e69890. [PMID: 23922843 PMCID: PMC3726750 DOI: 10.1371/journal.pone.0069890] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Accepted: 06/14/2013] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. METHODOLOGY/PRINCIPAL FINDING Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ∼ 98.28% and 89.02% ∼ 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. CONCLUSION The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published.
Collapse
Affiliation(s)
- Lingfei Shangguan
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Jian Han
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Emrul Kayesh
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Xin Sun
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Changqing Zhang
- College of Horticulture, Jinling Institute of Technology, Nanjing City, Jiangsu Province, China
| | - Tariq Pervaiz
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Xicheng Wen
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Jinggui Fang
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| |
Collapse
|
6
|
Wu CT, Chiou CY, Chiu HC, Yang UC. Fine-tuning of microRNA-mediated repression of mRNA by splicing-regulated and highly repressive microRNA recognition element. BMC Genomics 2013; 14:438. [PMID: 23819653 PMCID: PMC3708814 DOI: 10.1186/1471-2164-14-438] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Accepted: 06/11/2013] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND MicroRNAs are very small non-coding RNAs that interact with microRNA recognition elements (MREs) on their target messenger RNAs. Varying the concentration of a given microRNA may influence the expression of many target proteins. Yet, the expression of a specific target protein can be fine-tuned by alternative cleavage and polyadenylation to the corresponding mRNA. RESULTS This study showed that alternative splicing of mRNA is a fine-tuning mechanism in the cellular regulatory network. The splicing-regulated MREs are often highly repressive MREs. This phenomenon was observed not only in the hsa-miR-148a-regulated DNMT3B gene, but also in many target genes regulated by hsa-miR-124, hsa-miR-1, and hsa-miR-181a. When a gene contains multiple MREs in transcripts, such as the VEGF gene, the splicing-regulated MREs are again the highly repressive MREs. Approximately one-third of the analysable human MREs in MiRTarBase and TarBase can potentially perform the splicing-regulated fine-tuning. Interestingly, the high (+30%) repression ratios observed in most of these splicing-regulated MREs indicate associations with functions. For example, the MRE-free transcripts of many oncogenes, such as N-RAS and others may escape microRNA-mediated suppression in cancer tissues. CONCLUSIONS This fine-tuning mechanism revealed associations with highly repressive MRE. Since high-repression MREs are involved in many important biological phenomena, the described association implies that splicing-regulated MREs are functional. A possible application of this observed association is in distinguishing functionally relevant MREs from predicted MREs.
Collapse
Affiliation(s)
- Cheng-Tao Wu
- Institute of Biomedical Informatics, National Yang-Ming University, No.155, Sec.2, Linong Street, Taipei 11221, Taiwan, ROC
- Biomedical Technology and Device Research Labs (BDL), Industrial Technology Research Institute (ITRI), No.195, Sec. 4, Chung Hsing Rd., Chutung, Hsinchu 31040, Taiwan, ROC
| | - Chien-Ying Chiou
- Center for Systems and Synthetic Biology, National Yang-Ming University, No.155, Sec.2, Linong Street, Taipei 11221, Taiwan, ROC
| | - Ho-Chen Chiu
- Institute of Biomedical Informatics, National Yang-Ming University, No.155, Sec.2, Linong Street, Taipei 11221, Taiwan, ROC
| | - Ueng-Cheng Yang
- Institute of Biomedical Informatics, National Yang-Ming University, No.155, Sec.2, Linong Street, Taipei 11221, Taiwan, ROC
- Center for Systems and Synthetic Biology, National Yang-Ming University, No.155, Sec.2, Linong Street, Taipei 11221, Taiwan, ROC
- Bioinformatics Consortium of Taiwan core facility, Taipei, Taiwan, ROC
| |
Collapse
|
7
|
Góngora-Castillo E, Buell CR. Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence. Nat Prod Rep 2013; 30:490-500. [PMID: 23377493 DOI: 10.1039/c3np20099j] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Plant natural product research can be facilitated through genome and transcriptome sequencing approaches that generate informative sequence and expression datasets that enable characterization of biochemical pathways of interest. As the overwhelming majority of plant-derived natural products are derived from species with little, if any, sequence and/or genomic resources, the ability to perform whole genome shotgun sequencing and assembly has been and will continue to be transformative as access to a genome sequence provides molecular resources and a context for discovery and characterization of biosynthetic pathways. Due to the reduced size and complexity of the transcriptome relative to the genome, transcriptome sequencing provides a rapid, inexpensive approach to access gene sequences, gene expression abundances, and gene expression patterns in any species, including those that lack a reference genome sequence. To date, successful applications of RNA sequencing in conjunction with de novo transcriptome assembly has enabled identification of new genes in an array of biochemical pathways in plants. While sequencing technologies are well developed, challenges remain in the handling and analysis of transcriptome sequences. In this Highlight article, we provide an overview of the bioinformatics challenges associated with transcriptome analyses using short read sequences and how to address these issues in plant species that lack a reference genome.
Collapse
|
8
|
Chunmei G, Shuqing L, Ming-Zhong S. Sequence and Bioinformatic Characterization of Expressed Sequence Tags Originated FromGloydius shedaoensis shedaoensisVenom Gland. Anat Rec (Hoboken) 2013; 296:807-14. [DOI: 10.1002/ar.22670] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 01/08/2013] [Indexed: 12/18/2022]
Affiliation(s)
- Guo Chunmei
- Department of Biotechnology; Dalian Medical University; Dalian 116044 China
| | - Liu Shuqing
- Department of Biochemistry and Molecular Biology; Dalian Medical University; Dalian 116044 China
| | - Sun Ming-Zhong
- Department of Biotechnology; Dalian Medical University; Dalian 116044 China
| |
Collapse
|
9
|
Wu TH, Chu LJ, Wang JC, Chen TW, Tien YJ, Lin WC, Ng WV. Meta-analytical biomarker search of EST expression data reveals three differentially expressed candidates. BMC Genomics 2012; 13 Suppl 7:S12. [PMID: 23282184 PMCID: PMC3521215 DOI: 10.1186/1471-2164-13-s7-s12] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Researches have been conducted for the identification of differentially expressed genes (DEGs) by generating and mining of cDNA expressed sequence tags (ESTs) for more than a decade. Although the availability of public databases make possible the comprehensive mining of DEGs among the ESTs from multiple tissue types, existing studies usually employed statistics suitable only for two categories. Multi-class test has been developed to enable the finding of tissue specific genes, but subsequent search for cancer genes involves separate two-category test only on the ESTs of the tissue of interest. This constricts the amount of data used. On the other hand, simple pooling of cancer and normal genes from multiple tissue types runs the risk of Simpson's paradox. Here we presented a different approach which searched for multi-cancer DEG candidates by analyzing all pertinent ESTs in all categories and narrowing down the cancer biomarker candidates via integrative analysis with microarray data and selection of secretory and membrane protein genes as well as incorporation of network analysis. Finally, the differential expression patterns of three selected cancer biomarker candidates were confirmed by real-time qPCR analysis. RESULTS Seven hundred and twenty three primary DEG candidates (p-value < 0.05 and lower bound of confidence interval of odds ratio ≥ 1.65) were selected from a curated EST database with the application of Cochran-Mantel-Haenszel statistic (CMH). GeneGO analysis results indicated this set as neoplasm enriched. Cross-examination with microarray data further narrowed the list down to 235 genes, among which 96 had membrane or secretory annotations. After examined the candidates in protein interaction network, public tissue expression databases, and literatures, we selected three genes for further evaluation by real-time qPCR with eight major normal and cancer tissues. The higher-than-normal tissue expression of COL3A1, DLG3, and RNF43 in some of the cancer tissues is in agreement with our in silico predictions. CONCLUSIONS Searching digitized transcriptome using CMH enabled us to identify multi-cancer differentially expressed gene candidates. Our methodology demonstrated simultaneously analysis for cancer biomarkers of multiple tissue types with the EST data. With the revived interest in digitizing the transcriptomes by NGS, cancer biomarkers could be more precisely detected from the ESTs. The three candidates identified in this study, COL3A1, DLG3, and RNF43, are valuable targets for further evaluation with a larger sample size of normal and cancer tissue or serum samples.
Collapse
Affiliation(s)
- Timothy H Wu
- Institute of Biomedical Informatics, National Yang Ming University, Taipei, Taiwan, R.O.C
| | - Lichieh J Chu
- Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan, R.O.C
| | - Jian-Chiao Wang
- Department of Biotechnology and Laboratory Science in Medicine and Institute of Biotechnology in Medicine, National Yang Ming University, Taipei, Taiwan, R.O.C
| | - Ting-Wen Chen
- Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan, R.O.C
- Bioinformatics Center, Chang Gung University, Taoyuan, Taiwan, R.O.C
| | - Yin-Jing Tien
- Institute of Statistical Science, Academia Sinica, Taipei, 11529, Taiwan, R.O.C
| | - Wen-Chang Lin
- Institute of Biomedical Informatics, National Yang Ming University, Taipei, Taiwan, R.O.C
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan, R.O.C
| | - Wailap V Ng
- Institute of Biomedical Informatics, National Yang Ming University, Taipei, Taiwan, R.O.C
- Department of Biotechnology and Laboratory Science in Medicine and Institute of Biotechnology in Medicine, National Yang Ming University, Taipei, Taiwan, R.O.C
- Center for Systems and Synthetic Biology, National Yang Ming University, Taipei, Taiwan, R.O.C
| |
Collapse
|
10
|
Huang J, Yan L, Lei Y, Jiang H, Ren X, Liao B. Expressed sequence tags in cultivated peanut (Arachis hypogaea): discovery of genes in seed development and response to Ralstonia solanacearum challenge. JOURNAL OF PLANT RESEARCH 2012; 125:755-69. [PMID: 22648474 DOI: 10.1007/s10265-012-0491-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/25/2012] [Indexed: 05/07/2023]
Abstract
Although an important oil crop, peanut has only 162,030 expressed sequence tags (ESTs) publicly available, 86,943 of which are from cultivated plants. More ESTs from cultivated peanuts are needed for isolation of stress-resistant, tissue-specific and developmentally important genes. Here, we generated 63,234 ESTs from our 5 constructed peanut cDNA libraries of Ralstonia solanacearum challenged roots, R. solanacearum challenged leaves, and unchallenged cultured peanut roots, leaves and developing seeds. Among these ESTs, there were 14,547 unique sequences with 7,961 tentative consensus sequences and 6,586 singletons. Putative functions for 47.8 % of the sequences were identified, including transcription factors, tissue-specific genes, genes involved in fatty acid biosynthesis and oil formation regulation, and resistance gene analogue genes. Additionally, differentially expressed genes, including those involved in ethylene and jasmonic acid signal transduction pathways, from both peanut leaves and roots, were identified in R. solanacearum challenged samples. This large expression dataset from different peanut tissues will be a valuable source for marker development and gene expression analysis. It will also be helpful for finding candidate genes for fatty acid synthesis and oil formation regulation as well as for studying mechanisms of interactions between the peanut host and R. solanacearum pathogen.
Collapse
Affiliation(s)
- Jiaquan Huang
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture, Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Wuhan, People's Republic of China
| | | | | | | | | | | |
Collapse
|
11
|
Sha AH, Li C, Yan XH, Shan ZH, Zhou XA, Jiang ML, Mao H, Chen B, Wan X, Wei WH. Large-scale sequencing of normalized full-length cDNA library of soybean seed at different developmental stages and analysis of the gene expression profiles based on ESTs. Mol Biol Rep 2012; 39:2867-74. [PMID: 21667246 DOI: 10.1007/s11033-011-1046-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2010] [Accepted: 06/04/2011] [Indexed: 12/15/2022]
Abstract
Although GenBank has now covered over 1,400,000 expressed sequence tags (ESTs) from soybean, most ESTs available to the public have been derived from tissues or environmental conditions rather than developing seeds. It is absolutely necessary for annotating the molecular mechanisms of soybean seed development to analyze completely the gene expression profiles of its immature seed at various stages. Here we have constructed a full-length-enriched cDNA library comprised of a total of 45,408 cDNA clones which cover various stages of soybean seed development. Furthermore, we have sequenced from 5' ends of these clones, 36,656 ESTs were obtained in the present study. These EST sequences could be categorized into 27,982 unigenes, including 22,867 contigs and 5,115 singletons, among which 27,931 could be mapped onto soybean 20 chromosome sequences. Comparative genomic analysis with other plants has revealed that these unigenes include lots of candidate genes specific to dicot, legume and soybean. Approximately 1,789 of these unigenes currently show no homology to known soybean sequences, suggesting that many represent mRNAs specifically expressed in seeds. Novel abundant genes involved in the oil synthesis have been found in this study, may serve as a valuable resource for soybean seed improvement.
Collapse
Affiliation(s)
- Ai-Hua Sha
- Institute of Oil Crops, Key Laboratory of Oil Crop Biology of the Ministry of Agriculture, Chinese Academy of Agricultural Sciences, Wuhan 430062, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Manickavelu A, Kawaura K, Oishi K, Shin-I T, Kohara Y, Yahiaoui N, Keller B, Abe R, Suzuki A, Nagayama T, Yano K, Ogihara Y. Comprehensive functional analyses of expressed sequence tags in common wheat (Triticum aestivum). DNA Res 2012; 19:165-77. [PMID: 22334568 PMCID: PMC3325080 DOI: 10.1093/dnares/dss001] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
About 1 million expressed sequence tag (EST) sequences comprising 125.3 Mb nucleotides were accreted from 51 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including abiotic stresses and pathogen challenges in common wheat (Triticum aestivum). Expressed sequence tags were assembled with stringent parameters after processing with inbuild scripts, resulting in 37,138 contigs and 215,199 singlets. In the assembled sequences, 10.6% presented no matches with existing sequences in public databases. Functional characterization of wheat unigenes by gene ontology annotation, mining transcription factors, full-length cDNA, and miRNA targeting sites were carried out. A bioinformatics strategy was developed to discover single-nucleotide polymorphisms (SNPs) within our large EST resource and reported the SNPs between and within (homoeologous) cultivars. Digital gene expression was performed to find the tissue-specific gene expression, and correspondence analysis was executed to identify common and specific gene expression by selecting four biotic stress-related libraries. The assembly and associated information cater a framework for future investigation in functional genomics.
Collapse
Affiliation(s)
- Alagu Manickavelu
- Kihara Institute for Biological Research, Yokohama City University, Yokohama, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Cheung F. Global assembly of expressed sequence tags. Methods Mol Biol 2012; 883:193-199. [PMID: 22589135 DOI: 10.1007/978-1-61779-839-9_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The method for the construction of Expressed Sequence Tag (EST) assemblies described here uses reads generated from 454 pyrosequencing and Sanger and Illumina (Solexa) sequencing technologies as input. It is consistent with and parallels many established EST assembly protocols, for example the TIGR Gene Indices. Reads that are used as input to the EST assembly process usually come from both internal and external sources. Thus, in addition to internally generated EST reads, expressed transcripts are collected from dbEST and also the NCBI GenBank nucleotide database (full-length and partial cDNAs). "Virtual" transcript sequences derived from whole genome annotation projects can be excluded, depending on the needs of the project. Currently, in most cases, 454-derived sequences can be treated similar to Sanger-derived ESTs. In contrast, the shorter Solexa-derived sequences will have to undergo a round of either de novo assembly or an "align-then-assemble" approach against a reference genome, if available, before these transcripts can be used for the purpose of a global EST assembly that combines a mixture of Sanger and next-generation sequencing technologies.
Collapse
Affiliation(s)
- Foo Cheung
- Center for Human Immunology, Autoimmunity, and Inflammation, National Institute of Health, Bethesda, MD, USA.
| |
Collapse
|
14
|
Kim C, Robertson JS, Paterson AH. Inference of subgenomic origin of BACs in an interspecific hybrid sugarcane cultivar by overlapping oligonucleotide hybridizations. Genome 2011; 54:727-37. [PMID: 21883018 DOI: 10.1139/g11-038] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Sugarcane (Saccharum spp.) breeders in the early 20th century made remarkable progress in increasing yield and disease resistance by crossing Saccharum spontaneum L., a wild relative, to Saccharum officinarum L., a traditional cultivar. Modern sugarcane cultivars have approximately 71%-83% of their chromosomes originating from S. officinarum, approximately 10%-21% from S. spontaneum, and approximately 2%-13% recombinant or translocated chromosomes. In the present work, C(0)t-based cloning and sequencing (CBCS) was implemented to further explore highly repetitive DNA and to seek species-specific repeated DNA in both S. officinarum and S. spontaneum. For putatively species-specific sequences, overlappping oligonucleotide probes (overgos) were designed and hybridized to BAC filters from the interspecific hybrid sugarcane cultivar 'R570' to try to deduce parental origins of BAC clones. We inferred that 12 967 BACs putatively originated from S. officinarum and 5117 BACs from S. spontaneum. Another 1103 BACs were hybridized by both species-specific overgos, too many to account for by conventional recombination, thus suggesting ectopic recombination and (or) translocation of DNA elements. Constructing a low C(0)t library is useful to collect highly repeated DNA sequences and to search for potentially species-specific molecular markers, especially among recently diverged species. Even in the absence of repeat families that are species-specific in their entirety, the identification of localized variations within consensus sequences, coupled with the site specificity of short synthetic overgos, permits researchers to monitor species-specific or species-enriched variants.
Collapse
Affiliation(s)
- Changsoo Kim
- Plant Genome Mapping Laboratory, University of Georgia, Athens, USA
| | | | | |
Collapse
|
15
|
Liu R, Wang B, Guo W, Wang L, Zhang T. Differential gene expression and associated QTL mapping for cotton yield based on a cDNA-AFLP transcriptome map in an immortalized F2. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2011; 123:439-54. [PMID: 21512772 DOI: 10.1007/s00122-011-1597-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2010] [Accepted: 04/05/2011] [Indexed: 05/12/2023]
Abstract
cDNA-AFLP techniques have found new applications in recent years. Currently, the methodology is used to establish differential gene expression and construct linkage maps. In the present study, a transcriptome map based on cDNA-AFLP techniques was constructed using an immortalized F(2) (IF(2)) population of 171 lines. The lines were derived from intercrosses between 180 recombinant inbred lines (RILs) of the cotton hybrid Xiangzamian 2 (Gossypium. hirsutum L.). A total of 302 transcriptome-derived fragments (TDFs) were mapped onto 26 linkage groups that covered 2,477.06 cM in length with an average distance of 8.23 cM between two markers. Seventy-one QTL for yield and yield component traits were detected by CIM procedures based on four environments, with 13 QTL identified in at least two environments. Some TDFs co-located with yield QTL were subsequently sequenced and analyzed by online homology searches. Potential candidate genes for yield and yield component traits were found to encode proteins involved in DNA replication, transcription, translation, and biosynthesis regulation. Furthermore, genes regulating metabolic processes signal transduction, transport, and structural components of organelles were identified. Correlation analysis between expression patterns of TDFs and trait performance detected six TDFs positively correlated to both yield and yield heterosis: six TDFs positively correlated to yield, and seven TDFs to yield heterosis. These TDFs have potential for cloning the functional genes responsible for each corresponding trait and have future value in marker-assisted selection.
Collapse
Affiliation(s)
- Renzhong Liu
- National Key Laboratory of Crop Genetics and Germplasm Enhancement, Cotton Research Institute, Nanjing Agricultural University, Nanjing, China
| | | | | | | | | |
Collapse
|
16
|
Fierro AC, Vandenbussche F, Engelen K, Van de Peer Y, Marchal K. Meta Analysis of Gene Expression Data within and Across Species. Curr Genomics 2011; 9:525-34. [PMID: 19516959 PMCID: PMC2694560 DOI: 10.2174/138920208786847935] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2008] [Revised: 07/07/2008] [Accepted: 07/18/2008] [Indexed: 01/15/2023] Open
Abstract
Since the second half of the 1990s, a large number of genome-wide analyses have been described that study gene expression at the transcript level. To this end, two major strategies have been adopted, a first one relying on hybridization techniques such as microarrays, and a second one based on sequencing techniques such as serial analysis of gene expression (SAGE), cDNA-AFLP, and analysis based on expressed sequence tags (ESTs). Despite both types of profiling experiments becoming routine techniques in many research groups, their application remains costly and laborious. As a result, the number of conditions profiled in individual studies is still relatively small and usually varies from only two to few hundreds of samples for the largest experiments. More and more, scientific journals require the deposit of these high throughput experiments in public databases upon publication. Mining the information present in these databases offers molecular biologists the possibility to view their own small-scale analysis in the light of what is already available. However, so far, the richness of the public information remains largely unexploited. Several obstacles such as the correct association between ESTs and microarray probes with the corresponding gene transcript, the incompleteness and inconsistency in the annotation of experimental conditions, and the lack of standardized experimental protocols to generate gene expression data, all impede the successful mining of these data. Here, we review the potential and difficulties of combining publicly available expression data from respectively EST analyses and microarray experiments. With examples from literature, we show how meta-analysis of expression profiling experiments can be used to study expression behavior in a single organism or between organisms, across a wide range of experimental conditions. We also provide an overview of the methods and tools that can aid molecular biologists in exploiting these public data.
Collapse
Affiliation(s)
- Ana C Fierro
- Department of Microbial and Molecular Systems, Katholieke Universiteit Leuven, Kasteelpark Arenberg 20, 3001 Leuven, Belgium
| | | | | | | | | |
Collapse
|
17
|
Kim HJ, Baek KH, Lee BW, Choi D, Hur CG. In silico identification and characterization of microRNAs and their putative target genes in Solanaceae plants. Genome 2011; 54:91-8. [PMID: 21326365 DOI: 10.1139/g10-104] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
MicroRNAs (miRNAs) are a class of small, single-stranded, noncoding RNAs ranging from 19 to 25 nucleotides. The miRNA control various cellular functions by negatively regulating gene expression at the post-transcriptional level. The miRNA regulation over their target genes has a central role in regulating plant growth and development; however, only a few reports have been published on the function of miRNAs in the family Solanaceae. We identified Solanaceae miRNAs and their target genes by analyzing expressed sequence tag (EST) data from five different Solanaceae species. A comprehensive bioinformatic analysis of EST data of Solanaceae species revealed the presence of at least 11 miRNAs and 54 target genes in pepper (Capsicum annuum L.), 22 miRNAs and 221 target genes in potato (Solanum tuberosum L.), 12 miRNAs and 417 target genes in tomato (Solanum lycopersicum L.), 46 miRNAs and 60 target genes in tobacco (Nicotiana tabacum L.), and 7 miRNAs and 28 target genes in Nicotiana benthamiana. The identified Solanaceae miRNAs and their target genes were deposited in the SolmiRNA database, which is freely available for academic research only at http://genepool.kribb.re.kr/SolmiRNA. Our data indicate that the Solanaceae family has both conserved and specific miRNAs and that their target genes may play important roles in growth and development of Solanaceae plants.
Collapse
Affiliation(s)
- Hyun-Jin Kim
- Bioinformatics Research Center, Korea Research Institute of Bioscience and Biotechnology, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | | | | | | | | |
Collapse
|
18
|
Wang Y, Meng Y, Zhang M, Tong X, Wang Q, Sun Y, Quan J, Govers F, Shan W. Infection of Arabidopsis thaliana by Phytophthora parasitica and identification of variation in host specificity. MOLECULAR PLANT PATHOLOGY 2011; 12:187-201. [PMID: 21199568 PMCID: PMC6640465 DOI: 10.1111/j.1364-3703.2010.00659.x] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Oomycete pathogens cause severe damage to a wide range of agriculturally important crops and natural ecosystems. They represent a unique group of plant pathogens that are evolutionarily distant from true fungi. In this study, we established a new plant-oomycete pathosystem in which the broad host range pathogen Phytophthora parasitica was demonstrated to be capable of interacting compatibly with the model plant Arabidopsis thaliana. Water-soaked lesions developed on leaves within 3 days and numerous sporangia formed within 5 days post-inoculation of P. parasitica zoospores. Cytological characterization showed that P. parasitica developed appressoria-like swellings and penetrated epidermal cells directly and preferably at the junction between anticlinal host cell walls. Multiple haustoria-like structures formed in both epidermal cells and mesophyll cells 1 day post-inoculation of zoospores. Pathogenicity assays of 25 A. thaliana ecotypes with six P. parasitica strains indicated the presence of a natural variation in host specificity between A. thaliana and P. parasitica. Most ecotypes were highly susceptible to P. parasitica strains Pp014, Pp016 and Pp025, but resistant to strains Pp008 and Pp009, with the frequent appearance of cell wall deposition and active defence response-based cell necrosis. Gene expression and comparative transcriptomic analysis further confirmed the compatible interaction by the identification of up-regulated genes in A. thaliana which were characteristic of biotic stress. The established A. thaliana-P. parasitica pathosystem expands the model systems investigating oomycete-plant interactions, and will facilitate a full understanding of Phytophthora biology and pathology.
Collapse
Affiliation(s)
- Yan Wang
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Bombarely A, Merchante C, Csukasi F, Cruz-Rus E, Caballero JL, Medina-Escobar N, Blanco-Portales R, Botella MA, Muñoz-Blanco J, Sánchez-Sevilla JF, Valpuesta V. Generation and analysis of ESTs from strawberry (Fragaria xananassa) fruits and evaluation of their utility in genetic and molecular studies. BMC Genomics 2010; 11:503. [PMID: 20849591 PMCID: PMC2996999 DOI: 10.1186/1471-2164-11-503] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Accepted: 09/17/2010] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Cultivated strawberry is a hybrid octoploid species (Fragaria xananassa Duchesne ex. Rozier) whose fruit is highly appreciated due to its organoleptic properties and health benefits. Despite recent studies on the control of its growth and ripening processes, information about the role played by different hormones on these processes remains elusive. Further advancement of this knowledge is hampered by the limited sequence information on genes from this species, despite the abundant information available on genes from the wild diploid relative Fragaria vesca. However, the diploid species, or one ancestor, only partially contributes to the genome of the cultivated octoploid. We have produced a collection of expressed sequence tags (ESTs) from different cDNA libraries prepared from different fruit parts and developmental stages. The collection has been analysed and the sequence information used to explore the involvement of different hormones in fruit developmental processes, and for the comparison of transcripts in the receptacle of ripe fruits of diploid and octoploid species. The study is particularly important since the commercial fruit is indeed an enlarged flower receptacle with the true fruits, the achenes, on the surface and connected through a network of vascular vessels to the central pith. RESULTS We have sequenced over 4,500 ESTs from Fragaria xananassa, thus doubling the number of ESTs available in the GenBank of this species. We then assembled this information together with that available from F. xananassa resulting a total of 7,096 unigenes. The identification of SSRs and SNPs in many of the ESTs allowed their conversion into functional molecular markers. The availability of libraries prepared from green growing fruits has allowed the cloning of cDNAs encoding for genes of auxin, ethylene and brassinosteroid signalling processes, followed by expression studies in selected fruit parts and developmental stages. In addition, the sequence information generated in the project, jointly with previous information on sequences from both F. xananassa and F. vesca, has allowed designing an oligo-based microarray that has been used to compare the transcriptome of the ripe receptacle of the diploid and octoploid species. Comparison of the transcriptomes, grouping the genes by biological processes, points to differences being quantitative rather than qualitative. CONCLUSIONS The present study generates essential knowledge and molecular tools that will be useful in improving investigations at the molecular level in cultivated strawberry (F. xananassa). This knowledge is likely to provide useful resources in the ongoing breeding programs. The sequence information has already allowed the development of molecular markers that have been applied to germplasm characterization and could be eventually used in QTL analysis. Massive transcription analysis can be of utility to target specific genes to be further studied, by their involvement in the different plant developmental processes.
Collapse
Affiliation(s)
- Aureliano Bombarely
- Departamento de Biología Molecular y Bioquímica. Universidad de Málaga. Spain
| | - Catharina Merchante
- Departamento de Biología Molecular y Bioquímica. Universidad de Málaga. Spain
| | - Fabiana Csukasi
- Departamento de Biología Molecular y Bioquímica. Universidad de Málaga. Spain
| | - Eduardo Cruz-Rus
- Departamento de Biología Molecular y Bioquímica. Universidad de Málaga. Spain
| | | | | | | | - Miguel A Botella
- Departamento de Biología Molecular y Bioquímica. Universidad de Málaga. Spain
| | | | | | | |
Collapse
|
20
|
Guo S, Zheng Y, Joung JG, Liu S, Zhang Z, Crasta OR, Sobral BW, Xu Y, Huang S, Fei Z. Transcriptome sequencing and comparative analysis of cucumber flowers with different sex types. BMC Genomics 2010; 11:384. [PMID: 20565788 PMCID: PMC2897810 DOI: 10.1186/1471-2164-11-384] [Citation(s) in RCA: 146] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2010] [Accepted: 06/17/2010] [Indexed: 11/19/2022] Open
Abstract
Background Cucumber, Cucumis sativus L., is an economically and nutritionally important crop of the Cucurbitaceae family and has long served as a primary model system for sex determination studies. Recently, the sequencing of its whole genome has been completed. However, transcriptome information of this species is still scarce, with a total of around 8,000 Expressed Sequence Tag (EST) and mRNA sequences currently available in GenBank. In order to gain more insights into molecular mechanisms of plant sex determination and provide the community a functional genomics resource that will facilitate cucurbit research and breeding, we performed transcriptome sequencing of cucumber flower buds of two near-isogenic lines, WI1983G, a gynoecious plant which bears only pistillate flowers, and WI1983H, a hermaphroditic plant which bears only bisexual flowers. Result Using Roche-454 massive parallel pyrosequencing technology, we generated a total of 353,941 high quality EST sequences with an average length of 175bp, among which 188,255 were from gynoecious flowers and 165,686 from hermaphroditic flowers. These EST sequences, together with ~5,600 high quality cucumber EST and mRNA sequences available in GenBank, were clustered and assembled into 81,401 unigenes, of which 28,452 were contigs and 52,949 were singletons. The unigenes and ESTs were further mapped to the cucumber genome and more than 500 alternative splicing events were identified in 443 cucumber genes. The unigenes were further functionally annotated by comparing their sequences to different protein and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 343 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified ~200 differentially expressed genes between flowers of WI1983G and WI1983H and provided novel insights into molecular mechanisms of plant sex determination process. Furthermore, a set of SSR motifs and high confidence SNPs between WI1983G and WI1983H were identified from the ESTs, which provided the material basis for future genetic linkage and QTL analysis. Conclusion A large set of EST sequences were generated from cucumber flower buds of two different sex types. Differentially expressed genes between these two different sex-type flowers, as well as putative SSR and SNP markers, were identified. These EST sequences provide valuable information to further understand molecular mechanisms of plant sex determination process and forms a rich resource for future functional genomics analysis, marker development and cucumber breeding.
Collapse
Affiliation(s)
- Shaogui Guo
- 1National Engineering Research Center for Vegetables, Beijing 100097, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
English AC, Patel KS, Loraine AE. Prevalence of alternative splicing choices in Arabidopsis thaliana. BMC PLANT BIOLOGY 2010; 10:102. [PMID: 20525311 PMCID: PMC3017808 DOI: 10.1186/1471-2229-10-102] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 06/04/2010] [Indexed: 05/04/2023]
Abstract
BACKGROUND Around 14% of protein-coding genes of Arabidopsis thaliana genes from the TAIR9 genome release are annotated as producing multiple transcript variants through alternative splicing. However, for most alternatively spliced genes in Arabidopsis, the relative expression level of individual splicing variants is unknown. RESULTS We investigated prevalence of alternative splicing (AS) events in Arabidopsis thaliana using ESTs. We found that for most AS events with ample EST coverage, the majority of overlapping ESTs strongly supported one major splicing choice, with less than 10% of ESTs supporting the minor form. Analysis of ESTs also revealed a small but noteworthy subset of genes for which alternative choices appeared with about equal prevalence, suggesting that for these genes the variant splicing forms co-occur in the same cell types. Of the AS events in which both forms were about equally prevalent, more than 80% affected untranslated regions or involved small changes to the encoded protein sequence. CONCLUSIONS Currently available evidence from ESTs indicates that alternative splicing in Arabidopsis occurs and affects many genes, but for most genes with documented alternative splicing, one AS choice predominates. To aid investigation of the role AS may play in modulating function of Arabidopsis genes, we provide an on-line resource (ArabiTag) that supports searching AS events by gene, by EST library keyword search, and by relative prevalence of minor and major forms.
Collapse
Affiliation(s)
- Adam C English
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| | - Ketan S Patel
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| | - Ann E Loraine
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| |
Collapse
|
22
|
Raju NL, Gnanesh BN, Lekha P, Jayashree B, Pande S, Hiremath PJ, Byregowda M, Singh NK, Varshney RK. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.). BMC PLANT BIOLOGY 2010; 10:45. [PMID: 20222972 PMCID: PMC2923520 DOI: 10.1186/1471-2229-10-45] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Accepted: 03/11/2010] [Indexed: 05/23/2023]
Abstract
BACKGROUND Pigeonpea (Cajanus cajan (L.) Millsp) is one of the major grain legume crops of the tropics and subtropics, but biotic stresses [Fusarium wilt (FW), sterility mosaic disease (SMD), etc.] are serious challenges for sustainable crop production. Modern genomic tools such as molecular markers and candidate genes associated with resistance to these stresses offer the possibility of facilitating pigeonpea breeding for improving biotic stress resistance. Availability of limited genomic resources, however, is a serious bottleneck to undertake molecular breeding in pigeonpea to develop superior genotypes with enhanced resistance to above mentioned biotic stresses. With an objective of enhancing genomic resources in pigeonpea, this study reports generation and analysis of comprehensive resource of FW- and SMD- responsive expressed sequence tags (ESTs). RESULTS A total of 16 cDNA libraries were constructed from four pigeonpea genotypes that are resistant and susceptible to FW ('ICPL 20102' and 'ICP 2376') and SMD ('ICP 7035' and 'TTB 7') and a total of 9,888 (9,468 high quality) ESTs were generated and deposited in dbEST of GenBank under accession numbers GR463974 to GR473857 and GR958228 to GR958231. Clustering and assembly analyses of these ESTs resulted into 4,557 unique sequences (unigenes) including 697 contigs and 3,860 singletons. BLASTN analysis of 4,557 unigenes showed a significant identity with ESTs of different legumes (23.2-60.3%), rice (28.3%), Arabidopsis (33.7%) and poplar (35.4%). As expected, pigeonpea ESTs are more closely related to soybean (60.3%) and cowpea ESTs (43.6%) than other plant ESTs. Similarly, BLASTX similarity results showed that only 1,603 (35.1%) out of 4,557 total unigenes correspond to known proteins in the UniProt database (or= 5 sequences detected 102 single nucleotide polymorphisms (SNPs) in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS) assay. CONCLUSION The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.
Collapse
Affiliation(s)
- Nikku L Raju
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
| | - Belaghihalli N Gnanesh
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
- University of Agricultural Sciences, Gandhi Krishi Vignyan Kendra (GKVK), Bangalore, 560 065, Karnataka, India
| | - Pazhamala Lekha
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
| | - Balaji Jayashree
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
| | - Suresh Pande
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
| | - Pavana J Hiremath
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
| | - Munishamappa Byregowda
- University of Agricultural Sciences, Gandhi Krishi Vignyan Kendra (GKVK), Bangalore, 560 065, Karnataka, India
| | - Nagendra K Singh
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Greater Hyderabad 502 324, Andhra Pradesh, India
- Genomics towards Gene Discovery Sub Programme, Generation Challenge Programme (GCP) c/o CIMMYT, Int. Apartado Postal 6-641, 06600, Mexico, DF Mexico
| |
Collapse
|
23
|
Fukuoka H, Yamaguchi H, Nunome T, Negoro S, Miyatake K, Ohyama A. Accumulation, functional annotation, and comparative analysis of expressed sequence tags in eggplant (Solanum melongena L.), the third pole of the genus Solanum species after tomato and potato. Gene 2010; 450:76-84. [PMID: 19857557 DOI: 10.1016/j.gene.2009.10.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Revised: 10/16/2009] [Accepted: 10/16/2009] [Indexed: 10/20/2022]
Abstract
Eggplant (Solanum melongena L.) is a widely grown vegetable crop that belongs to the genus Solanum, which is comprised of more than 1000 species of wide genetic and phenotypic variation. Unlike tomato and potato, Solanum crops that belong to subgenus Potatoe and have been targets for comprehensive genomic studies, eggplant is endemic to the Old World and belongs to a different subgenus, Leptostemonum, and therefore, would be a unique member for comparative molecular biology in Solanum. In this study, more than 60,000 eggplant cDNA clones from various tissues and treatments were sequenced from both the 5'- and 3'-ends, and a unigene set consisting of 16,245 unique sequences was constructed. Functional annotations based on sequence similarity to known plant reference datasets revealed a distribution of functional categories almost similar to that of tomato, while 1316 unigenes were suggested to be eggplant-specific. Sequence-based comparative analysis using putative orthologous gene groups setup by reciprocal sequence comparison among six solanaceous species suggested that eggplant and its wild ally Solanum torvum were clustered separately from subgenus Potatoe species, and then, all Solanum species were clustered separately from the genus Capsicum. Microsatellite motif distribution was different among species and likely to be coincident with the phylogenetic relationships. Furthermore, the eggplant unigene dataset exhibited its utility in transcriptome analysis by the SAGE strategy where a considerable number of short tag sequences of interest were successfully assigned to unigenes and their functional annotations. The eggplant ESTs and 16k unigene set developed in this study would be a useful resource not only for molecular genetics and breeding in eggplant itself, but for expanding the scope of comparative biology in Solanum species.
Collapse
Affiliation(s)
- Hiroyuki Fukuoka
- National Institute of Vegetable and Tea Science, NARO., Ano, Tsu, Mie 514-2392, Japan.
| | | | | | | | | | | |
Collapse
|
24
|
Gan LP, Zhang WY, Niu YS, Xu L, Xi J, Ji MM, Xu SQ. Construction and application of an electronic spatiotemporal expression profile and gene ontology analysis platform based on the EST database of the silkworm, Bombyx mori. JOURNAL OF INSECT SCIENCE (ONLINE) 2010; 10:114. [PMID: 20874595 PMCID: PMC3016962 DOI: 10.1673/031.010.11401] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Accepted: 01/25/2010] [Indexed: 05/29/2023]
Abstract
An Expressed Sequence Tag (EST) is a short sub-sequence of a transcribed cDNA sequence. ESTs represent gene expression and give good clues for gene expression analysis. Based on EST data obtained from NCBI, an EST analysis package was developed (apEST). This tool was programmed for electronic expression, protein annotation and Gene Ontology (GO) category analysis in Bombyx mori (L.) (Lepidoptera: Bombycidae). A total of 245,761 ESTs (as of 01 July 2009) were searched and downloaded in FASTA format, from which information for tissue type, development stage, sex and strain were extracted, classified and summed by running apEST. Then, corresponding distribution profiles were formed after redundant parts had been removed. Gene expression profiles for one tissue of different developmental stages and from one development stage of the different tissues were attained. A housekeeping gene and tissue-and-stage-specific genes were selected by running apEST, contrasting with two other online analysis approaches, microarray-based gene expression profile on SilkDB (BmMDB) and EST profile on NCBI. A spatio-temporal expression profile of catalase run by apEST was then presented as a three-dimensional graph for the intuitive visualization of patterns. A total of 37 query genes confirmed from microarray data and RT-PCR experiments were selected as queries to test apEST. The results had great conformity among three approaches. Nevertheless, there were minor differences between apEST and BmMDB because of the unique items investigated. Therefore, complementary analysis was proposed. Application of apEST also led to the acquisition of corresponding protein annotations for EST datasets and eventually for their functions. The results were presented according to statistical information on protein annotation and Gene Ontology (GO) category. These all verified the reliability of apEST and the operability of this platform. The apEST can also be applied in other species by modifying some parameters and serves as a model for gene expression study for Lepidoptera.
Collapse
Affiliation(s)
- Li- Ping Gan
- National Engineering Laboratory for Modem Silk, Department of Applied Biology, Medical College of Soochow University, Suzhou, 215153, P. R. China
- Biology Department, Chongqing Three Gorges University, Chongqing, 404000, China
| | - Wen-Yu Zhang
- Bioinformatics Department, Medical College, Soochow University, Suzhou, 215153, China
| | - Yan-Shan Niu
- National Engineering Laboratory for Modem Silk, Department of Applied Biology, Medical College of Soochow University, Suzhou, 215153, P. R. China
| | - Li Xu
- National Engineering Laboratory for Modem Silk, Department of Applied Biology, Medical College of Soochow University, Suzhou, 215153, P. R. China
| | - Jian Xi
- National Engineering Laboratory for Modem Silk, Department of Applied Biology, Medical College of Soochow University, Suzhou, 215153, P. R. China
| | - Ming-Ming Ji
- National Engineering Laboratory for Modem Silk, Department of Applied Biology, Medical College of Soochow University, Suzhou, 215153, P. R. China
| | - Shi-Qing Xu
- National Engineering Laboratory for Modem Silk, Department of Applied Biology, Medical College of Soochow University, Suzhou, 215153, P. R. China
| |
Collapse
|
25
|
Jackson DJ, McDougall C, Woodcroft B, Moase P, Rose RA, Kube M, Reinhardt R, Rokhsar DS, Montagnani C, Joubert C, Piquemal D, Degnan BM. Parallel evolution of nacre building gene sets in molluscs. Mol Biol Evol 2009; 27:591-608. [PMID: 19915030 DOI: 10.1093/molbev/msp278] [Citation(s) in RCA: 170] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The capacity to biomineralize is closely linked to the rapid expansion of animal life during the early Cambrian, with many skeletonized phyla first appearing in the fossil record at this time. The appearance of disparate molluscan forms during this period leaves open the possibility that shells evolved independently and in parallel in at least some groups. To test this proposition and gain insight into the evolution of structural genes that contribute to shell fabrication, we compared genes expressed in nacre (mother-of-pearl) forming cells in the mantle of the bivalve Pinctada maxima and the gastropod Haliotis asinina. Despite both species having highly lustrous nacre, we find extensive differences in these expressed gene sets. Following the removal of housekeeping genes, less than 10% of all gene clusters are shared between these molluscs, with some being conserved biomineralization genes that are also found in deuterostomes. These differences extend to secreted proteins that may localize to the organic shell matrix, with less than 15% of this secretome being shared. Despite these differences, H. asinina and P. maxima both secrete proteins with repetitive low-complexity domains (RLCDs). Pinctada maxima RLCD proteins-for example, the shematrins-are predominated by silk/fibroin-like domains, which are absent from the H. asinina data set. Comparisons of shematrin genes across three species of Pinctada indicate that this gene family has undergone extensive divergent evolution within pearl oysters. We also detect fundamental bivalve-gastropod differences in extracellular matrix proteins involved in mollusc-shell formation. Pinctada maxima expresses a chitin synthase at high levels and several chitin deacetylation genes, whereas only one protein involved in chitin interactions is present in the H. asinina data set, suggesting that the organic matrix on which calcification proceeds differs fundamentally between these species. Large-scale differences in genes expressed in nacre-forming cells of Pinctada and Haliotis are compatible with the hypothesis that gastropod and bivalve nacre is the result of convergent evolution. The expression of novel biomineralizing RLCD proteins in each of these two molluscs and, interestingly, sea urchins suggests that the evolution of such structural proteins has occurred independently multiple times in the Metazoa.
Collapse
Affiliation(s)
- Daniel J Jackson
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Tittarelli A, Santiago M, Morales A, Meisel LA, Silva H. Isolation and functional characterization of cold-regulated promoters, by digitally identifying peach fruit cold-induced genes from a large EST dataset. BMC PLANT BIOLOGY 2009; 9:121. [PMID: 19772651 PMCID: PMC2754992 DOI: 10.1186/1471-2229-9-121] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2009] [Accepted: 09/22/2009] [Indexed: 05/21/2023]
Abstract
BACKGROUND Cold acclimation is the process by which plants adapt to the low, non freezing temperatures that naturally occur during late autumn or early winter. This process enables the plants to resist the freezing temperatures of winter. Temperatures similar to those associated with cold acclimation are also used by the fruit industry to delay fruit ripening in peaches. However, peaches that are subjected to long periods of cold storage may develop chilling injury symptoms (woolliness and internal breakdown). In order to better understand the relationship between cold acclimation and chilling injury in peaches, we isolated and functionally characterized cold-regulated promoters from cold-inducible genes identified by digitally analyzing a large EST dataset. RESULTS Digital expression analyses of EST datasets, revealed 164 cold-induced peach genes, several of which show similarities to genes associated with cold acclimation and cold stress responses. The promoters of three of these cold-inducible genes (Ppbec1, Ppxero2 and Pptha1) were fused to the GUS reporter gene and characterized for cold-inducibility using both transient transformation assays in peach fruits (in fruta) and stable transformation in Arabidopsis thaliana. These assays demonstrate that the promoter Pptha1 is not cold-inducible, whereas the Ppbec1 and Ppxero2 promoter constructs are cold-inducible. CONCLUSION This work demonstrates that during cold storage, peach fruits differentially express genes that are associated with cold acclimation. Functional characterization of these promoters in transient transformation assays in fruta as well as stable transformation in Arabidopsis, demonstrate that the isolated Ppbec1 and Ppxero2 promoters are cold-inducible promoters, whereas the isolated Pptha1 promoter is not cold-inducible. Additionally, the cold-inducible activity of the Ppbec1 and Ppxero2 promoters suggest that there is a conserved heterologous cold-inducible regulation of these promoters in peach and Arabidopsis. These results reveal that digital expression analyses may be used in non-model species to identify candidate genes whose promoters are differentially expressed in response to exogenous stimuli.
Collapse
Affiliation(s)
- Andrés Tittarelli
- Millennium Nucleus in Plant Cell Biotechnology (MN-PCB), Santiago, Chile
- Plant Functional Genomics & Bioinformatics Lab, Universidad Andrés Bello, Santiago, Chile
| | - Margarita Santiago
- Millennium Nucleus in Plant Cell Biotechnology (MN-PCB), Santiago, Chile
- Plant Functional Genomics & Bioinformatics Lab, Universidad Andrés Bello, Santiago, Chile
| | - Andrea Morales
- Millennium Nucleus in Plant Cell Biotechnology (MN-PCB), Santiago, Chile
- Plant Functional Genomics & Bioinformatics Lab, Universidad Andrés Bello, Santiago, Chile
| | - Lee A Meisel
- Millennium Nucleus in Plant Cell Biotechnology (MN-PCB), Santiago, Chile
- Centro de Biotecnología Vegetal, Universidad Andrés Bello, Santiago, Chile
| | - Herman Silva
- Millennium Nucleus in Plant Cell Biotechnology (MN-PCB), Santiago, Chile
- Plant Functional Genomics & Bioinformatics Lab, Universidad Andrés Bello, Santiago, Chile
| |
Collapse
|
27
|
Ashraf N, Ghai D, Barman P, Basu S, Gangisetty N, Mandal MK, Chakraborty N, Datta A, Chakraborty S. Comparative analyses of genotype dependent expressed sequence tags and stress-responsive transcriptome of chickpea wilt illustrate predicted and unexpected genes and novel regulators of plant immunity. BMC Genomics 2009; 10:415. [PMID: 19732460 PMCID: PMC2755012 DOI: 10.1186/1471-2164-10-415] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2009] [Accepted: 09/05/2009] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The ultimate phenome of any organism is modulated by regulated transcription of many genes. Characterization of genetic makeup is thus crucial for understanding the molecular basis of phenotypic diversity, evolution and response to intra- and extra-cellular stimuli. Chickpea is the world's third most important food legume grown in over 40 countries representing all the continents. Despite its importance in plant evolution, role in human nutrition and stress adaptation, very little ESTs and differential transcriptome data is available, let alone genotype-specific gene signatures. Present study focuses on Fusarium wilt responsive gene expression in chickpea. RESULTS We report 6272 gene sequences of immune-response pathway that would provide genotype-dependent spatial information on the presence and relative abundance of each gene. The sequence assembly led to the identification of a CaUnigene set of 2013 transcripts comprising of 973 contigs and 1040 singletons, two-third of which represent new chickpea genes hitherto undiscovered. We identified 209 gene families and 262 genotype-specific SNPs. Further, several novel transcription regulators were identified indicating their possible role in immune response. The transcriptomic analysis revealed 649 non-cannonical genes besides many unexpected candidates with known biochemical functions, which have never been associated with pathostress-responsive transcriptome. CONCLUSION Our study establishes a comprehensive catalogue of the immune-responsive root transcriptome with insight into their identity and function. The development, detailed analysis of CaEST datasets and global gene expression by microarray provide new insight into the commonality and diversity of organ-specific immune-responsive transcript signatures and their regulated expression shaping the species specificity at genotype level. This is the first report on differential transcriptome of an unsequenced genome during vascular wilt.
Collapse
Affiliation(s)
- Nasheeman Ashraf
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Deepali Ghai
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Pranjan Barman
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Swaraj Basu
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Nagaraju Gangisetty
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Mihir K Mandal
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Niranjan Chakraborty
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Asis Datta
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| | - Subhra Chakraborty
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067, India
| |
Collapse
|
28
|
Vega-Arreguín JC, Ibarra-Laclette E, Jiménez-Moraila B, Martínez O, Vielle-Calzada JP, Herrera-Estrella L, Herrera-Estrella A. Deep sampling of the Palomero maize transcriptome by a high throughput strategy of pyrosequencing. BMC Genomics 2009; 10:299. [PMID: 19580677 PMCID: PMC2714558 DOI: 10.1186/1471-2164-10-299] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2008] [Accepted: 07/06/2009] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND In-depth sequencing analysis has not been able to determine the overall complexity of transcriptional activity of a plant organ or tissue sample. In some cases, deep parallel sequencing of Expressed Sequence Tags (ESTs), although not yet optimized for the sequencing of cDNAs, has represented an efficient procedure for validating gene prediction and estimating overall gene coverage. This approach could be very valuable for complex plant genomes. In addition, little emphasis has been given to efforts aiming at an estimation of the overall transcriptional universe found in a multicellular organism at a specific developmental stage. RESULTS To explore, in depth, the transcriptional diversity in an ancient maize landrace, we developed a protocol to optimize the sequencing of cDNAs and performed 4 consecutive GS20-454 pyrosequencing runs of a cDNA library obtained from 2 week-old Palomero Toluqueño maize plants. The protocol reported here allowed obtaining over 90% of informative sequences. These GS20-454 runs generated over 1.5 Million reads, representing the largest amount of sequences reported from a single plant cDNA library. A collection of 367,391 quality-filtered reads (30.09 Mb) from a single run was sufficient to identify transcripts corresponding to 34% of public maize ESTs databases; total sequences generated after 4 filtered runs increased this coverage to 50%. Comparisons of all 1.5 Million reads to the Maize Assembled Genomic Islands (MAGIs) provided evidence for the transcriptional activity of 11% of MAGIs. We estimate that 5.67% (86,069 sequences) do not align with public ESTs or annotated genes, potentially representing new maize transcripts. Following the assembly of 74.4% of the reads in 65,493 contigs, real-time PCR of selected genes confirmed a predicted correlation between the abundance of GS20-454 sequences and corresponding levels of gene expression. CONCLUSION A protocol was developed that significantly increases the number, length and quality of cDNA reads using massive 454 parallel sequencing. We show that recurrent 454 pyrosequencing of a single cDNA sample is necessary to attain a thorough representation of the transcriptional universe present in maize, that can also be used to estimate transcript abundance of specific genes. This data suggests that the molecular and functional diversity contained in the vast native landraces remains to be explored, and that large-scale transcriptional sequencing of a presumed ancestor of the modern maize varieties represents a valuable approach to characterize the functional diversity of maize for future agricultural and evolutionary studies.
Collapse
Affiliation(s)
- Julio C Vega-Arreguín
- Laboratorio Nacional de Genómica para la Biodiversidad, Cinvestav Campus Guanajuato, Carretera Irapuato-León, Irapuato, Gto, Mexico.
| | | | | | | | | | | | | |
Collapse
|
29
|
Govind G, Harshavardhan VT, ThammeGowda HV, Patricia JK, Kalaiarasi PJ, Dhanalakshmi R, Iyer DR, Senthil Kumar M, Muthappa SK, Sreenivasulu N, Nese S, Udayakumar M, Makarla UK. Identification and functional validation of a unique set of drought induced genes preferentially expressed in response to gradual water stress in peanut. Mol Genet Genomics 2009; 281:591-605. [PMID: 19224247 PMCID: PMC2757612 DOI: 10.1007/s00438-009-0432-z] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2007] [Accepted: 01/30/2009] [Indexed: 12/28/2022]
Abstract
Peanut, found to be relatively drought tolerant crop, has been the choice of study to characterize the genes expressed under gradual water deficit stress. Nearly 700 genes were identified to be enriched in subtractive cDNA library from gradual process of drought stress adaptation. Further, expression of the drought inducible genes related to various signaling components and gene sets involved in protecting cellular function has been described based on dot blot experiments. Fifty genes (25 regulators and 25 functional related genes) selected based on dot blot experiments were tested for their stress responsiveness using northern blot analysis and confirmed their nature of differential regulation under different field capacity of drought stress treatments. ESTs generated from this subtracted cDNA library offered a rich source of stress-related genes including signaling components. Additional 50% uncharacterized sequences are noteworthy. Insights gained from this study would provide the foundation for further studies to understand the question of how peanut plants are able to adapt to naturally occurring harsh drought conditions. At present functional validation cannot be deemed in peanut, hence as a proof of concept seven orthologues of drought induced genes of peanut have been silenced in heterologous N. benthamiana system, using virus induced gene silencing method. These results point out the functional importance for HSP70 gene and key regulators such as Jumonji in drought stress response.
Collapse
Affiliation(s)
- Geetha Govind
- Department of Crop Physiology, University of Agricultural Sciences, GKVK, Bangalore, 560 065, Karnataka, India.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Sjödin A, Street NR, Sandberg G, Gustafsson P, Jansson S. The Populus Genome Integrative Explorer (PopGenIE): a new resource for exploring the Populus genome. THE NEW PHYTOLOGIST 2009; 182:1013-1025. [PMID: 19383103 DOI: 10.1111/j.1469-8137.2009.02807.x] [Citation(s) in RCA: 108] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Populus has become an important model plant system. However, utilization of the increasingly extensive collection of genetics and genomics data created by the community is currently hindered by the lack of a central resource, such as a model organism database (MOD). Such MODs offer a single entry point to the collection of resources available within a model system, typically including tools for exploring and querying those resources. As a starting point to overcoming the lack of such an MOD for Populus, we present the Populus Genome Integrative Explorer (PopGenIE), an integrated set of tools for exploring the Populus genome and transcriptome. The resource includes genome, synteny and quantitative trait locus (QTL) browsers for exploring genetic data. Expression tools include an electronic fluorescent pictograph (eFP) browser, expression profile plots, co-regulation within collated transcriptomics data sets, and identification of over-represented functional categories and genomic hotspot locations. A number of collated transcriptomics data sets are made available in the eFP browser to facilitate functional exploration of gene function. Additional homology and data extraction tools are provided. PopGenIE significantly increases accessibility to Populus genomics resources and allows exploration of transcriptomics data without the need to learn or understand complex statistical analysis methods. PopGenIE is available at www.popgenie.org or via www.populusgenome.info.
Collapse
Affiliation(s)
- Andreas Sjödin
- Umeå Plant Science Centre, Department of Plant Physiology, University of Umeå, SE-901-87 Umeå, Sweden
- CBRN Security and Defence, Swedish Defence Research Agency, SE-90182 Umeå, Sweden
| | - Nathaniel Robert Street
- Umeå Plant Science Centre, Department of Plant Physiology, University of Umeå, SE-901-87 Umeå, Sweden
| | - Göran Sandberg
- Umeå Plant Science Centre, Department of Plant Physiology, University of Umeå, SE-901-87 Umeå, Sweden
| | - Petter Gustafsson
- Umeå Plant Science Centre, Department of Plant Physiology, University of Umeå, SE-901-87 Umeå, Sweden
| | - Stefan Jansson
- Umeå Plant Science Centre, Department of Plant Physiology, University of Umeå, SE-901-87 Umeå, Sweden
| |
Collapse
|
31
|
Jiang SM, Yin WB, Hu J, Shi R, Zhou RN, Chen YH, Zhou GH, Wang RRC, Song LY, Hu ZM. Isolation of expressed sequences from a specific chromosome of Thinopyrum intermedium infected by BYDV. Genome 2009; 52:68-76. [PMID: 19132073 DOI: 10.1139/g08-108] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
To map important ESTs to specific chromosomes and (or) chromosomal regions is difficult in hexaploid wheat because of its large genome size and serious interference of homoeologous sequences. Large-scale EST sequencing and subsequent chromosome localization are both laborious and time-consuming. The wheat alien addition line TAi-27 contains a pair of chromosomes of Thinopyrum intermedium (Host) Barkworth & D.R. Dewey that carry the resistance gene against barley yellow dwarf virus. In this research, we developed a modified technique based on chromosome microdissection and hybridization-specific amplification to isolate expressed sequences from the alien chromosome of TAi-27 by hybridization between the DNA of the microdissected alien chromosome and cDNA of Th. intermedium infected by barley yellow dwarf virus. Twelve clones were selected, sequenced, and analyzed. Three of them were unknown genes without any hit in the GenBank database and the other nine were highly homologous with ESTs of wheat, barley, and (or) other plants in Gramineae induced by abiotic or biotic stress. The method used in this research to isolate expressed sequences from a specific chromosome has the following advantages: (i) the obtained expressed sequences are larger in size and have 3' end information and (ii) the operation is less complicated. It would be an efficient improved method for genomics and functional genomics research of polyploid plants, especially for EST development and mapping. The obtained expressed sequence data are also informative in understanding the resistance genes on the alien chromosome of TAi-27.
Collapse
Affiliation(s)
- Shu-Mei Jiang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Datun Road, Beijing 100101, PR China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Hei-Chia Wang, Tian-Hsiang Huang. Prediction of EST Functional Relationships via Literature Mining With User-Specified Parameters. IEEE Trans Biomed Eng 2009; 56:969-77. [DOI: 10.1109/tbme.2008.2009765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
33
|
Buell CR. Poaceae genomes: going from unattainable to becoming a model clade for comparative plant genomics. PLANT PHYSIOLOGY 2009; 149:111-6. [PMID: 19005087 PMCID: PMC2613712 DOI: 10.1104/pp.108.128926] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Accepted: 11/05/2008] [Indexed: 05/21/2023]
Affiliation(s)
- C Robin Buell
- Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824, USA.
| |
Collapse
|
34
|
Shi BJ, Wang GL. Comparative study of genes expressed from rice fungus-resistant and susceptible lines during interactions with Magnaporthe oryzae. Gene 2008; 427:80-5. [DOI: 10.1016/j.gene.2008.09.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2008] [Revised: 09/11/2008] [Accepted: 09/16/2008] [Indexed: 01/17/2023]
|
35
|
Argout X, Fouet O, Wincker P, Gramacho K, Legavre T, Sabau X, Risterucci AM, Da Silva C, Cascardo J, Allegre M, Kuhn D, Verica J, Courtois B, Loor G, Babin R, Sounigo O, Ducamp M, Guiltinan MJ, Ruiz M, Alemanno L, Machado R, Phillips W, Schnell R, Gilmour M, Rosenquist E, Butler D, Maximova S, Lanaud C. Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. BMC Genomics 2008; 9:512. [PMID: 18973681 PMCID: PMC2642826 DOI: 10.1186/1471-2164-9-512] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Accepted: 10/30/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. RESULTS Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species.Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories.A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database.To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection.A large collection of new genetic markers was provided by this ESTs collection. CONCLUSION This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.
Collapse
Affiliation(s)
- Xavier Argout
- Biological Systems Department, UMR DAP TA 40/03, CIRAD, Montpellier, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Kim HJ, Baek KH, Lee SW, Kim J, Lee BW, Cho HS, Kim WT, Choi D, Hur CG. Pepper EST database: comprehensive in silico tool for analyzing the chili pepper (Capsicum annuum) transcriptome. BMC PLANT BIOLOGY 2008; 8:101. [PMID: 18844979 PMCID: PMC2575210 DOI: 10.1186/1471-2229-8-101] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2008] [Accepted: 10/09/2008] [Indexed: 05/19/2023]
Abstract
BACKGROUND There is no dedicated database available for Expressed Sequence Tags (EST) of the chili pepper (Capsicum annuum), although the interest in a chili pepper EST database is increasing internationally due to the nutritional, economic, and pharmaceutical value of the plant. Recent advances in high-throughput sequencing of the ESTs of chili pepper cv. Bukang have produced hundreds of thousands of complementary DNA (cDNA) sequences. Therefore, a chili pepper EST database was designed and constructed to enable comprehensive analysis of chili pepper gene expression in response to biotic and abiotic stresses. RESULTS We built the Pepper EST database to mine the complexity of chili pepper ESTs. The database was built on 122,582 sequenced ESTs and 116,412 refined ESTs from 21 pepper EST libraries. The ESTs were clustered and assembled into virtual consensus cDNAs and the cDNAs were assigned to metabolic pathway, Gene Ontology (GO), and MIPS Functional Catalogue (FunCat). The Pepper EST database is designed to provide a workbench for (i) identifying unigenes in pepper plants, (ii) analyzing expression patterns in different developmental tissues and under conditions of stress, and (iii) comparing the ESTs with those of other members of the Solanaceae family. The Pepper EST database is freely available at http://genepool.kribb.re.kr/pepper/. CONCLUSION The Pepper EST database is expected to provide a high-quality resource, which will contribute to gaining a systemic understanding of plant diseases and facilitate genetics-based population studies. The database is also expected to contribute to analysis of gene synteny as part of the chili pepper sequencing project by mapping ESTs to the genome.
Collapse
Affiliation(s)
- Hyun-Jin Kim
- Omics Integration Research Center, KRIBB, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | - Kwang-Hyun Baek
- School of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 712-749, Korea
| | - Seung-Won Lee
- Omics Integration Research Center, KRIBB, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | - JungEun Kim
- Omics Integration Research Center, KRIBB, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | - Bong-Woo Lee
- Omics Integration Research Center, KRIBB, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | - Hye-Sun Cho
- Plant Genome Research Center, KRIBB, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| | - Woo Taek Kim
- Department of Biology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| | - Doil Choi
- Department of Plant Science, Seoul National University, Seoul 151-921, Korea
| | - Cheol-Goo Hur
- Omics Integration Research Center, KRIBB, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Korea
| |
Collapse
|
37
|
Kim C, Jang CS, Kamps TL, Robertson JS, Feltus FA, Paterson AH. Transcriptome analysis of leaf tissue from Bermudagrass (Cynodon dactylon) using a normalised cDNA library. FUNCTIONAL PLANT BIOLOGY : FPB 2008; 35:585-594. [PMID: 32688814 DOI: 10.1071/fp08133] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2008] [Accepted: 06/03/2008] [Indexed: 06/11/2023]
Abstract
A normalised cDNA library was constructed from Bermudagrass to gain insight into the transcriptome of Cynodon dactylon L. A total of 15 588 high-quality expressed sequence tags (ESTs) from the cDNA library were subjected to The Institute for Genomic Research Gene Indices clustering tools to produce a unigene set. A total of 9414 unigenes were obtained from the high-quality ESTs and only 39.6% of the high-quality ESTs were redundant, indicating that the normalisation procedure was effective. A large-scale comparative genomic analysis of the unigenes was carried out using publicly available tools, such as BLAST, InterProScan and Gene Ontology. The unigenes were also subjected to a search for EST-derived simple sequence repeats (EST-SSRs) and conserved-intron scanning primers (CISPs), which are useful as DNA markers. Although the candidate EST-SSRs and CISPs found in the present study need to be empirically tested, they are expected to be useful as DNA markers for many purposes, including comparative genomic studies of grass species, by virtue of their significant similarities to EST sequences from other grasses. Thus, knowledge of Cynodon ESTs will empower turfgrass research by providing homologues for genes that are thought to confer important functions in other plants.
Collapse
Affiliation(s)
- Changsoo Kim
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Cheol Seong Jang
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Terry L Kamps
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Jon S Robertson
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Frank A Feltus
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| | - Andrew H Paterson
- Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
38
|
Bagnaresi P, Moschella A, Beretta O, Vitulli F, Ranalli P, Perata P. Heterologous microarray experiments allow the identification of the early events associated with potato tuber cold sweetening. BMC Genomics 2008; 9:176. [PMID: 18416834 PMCID: PMC2358903 DOI: 10.1186/1471-2164-9-176] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2007] [Accepted: 04/16/2008] [Indexed: 01/21/2023] Open
Abstract
Background Since its discovery more than 100 years ago, potato (Solanum tuberosum) tuber cold-induced sweetening (CIS) has been extensively investigated. Several carbohydrate-associated genes would seem to be involved in the process. However, many uncertainties still exist, as the relative contribution of each gene to the process is often unclear, possibly as the consequence of the heterogeneity of experimental systems. Some enzymes associated with CIS, such as β-amylases and invertases, have still to be identified at a sequence level. In addition, little is known about the early events that trigger CIS and on the involvement/association with CIS of genes different from carbohydrate-associated genes. Many of these uncertainties could be resolved by profiling experiments, but no GeneChip is available for the potato, and the production of the potato cDNA spotted array (TIGR) has recently been discontinued. In order to obtain an overall picture of early transcriptional events associated with CIS, we investigated whether the commercially-available tomato Affymetrix GeneChip could be used to identify which potato cold-responsive gene family members should be further studied in detail by Real-Time (RT)-PCR (qPCR). Results A tomato-potato Global Match File was generated for the interpretation of various aspects of the heterologous dataset, including the retrieval of best matching potato counterparts and annotation, and the establishment of a core set of highly homologous genes. Several cold-responsive genes were identified, and their expression pattern was studied in detail by qPCR over 26 days. We detected biphasic behaviour of mRNA accumulation for carbohydrate-associated genes and our combined GeneChip-qPCR data identified, at a sequence level, enzymatic activities such as β-amylases and invertases previously reported as being involved in CIS. The GeneChip data also unveiled important processes accompanying CIS, such as the induction of redox- and ethylene-associated genes. Conclusion Our Global Match File strategy proved critical for accurately interpretating heterologous datasets, and suggests that similar approaches may be fruitful for other species. Transcript profiling of early events associated with CIS revealed a complex network of events involving sugars, redox and hormone signalling which may be either linked serially or act in parallel. The identification, at a sequence level, of various enzymes long known as having a role in CIS provides molecular tools for further understanding the phenomenon.
Collapse
Affiliation(s)
- Paolo Bagnaresi
- CRA-GPG, Genomic Research Center, Via S, Protaso 302, I-29017 Fiorenzuola d'Arda (PC), Italy.
| | | | | | | | | | | |
Collapse
|
39
|
Zhou RN, Shi R, Jiang SM, Yin WB, Wang HH, Chen YH, Hu J, Wang RRC, Zhang XQ, Hu ZM. Rapid EST isolation from chromosome 1R of rye. BMC PLANT BIOLOGY 2008; 8:28. [PMID: 18366673 PMCID: PMC2322994 DOI: 10.1186/1471-2229-8-28] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2007] [Accepted: 03/18/2008] [Indexed: 05/26/2023]
Abstract
BACKGROUND To obtain important expressed sequence tags (ESTs) located on specific chromosomes is currently difficult. Construction of single-chromosome EST library could be an efficient strategy to isolate important ESTs located on specific chromosomes. In this research we developed a method to rapidly isolate ESTs from chromosome 1R of rye by combining the techniques of chromosome microdissection with hybrid specific amplification (HSA). RESULTS Chromosome 1R was isolated by a glass needle and digested with proteinase K (PK). The DNA of chromosome 1R was amplified by two rounds of PCR using a degenerated oligonucleotide 6-MW sequence with a Sau3AI digestion site as the primer. The PCR product was digested with Sau3AI and linked with adaptor HSA1, then hybridized with the Sau3AI digested cDNA with adaptor HSA2 of rye leaves with and without salicylic acid (SA) treatment, respectively. The hybridized DNA fragments were recovered by the HSA method and cloned into pMD18-T vector. The cloned inserts were released by PCR using the partial sequences in HSA1 and HSA2 as the primers and then sequenced. Of the 94 ESTs obtained and analyzed, 6 were known sequences located on rye chromosome 1R or on homologous group 1 chromosomes of wheat; all of them were highly homologous with ESTs of wheat, barley and/or other plants in Gramineae, some of which were induced by abiotic or biotic stresses. Isolated in this research were 22 ESTs with unknown functions, probably representing some new genes on rye chromosome 1R. CONCLUSION We developed a new method to rapidly clone chromosome-specific ESTs from chromosome 1R of rye. The information reported here should be useful for cloning and investigating the new genes found on chromosome 1R.
Collapse
Affiliation(s)
- Ruo-Nan Zhou
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
- Graduate University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Rui Shi
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
- Forest Biotechnology Group, N.C. State University, Campus Box 7247, Raleigh, NC 27695-7247, USA
| | - Shu-Mei Jiang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
- South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, P. R. China
| | - Wei-Bo Yin
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Huang-Huang Wang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Yu-Hong Chen
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Jun Hu
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Richard RC Wang
- USDA-ARS, FRRL, Utah State University, Logan, UT 84322-6300, USA
| | - Xiang-Qi Zhang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Zan-Min Hu
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| |
Collapse
|
40
|
Klee EW. Data Mining for Biomarker Development: A Review of Tissue Specificity Analysis. Clin Lab Med 2008; 28:127-43, viii. [DOI: 10.1016/j.cll.2007.10.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
41
|
Guo B, Chen X, Dang P, Scully BT, Liang X, Holbrook CC, Yu J, Culbreath AK. Peanut gene expression profiling in developing seeds at different reproduction stages during Aspergillus parasiticus infection. BMC DEVELOPMENTAL BIOLOGY 2008; 8:12. [PMID: 18248674 PMCID: PMC2257936 DOI: 10.1186/1471-213x-8-12] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2007] [Accepted: 02/04/2008] [Indexed: 02/02/2023]
Abstract
Background Peanut (Arachis hypogaea L.) is an important crop economically and nutritionally, and is one of the most susceptible host crops to colonization of Aspergillus parasiticus and subsequent aflatoxin contamination. Knowledge from molecular genetic studies could help to devise strategies in alleviating this problem; however, few peanut DNA sequences are available in the public database. In order to understand the molecular basis of host resistance to aflatoxin contamination, a large-scale project was conducted to generate expressed sequence tags (ESTs) from developing seeds to identify resistance-related genes involved in defense response against Aspergillus infection and subsequent aflatoxin contamination. Results We constructed six different cDNA libraries derived from developing peanut seeds at three reproduction stages (R5, R6 and R7) from a resistant and a susceptible cultivated peanut genotypes, 'Tifrunner' (susceptible to Aspergillus infection with higher aflatoxin contamination and resistant to TSWV) and 'GT-C20' (resistant to Aspergillus with reduced aflatoxin contamination and susceptible to TSWV). The developing peanut seed tissues were challenged by A. parasiticus and drought stress in the field. A total of 24,192 randomly selected cDNA clones from six libraries were sequenced. After removing vector sequences and quality trimming, 21,777 high-quality EST sequences were generated. Sequence clustering and assembling resulted in 8,689 unique EST sequences with 1,741 tentative consensus EST sequences (TCs) and 6,948 singleton ESTs. Functional classification was performed according to MIPS functional catalogue criteria. The unique EST sequences were divided into twenty-two categories. A similarity search against the non-redundant protein database available from NCBI indicated that 84.78% of total ESTs showed significant similarity to known proteins, of which 165 genes had been previously reported in peanuts. There were differences in overall expression patterns in different libraries and genotypes. A number of sequences were expressed throughout all of the libraries, representing constitutive expressed sequences. In order to identify resistance-related genes with significantly differential expression, a statistical analysis to estimate the relative abundance (R) was used to compare the relative abundance of each gene transcripts in each cDNA library. Thirty six and forty seven unique EST sequences with threshold of R > 4 from libraries of 'GT-C20' and 'Tifrunner', respectively, were selected for examination of temporal gene expression patterns according to EST frequencies. Nine and eight resistance-related genes with significant up-regulation were obtained in 'GT-C20' and 'Tifrunner' libraries, respectively. Among them, three genes were common in both genotypes. Furthermore, a comparison of our EST sequences with other plant sequences in the TIGR Gene Indices libraries showed that the percentage of peanut EST matched to Arabidopsis thaliana, maize (Zea mays), Medicago truncatula, rapeseed (Brassica napus), rice (Oryza sativa), soybean (Glycine max) and wheat (Triticum aestivum) ESTs ranged from 33.84% to 79.46% with the sequence identity ≥ 80%. These results revealed that peanut ESTs are more closely related to legume species than to cereal crops, and more homologous to dicot than to monocot plant species. Conclusion The developed ESTs can be used to discover novel sequences or genes, to identify resistance-related genes and to detect the differences among alleles or markers between these resistant and susceptible peanut genotypes. Additionally, this large collection of cultivated peanut EST sequences will make it possible to construct microarrays for gene expression studies and for further characterization of host resistance mechanisms. It will be a valuable genomic resource for the peanut community. The 21,777 ESTs have been deposited to the NCBI GenBank database with accession numbers ES702769 to ES724546.
Collapse
Affiliation(s)
- Baozhu Guo
- USDA-ARS, Crop Protection and Management Research Unit, Tifton, Georgia 31793, USA.
| | | | | | | | | | | | | | | |
Collapse
|
42
|
Paiva JAP, Garcés M, Alves A, Garnier-Géré P, Rodrigues JC, Lalanne C, Porcon S, Le Provost G, Da Silva Perez D, Brach J, Frigerio JM, Claverol S, Barré A, Fevereiro P, Plomion C. Molecular and phenotypic profiling from the base to the crown in maritime pine wood-forming tissue. THE NEW PHYTOLOGIST 2008; 178:283-301. [PMID: 18298434 DOI: 10.1111/j.1469-8137.2008.02379.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Environmental, developmental and genetic factors affect variation in wood properties at the chemical, anatomical and physical levels. Here, the phenotypic variation observed along the tree stem was explored and the hypothesis tested that this variation could be the result of the differential expression of genes/proteins during wood formation. Differentiating xylem samples of maritime pine (Pinus pinaster) were collected from the top (crown wood, CW) to the bottom (base wood, BW) of adult trees. These samples were characterized by Fourier transform infrared spectroscopy (FTIR) and analytical pyrolysis. Two main groups of samples, corresponding to CW and BW, could be distinguished from cell wall chemical composition. A genomic approach, combining large-scale production of expressed sequence tags (ESTs), gene expression profiling and quantitative proteomics analysis, allowed identification of 262 unigenes (out of 3512) and 231 proteins (out of 1372 spots) that were differentially expressed along the stem. A good relationship was found between functional categories from transcriptomic and proteomic data. A good fit between the molecular mechanisms involved in CW-BW formation and these two types of wood phenotypic differences was also observed. This work provides a list of candidate genes for wood properties that will be tested in forward genetics.
Collapse
Affiliation(s)
- Jorge A P Paiva
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Av. da República-EAN, 2780-157 Oeiras, Portugal
- Tropical Research Institute of Portugal (IICT), Forest and Forest Products Centre, Tapada da Ajuda, 1349-017 Lisboa, Portugal
| | - Marcelo Garcés
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
- Instituto de Biología Vegetal y Biotecnología. Universidad de Talca, Chile
| | - Ana Alves
- Tropical Research Institute of Portugal (IICT), Forest and Forest Products Centre, Tapada da Ajuda, 1349-017 Lisboa, Portugal
- Centro de Estudos Florestais, Departamento de Engenharia Florestal, Instituto Superior de Agronomia, ISA-DEF, Tapada Ajuda, 1349-017 Lisboa, Portugal
| | - Pauline Garnier-Géré
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| | - José Carlos Rodrigues
- Tropical Research Institute of Portugal (IICT), Forest and Forest Products Centre, Tapada da Ajuda, 1349-017 Lisboa, Portugal
- Centro de Estudos Florestais, Departamento de Engenharia Florestal, Instituto Superior de Agronomia, ISA-DEF, Tapada Ajuda, 1349-017 Lisboa, Portugal
| | - Céline Lalanne
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| | - Stéphane Porcon
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| | - Grégoire Le Provost
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| | - Denilson Da Silva Perez
- FCBA InTechFibres, Laboratoire Bois Process, Domaine Universitaire, BP 251, 38044 Grenoble, Cedex 9, France
| | - Jean Brach
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| | - Jean-Marc Frigerio
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| | | | - Aurélien Barré
- Centre de Bioinformatique Bordeaux, Université Victor Segalen Bordeaux 2, rue Léo Saignat, 33076 Bordeaux Cedex, France
| | - Pedro Fevereiro
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Av. da República-EAN, 2780-157 Oeiras, Portugal
- Departamento de Biologia Vegetal, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, 1700 Lisboa, Portugal
| | - Christophe Plomion
- INRA, UMR 1202, Biodiversity Genes and Communities, 69 route d'Arcachon, F-33610 Cestas, France
| |
Collapse
|
43
|
Gorodkin J, Cirera S, Hedegaard J, Gilchrist MJ, Panitz F, Jørgensen C, Scheibye-Knudsen K, Arvin T, Lumholdt S, Sawera M, Green T, Nielsen BJ, Havgaard JH, Rosenkilde C, Wang J, Li H, Li R, Liu B, Hu S, Dong W, Li W, Yu J, Wang J, Stærfeldt HH, Wernersson R, Madsen LB, Thomsen B, Hornshøj H, Bujie Z, Wang X, Wang X, Bolund L, Brunak S, Yang H, Bendixen C, Fredholm M. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags. Genome Biol 2007; 8:R45. [PMID: 17407547 PMCID: PMC1895994 DOI: 10.1186/gb-2007-8-4-r45] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2006] [Revised: 01/18/2007] [Accepted: 04/02/2007] [Indexed: 12/05/2022] Open
Abstract
A resource consisting of one million porcine ESTs is described, providing an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies.
Collapse
Affiliation(s)
- Jan Gorodkin
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Susanna Cirera
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Jakob Hedegaard
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Michael J Gilchrist
- The Wellcome Trust/Cancer Research UK Gurdon Institute, Cambridge, CB2 1QN, UK
| | - Frank Panitz
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Claus Jørgensen
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Karsten Scheibye-Knudsen
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Troels Arvin
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Steen Lumholdt
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Milena Sawera
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Trine Green
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Bente J Nielsen
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Jakob H Havgaard
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Carina Rosenkilde
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Jun Wang
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Institute of Human Genetics, University of Aarhus, Nordre Ringgade 1, DK-8000 Aarhus C, Denmark
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campus Vej 55, DK-5230 Odense M, Denmark
| | - Heng Li
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Institute of Human Genetics, University of Aarhus, Nordre Ringgade 1, DK-8000 Aarhus C, Denmark
| | - Ruiqiang Li
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campus Vej 55, DK-5230 Odense M, Denmark
| | - Bin Liu
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Songnian Hu
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Wei Dong
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Wei Li
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Jun Yu
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Jian Wang
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Hans-Henrik Stærfeldt
- Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, DK-2800 Lyngby, Denmark
| | - Rasmus Wernersson
- Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, DK-2800 Lyngby, Denmark
| | - Lone B Madsen
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Bo Thomsen
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Henrik Hornshøj
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Zhan Bujie
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Xuegang Wang
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Xuefei Wang
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Lars Bolund
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Institute of Human Genetics, University of Aarhus, Nordre Ringgade 1, DK-8000 Aarhus C, Denmark
| | - Søren Brunak
- Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, DK-2800 Lyngby, Denmark
| | - Huanming Yang
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Christian Bendixen
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Merete Fredholm
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| |
Collapse
|
44
|
Kong L, Lv Z, Chen J, Nie Z, Wang D, Shen H, Wang X, Wu X, Zhang Y. Expression analysis and tissue distribution of two 14-3-3 proteins in silkworm (Bombyx mori). Biochim Biophys Acta Gen Subj 2007; 1770:1598-604. [PMID: 17949913 DOI: 10.1016/j.bbagen.2007.08.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2007] [Revised: 08/08/2007] [Accepted: 08/10/2007] [Indexed: 11/27/2022]
Abstract
14-3-3 proteins, which have been identified in a wide variety of eukaryotes, are highly conserved acidic proteins. In this study, we identified two genes in silkworm that encode 14-3-3 proteins (Bm14-3-3zeta and Bm14-3-3epsilon). Category of two 14-3-3 proteins was identified according to phylogenetic analysis. Bm14-3-3zeta shared 90% identity with that in Drosophila, while Bm14-3-3epsilon shared 86% identity with that in Drosophila. According to Western blot and real time PCR analysis, the Bm14-3-3zeta expression levels are higher than Bm14-3-3epsilon in seven tissues and in four silkworm developmental stages examined. Bm14-3-3zeta was expressed during every stage of silkworm and in every tissue of the fifth instar larvae that was examined, but Bm14-3-3epsilon expression was not detected in eggs or heads of the fifth instar larvae. Both 14-3-3 proteins were highly expressed in silk glands. These results suggest that Bm14-3-3zeta expression is universal and continuous, while Bm14-3-3epsilon expression is tissue and stage-specific. Based on tissue expression patterns and the known functions of 14-3-3 proteins, it may be that both 14-3-3 proteins are involved in the regulation of gene expression in silkworm silk glands.
Collapse
Affiliation(s)
- Lingyin Kong
- College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou 310018, China
| | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Affiliation(s)
- Pablo D Rabinowicz
- J. C. Venter Institute, 9712 Medical Center Drive, Rockville, Maryland 20850, USA.
| |
Collapse
|
46
|
Venu RC, Jia Y, Gowda M, Jia MH, Jantasuriyarat C, Stahlberg E, Li H, Rhineheart A, Boddhireddy P, Singh P, Rutger N, Kudrna D, Wing R, Nelson JC, Wang GL. RL-SAGE and microarray analysis of the rice transcriptome after Rhizoctonia solani infection. Mol Genet Genomics 2007; 278:421-31. [PMID: 17579886 DOI: 10.1007/s00438-007-0260-y] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2007] [Accepted: 05/14/2007] [Indexed: 11/30/2022]
Abstract
Sheath blight caused by the fungal pathogen Rhizoctonia solani is an emerging problem in rice production worldwide. To elucidate the molecular basis of rice defense to the pathogen, RNA isolated from R. solani-infected leaves of Jasmine 85 was used for both RL-SAGE library construction and microarray hybridization. RL-SAGE sequence analysis identified 20,233 and 24,049 distinct tags from the control and inoculated libraries, respectively. Nearly half of the significant tags (> or =2 copies) from both libraries matched TIGR annotated genes and KOME full-length cDNAs. Among them, 42% represented sense and 7% antisense transcripts, respectively. Interestingly, 60% of the library-specific (> or =10 copies) and differentially expressed (>4.0-fold change) tags were novel transcripts matching genomic sequence but not annotated genes. About 70% of the genes identified in the SAGE libraries showed similar expression patterns (up or down-regulated) in the microarray data obtained from three biological replications. Some candidate RL-SAGE tags and microarray genes were located in known sheath blight QTL regions. The expression of ten differentially expressed RL-SAGE tags was confirmed with RT-PCR. The defense genes associated with resistance to R. solani identified in this study are useful genomic materials for further elucidation of the molecular basis of the defense response to R. solani and fine mapping of target sheath blight QTLs.
Collapse
Affiliation(s)
- R C Venu
- Department of Plant Pathology, The Ohio State University, 201 Kottman Hall, 2021 Coffey Rd, Columbus, OH 43210, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Li L, Cheng H, Gai J, Yu D. Genome-wide identification and characterization of putative cytochrome P450 genes in the model legume Medicago truncatula. PLANTA 2007; 226:109-23. [PMID: 17273868 DOI: 10.1007/s00425-006-0473-z] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2006] [Accepted: 12/20/2006] [Indexed: 05/13/2023]
Abstract
In plants, cytochrome P450 is a group of monooxygenases existing as a gene superfamily and plays important roles in metabolizing physiologically important compounds. However, to date only a limited number of P450s have been identified and characterized in legumes. In this study, data mining methods were used, and 151 putative P450 genes in the model legume Medicago truncatula were identified, including 135 novel sequences. These genes were classified into 9 clans and 44 families by sequence similarity, and among those 4 new clans and 21 new families not reported previously in legumes. By comparison of these genes with P450 genes in Arabidopsis and rice, it was found that most of the known P450 families in dicot species exist in M. truncatula. The representative protein sequences of putative P450s were aligned, and the secondary elements were assigned based on the known structure P450BM3. Putative substrate recognition sites (SRSs) and substrate binding sites were also identified in these sequences. In addition, the ESTs-derived expression profiles (digital Northern) of the putative P450 genes were analyzed, which was confirmed by semi-quantitative RT-PCR analyses of several selected P450 genes. These results will provide a base for catalogue information on P450 genes in M. truncatula and for further functional analysis of P450 superfamily genes in legumes.
Collapse
Affiliation(s)
- Lingyong Li
- National Center for Soybean Improvement, National Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China
| | | | | | | |
Collapse
|
48
|
Fierro AC, Thuret R, Coen L, Perron M, Demeneix BA, Wegnez M, Gyapay G, Weissenbach J, Wincker P, Mazabraud A, Pollet N. Exploring nervous system transcriptomes during embryogenesis and metamorphosis in Xenopus tropicalis using EST analysis. BMC Genomics 2007; 8:118. [PMID: 17506875 PMCID: PMC1890556 DOI: 10.1186/1471-2164-8-118] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2006] [Accepted: 05/16/2007] [Indexed: 11/26/2022] Open
Abstract
Background The western African clawed frog Xenopus tropicalis is an anuran amphibian species now used as model in vertebrate comparative genomics. It provides the same advantages as Xenopus laevis but is diploid and has a smaller genome of 1.7 Gbp. Therefore X. tropicalis is more amenable to systematic transcriptome surveys. We initiated a large-scale partial cDNA sequencing project to provide a functional genomics resource on genes expressed in the nervous system during early embryogenesis and metamorphosis in X. tropicalis. Results A gene index was defined and analysed after the collection of over 48,785 high quality sequences. These partial cDNA sequences were obtained from an embryonic head and retina library (30,272 sequences) and from a metamorphic brain and spinal cord library (27,602 sequences). These ESTs are estimated to represent 9,693 transcripts derived from an estimated 6,000 genes. Comparison of these cDNA sequences with protein databases indicates that 46% contain their start codon. Further annotation included Gene Ontology functional classification, InterPro domain analysis, alternative splicing and non-coding RNA identification. Gene expression profiles were derived from EST counts and used to define transcripts specific to metamorphic stages of development. Moreover, these ESTs allowed identification of a set of 225 polymorphic microsatellites that can be used as genetic markers. Conclusion These cDNA sequences permit in silico cloning of numerous genes and will facilitate studies aimed at deciphering the roles of cognate genes expressed in the nervous system during neural development and metamorphosis. The genomic resources developed to study X. tropicalis biology will accelerate exploration of amphibian physiology and genetics. In particular, the model will facilitate analysis of key questions related to anuran embryogenesis and metamorphosis and its associated regulatory processes.
Collapse
Affiliation(s)
- Ana C Fierro
- CNRS UMR 8080, F-91405 Orsay, France
- Univ Paris Sud, F-91405 Orsay, France
- Programme d'Épigénomique, Univ Evry, Tour Évry 2, 10è étage, 523 Terrasses de l'Agora, 91034 Evry cedex, France
| | - Raphaël Thuret
- CNRS UMR 8080, F-91405 Orsay, France
- Univ Paris Sud, F-91405 Orsay, France
| | - Laurent Coen
- CNRS UMR 5166, Evolution des Régulations Endocriniennes, USM 501, Département Régulations, Développement et Diversité Moléculaire, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75231 Paris Cedex 5, France
| | - Muriel Perron
- CNRS UMR 8080, F-91405 Orsay, France
- Univ Paris Sud, F-91405 Orsay, France
| | - Barbara A Demeneix
- CNRS UMR 5166, Evolution des Régulations Endocriniennes, USM 501, Département Régulations, Développement et Diversité Moléculaire, Muséum National d'Histoire Naturelle, 7 rue Cuvier, 75231 Paris Cedex 5, France
| | - Maurice Wegnez
- CNRS UMR 8080, F-91405 Orsay, France
- Univ Paris Sud, F-91405 Orsay, France
| | - Gabor Gyapay
- Genoscope and CNRS UMR 8030, 2 rue Gaston Crémieux CP5706, 91057 Evry, France
| | - Jean Weissenbach
- Genoscope and CNRS UMR 8030, 2 rue Gaston Crémieux CP5706, 91057 Evry, France
| | - Patrick Wincker
- Genoscope and CNRS UMR 8030, 2 rue Gaston Crémieux CP5706, 91057 Evry, France
| | - André Mazabraud
- CNRS UMR 8080, F-91405 Orsay, France
- Univ Paris Sud, F-91405 Orsay, France
| | - Nicolas Pollet
- CNRS UMR 8080, F-91405 Orsay, France
- Univ Paris Sud, F-91405 Orsay, France
- Programme d'Épigénomique, Univ Evry, Tour Évry 2, 10è étage, 523 Terrasses de l'Agora, 91034 Evry cedex, France
| |
Collapse
|
49
|
Malik MR, Wang F, Dirpaul JM, Zhou N, Polowick PL, Ferrie AMR, Krochko JE. Transcript profiling and identification of molecular markers for early microspore embryogenesis in Brassica napus. PLANT PHYSIOLOGY 2007; 144:134-54. [PMID: 17384168 PMCID: PMC1913795 DOI: 10.1104/pp.106.092932] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2006] [Accepted: 03/10/2007] [Indexed: 05/14/2023]
Abstract
Isolated microspores of Brassica napus are developmentally programmed to form gametes; however, microspores can be reprogrammed through stress treatments to undergo appropriate divisions and form embryos. We are interested in the identification and isolation of factors and genes associated with the induction and establishment of embryogenesis in isolated microspores. Standard and normalized cDNA libraries, as well as subtractive cDNA libraries, were constructed from freshly isolated microspores (0 h) and microspores cultured for 3, 5, or 7 d under embryogenesis-inducing conditions. Library comparison tools were used to identify shifts in metabolism across this time course. Detailed expressed sequence tag analyses of 3 and 5 d cultures indicate that most sequences are related to pollen-specific genes. However, semiquantitative and real-time reverse transcription-polymerase chain reaction analyses at the initial stages of embryo induction also reveal expression of embryogenesis-related genes such as BABYBOOM1, LEAFY COTYLEDON1 (LEC1), and LEC2 as early as 2 to 3 d of microspore culture. Sequencing results suggest that embryogenesis is clearly established in a subset of the microspores by 7 d of culture and that this time point is optimal for isolation of embryo-specific expressed sequence tags such as ABSCISIC ACID INSENSITIVE3, ATS1, LEC1, LEC2, and FUSCA3. Following extensive polymerase chain reaction-based expression profiling, 16 genes were identified as unequivocal molecular markers for microspore embryogenesis in B. napus. These molecular marker genes also show expression during zygotic embryogenesis, underscoring the common developmental pathways that function in zygotic and gametic embryogenesis. The quantitative expression values of several of these molecular marker genes are shown to be predictive of embryogenic potential in B. napus cultivars (e.g. 'Topas' DH4079, 'Allons,' 'Westar,' 'Garrison').
Collapse
Affiliation(s)
- Meghna R Malik
- Plant Biotechnology Institute, National Research Council of Canada, Saskatoon, Saskatchewan, Canada
| | | | | | | | | | | | | |
Collapse
|
50
|
Neretti N, Remondini D, Tatar M, Sedivy JM, Pierini M, Mazzatti D, Powell J, Franceschi C, Castellani GC. Correlation analysis reveals the emergence of coherence in the gene expression dynamics following system perturbation. BMC Bioinformatics 2007; 8 Suppl 1:S16. [PMID: 17430560 PMCID: PMC1885845 DOI: 10.1186/1471-2105-8-s1-s16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Time course gene expression experiments are a popular means to infer co-expression. Many methods have been proposed to cluster genes or to build networks based on similarity measures of their expression dynamics. In this paper we apply a correlation based approach to network reconstruction to three datasets of time series gene expression following system perturbation: 1) Conditional, Tamoxifen dependent, activation of the cMyc proto-oncogene in rat fibroblast; 2) Genomic response to nutrition changes in D. melanogaster; 3) Patterns of gene activity as a consequence of ageing occurring over a life-span time series (25y–90y) sampled from T-cells of human donors. We show that the three datasets undergo similar transitions from an "uncorrelated" regime to a positively or negatively correlated one that is symptomatic of a shift from a "ground" or "basal" state to a "polarized" state. In addition, we show that a similar transition is conserved at the pathway level, and that this information can be used for the construction of "meta-networks" where it is possible to assess new relations among functionally distant sets of molecular functions.
Collapse
Affiliation(s)
- Nicola Neretti
- Institute for Brain and Neural Systems, Brown University, Providence RI, USA
- Centro Interdipartimentale "L. Galvani", Università di Bologna, Bologna, Italy
| | - Daniel Remondini
- Centro Interdipartimentale "L. Galvani", Università di Bologna, Bologna, Italy
- DIMORFIPA, Università di Bologna, Bologna, Italy
| | - Marc Tatar
- Department of Ecology and Evolutionary Biology, Brown University, Providence RI, USA
| | - John M Sedivy
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence RI, USA
| | - Michela Pierini
- Centro Interdipartimentale "L. Galvani", Università di Bologna, Bologna, Italy
| | | | | | - Claudio Franceschi
- Centro Interdipartimentale "L. Galvani", Università di Bologna, Bologna, Italy
- I.N.R.C.A., Department of Gerontological Sciences, via Birarelli 8, 60121 Ancona, Italy
| | - Gastrone C Castellani
- Institute for Brain and Neural Systems, Brown University, Providence RI, USA
- Centro Interdipartimentale "L. Galvani", Università di Bologna, Bologna, Italy
- DIMORFIPA, Università di Bologna, Bologna, Italy
| |
Collapse
|