Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Haas BJ, Volfovsky N, Town CD, Troukhan M, Alexandrov N, Feldmann KA, Flavell RB, White O, Salzberg SL. Full-length messenger RNA sequences greatly improve genome annotation. Genome Biol 2002;3:RESEARCH0029. [PMID: 12093376 PMCID: PMC116726 DOI: 10.1186/gb-2002-3-6-research0029] [Citation(s) in RCA: 126] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2001] [Revised: 03/14/2002] [Accepted: 04/19/2002] [Indexed: 11/10/2022] Open

For:	Haas BJ, Volfovsky N, Town CD, Troukhan M, Alexandrov N, Feldmann KA, Flavell RB, White O, Salzberg SL. Full-length messenger RNA sequences greatly improve genome annotation. Genome Biol 2002;3:RESEARCH0029. [PMID: 12093376 PMCID: PMC116726 DOI: 10.1186/gb-2002-3-6-research0029] [Citation(s) in RCA: 126] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2001] [Revised: 03/14/2002] [Accepted: 04/19/2002] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Carbonnel S, Cornelis S, Hazak O. The CLE33 peptide represses phloem differentiation via autocrine and paracrine signaling in Arabidopsis. Commun Biol 2023;6:588. [PMID: 37280369 DOI: 10.1038/s42003-023-04972-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 05/23/2023] [Indexed: 06/08/2023] Open

Flavell RB. Perspective: 50 years of plant chromosome biology. PLANT PHYSIOLOGY 2021;185:731-753. [PMID: 33604616 PMCID: PMC8133586 DOI: 10.1093/plphys/kiaa108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 12/04/2020] [Indexed: 06/12/2023]

Zebell SG. A broad view: Dick Flavell. PLANT PHYSIOLOGY 2021;185:727-730. [PMID: 33822223 PMCID: PMC8133605 DOI: 10.1093/plphys/kiaa111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

In silico identification and structure function analysis of a putative coclaurine N-methyltransferase from Aristolochia fimbriata. Comput Biol Chem 2020;85:107201. [DOI: 10.1016/j.compbiolchem.2020.107201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 12/31/2019] [Accepted: 01/08/2020] [Indexed: 11/22/2022]

Yi X, Yang Y, Wu P, Xu X, Li W. Alternative splicing events during adipogenesis from hMSCs. J Cell Physiol 2019;235:304-316. [PMID: 31206189 DOI: 10.1002/jcp.28970] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Revised: 05/28/2019] [Accepted: 05/29/2019] [Indexed: 12/22/2022]

Davies JP, Christensen CA. Developing Transgenic Agronomic Traits for Crops: Targets, Methods, and Challenges. Methods Mol Biol 2019;1864:343-365. [PMID: 30415346 DOI: 10.1007/978-1-4939-8778-8_22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Phylogenetic analyses and in-seedling expression of ammonium and nitrate transporters in wheat. Sci Rep 2018;8:7082. [PMID: 29728590 PMCID: PMC5935732 DOI: 10.1038/s41598-018-25430-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 04/18/2018] [Indexed: 02/03/2023] Open

Rawal HC, Kumar S, Mithra S V A, Solanke AU, Nigam D, Saxena S, Tyagi A, V S, Yadav NR, Kalia P, Singh NP, Singh NK, Sharma TR, Gaikwad K. High Quality Unigenes and Microsatellite Markers from Tissue Specific Transcriptome and Development of a Database in Clusterbean (Cyamopsis tetragonoloba, L. Taub). Genes (Basel) 2017;8:genes8110313. [PMID: 29120386 PMCID: PMC5704226 DOI: 10.3390/genes8110313] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Revised: 10/23/2017] [Accepted: 11/06/2017] [Indexed: 12/23/2022] Open

Chan KL, Rosli R, Tatarinova TV, Hogan M, Firdaus-Raih M, Low ETL. Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data. BMC Bioinformatics 2017;18:1426. [PMID: 28466793 PMCID: PMC5333190 DOI: 10.1186/s12859-016-1426-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion.

RESULTS

We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure).

CONCLUSIONS

Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.

Collapse

Genome-Wide Identification and Characterization of the LRR-RLK Gene Family in Two Vernicia Species. Int J Genomics 2015;2015:823427. [PMID: 26783513 PMCID: PMC4691485 DOI: 10.1155/2015/823427] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 11/17/2015] [Indexed: 11/17/2022] Open

Zhang X, Feng H, Feng C, Xu H, Huang X, Wang Q, Duan X, Wang X, Wei G, Huang L, Kang Z. Isolation and characterisation of cDNA encoding a wheat heavy metal-associated isoprenylated protein involved in stress responses. PLANT BIOLOGY (STUTTGART, GERMANY) 2015;17:1176-86. [PMID: 25951496 DOI: 10.1111/plb.12344] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/01/2015] [Indexed: 05/03/2023]

Chauhan R, Jasrai Y, Pandya H. In Silico Analysis for Five Major Cereal Crops Phytocystatins. Interdiscip Sci 2015;7:233-41. [PMID: 26267706 DOI: 10.1007/s12539-015-0264-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Revised: 01/15/2014] [Accepted: 02/07/2014] [Indexed: 11/28/2022]

Warren RL, Keeling CI, Yuen MMS, Raymond A, Taylor GA, Vandervalk BP, Mohamadi H, Paulino D, Chiu R, Jackman SD, Robertson G, Yang C, Boyle B, Hoffmann M, Weigel D, Nelson DR, Ritland C, Isabel N, Jaquish B, Yanchuk A, Bousquet J, Jones SJM, MacKay J, Birol I, Bohlmann J. Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2015;83:189-212. [PMID: 26017574 DOI: 10.1111/tpj.12886] [Citation(s) in RCA: 122] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 05/15/2015] [Indexed: 05/21/2023]

Affiliation(s)

René L Warren Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Christopher I Keeling Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Macaire Man Saint Yuen Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Anthony Raymond Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Greg A Taylor Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Benjamin P Vandervalk Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Hamid Mohamadi Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Daniel Paulino Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Readman Chiu Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Shaun D Jackman Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Gordon Robertson Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Chen Yang Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Brian Boyle Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
Margarete Hoffmann Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
Detlef Weigel Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
David R Nelson Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
Carol Ritland Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Nathalie Isabel Natural Resources Canada, Laurentian Forestry Centre, Québec, QC, G1V 4C7, Canada
Barry Jaquish British Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, BC, V8W 9C2, Canada
Alvin Yanchuk British Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, BC, V8W 9C2, Canada
Jean Bousquet Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
Steven J M Jones Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
John MacKay Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK
Inanc Birol Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
Joerg Bohlmann Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada

Collapse

Rodríguez-García MJ, Machado V, Galián J. Identification and characterisation of putative seminal fluid proteins from male reproductive tissue EST libraries in tiger beetles. BMC Genomics 2015;16:391. [PMID: 25981911 PMCID: PMC4434525 DOI: 10.1186/s12864-015-1619-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2014] [Accepted: 05/05/2015] [Indexed: 11/10/2022] Open

Yao QY, Xia EH, Liu FH, Gao LZ. Genome-wide identification and comparative expression analysis reveal a rapid expansion and functional divergence of duplicated genes in the WRKY gene family of cabbage, Brassica oleracea var. capitata. Gene 2014;557:35-42. [PMID: 25481634 DOI: 10.1016/j.gene.2014.12.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Revised: 11/28/2014] [Accepted: 12/02/2014] [Indexed: 12/18/2022]

High-throughput sequencing and de novo assembly of Brassica oleracea var. Capitata L. for transcriptome analysis. PLoS One 2014;9:e92087. [PMID: 24682075 PMCID: PMC3969326 DOI: 10.1371/journal.pone.0092087] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Accepted: 02/18/2014] [Indexed: 12/28/2022] Open

Abstract

Background

The cabbage, Brassica oleracea var. capitata L., has a distinguishable phenotype within the genus Brassica. Despite the economic and genetic importance of cabbage, there is little genomic data for cabbage, and most studies of Brassica are focused on other species or other B. oleracea subspecies. The lack of genomic data for cabbage, a non-model organism, hinders research on its molecular biology. Hence, the construction of reliable transcriptomic data based on high-throughput sequencing technologies is needed to enhance our understanding of cabbage and provide genomic information for future work.

Methodology/Principal Findings

We constructed cDNAs from total RNA isolated from the roots, leaves, flowers, seedlings, and calcium-limited seedling tissues of two cabbage genotypes: 102043 and 107140. We sequenced a total of six different samples using the Illumina HiSeq platform, producing 40.5 Gbp of sequence data comprising 401,454,986 short reads. We assembled 205,046 transcripts (≥ 200 bp) using the Velvet and Oases assembler and predicted 53,562 loci from the transcripts. We annotated 35,274 of the loci with 55,916 plant peptides in the Phytozome database. The average length of the annotated loci was 1,419 bp. We confirmed the reliability of the sequencing assembly using reverse-transcriptase PCR to identify tissue-specific gene candidates among the annotated loci.

Conclusion

Our study provides valuable transcriptome sequence data for B. oleracea var. capitata L., offering a new resource for studying B. oleracea and closely related species. Our transcriptomic sequences will enhance the quality of gene annotation and functional analysis of the cabbage genome and serve as a material basis for future genomic research on cabbage. The sequencing data from this study can be used to develop molecular markers and to identify the extreme differences among the phenotypes of different species in the genus Brassica.

Collapse

The function and properties of the transcriptional regulator COS1 in Magnaporthe oryzae. Fungal Biol 2013;117:239-49. [DOI: 10.1016/j.funbio.2013.01.010] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Revised: 12/22/2012] [Accepted: 01/27/2013] [Indexed: 11/20/2022]

Gibson AK, Smith Z, Fuqua C, Clay K, Colbourne JK. Why so many unknown genes? Partitioning orphans from a representative transcriptome of the lone star tick Amblyomma americanum. BMC Genomics 2013;14:135. [PMID: 23445305 PMCID: PMC3616916 DOI: 10.1186/1471-2164-14-135] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Accepted: 02/21/2013] [Indexed: 11/10/2022] Open

Abstract

Background

Genomic resources within the phylum Arthropoda are largely limited to the true insects but are beginning to include unexplored subphyla, such as the Crustacea and Chelicerata. Investigations of these understudied taxa uncover high frequencies of orphan genes, which lack detectable sequence homology to genes in pre-existing databases. The ticks (Acari: Chelicerata) are one such understudied taxon for which genomic resources are urgently needed. Ticks are obligate blood-feeders that vector major diseases of humans, domesticated animals, and wildlife. In analyzing a transcriptome of the lone star tick Amblyomma americanum, one of the most abundant disease vectors in the United States, we find a high representation of unannotated sequences. We apply a general framework for quantifying the origin and true representation of unannotated sequences in a dataset and for evaluating the biological significance of orphan genes.

Results

Expressed sequence tags (ESTs) were derived from different life stages and populations of A. americanum and combined with ESTs available from GenBank to produce 14,310 ESTs, over twice the number previously available. The vast majority (71%) has no sequence homology to proteins archived in UniProtKB. We show that poor sequence or assembly quality is not a major contributor to this high representation by orphan genes. Moreover, most unannotated sequences are functional: a microarray experiment demonstrates that 59% of functional ESTs are unannotated. Lastly, we attempt to further annotate our EST dataset using genomic datasets from other members of the Acari, including Ixodes scapularis, four other tick species and the mite Tetranychus urticae. We find low homology with these species, consistent with significant divergence within this subclass.

Conclusions

We conclude that the abundance of orphan genes in A. americanum likely results from 1) taxonomic isolation stemming from divergence within the tick lineage and limited genomic resources for ticks and 2) lineage-specific genes needing functional genomic studies to evaluate their association with the unique biology of ticks. The EST sequences described here will contribute substantially to the development of tick genomics. Moreover, the framework provided for the evaluation of orphan genes can guide analyses of future transcriptome sequencing projects.

Collapse

Ahmed NU, Park JI, Jung HJ, Seo MS, Kumar TS, Lee IH, Nou IS. Identification and characterization of stress resistance related genes of Brassica rapa. Biotechnol Lett 2012;34:979-87. [PMID: 22286206 DOI: 10.1007/s10529-012-0860-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2011] [Accepted: 01/03/2012] [Indexed: 11/29/2022]

Han B, Xu S, Xie YJ, Huang JJ, Wang LJ, Yang Z, Zhang CH, Sun Y, Shen WB, Xie GS. ZmHO-1, a maize haem oxygenase-1 gene, plays a role in determining lateral root development. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2012;184:63-74. [PMID: 22284711 DOI: 10.1016/j.plantsci.2011.12.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2011] [Revised: 12/08/2011] [Accepted: 12/15/2011] [Indexed: 05/04/2023]

Li Z, Zhang Z, Yan P, Huang S, Fei Z, Lin K. RNA-Seq improves annotation of protein-coding genes in the cucumber genome. BMC Genomics 2011;12:540. [PMID: 22047402 PMCID: PMC3219749 DOI: 10.1186/1471-2164-12-540] [Citation(s) in RCA: 124] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 11/02/2011] [Indexed: 01/02/2023] Open

Haas BJ, Zeng Q, Pearson MD, Cuomo CA, Wortman JR. Approaches to Fungal Genome Annotation. Mycology 2011;2:118-141. [PMID: 22059117 PMCID: PMC3207268 DOI: 10.1080/21501203.2011.606851] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Xia Z, Xu H, Zhai J, Li D, Luo H, He C, Huang X. RNA-Seq analysis and de novo transcriptome assembly of Hevea brasiliensis. PLANT MOLECULAR BIOLOGY 2011;77:299-308. [PMID: 21811850 DOI: 10.1007/s11103-011-9811-z] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2010] [Accepted: 07/14/2011] [Indexed: 05/05/2023]

Rigault P, Boyle B, Lepage P, Cooke JEK, Bousquet J, MacKay JJ. A white spruce gene catalog for conifer genome analyses. PLANT PHYSIOLOGY 2011;157:14-28. [PMID: 21730200 PMCID: PMC3165865 DOI: 10.1104/pp.111.179663] [Citation(s) in RCA: 86] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2011] [Accepted: 06/24/2011] [Indexed: 05/18/2023]

Abstract

Several angiosperm plant genomes, including Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), poplar (Populus trichocarpa), and grapevine (Vitis vinifera), have been sequenced, but the lack of reference genomes in gymnosperm phyla reduces our understanding of plant evolution and restricts the potential impacts of genomics research. A gene catalog was developed for the conifer tree Picea glauca (white spruce) through large-scale expressed sequence tag sequencing and full-length cDNA sequencing to facilitate genome characterizations, comparative genomics, and gene mapping. The resource incorporates new and publicly available sequences into 27,720 cDNA clusters, 23,589 of which are represented by full-length insert cDNAs. Expressed sequence tags, mate-pair cDNA clone analysis, and custom sequencing were integrated through an iterative process to improve the accuracy of clustering outcomes. The entire catalog spans 30 Mb of unique transcribed sequence. We estimated that the P. glauca nuclear genome contains up to 32,520 transcribed genes owing to incomplete, partially sequenced, and unsampled transcripts and that its transcriptome could span up to 47 Mb. These estimates are in the same range as the Arabidopsis and rice transcriptomes. Next-generation methods confirmed and enhanced the catalog by providing deeper coverage for rare transcripts, by extending many incomplete clusters, and by augmenting the overall transcriptome coverage to 38 Mb of unique sequence. Genomic sample sequencing at 8.5% of the 19.8-Gb P. glauca genome identified 1,495 clusters representing highly repeated sequences among the cDNA clusters. With a conifer transcriptome in full view, functional and protein domain annotations clearly highlighted the divergences between conifers and angiosperms, likely reflecting their respective evolutionary paths.

Collapse

Schoof H. Towards Interoperability in Genome Databases: The MAtDB (MIPS Arabidopsis Thaliana Database) Experience. Comp Funct Genomics 2011;4:255-8. [PMID: 18629123 PMCID: PMC2447410 DOI: 10.1002/cfg.278] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2003] [Revised: 02/05/2003] [Accepted: 02/06/2003] [Indexed: 11/09/2022] Open

Doyle CE, Donaldson ME, Morrison EN, Saville BJ. Ustilago maydis transcript features identified through full-length cDNA analysis. Mol Genet Genomics 2011;286:143-59. [PMID: 21750919 DOI: 10.1007/s00438-011-0634-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2011] [Accepted: 06/28/2011] [Indexed: 12/13/2022]

Buell CR, Last RL. Twenty-first century plant biology: impacts of the Arabidopsis genome on plant biology and agriculture. PLANT PHYSIOLOGY 2010;154:497-500. [PMID: 20921172 PMCID: PMC2948998 DOI: 10.1104/pp.110.159541] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Accepted: 06/15/2010] [Indexed: 05/28/2023]

Haas BJ, Zody MC. Advancing RNA-Seq analysis. Nat Biotechnol 2010. [PMID: 20458303 DOI: 10.1038/nbt0510‐421] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Advancing RNA-Seq analysis. Nat Biotechnol 2010;28:421-3. [DOI: 10.1038/nbt0510-421] [Citation(s) in RCA: 157] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

The transcriptome of the early life history stages of the California Sea Hare Aplysia californica. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2010;5:165-70. [PMID: 20434970 DOI: 10.1016/j.cbd.2010.03.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2009] [Revised: 03/25/2010] [Accepted: 03/27/2010] [Indexed: 11/24/2022]

Kim S, Park J, Park SY, Mitchell TK, Lee YH. Identification and analysis of in planta expressed genes of Magnaporthe oryzae. BMC Genomics 2010;11:104. [PMID: 20146797 PMCID: PMC2832786 DOI: 10.1186/1471-2164-11-104] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Accepted: 02/10/2010] [Indexed: 11/14/2022] Open

Gou X, He K, Yang H, Yuan T, Lin H, Clouse SD, Li J. Genome-wide cloning and sequence analysis of leucine-rich repeat receptor-like protein kinase genes in Arabidopsis thaliana. BMC Genomics 2010. [PMID: 20064227 DOI: 10.1186/1471‐2164‐11‐19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Gou X, He K, Yang H, Yuan T, Lin H, Clouse SD, Li J. Genome-wide cloning and sequence analysis of leucine-rich repeat receptor-like protein kinase genes in Arabidopsis thaliana. BMC Genomics 2010;11:19. [PMID: 20064227 PMCID: PMC2817689 DOI: 10.1186/1471-2164-11-19] [Citation(s) in RCA: 134] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2009] [Accepted: 01/11/2010] [Indexed: 11/19/2022] Open

Marques MC, Alonso-Cantabrana H, Forment J, Arribas R, Alamar S, Conejero V, Perez-Amador MA. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus. BMC Genomics 2009;10:428. [PMID: 19747386 PMCID: PMC2754500 DOI: 10.1186/1471-2164-10-428] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Accepted: 09/11/2009] [Indexed: 01/02/2023] Open

Abstract

Background

Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation.

Results

We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis.

Conclusion

The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species.

Collapse

Upadhyay SK, Shankar J, Singh Y, Basir SF, Madan T, Sarma PU. Expressed sequence tags of Aspergillus fumigatus: Extension of catalogue and their evaluation as putative drug targets and/or diagnostic markers. Indian J Clin Biochem 2009;24:131-6. [PMID: 23105821 DOI: 10.1007/s12291-009-0024-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Seki M, Shinozaki K. Functional genomics using RIKEN Arabidopsis thaliana full-length cDNAs. JOURNAL OF PLANT RESEARCH 2009;122:355-66. [PMID: 19412652 DOI: 10.1007/s10265-009-0239-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 04/08/2009] [Indexed: 05/24/2023]

Park K, Dirisala VR, Oh Y, Choi H, Lee KT, Kim JH, Lee HT, Seo KH, Park C. Reporting 678 putative cSNPs from full-length enriched cDNA sequences of the Korean native pig. J Anim Breed Genet 2009;126:127-33. [PMID: 19320769 DOI: 10.1111/j.1439-0388.2008.00765.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Gu L, Guo R. Genome-wide detection and analysis of alternative splicing for nucleotide binding site-leucine-rich repeats sequences in rice. J Genet Genomics 2009;34:247-57. [PMID: 17498622 DOI: 10.1016/s1673-8527(07)60026-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2006] [Accepted: 08/03/2006] [Indexed: 11/20/2022]

Abstract

Alternative splicing is a major contributor to genomic complexity and proteome diversity, yet the analysis of alternative splicing for the sequence containing nucleotide binding site and leucine-rich repeats (NBS-LRR) domain has not been explored in rice (Oryza sativa L.). Hidden Markov model (HMM) searches were performed for NBS-LRR domain. 875 NBS-LRR-encoding sequences were obtained from the Institute for Genomic Research (TIGR). All of them were used to blast Knowledge-based Oryza Molecular Biological Encyclopaedia (KOME), TIGR rice gene index (TGI), and Universal Protein Resource (UniProt) to obtain homologous full-length cDNAs (FL-cDNAs), tentative consensus sequences, and protein sequences. Alternative splicing events were detected from genomic alignment of FL-cDNAs, tentative consensus sequences, and protein sequences, which provide valuable information on splice variants of genes. These sequences were aligned to the corresponding BAC sequences using the Spidey and Sim4 programs and each of the proteins was aligned by tBLASTn. Of the 875 NBS-LRR sequences, 119 (13.6%) sequences had alternative splicing where multiple FL-cDNAs, TGI sequences and proteins corresponded to the same gene. 71 intron retention events, 20 exon skipping events, 16 alternative termination events, 25 alternative initiation events, 12 alternative 5' splicing events, and 16 alternative 3' splicing events were identified. Most of these alternative splices were supported by two or more transcripts. The data sets are available at http://www.bioinfor.org Furthermore, the bioinformatics analysis of splice boundaries showed that exon skipping and intron retention did not exhibit strong consensus. This implies a different regulation mechanism that guides the expression of splice isoforms. This article also presents the analysis of the effects of intron retention on proteins. The C-terminal regions of alternative proteins turned out to be more variable than the N-terminal regions. Finally, tissue distribution and protein localization of alternative splicing were explored. The largest categories of tissue distributions for alternative splicing were shoot and callus. More than one-thirds of protein localization for splice forms was plasma membrane and cytoplasm. All the NBS-LRR proteins for splice forms may have important function in disease resistance and activate downstream signaling pathways.

Collapse

Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC, Freeling M, Gingle AR, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Peterson DG, Mehboob-ur-Rahman, Ware D, Westhoff P, Mayer KFX, Messing J, Rokhsar DS. The Sorghum bicolor genome and the diversification of grasses. Nature 2009;457:551-6. [PMID: 19189423 DOI: 10.1038/nature07723] [Citation(s) in RCA: 1638] [Impact Index Per Article: 109.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Zeba N, Isbat M, Kwon NJ, Lee MO, Kim SR, Hong CB. Heat-inducible C3HC4 type RING zinc finger protein gene from Capsicum annuum enhances growth of transgenic tobacco. PLANTA 2009;229:861-71. [PMID: 19125289 DOI: 10.1007/s00425-008-0884-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2008] [Accepted: 12/16/2008] [Indexed: 05/27/2023]

Grigsby IF, Rutledge EM, Morton CA, Finger FP. Functional redundancy of two C. elegans homologs of the histone chaperone Asf1 in germline DNA replication. Dev Biol 2009;329:64-79. [PMID: 19233156 DOI: 10.1016/j.ydbio.2009.02.015] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2008] [Revised: 01/30/2009] [Accepted: 02/11/2009] [Indexed: 11/20/2022]

Umezawa T, Sakurai T, Totoki Y, Toyoda A, Seki M, Ishiwata A, Akiyama K, Kurotani A, Yoshida T, Mochida K, Kasuga M, Todaka D, Maruyama K, Nakashima K, Enju A, Mizukado S, Ahmed S, Yoshiwara K, Harada K, Tsubokura Y, Hayashi M, Sato S, Anai T, Ishimoto M, Funatsuki H, Teraishi M, Osaki M, Shinano T, Akashi R, Sakaki Y, Yamaguchi-Shinozaki K, Shinozaki K. Sequencing and analysis of approximately 40,000 soybean cDNA clones from a full-length-enriched cDNA library. DNA Res 2008;15:333-46. [PMID: 18927222 PMCID: PMC2608845 DOI: 10.1093/dnares/dsn024] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2008] [Accepted: 09/10/2008] [Indexed: 11/14/2022] Open

Affiliation(s)

Taishi Umezawa Gene Discovery Research Team, RIKEN Plant Science Center, Koyadai 3-1-1, Tsukuba, Ibaraki 305-0074, Japan
Tetsuya Sakurai Integrated Genome Informatics Research Unit, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Yasushi Totoki Genome Annotation and Comparative Analysis Team, RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Atsushi Toyoda Sequence Technology Team, RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Motoaki Seki Plant Genomic Network Research Team, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Atsushi Ishiwata Integrated Genome Informatics Research Unit, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Kenji Akiyama Integrated Genome Informatics Research Unit, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Atsushi Kurotani Integrated Genome Informatics Research Unit, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Takuhiro Yoshida Integrated Genome Informatics Research Unit, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Keiichi Mochida Gene Discovery Research Group, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Mie Kasuga Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan
Daisuke Todaka Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan Laboratory of Plant Molecular Physiology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
Kyonoshin Maruyama Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan
Kazuo Nakashima Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan
Akiko Enju Plant Genomic Network Research Team, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Saho Mizukado Gene Discovery Research Team, RIKEN Plant Science Center, Koyadai 3-1-1, Tsukuba, Ibaraki 305-0074, Japan
Selina Ahmed Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan
Kyoko Yoshiwara Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan
Kyuya Harada National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
Yasutaka Tsubokura National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
Masaki Hayashi National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
Shusei Sato Kazusa DNA Research Institute, 2-6-7 Kazusa-kamatari, Kisarazu, Chiba 292-0818, Japan
Toyoaki Anai Department of Applied Biological Sciences, Faculty of Agriculture, Saga University, Honjo 840-8502, Saga, Japan
Masao Ishimoto National Agricultural Research Center for Hokkaido Region, 1 Hitsujigaoka, Sapporo, Hokkaido 062-8555, Japan
Hideyuki Funatsuki National Agricultural Research Center for Hokkaido Region, 1 Hitsujigaoka, Sapporo, Hokkaido 062-8555, Japan
Masayoshi Teraishi Experimental Farm, Kyoto University, Takatsuki, Osaka 569-0096, Japan
Mitsuru Osaki Graduate School of Agriculture, Hokkaido University, Sapporo, Hokkaido 060-8589, Japan
Takuro Shinano National Agricultural Research Center for Hokkaido Region, 1 Hitsujigaoka, Sapporo, Hokkaido 062-8555, Japan
Ryo Akashi Division of BioResource, Frontier Science Research Center, University of Miyazaki, Miyazaki 889-2192, Japan
Yoshiyuki Sakaki Genome Annotation and Comparative Analysis Team, RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan Sequence Technology Team, RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
Kazuko Yamaguchi-Shinozaki Biological Resources Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan Laboratory of Plant Molecular Physiology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
Kazuo Shinozaki Gene Discovery Research Team, RIKEN Plant Science Center, Koyadai 3-1-1, Tsukuba, Ibaraki 305-0074, Japan

Collapse

Kapustin Y, Souvorov A, Tatusova T, Lipman D. Splign: algorithms for computing spliced alignments with identification of paralogs. Biol Direct 2008;3:20. [PMID: 18495041 PMCID: PMC2440734 DOI: 10.1186/1745-6150-3-20] [Citation(s) in RCA: 244] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2008] [Accepted: 05/21/2008] [Indexed: 11/10/2022] Open

Grigsby IF, Finger FP. UNC-85, a C. elegans homolog of the histone chaperone Asf1, functions in post-embryonic neuroblast replication. Dev Biol 2008;319:100-9. [PMID: 18490010 DOI: 10.1016/j.ydbio.2008.04.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 04/08/2008] [Accepted: 04/08/2008] [Indexed: 11/28/2022]

Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 2008;9:R7. [PMID: 18190707 PMCID: PMC2395244 DOI: 10.1186/gb-2008-9-1-r7] [Citation(s) in RCA: 1884] [Impact Index Per Article: 117.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2007] [Revised: 12/17/2007] [Accepted: 01/11/2008] [Indexed: 01/16/2023] Open

Liu Q, Mackey AJ, Roos DS, Pereira FCN. Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction. ACTA ACUST UNITED AC 2008;24:597-605. [PMID: 18187439 DOI: 10.1093/bioinformatics/btn004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Thibaud-Nissen F, Campbell M, Hamilton JP, Zhu W, Buell CR. EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome. BMC Genomics 2007;8:388. [PMID: 17961238 PMCID: PMC2151081 DOI: 10.1186/1471-2164-8-388] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Accepted: 10/25/2007] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Despite the improvements of tools for automated annotation of genome sequences, manual curation at the structural and functional level can provide an increased level of refinement to genome annotation. The Institute for Genomic Research Rice Genome Annotation (hereafter named the Osa1 Genome Annotation) is the product of an automated pipeline and, for this reason, will benefit from the input of biologists with expertise in rice and/or particular gene families. Leveraging knowledge from a dispersed community of scientists is a demonstrated way of improving a genome annotation. This requires tools that facilitate 1) the submission of gene annotation to an annotation project, 2) the review of the submitted models by project annotators, and 3) the incorporation of the submitted models in the ongoing annotation effort.

RESULTS

We have developed the Eukaryotic Community Annotation Package (EuCAP), an annotation tool, and have applied it to the rice genome. The primary level of curation by community annotators (CA) has been the annotation of gene families. Annotation can be submitted by email or through the EuCAP Web Tool. The CA models are aligned to the rice pseudomolecules and the coordinates of these alignments, along with functional annotation, are stored in the MySQL EuCAP Gene Model database. Web pages displaying the alignments of the CA models to the Osa1 Genome models are automatically generated from the EuCAP Gene Model database. The alignments are reviewed by the project annotators (PAs) in the context of experimental evidence. Upon approval by the PAs, the CA models, along with the corresponding functional annotations, are integrated into the Osa1 Genome Annotation. The CA annotations, grouped by family, are displayed on the Community Annotation pages of the project website http://rice.tigr.org, as well as in the Community Annotation track of the Genome Browser.

CONCLUSION

We have applied EuCAP to rice. As of July 2007, the structural and/or functional annotation of 1,094 genes representing 57 families have been deposited and integrated into the current gene set. All of the EuCAP components are open-source, thereby allowing the implementation of EuCAP for the annotation of other genomes. EuCAP is available at http://sourceforge.net/projects/eucap/.

Collapse

Pertea M, Mount SM, Salzberg SL. A computational survey of candidate exonic splicing enhancer motifs in the model plant Arabidopsis thaliana. BMC Bioinformatics 2007;8:159. [PMID: 17517127 PMCID: PMC1892810 DOI: 10.1186/1471-2105-8-159] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2006] [Accepted: 05/21/2007] [Indexed: 02/05/2023] Open

D'Agostino N, Traini A, Frusciante L, Chiusano ML. Gene models from ESTs (GeneModelEST): an application on the Solanum lycopersicum genome. BMC Bioinformatics 2007;8 Suppl 1:S9. [PMID: 17430576 PMCID: PMC1885861 DOI: 10.1186/1471-2105-8-s1-s9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The structure annotation of a genome is based either on ab initio methodologies or on similaritiy searches versus molecules that have been already annotated. Ab initio gene predictions in a genome are based on a priori knowledge of species-specific features of genes. The training of ab initio gene finders is based on the definition of a data-set of gene models. To accomplish this task the common approach is to align species-specific full length cDNA and EST sequences along the genomic sequences in order to define exon/intron structure of mRNA coding genes.

RESULTS

GeneModelEST is the software here proposed for defining a data-set of candidate gene models using exclusively evidence derived from cDNA/EST sequences.GeneModelEST requires the genome coordinates of the spliced-alignments of ESTs and of contigs (tentative consensus sequences) generated by an EST clustering/assembling procedure to be formatted in a General Feature Format (GFF) standard file. Moreover, the alignments of the contigs versus a protein database are required as an NCBI BLAST formatted report file. The GeneModelEST analysis aims to i) evaluate each exon as defined from contig spliced alignments onto the genome sequence; ii) classify the contigs according to quality levels in order to select candidate gene models; iii) assign to the candidate gene models preliminary functional annotations. We discuss the application of the proposed methodology to build a data-set of gene models of Solanum lycopersicum, whose genome sequencing is an ongoing effort by the International Tomato Genome Sequencing Consortium.

CONCLUSION

The contig classification procedure used by GeneModelEST supports the detection of candidate gene models, the identification of potential alternative transcripts and it is useful to filter out ambiguous information. An automated procedure, such as the one proposed here, is fundamental to support large scale analysis in order to provide species-specific gene models, that could be useful as a training data-set for ab initio gene finders and/or as a reference gene list for a human curated annotation.

Collapse

Kumar S, Dutta A, Sinha AK, Sen J. Cloning, characterization and localization of a novel basic peroxidase gene from Catharanthus roseus. FEBS J 2007;274:1290-303. [PMID: 17298442 DOI: 10.1111/j.1742-4658.2007.05677.x] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]