1
|
Bacher R, Chu LF, Argus C, Bolin JM, Knight P, Thomson J, Stewart R, Kendziorski C. Enhancing biological signals and detection rates in single-cell RNA-seq experiments with cDNA library equalization. Nucleic Acids Res 2022; 50:e12. [PMID: 34850101 PMCID: PMC8789062 DOI: 10.1093/nar/gkab1071] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 10/14/2021] [Accepted: 10/20/2021] [Indexed: 11/14/2022] Open
Abstract
Considerable effort has been devoted to refining experimental protocols to reduce levels of technical variability and artifacts in single-cell RNA-sequencing data (scRNA-seq). We here present evidence that equalizing the concentration of cDNA libraries prior to pooling, a step not consistently performed in single-cell experiments, improves gene detection rates, enhances biological signals, and reduces technical artifacts in scRNA-seq data. To evaluate the effect of equalization on various protocols, we developed Scaffold, a simulation framework that models each step of an scRNA-seq experiment. Numerical experiments demonstrate that equalization reduces variation in sequencing depth and gene-specific expression variability. We then performed a set of experiments in vitro with and without the equalization step and found that equalization increases the number of genes that are detected in every cell by 17-31%, improves discovery of biologically relevant genes, and reduces nuisance signals associated with cell cycle. Further support is provided in an analysis of publicly available data.
Collapse
Affiliation(s)
- Rhonda Bacher
- Department of Biostatistics, University of Florida, FL, USA
| | - Li-Fang Chu
- Department of Comparative Biology and Experimental Medicine, University of Calgary, Calgary, AB, Canada
- Morgridge Institute for Research, Madison, WI, USA
| | - Cara Argus
- Morgridge Institute for Research, Madison, WI, USA
| | | | - Parker Knight
- Department of Mathematics, University of Florida, FL, USA
| | | | - Ron Stewart
- Morgridge Institute for Research, Madison, WI, USA
| | | |
Collapse
|
2
|
Zhang L, Zhang H, Yang S. Cytosolic TaGAPC2 Enhances Tolerance to Drought Stress in Transgenic Arabidopsis Plants. Int J Mol Sci 2020; 21:ijms21207499. [PMID: 33053684 PMCID: PMC7590034 DOI: 10.3390/ijms21207499] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 10/02/2020] [Accepted: 10/04/2020] [Indexed: 11/16/2022] Open
Abstract
Drought is a major natural disaster that seriously affects agricultural production, especially for winter wheat in boreal China. As functional proteins, the functions and mechanisms of glyceraldehyde-3-phosphate dehydrogenase in cytoplasm (GAPCs) have remained little investigated in wheat subjected to adverse environmental conditions. In this study, we cloned and characterized a GAPC isoform TaGAPC2 in wheat. Over-expression of TaGApC2-6D in Arabidopsis led to enhanced root length, reduced reactive oxygen species (ROS) production, and elevated drought tolerance. In addition, the dual-luciferase assays showed that TaWRKY28/33/40/47 could positively regulate the expression of TaGApC2-6A and TaGApC2-6D. Further results of the yeast two-hybrid system and bimolecular fluorescence complementation assay (BiFC) demonstrate that TaPLDδ, an enzyme producing phosphatidic acid (PA), could interact with TaGAPC2-6D in plants. These results demonstrate that TaGAPC2 regulated by TaWRKY28/33/40/47 plays a crucial role in drought tolerance, which may influence the drought stress conditions via interaction with TaPLDδ. In conclusion, our results establish a new positive regulation mechanism of TaGAPC2 that helps wheat fine-tune its drought response.
Collapse
|
3
|
Lin L, Cai W, Du Z, Zhang W, Xu Q, Sun W, Chen M. Establishing a System for Functional Characterization of Full-Length cDNAs of Camellia sinensis. Int J Mol Sci 2019; 20:ijms20235929. [PMID: 31775391 PMCID: PMC6929147 DOI: 10.3390/ijms20235929] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 11/15/2019] [Accepted: 11/21/2019] [Indexed: 11/17/2022] Open
Abstract
Tea (Camellia sinensis) is enriched with bioactive secondary metabolites, and is one of the most popular nonalcoholic beverages globally. Two tea reference genomes have been reported; however, the functional analysis of tea genes has lagged, mainly due to tea’s recalcitrance to genetic transformation and the absence of alternative high throughput heterologous expression systems. A full-length cDNA collection with a streamlined cloning system is needed in this economically important woody crop species. RNAs were isolated from nine different vegetative tea tissues, pooled, then used to construct a normalized full-length cDNA library. The titer of unamplified and amplified cDNA library was 6.89 × 106 and 1.8 × 1010 cfu/mL, respectively; the library recombinant rate was 87.2%. Preliminary characterization demonstrated that this collection can complement existing tea reference genomes and facilitate rare gene discovery. In addition, to streamline tea cDNA cloning and functional analysis, a binary vector (pBIG2113SF) was reengineered, seven tea cDNAs isolated from this library were successfully cloned into this vector, then transformed into Arabidopsis. One FL-cDNA, which encodes a putative P1B-type ATPase 5 (CsHMA5), was characterized further as a proof of concept. We demonstrated that overexpression of CsHMA5 in Arabidopsis resulted in copper hyposensitivity. Thus, our data demonstrated that this represents an efficient system for rare gene discovery and functional characterization of tea genes. The integration of a tea FL-cDNA collection with efficient cloning and a heterologous expression system would facilitate functional annotation and characterization of tea genes.
Collapse
Affiliation(s)
- Lin Lin
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Horticultural Plant Biology and Metabolomics Center, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (L.L.); (W.C.); (Z.D.); (Q.X.)
| | - Weiwei Cai
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Horticultural Plant Biology and Metabolomics Center, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (L.L.); (W.C.); (Z.D.); (Q.X.)
| | - Zhenghua Du
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Horticultural Plant Biology and Metabolomics Center, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (L.L.); (W.C.); (Z.D.); (Q.X.)
| | - Wenjing Zhang
- Anxi College of Tea Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Quanming Xu
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Horticultural Plant Biology and Metabolomics Center, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (L.L.); (W.C.); (Z.D.); (Q.X.)
| | - Weijiang Sun
- Anxi College of Tea Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
- Correspondence: (W.S.); (M.C.); Tel.: +86-13705067139 (W.S.); +86-18860109236 (M.C.)
| | - Mingjie Chen
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Horticultural Plant Biology and Metabolomics Center, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China; (L.L.); (W.C.); (Z.D.); (Q.X.)
- Correspondence: (W.S.); (M.C.); Tel.: +86-13705067139 (W.S.); +86-18860109236 (M.C.)
| |
Collapse
|
4
|
Hoang NV, Furtado A, Perlo V, Botha FC, Henry RJ. The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome. Front Genet 2019; 10:654. [PMID: 31396260 PMCID: PMC6664245 DOI: 10.3389/fgene.2019.00654] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 06/20/2019] [Indexed: 11/13/2022] Open
Abstract
Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed, and many new generally shorter transcripts were detected by normalization. For the same input cDNA and data yield, the normalized library recovered more total transcript isoforms and number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ∼1.25 kb and more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ∼52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk, and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ∼80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.
Collapse
Affiliation(s)
- Nam V. Hoang
- College of Agriculture and Forestry, Hue University, Hue, Vietnam
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| | - Virginie Perlo
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| | - Frederik C. Botha
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | - Robert J. Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| |
Collapse
|
5
|
Meireles B, Usié A, Barbosa P, Fortes AM, Folgado A, Chaves I, Carrasquinho I, Costa RL, Gonçalves S, Teixeira RT, Ramos AM, Nóbrega F. Characterization of the cork formation and production transcriptome in Quercus cerris × suber hybrids. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2018; 24:535-549. [PMID: 30042611 PMCID: PMC6041232 DOI: 10.1007/s12298-018-0526-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Revised: 03/13/2018] [Accepted: 03/20/2018] [Indexed: 05/31/2023]
Abstract
Cork oak is the main cork-producing species worldwide, and plays a significant economic, ecological and social role in the Mediterranean countries, in particular in Portugal and Spain. The ability to produce cork is limited to a few species, hence it must involve specific regulation mechanisms that are unique to these species. However, to date, these mechanisms remain largely understudied, especially with approaches involving the use of high-throughput sequencing technology. In this study, the transcriptome of cork-producing and non-cork-producing Quercus cerris × suber hybrids was analyzed in order to elucidate the differences between the two groups of trees displaying contrasting phenotypes for cork production. The results revealed the presence of a significant number of genes exclusively associated with cork production, in the trees that developed cork. Moreover, several gene ontology subcategories, such as cell wall biogenesis, lipid metabolic processes, metal ion binding and apoplast/cell wall, were only detected in the trees with cork production. These results indicate the existence, at the transcriptome level, of mechanisms that seem to be unique and necessary for cork production, which is an advancement in our knowledge regarding the genetic regulation behind cork formation and production.
Collapse
Affiliation(s)
- Brígida Meireles
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
| | - Ana Usié
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
- Instituto de Ciências Agrárias e Ambientais Mediterrânicas (ICAAM), Universidade de Évora, Évora, Portugal
| | - Pedro Barbosa
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
| | - Ana Margarida Fortes
- Faculdade de Ciências de Lisboa, Biosystems and Integrative Sciences Institute (BIOISI), Universidade de Lisboa, Lisbon, Portugal
| | - André Folgado
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
| | - Inês Chaves
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
| | - Isabel Carrasquinho
- Instituto Nacional de Investigação Agrária e Veterinária, I.P, Quinta do Marquês, 2780-159 Oeiras, Portugal
| | - Rita Lourenço Costa
- Instituto Nacional de Investigação Agrária e Veterinária, I.P, Quinta do Marquês, 2780-159 Oeiras, Portugal
- Centro de estudos Florestais, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017 Lisbon, Portugal
| | - Sónia Gonçalves
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
- Present Address: Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB101SA UK
| | - Rita Teresa Teixeira
- Instituto Superior de Agronomia da Universidade de Lisboa (ISA), Tapada da Ajuda, 1349-017 Lisbon, Portugal
| | - António Marcos Ramos
- Centro de Biotecnologia Agrícola e Agro-Alimentar do Alentejo (CEBAL), Instituto Politécnico de Beja (IPBeja), Beja, Portugal
- Instituto de Ciências Agrárias e Ambientais Mediterrânicas (ICAAM), Universidade de Évora, Évora, Portugal
| | - Filomena Nóbrega
- Instituto Nacional de Investigação Agrária e Veterinária, I.P, Quinta do Marquês, 2780-159 Oeiras, Portugal
| |
Collapse
|
6
|
Hoang NV, Furtado A, Mason PJ, Marquardt A, Kasirajan L, Thirugnanasambandam PP, Botha FC, Henry RJ. A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing. BMC Genomics 2017; 18:395. [PMID: 28532419 PMCID: PMC5440902 DOI: 10.1186/s12864-017-3757-8] [Citation(s) in RCA: 116] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 05/03/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Despite the economic importance of sugarcane in sugar and bioenergy production, there is not yet a reference genome available. Most of the sugarcane transcriptomic studies have been based on Saccharum officinarum gene indices (SoGI), expressed sequence tags (ESTs) and de novo assembled transcript contigs from short-reads; hence knowledge of the sugarcane transcriptome is limited in relation to transcript length and number of transcript isoforms. RESULTS The sugarcane transcriptome was sequenced using PacBio isoform sequencing (Iso-Seq) of a pooled RNA sample derived from leaf, internode and root tissues, of different developmental stages, from 22 varieties, to explore the potential for capturing full-length transcript isoforms. A total of 107,598 unique transcript isoforms were obtained, representing about 71% of the total number of predicted sugarcane genes. The majority of this dataset (92%) matched the plant protein database, while just over 2% was novel transcripts, and over 2% was putative long non-coding RNAs. About 56% and 23% of total sequences were annotated against the gene ontology and KEGG pathway databases, respectively. Comparison with de novo contigs from Illumina RNA-Sequencing (RNA-Seq) of the internode samples from the same experiment and public databases showed that the Iso-Seq method recovered more full-length transcript isoforms, had a higher N50 and average length of largest 1,000 proteins; whereas a greater representation of the gene content and RNA diversity was captured in RNA-Seq. Only 62% of PacBio transcript isoforms matched 67% of de novo contigs, while the non-matched proportions were attributed to the inclusion of leaf/root tissues and the normalization in PacBio, and the representation of more gene content and RNA classes in the de novo assembly, respectively. About 69% of PacBio transcript isoforms and 41% of de novo contigs aligned with the sorghum genome, indicating the high conservation of orthologs in the genic regions of the two genomes. CONCLUSIONS The transcriptome dataset should contribute to improved sugarcane gene models and sugarcane protein predictions; and will serve as a reference database for analysis of transcript expression in sugarcane.
Collapse
Affiliation(s)
- Nam V Hoang
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,College of Agriculture and Forestry, Hue University, Hue, Vietnam
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia
| | - Patrick J Mason
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia
| | - Annelie Marquardt
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,Sugar Research Australia, Indooroopilly, QLD, 4068, Australia
| | - Lakshmi Kasirajan
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,ICAR - Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India
| | - Prathima P Thirugnanasambandam
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,ICAR - Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India
| | - Frederik C Botha
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.,Sugar Research Australia, Indooroopilly, QLD, 4068, Australia
| | - Robert J Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Room 2.245, Level 2, The John Hay Building, Queensland Biosciences Precinct [#80], 306 Carmody Road, St. Lucia, QLD, 4072, Australia.
| |
Collapse
|
7
|
Al-Faifi SA, Migdadi HM, Algamdi SS, Khan MA, Al-Obeed RS, Ammar MH, Jakse J. Analysis of Expressed Sequence Tags (EST) in Date Palm. Methods Mol Biol 2017; 1638:283-313. [PMID: 28755231 DOI: 10.1007/978-1-4939-7159-6_23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Expressed sequence tags (EST) were generated from a normalized cDNA library of the date palm Sukkari cv. to understand the high-quality and better field performance of this well-known commercial cultivar. A total of 6943 high-quality ESTs were generated, out of them 6671 are submitted to the GenBank dbEST (LIBEST_028537). The generated ESTs were assembled into 6362 unigenes, consisting of 494 (14.4%) contigs and 5868 (84.53%) singletons. The functional annotation shows that the majority of the ESTs are associated with binding (44%), catalytic (40%), transporter (5%), and structural molecular (5%) activities. The blastx results show that 73% of unigenes are significantly similar to known plant genes and 27% are novel. The latter could be of particular interest in date palm genetic studies. Further analysis shows that some ESTs are categorized as stress/defense- and fruit development-related genes. These newly generated ESTs could significantly enhance date palm EST databases in the public domain and are available to scientists and researchers across the globe. This knowledge will facilitate the discovery of candidate genes that govern important developmental and agronomical traits in date palm. It will provide important resources for developing genetic tools, comparative genomics, and genome evolution among date palm cultivars.
Collapse
Affiliation(s)
- Sulieman A Al-Faifi
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box 2460, Riyadh, 11451, Saudi Arabia
| | - Hussein M Migdadi
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box 2460, Riyadh, 11451, Saudi Arabia.
| | - Salem S Algamdi
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box 2460, Riyadh, 11451, Saudi Arabia
| | - Mohammad Altaf Khan
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box 2460, Riyadh, 11451, Saudi Arabia
| | - Rashid S Al-Obeed
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box 2460, Riyadh, 11451, Saudi Arabia
| | - Megahed H Ammar
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box 2460, Riyadh, 11451, Saudi Arabia
| | - Jerenj Jakse
- Biotechnical Faculty, Agronomy Department, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
8
|
Zhang W, Zhang H, Qi F, Jian G. Generation of transcriptome profiling and gene functional analysis in Gossypium hirsutum upon Verticillium dahliae infection. Biochem Biophys Res Commun 2016; 473:879-885. [DOI: 10.1016/j.bbrc.2016.03.143] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Accepted: 03/29/2016] [Indexed: 10/22/2022]
|
9
|
Collembolan Transcriptomes Highlight Molecular Evolution of Hexapods and Provide Clues on the Adaptation to Terrestrial Life. PLoS One 2015; 10:e0130600. [PMID: 26075903 PMCID: PMC4468109 DOI: 10.1371/journal.pone.0130600] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Accepted: 05/21/2015] [Indexed: 11/19/2022] Open
Abstract
Background Collembola (springtails) represent a soil-living lineage of hexapods in between insects and crustaceans. Consequently, their genomes may hold key information on the early processes leading to evolution of Hexapoda from a crustacean ancestor. Method We assembled and annotated transcriptomes of the Collembola Folsomia candida and Orchesella cincta, and performed comparative analysis with protein-coding gene sequences of three crustaceans and three insects to identify adaptive signatures associated with the evolution of hexapods within the pancrustacean clade. Results Assembly of the springtail transcriptomes resulted in 37,730 transcripts with predicted open reading frames for F. candida and 32,154 for O. cincta, of which 34.2% were functionally annotated for F. candida and 38.4% for O. cincta. Subsequently, we predicted orthologous clusters among eight species and applied the branch-site test to detect episodic positive selection in the Hexapoda and Collembola lineages. A subset of 250 genes showed significant positive selection along the Hexapoda branch and 57 in the Collembola lineage. Gene Ontology categories enriched in these genes include metabolism, stress response (i.e. DNA repair, immune response), ion transport, ATP metabolism, regulation and development-related processes (i.e. eye development, neurological development). Conclusions We suggest that the identified gene families represent processes that have played a key role in the divergence of hexapods within the pancrustacean clade that eventually evolved into the most species-rich group of all animals, the hexapods. Furthermore, some adaptive signatures in collembolans may provide valuable clues to understand evolution of hexapods on land.
Collapse
|
10
|
Duplex-specific nuclease-mediated bioanalysis. Trends Biotechnol 2015; 33:180-8. [DOI: 10.1016/j.tibtech.2014.12.008] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2014] [Revised: 12/22/2014] [Accepted: 12/30/2014] [Indexed: 12/21/2022]
|
11
|
Pereira-Leal JB, Abreu IA, Alabaça CS, Almeida MH, Almeida P, Almeida T, Amorim MI, Araújo S, Azevedo H, Badia A, Batista D, Bohn A, Capote T, Carrasquinho I, Chaves I, Coelho AC, Costa MMR, Costa R, Cravador A, Egas C, Faro C, Fortes AM, Fortunato AS, Gaspar MJ, Gonçalves S, Graça J, Horta M, Inácio V, Leitão JM, Lino-Neto T, Marum L, Matos J, Mendonça D, Miguel A, Miguel CM, Morais-Cecílio L, Neves I, Nóbrega F, Oliveira MM, Oliveira R, Pais MS, Paiva JA, Paulo OS, Pinheiro M, Raimundo JAP, Ramalho JC, Ribeiro AI, Ribeiro T, Rocheta M, Rodrigues AI, Rodrigues JC, Saibo NJM, Santo TE, Santos AM, Sá-Pereira P, Sebastiana M, Simões F, Sobral RS, Tavares R, Teixeira R, Varela C, Veloso MM, Ricardo CPP. A comprehensive assessment of the transcriptome of cork oak (Quercus suber) through EST sequencing. BMC Genomics 2014; 15:371. [PMID: 24885229 PMCID: PMC4070548 DOI: 10.1186/1471-2164-15-371] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2013] [Accepted: 04/15/2014] [Indexed: 01/17/2023] Open
Abstract
Background Cork oak (Quercus suber) is one of the rare trees with the ability to produce cork, a material widely used to make wine bottle stoppers, flooring and insulation materials, among many other uses. The molecular mechanisms of cork formation are still poorly understood, in great part due to the difficulty in studying a species with a long life-cycle and for which there is scarce molecular/genomic information. Cork oak forests are of great ecological importance and represent a major economic and social resource in Southern Europe and Northern Africa. However, global warming is threatening the cork oak forests by imposing thermal, hydric and many types of novel biotic stresses. Despite the economic and social value of the Q. suber species, few genomic resources have been developed, useful for biotechnological applications and improved forest management. Results We generated in excess of 7 million sequence reads, by pyrosequencing 21 normalized cDNA libraries derived from multiple Q. suber tissues and organs, developmental stages and physiological conditions. We deployed a stringent sequence processing and assembly pipeline that resulted in the identification of ~159,000 unigenes. These were annotated according to their similarity to known plant genes, to known Interpro domains, GO classes and E.C. numbers. The phylogenetic extent of this ESTs set was investigated, and we found that cork oak revealed a significant new gene space that is not covered by other model species or EST sequencing projects. The raw data, as well as the full annotated assembly, are now available to the community in a dedicated web portal at http://www.corkoakdb.org. Conclusions This genomic resource represents the first trancriptome study in a cork producing species. It can be explored to develop new tools and approaches to understand stress responses and developmental processes in forest trees, as well as the molecular cascades underlying cork differentiation and disease response.
Collapse
Affiliation(s)
- José B Pereira-Leal
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande 6, Oeiras 2780-156, Portugal.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Figueiredo J, Simões MJ, Gomes P, Barroso C, Pinho D, Conceição L, Fonseca L, Abrantes I, Pinheiro M, Egas C. Assessment of the geographic origins of pinewood nematode isolates via single nucleotide polymorphism in effector genes. PLoS One 2013; 8:e83542. [PMID: 24391785 PMCID: PMC3877046 DOI: 10.1371/journal.pone.0083542] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Accepted: 11/05/2013] [Indexed: 11/18/2022] Open
Abstract
The pinewood nematode, Bursaphelenchus xylophilus, is native to North America but it only causes damaging pine wilt disease in those regions of the world where it has been introduced. The accurate detection of the species and its dispersal routes are thus essential to define effective control measures. The main goals of this study were to analyse the genetic diversity among B. xylophilus isolates from different geographic locations and identify single nucleotide polymorphism (SNPs) markers for geographic origin, through a comparative transcriptomic approach. The transcriptomes of seven B. xylophilus isolates, from Continental Portugal (4), China (1), Japan (1) and USA (1), were sequenced in the next generation platform Roche 454. Analysis of effector gene transcripts revealed inter-isolate nucleotide diversity that was validated by Sanger sequencing in the genomic DNA of the seven isolates and eight additional isolates from different geographic locations: Madeira Island (2), China (1), USA (1), Japan (2) and South Korea (2). The analysis identified 136 polymorphic positions in 10 effector transcripts. Pairwise comparison of the 136 SNPs through Neighbor-Joining and the Maximum Likelihood methods and 5-mer frequency analysis with the alignment-independent bilinear multivariate modelling approach correlated the SNPs with the isolates geographic origin. Furthermore, the SNP analysis indicated a closer proximity of the Portuguese isolates to the Korean and Chinese isolates than to the Japanese or American isolates. Each geographic cluster carried exclusive alleles that can be used as SNP markers for B. xylophilus isolate identification.
Collapse
Affiliation(s)
- Joana Figueiredo
- Department of Life Sciences, University of Coimbra, Coimbra, Portugal
| | - Maria José Simões
- Genoinseq, Next Generation Sequencing Unit, Biocant, Cantanhede, Portugal
| | - Paula Gomes
- Genoinseq, Next Generation Sequencing Unit, Biocant, Cantanhede, Portugal
| | - Cristina Barroso
- Genoinseq, Next Generation Sequencing Unit, Biocant, Cantanhede, Portugal
| | - Diogo Pinho
- Genoinseq, Next Generation Sequencing Unit, Biocant, Cantanhede, Portugal
| | - Luci Conceição
- IMAR-CMA, Department of Life Sciences, University of Coimbra, Coimbra, Portugal
| | - Luís Fonseca
- IMAR-CMA, Department of Life Sciences, University of Coimbra, Coimbra, Portugal
| | - Isabel Abrantes
- IMAR-CMA, Department of Life Sciences, University of Coimbra, Coimbra, Portugal
| | - Miguel Pinheiro
- Genoinseq, Next Generation Sequencing Unit, Biocant, Cantanhede, Portugal
| | - Conceição Egas
- Genoinseq, Next Generation Sequencing Unit, Biocant, Cantanhede, Portugal
| |
Collapse
|
13
|
The Antarctic krill Euphausia superba shows diurnal cycles of transcription under natural conditions. PLoS One 2013; 8:e68652. [PMID: 23874706 PMCID: PMC3714250 DOI: 10.1371/journal.pone.0068652] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2013] [Accepted: 05/30/2013] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Polar environments are characterized by extreme seasonal changes in day length, light intensity and spectrum, the extent of sea ice during the winter, and food availability. A key species of the Southern Ocean ecosystem, the Antarctic krill (Euphausia superba) has evolved rhythmic physiological and behavioral mechanisms to adapt to daily and seasonal changes. The molecular organization of the clockwork underlying these biological rhythms is, nevertheless, still only partially understood. METHODOLOGY/PRINCIPAL FINDINGS The genome sequence of the Antarctic krill is not yet available. A normalized cDNA library was produced and pyrosequenced in the attempt to identify large numbers of transcripts. All available E. superba sequences were then assembled to create the most complete existing oligonucleotide microarray platform with a total of 32,217 probes. Gene expression signatures of specimens collected in the Ross Sea at five different time points over a 24-hour cycle were defined, and 1,308 genes differentially expressed were identified. Of the corresponding transcripts, 609 showed a significant sinusoidal expression pattern; about 40% of these exibithed a 24-hour periodicity while the other 60% was characterized by a shorter (about 12-hour) rhythm. We assigned the differentially expressed genes to functional categories and noticed that those concerning translation, proteolysis, energy and metabolic process, redox regulation, visual transduction and stress response, which are most likely related to daily environmental changes, were significantly enriched. Two transcripts of peroxiredoxin, thought to represent the ancestral timekeeping system that evolved about 2.5 billion years ago, were also identified as were two isoforms of the EsRh1 opsin and two novel arrestin1 sequences involved in the visual transduction cascade. CONCLUSIONS Our work represents the first characterization of the krill diurnal transcriptome under natural conditions and provides a first insight into the genetic regulation of physiological changes, which occur around the clock during an Antarctic summer day.
Collapse
|
14
|
Howe GT, Yu J, Knaus B, Cronn R, Kolpak S, Dolan P, Lorenz WW, Dean JFD. A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation. BMC Genomics 2013; 14:137. [PMID: 23445355 PMCID: PMC3673906 DOI: 10.1186/1471-2164-14-137] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2012] [Accepted: 01/31/2013] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. RESULTS We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. CONCLUSIONS Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array-more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change.
Collapse
Affiliation(s)
- Glenn T Howe
- Department of Forest Ecosystems and Society, Oregon State University, Corvallis, Oregon, 97331, USA
| | - Jianbin Yu
- Department of Forest Ecosystems and Society, Oregon State University, Corvallis, Oregon, 97331, USA
- Current address, DuPont Pioneer International, Willmar, Minnesota, 56201, USA
| | - Brian Knaus
- Pacific Northwest Research Station, USDA Forest Service, Corvallis, Oregon, 97331, USA
| | - Richard Cronn
- Pacific Northwest Research Station, USDA Forest Service, Corvallis, Oregon, 97331, USA
| | - Scott Kolpak
- Department of Forest Ecosystems and Society, Oregon State University, Corvallis, Oregon, 97331, USA
| | - Peter Dolan
- Department of Mathematics, University of Minnesota, Morris, MN, USA
| | - W Walter Lorenz
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, 30602, USA
| | - Jeffrey FD Dean
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, 30602, USA
| |
Collapse
|
15
|
Howe GT, Yu J, Knaus B, Cronn R, Kolpak S, Dolan P, Lorenz WW, Dean JFD. A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation. BMC Genomics 2013. [PMID: 23445355 DOI: 10.1186/1471‐2164‐14‐137] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. RESULTS We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. CONCLUSIONS Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array-more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change.
Collapse
Affiliation(s)
- Glenn T Howe
- Department of Forest Ecosystems and Society, Oregon State University, Corvallis, Oregon 97331, USA.
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Schwochow D, Serieys LEK, Wayne RK, Thalmann O. Efficient recovery of whole blood RNA--a comparison of commercial RNA extraction protocols for high-throughput applications in wildlife species. BMC Biotechnol 2012; 12:33. [PMID: 22738215 PMCID: PMC3406948 DOI: 10.1186/1472-6750-12-33] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Accepted: 06/27/2012] [Indexed: 01/08/2023] Open
Abstract
Background Since the emergence of next generation sequencing platforms, unprecedented opportunities have arisen in the study of natural vertebrate populations. In particular, insights into the genetic and epigenetic mechanisms of adaptation can be revealed through study of the expression profiles of genes. However, as a pre-requisite to expression profiling, care must be taken in RNA preparation as factors like DNA contamination, RNA integrity or transcript abundance can affect downstream applications. Here, we evaluated five commonly used RNA extraction methods using whole blood sampled under varying conditions from 20 wild carnivores. Results Despite the use of minute starting volumes, all methods produced quantifiable RNA extracts (1.4 – 18.4 μg) with varying integrity (RIN 4.6 - 7.7), the latter being significantly affected by the storage and extraction method used. We observed a significant overall effect of the extraction method on DNA contamination. One particular extraction method, the LeukoLOCK™ filter system, yielded high RNA integrity along with low DNA contamination and efficient depletion of hemoglobin transcripts highly abundant in whole blood. In a proof of concept sequencing experiment, we found globin RNA transcripts to occupy up to ¼ of all sequencing reads if libraries were not depleted of hemoglobin prior to sequencing. Conclusion By carefully choosing the appropriate RNA extraction method, whole blood can become a valuable source for high-throughput applications like expression arrays or transcriptome sequencing from natural populations. Additionally, candidate genes showing signs of selection could subsequently be genotyped in large population samples using whole blood as a source for RNA without harming individuals from rare or endangered species.
Collapse
Affiliation(s)
- Doreen Schwochow
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | | | | | | |
Collapse
|
17
|
Bogdanova EA, Shagina IA, Ianushevich IG, Vagner LL, Luk'ianov SA, Shagin DA. [Preparation of prokaryotic cDNA for high-throughput transcriptome analysis]. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2012; 37:854-7. [PMID: 22497085 DOI: 10.1134/s1068162011060045] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
High contents of non-coding RNA in total bacteria RNA complicates considerably transcriptome analysis using standard approaches like high-throughput sequencing, gene expression profiles, subtractive hybridization. We suggest a procedure of preparation of bacterial cDNA for transcriptomics that includes rRNA and tRNA depletion with preservation of relative abundance of coding sequences. The method is based on the second order hybridization kinetics and unique properties of Kanchatka crab duplex-specific nuclease. The method efficacy was demonstrated on a model experiments.
Collapse
|
18
|
Marques MC, Perez-Amador MA. Construction and analysis of full-length and normalized cDNA libraries from citrus. Methods Mol Biol 2012; 815:51-65. [PMID: 22130983 DOI: 10.1007/978-1-61779-424-7_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We have developed an integrated method to generate a normalized cDNA collection enriched in full-length and rare transcripts from citrus, using different species and multiple tissues and developmental stages. Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. In this regard, the availability of full-length cDNA clones facilitates functional analysis of the corresponding genes enabling manipulation of their expression and the generation of a variety of tagged versions of the native protein. The development of full-length cDNA sequences has the power to improve the quality of genome annotation, as well as provide tools for functional characterization of genes.
Collapse
Affiliation(s)
- M Carmen Marques
- Instituto de Biología Molecular y Celular de Plantas-IBMCP, Universidad Politécnica de Valencia-UPV and Consejo Superior de Investigaciones Científicas-CSIC, CPI 8E, Ingeniero Fausto Elio s/n, Valencia 46022, Spain
| | | |
Collapse
|
19
|
Abstract
A well-recognized obstacle to efficient high-throughput analysis of cDNA libraries is the differential abundance of various transcripts in any particular cell type. Decreasing the prevalence of clones representing abundant transcripts before sequencing, using cDNA normalization, may significantly increase the efficacy of random sequencing and is essential for rare gene discovery. Duplex-specific nuclease (DSN) normalization allows the generation of normalized full-length-enriched cDNA libraries to permit a high gene discovery rate. The method is based on the unique properties of DSN from the Kamchatka crab and involves denaturation-reassociation of cDNA, degradation of the ds-fraction formed by abundant transcripts by DSN, and PCR amplification of the remaining ss-DNA fraction. The method has been evaluated in various plant and animal models.
Collapse
|
20
|
Bettencourt R, Pinheiro M, Egas C, Gomes P, Afonso M, Shank T, Santos RS. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus. BMC Genomics 2010; 11:559. [PMID: 20937131 PMCID: PMC3091708 DOI: 10.1186/1471-2164-11-559] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Accepted: 10/11/2010] [Indexed: 01/03/2023] Open
Abstract
Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR) and their RNA transcription level by quantitative PCR (qPCR) experiments. Conclusions We have established the first tissue transcriptional analysis of a deep-sea hydrothermal vent animal and generated a searchable catalog of genes that provides a direct method of identifying and retrieving vast numbers of novel coding sequences which can be applied in gene expression profiling experiments from a non-conventional model organism. This provides the most comprehensive sequence resource for identifying novel genes currently available for a deep-sea vent organism, in particular, genes putatively involved in immune and inflammatory reactions in vent mussels. The characterization of the B. azoricus transcriptome will facilitate research into biological processes underlying physiological adaptations to hydrothermal vent environments and will provide a basis for expanding our understanding of genes putatively involved in adaptations processes during post-capture long term acclimatization experiments, at "sea-level" conditions, using B. azoricus as a model organism.
Collapse
Affiliation(s)
- Raul Bettencourt
- Department of Oceanography and Fisheries, University of the Azores, 9901-861 Horta, Portugal.
| | | | | | | | | | | | | |
Collapse
|
21
|
Bogdanov EA, Shagina I, Barsova EV, Kelmanson I, Shagin DA, Lukyanov SA. Normalizing cDNA libraries. ACTA ACUST UNITED AC 2010; Chapter 5:Unit 5.12.1-27. [PMID: 20373503 DOI: 10.1002/0471142727.mb0512s90] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The characterization of rare messages in cDNA libraries is complicated by the substantial variations that exist in the abundance levels of different transcripts in cells and tissues. The equalization (normalization) of cDNA is a helpful approach for decreasing the prevalence of abundant transcripts, thereby facilitating the assessment of rare transcripts. This unit provides a method for duplex-specific nuclease (DSN)-based normalization, which allows for the fast and reliable equalization of cDNA, thereby facilitating the generation of normalized, full-length-enriched cDNA libraries, and enabling efficient RNA analyses.
Collapse
|
22
|
Marques MC, Alonso-Cantabrana H, Forment J, Arribas R, Alamar S, Conejero V, Perez-Amador MA. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus. BMC Genomics 2009; 10:428. [PMID: 19747386 PMCID: PMC2754500 DOI: 10.1186/1471-2164-10-428] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Accepted: 09/11/2009] [Indexed: 01/02/2023] Open
Abstract
Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species.
Collapse
Affiliation(s)
- M Carmen Marques
- Instituto de Biología Molecular y Celular de Plantas, Universidad Politécnica de Valencia and Consejo Superior de Investigaciones Científicas, Avenida de los Naranjos s/n, Valencia 46022, Spain.
| | | | | | | | | | | | | |
Collapse
|
23
|
Hale MC, McCormick CR, Jackson JR, Dewoody JA. Next-generation pyrosequencing of gonad transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of normalization and rarefaction in gene discovery. BMC Genomics 2009; 10:203. [PMID: 19402907 PMCID: PMC2688523 DOI: 10.1186/1471-2164-10-203] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2008] [Accepted: 04/29/2009] [Indexed: 11/25/2022] Open
Abstract
Background Next-generation sequencing technologies have been applied most often to model organisms or species closely related to a model. However, these methods have the potential to be valuable in many wild organisms, including those of conservation concern. We used Roche 454 pyrosequencing to characterize gene expression in polyploid lake sturgeon (Acipenser fulvescens) gonads. Results Titration runs on a Roche 454 GS-FLX produced more than 47,000 sequencing reads. These reads represented 20,741 unique sequences that passed quality control (mean length = 186 bp). These were assembled into 1,831 contigs (mean contig depth = 4.1 sequences). Over 4,000 sequencing reads (~19%) were assigned gene ontologies, mostly to protein, RNA, and ion binding. A total of 877 candidate SNPs were identified from > 50 different genes. We employed an analytical approach from theoretical ecology (rarefaction) to evaluate depth of sequencing coverage relative to gene discovery. We also considered the relative merits of normalized versus native cDNA libraries when using next-generation sequencing platforms. Not surprisingly, fewer genes from the normalized libraries were rRNA subunits. Rarefaction suggests that normalization has little influence on the efficiency of gene discovery, at least when working with thousands of reads from a single tissue type. Conclusion Our data indicate that titration runs on 454 sequencers can characterize thousands of expressed sequence tags which can be used to identify SNPs, gene ontologies, and levels of gene expression in species of conservation concern. We anticipate that rarefaction will be useful in evaluations of gene discovery and that next-generation sequencing technologies hold great potential for the study of other non-model organisms.
Collapse
Affiliation(s)
- Matthew C Hale
- Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN 47907, USA.
| | | | | | | |
Collapse
|
24
|
DSN Depletion is a Simple Method to Remove Selected Transcripts from cDNA Populations. Mol Biotechnol 2009; 41:247-53. [DOI: 10.1007/s12033-008-9131-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2008] [Accepted: 11/21/2008] [Indexed: 10/21/2022]
|
25
|
Isolation, characterization and molecular cloning of duplex-specific nuclease from the hepatopancreas of the Kamchatka crab. BMC BIOCHEMISTRY 2008; 9:14. [PMID: 18495036 PMCID: PMC2413221 DOI: 10.1186/1471-2091-9-14] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2008] [Accepted: 05/21/2008] [Indexed: 11/22/2022]
Abstract
Background Nucleases, which are key components of biologically diverse processes such as DNA replication, repair and recombination, antiviral defense, apoptosis and digestion, have revolutionized the field of molecular biology. Indeed many standard molecular strategies, including molecular cloning, studies of DNA-protein interactions, and analysis of nucleic acid structures, would be virtually impossible without these versatile enzymes. The discovery of nucleases with unique properties has often served as the basis for the development of modern molecular biology methods. Thus, the search for novel nucleases with potentially exploitable functions remains an important scientific undertaking. Results Using degenerative primers and the rapid amplification of cDNA ends (RACE) procedure, we cloned the Duplex-Specific Nuclease (DSN) gene from the hepatopancreas of the Kamchatka crab and determined its full primary structure. We also developed an effective method for purifying functional DSN from the crab hepatopancreas. The isolated enzyme was highly thermostable, exhibited a broad pH optimum (5.5 – 7.5) and required divalent cations for activity, with manganese and cobalt being especially effective. The enzyme was highly specific, cleaving double-stranded DNA or DNA in DNA-RNA hybrids, but not single-stranded DNA or single- or double-stranded RNA. Moreover, only DNA duplexes containing at least 9 base pairs were effectively cleaved by DSN; shorter DNA duplexes were left intact. Conclusion We describe a new DSN from Kamchatka crab hepatopancreas, determining its primary structure and developing a preparative method for its purification. We found that DSN had unique substrate specificity, cleaving only DNA duplexes longer than 8 base pairs, or DNA in DNA-RNA hybrids. Interestingly, the DSN primary structure is homologous to well-known Serratia-like non-specific nucleases structures, but the properties of DSN are distinct. The unique substrate specificity of DSN should prove valuable in certain molecular biology applications.
Collapse
|
26
|
Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, Marden JH. Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol Ecol 2008; 17:1636-47. [PMID: 18266620 DOI: 10.1111/j.1365-294x.2008.03666.x] [Citation(s) in RCA: 501] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
We present a de novo assembly of a eukaryote transcriptome using 454 pyrosequencing data. The Glanville fritillary butterfly (Melitaea cinxia; Lepidoptera: Nymphalidae) is a prominent species in population biology but had no previous genomic data. Sequencing runs using two normalized complementary DNA collections from a genetically diverse pool of larvae, pupae, and adults yielded 608,053 expressed sequence tags (mean length = 110 nucleotides), which assembled into 48,354 contigs (sets of overlapping DNA segments) and 59,943 singletons. BLAST comparisons confirmed the accuracy of the sequencing and assembly, and indicated the presence of c. 9000 unique genes, along with > 6000 additional microarray-confirmed unannotated contigs. Average depth of coverage was 6.5-fold for the longest 4800 contigs (348-2849 bp in length), sufficient for detecting large numbers of single nucleotide polymorphisms. Oligonucleotide microarray probes designed from the assembled sequences showed highly repeatable hybridization intensity and revealed biological differences among individuals. We conclude that 454 sequencing, when performed to provide sufficient coverage depth, allows de novo transcriptome assembly and a fast, cost-effective, and reliable method for development of functional genomic tools for nonmodel species. This development narrows the gap between approaches based on model organisms with rich genetic resources vs. species that are most tractable for ecological and evolutionary studies.
Collapse
Affiliation(s)
- J Cristobal Vera
- Department of Biology, 208 Mueller Laboratory, Pennsylvania State University, University Park, PA 16802, USA.
| | | | | | | | | | | | | |
Collapse
|
27
|
Bogdanova EA, Shagin DA, Lukyanov SA. Normalization of full-length enriched cDNA. MOLECULAR BIOSYSTEMS 2008; 4:205-12. [PMID: 18437263 DOI: 10.1039/b715110c] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Analysis of rare messages in cDNA libraries is extremely difficult due to the substantial variations in the abundance of different transcripts in cells and tissues. Therefore, for rare transcript searches and analyses, the generation of equalized (normalized) cDNA is essential. Several cDNA normalization methods have been developed since 1990. A number of these methods have been optimized for the normalization of full-length enriched cDNA, and used in various applications, including transcriptome analysis and functional screening of cDNA libraries. One such procedure (named DSN-normalization) is based on the unique properties of duplex-specific nuclease (DSN) from kamchatka crab and allows the generation of normalized cDNA libraries with a high gene discovery rate.
Collapse
Affiliation(s)
- Ekaterina A Bogdanova
- Shemiakin and Ovchinnikov Institute of Bioorganic Chemistry, 16/10 Miklukho-Maklaya, Moscow, Russia
| | | | | |
Collapse
|
28
|
Selection strategy and the design of hybrid oligonucleotide primers for RACE-PCR: cloning a family of toxin-like sequences from Agelena orientalis. BMC Mol Biol 2007; 8:32. [PMID: 17498297 PMCID: PMC1876241 DOI: 10.1186/1471-2199-8-32] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2006] [Accepted: 05/11/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of specific but partially degenerate primers for nucleic acid hybridisations and PCRs amplification of known or unknown gene families was first reported well over a decade ago and the technique has been used widely since then. RESULTS Here we report a novel and successful selection strategy for the design of hybrid partially degenerate primers for use with RT-PCR and RACE-PCR for the identification of unknown gene families. The technique (named PaBaLiS) has proven very effective as it allowed us to identify and clone a large group of mRNAs encoding neurotoxin-like polypeptide pools from the venom of Agelena orientalis species of spider. Our approach differs radically from the generally accepted CODEHOP principle first reported in 1998. Most importantly, our method has proven very efficient by performing better than an independently generated high throughput EST cloning programme. Our method yielded nearly 130 non-identical sequences from Agelena orientalis, whilst the EST cloning technique yielded only 48 non-identical sequences from 2100 clones obtained from the same Agelena material. In addition to the primer design approach reported here, which is almost universally applicable to any PCR cloning application, our results also indicate that venom of Agelena orientalis spider contains a much larger family of related toxin-like sequences than previously thought. CONCLUSION With upwards of 100,000 species of spider thought to exist, and a propensity for producing diverse peptide pools, many more peptides of pharmacological importance await discovery. We envisage that some of these peptides and their recombinant derivatives will provide a new range of tools for neuroscience research and could also facilitate the development of a new generation of analgesic drugs and insecticides.
Collapse
|