1
|
Kim MH, Cho JS, Tran TNA, Nguyen TTT, Park EJ, Im JH, Han KH, Lee H, Ko JH. Comparative functional analysis of PdeNAC2 and AtVND6 in the tracheary element formation. TREE PHYSIOLOGY 2023:tpad042. [PMID: 37014763 DOI: 10.1093/treephys/tpad042] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 03/17/2023] [Indexed: 06/19/2023]
Abstract
Tracheary elements (i.e., vessel elements and tracheids) are highly specialized, non-living cells present in the water-conducting xylem tissue. In angiosperms, proteins in the VASCULAR-RELATED NAC-DOMAIN (VND) subgroup of the NAC transcription factor family (e.g., AtVND6) are required for the differentiation of vessel elements through transcriptional regulation of genes responsible for secondary cell wall (SCW) formation and programmed cell death (PCD). Gymnosperms, however, produce only tracheids, the mechanism of which remains elusive. Here, we report functional characteristics of PdeNAC2, a VND homolog in Pinus densiflora, as a key regulator of tracheid formation. Interestingly, our molecular genetic analyses show that PdeNAC2 can induce the formation of vessel element-like cells in angiosperm plants, demonstrated by transgenic overexpression of either native or NAC domain-swapped synthetic genes of PdeNAC2 and AtVND6 in both Arabidopsis and hybrid poplar. Subsequently, genome-wide identification of direct target genes of PdeNAC2 and AtVND6 revealed 138 and 174 genes as putative direct targets, respectively, but only 17 genes were identified as common direct targets. Further analyses have found that PdeNAC2 does not control some AtVND6-dependent vessel differentiation genes in angiosperm plants, such as AtVRLK1, LBD15/30, and pit-forming ROP signaling genes. Collectively, our results suggest that different target gene repertoires of PdeNAC2 and AtVND6 may contribute to the evolution of tracheary elements.
Collapse
Affiliation(s)
- Min-Ha Kim
- Department of Plant & Environmental New Resources, Kyung Hee University, Yongin 17104, Republic of Korea
| | - Jin-Seong Cho
- Department of Plant & Environmental New Resources, Kyung Hee University, Yongin 17104, Republic of Korea
| | - Thi Ngoc Anh Tran
- Department of Plant & Environmental New Resources, Kyung Hee University, Yongin 17104, Republic of Korea
| | - Thi Thu Tram Nguyen
- Department of Plant & Environmental New Resources, Kyung Hee University, Yongin 17104, Republic of Korea
| | - Eung-Jun Park
- Forest Bioresources Department, National Institute of Forest Science, Suwon 16631, Republic of Korea
| | - Jong-Hee Im
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- DOE Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI, 48824, USA
| | - Kyung-Hwan Han
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- DOE Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI, 48824, USA
- Department of Forestry, Michigan State University, East Lansing, MI 48824, USA
| | - Hyoshin Lee
- Forest Bioresources Department, National Institute of Forest Science, Suwon 16631, Republic of Korea
| | - Jae-Heung Ko
- Department of Plant & Environmental New Resources, Kyung Hee University, Yongin 17104, Republic of Korea
| |
Collapse
|
2
|
Müller M, Kües U, Budde KB, Gailing O. Applying molecular and genetic methods to trees and their fungal communities. Appl Microbiol Biotechnol 2023; 107:2783-2830. [PMID: 36988668 PMCID: PMC10106355 DOI: 10.1007/s00253-023-12480-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 03/05/2023] [Accepted: 03/07/2023] [Indexed: 03/30/2023]
Abstract
Forests provide invaluable economic, ecological, and social services. At the same time, they are exposed to several threats, such as fragmentation, changing climatic conditions, or increasingly destructive pests and pathogens. Trees, the inherent species of forests, cannot be viewed as isolated organisms. Manifold (micro)organisms are associated with trees playing a pivotal role in forest ecosystems. Of these organisms, fungi may have the greatest impact on the life of trees. A multitude of molecular and genetic methods are now available to investigate tree species and their associated organisms. Due to their smaller genome sizes compared to tree species, whole genomes of different fungi are routinely compared. Such studies have only recently started in forest tree species. Here, we summarize the application of molecular and genetic methods in forest conservation genetics, tree breeding, and association genetics as well as for the investigation of fungal communities and their interrelated ecological functions. These techniques provide valuable insights into the molecular basis of adaptive traits, the impacts of forest management, and changing environmental conditions on tree species and fungal communities and can enhance tree-breeding cycles due to reduced time for field testing. It becomes clear that there are multifaceted interactions among microbial species as well as between these organisms and trees. We demonstrate the versatility of the different approaches based on case studies on trees and fungi. KEY POINTS: • Current knowledge of genetic methods applied to forest trees and associated fungi. • Genomic methods are essential in conservation, breeding, management, and research. • Important role of phytobiomes for trees and their ecosystems.
Collapse
Affiliation(s)
- Markus Müller
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Goettingen, Büsgenweg 2, 37077, Göttingen, Germany.
- Center for Integrated Breeding Research (CiBreed), University of Goettingen, 37073, Göttingen, Germany.
| | - Ursula Kües
- Molecular Wood Biotechnology and Technical Mycology, Faculty for Forest Sciences and Forest Ecology, University of Goettingen, Büsgenweg 2, 37077, Göttingen, Germany
- Center for Molecular Biosciences (GZMB), Georg-August-University Göttingen, 37077, Göttingen, Germany
- Center of Sustainable Land Use (CBL), Georg-August-University Göttingen, 37077, Göttingen, Germany
| | - Katharina B Budde
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Goettingen, Büsgenweg 2, 37077, Göttingen, Germany
- Center of Sustainable Land Use (CBL), Georg-August-University Göttingen, 37077, Göttingen, Germany
| | - Oliver Gailing
- Forest Genetics and Forest Tree Breeding, Faculty for Forest Sciences and Forest Ecology, University of Goettingen, Büsgenweg 2, 37077, Göttingen, Germany
- Center for Integrated Breeding Research (CiBreed), University of Goettingen, 37073, Göttingen, Germany
- Center of Sustainable Land Use (CBL), Georg-August-University Göttingen, 37077, Göttingen, Germany
| |
Collapse
|
3
|
Annotation of Siberian Larch (Larix sibirica Ledeb.) Nuclear Genome—One of the Most Cold-Resistant Tree Species in the Only Deciduous GENUS in Pinaceae. PLANTS 2022; 11:plants11152062. [PMID: 35956540 PMCID: PMC9370799 DOI: 10.3390/plants11152062] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 07/22/2022] [Accepted: 07/26/2022] [Indexed: 11/17/2022]
Abstract
The recent release of the nuclear, chloroplast and mitochondrial genome assemblies of Siberian larch (Larix sibirica Ledeb.), one of the most cold-resistant tree species in the only deciduous genus of Pinaceae, with seasonal senescence and a rot-resistant valuable timber widely used in construction, greatly contributed to the development of genomic resources for the larch genus. Here, we present an extensive repeatome analysis and the first annotation of the draft nuclear Siberian larch genome assembly. About 66% of the larch genome consists of highly repetitive elements (REs), with the likely wave of retrotransposons insertions into the larch genome estimated to occur 4–5 MYA. In total, 39,370 gene models were predicted, with 87% of them having homology to the Arabidopsis-annotated proteins and 78% having at least one GO term assignment. The current state of the genome annotations allows for the exploration of the gymnosperm and angiosperm species for relative gene abundance in different functional categories. Comparative analysis of functional gene categories across different angiosperm and gymnosperm species finds that the Siberian larch genome has an overabundance of genes associated with programmed cell death (PCD), autophagy, stress hormone biosynthesis and regulatory pathways; genes that may play important roles in seasonal senescence and stress response to extreme cold in larch. Despite being incomplete, the draft assemblies and annotations of the conifer genomes are at a point of development where they now represent a valuable source for further genomic, genetic and population studies.
Collapse
|
4
|
Comparative Genomics Analysis of Repetitive Elements in Ten Gymnosperm Species: "Dark Repeatome" and Its Abundance in Conifer and Gnetum Species. Life (Basel) 2021; 11:life11111234. [PMID: 34833110 PMCID: PMC8620675 DOI: 10.3390/life11111234] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/09/2021] [Accepted: 11/09/2021] [Indexed: 11/16/2022] Open
Abstract
Repetitive elements (RE) and transposons (TE) can comprise up to 80% of some plant genomes and may be essential for regulating their evolution and adaptation. The “repeatome” information is often unavailable in assembled genomes because genomic areas of repeats are challenging to assemble and are often missing from final assembly. However, raw genomic sequencing data contain rich information about RE/TEs. Here, raw genomic NGS reads of 10 gymnosperm species were studied for the content and abundance patterns of their “repeatome”. We utilized a combination of alignment on databases of repetitive elements and de novo assembly of highly repetitive sequences from genomic sequencing reads to characterize and calculate the abundance of known and putative repetitive elements in the genomes of 10 conifer plants: Pinus taeda, Pinus sylvestris, Pinus sibirica, Picea glauca, Picea abies, Abies sibirica, Larix sibirica, Juniperus communis, Taxus baccata, and Gnetum gnemon. We found that genome abundances of known and newly discovered putative repeats are specific to phylogenetically close groups of species and match biological taxa. The grouping of species based on abundances of known repeats closely matches the grouping based on abundances of newly discovered putative repeats (kChains) and matches the known taxonomic relations.
Collapse
|
5
|
Finkers R, van Kaauwen M, Ament K, Burger-Meijer K, Egging R, Huits H, Kodde L, Kroon L, Shigyo M, Sato S, Vosman B, van Workum W, Scholten O. Insights from the first genome assembly of Onion (Allium cepa). G3 (BETHESDA, MD.) 2021; 11. [PMID: 34544132 DOI: 10.1101/2021.03.05.434149] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/06/2021] [Indexed: 05/18/2023]
Abstract
Onion is an important vegetable crop with an estimated genome size of 16 Gb. We describe the de novo assembly and ab initio annotation of the genome of a doubled haploid onion line DHCU066619, which resulted in a final assembly of 14.9 Gb with an N50 of 464 Kb. Of this, 2.4 Gb was ordered into eight pseudomolecules using four genetic linkage maps. The remainder of the genome is available in 89.6 K scaffolds. Only 72.4% of the genome could be identified as repetitive sequences and consist, to a large extent, of (retro) transposons. In addition, an estimated 20% of the putative (retro) transposons had accumulated a large number of mutations, hampering their identification, but facilitating their assembly. These elements are probably already quite old. The ab initio gene prediction indicated 540,925 putative gene models, which is far more than expected, possibly due to the presence of pseudogenes. Of these models, 47,066 showed RNASeq support. No gene rich regions were found, genes are uniformly distributed over the genome. Analysis of synteny with Allium sativum (garlic) showed collinearity but also major rearrangements between both species. This assembly is the first high-quality genome sequence available for the study of onion and will be a valuable resource for further research.
Collapse
Affiliation(s)
- Richard Finkers
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Martijn van Kaauwen
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Kai Ament
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Karin Burger-Meijer
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Henk Huits
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Linda Kodde
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Laurens Kroon
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Masayoshi Shigyo
- Laboratory of Vegetable Crop Science, College of Agriculture, Graduate School of Sciences and Technology for Innovation, Yamaguchi University Yamaguchi City, Yamaguchi 753-8515, Japan
| | - Shusei Sato
- Graduate School of Life Sciences, Tohoku University, Sendai 980-8577, Japan
| | - Ben Vosman
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Olga Scholten
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| |
Collapse
|
6
|
Finkers R, van Kaauwen M, Ament K, Burger-Meijer K, Egging R, Huits H, Kodde L, Kroon L, Shigyo M, Sato S, Vosman B, van Workum W, Scholten O. Insights from the first genome assembly of Onion (Allium cepa). G3 (BETHESDA, MD.) 2021; 11:jkab243. [PMID: 34544132 PMCID: PMC8496297 DOI: 10.1093/g3journal/jkab243] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/06/2021] [Indexed: 11/17/2022]
Abstract
Onion is an important vegetable crop with an estimated genome size of 16 Gb. We describe the de novo assembly and ab initio annotation of the genome of a doubled haploid onion line DHCU066619, which resulted in a final assembly of 14.9 Gb with an N50 of 464 Kb. Of this, 2.4 Gb was ordered into eight pseudomolecules using four genetic linkage maps. The remainder of the genome is available in 89.6 K scaffolds. Only 72.4% of the genome could be identified as repetitive sequences and consist, to a large extent, of (retro) transposons. In addition, an estimated 20% of the putative (retro) transposons had accumulated a large number of mutations, hampering their identification, but facilitating their assembly. These elements are probably already quite old. The ab initio gene prediction indicated 540,925 putative gene models, which is far more than expected, possibly due to the presence of pseudogenes. Of these models, 47,066 showed RNASeq support. No gene rich regions were found, genes are uniformly distributed over the genome. Analysis of synteny with Allium sativum (garlic) showed collinearity but also major rearrangements between both species. This assembly is the first high-quality genome sequence available for the study of onion and will be a valuable resource for further research.
Collapse
Affiliation(s)
- Richard Finkers
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Martijn van Kaauwen
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Kai Ament
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Karin Burger-Meijer
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Henk Huits
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Linda Kodde
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Laurens Kroon
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Masayoshi Shigyo
- Laboratory of Vegetable Crop Science, College of Agriculture, Graduate School of Sciences and Technology for Innovation, Yamaguchi University Yamaguchi City, Yamaguchi 753-8515, Japan
| | - Shusei Sato
- Graduate School of Life Sciences, Tohoku University, Sendai 980-8577, Japan
| | - Ben Vosman
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Olga Scholten
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| |
Collapse
|
7
|
Heitkam T, Schulte L, Weber B, Liedtke S, Breitenbach S, Kögler A, Morgenstern K, Brückner M, Tröber U, Wolf H, Krabel D, Schmidt T. Comparative Repeat Profiling of Two Closely Related Conifers ( Larix decidua and Larix kaempferi) Reveals High Genome Similarity With Only Few Fast-Evolving Satellite DNAs. Front Genet 2021; 12:683668. [PMID: 34322154 PMCID: PMC8312256 DOI: 10.3389/fgene.2021.683668] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 05/25/2021] [Indexed: 12/26/2022] Open
Abstract
In eukaryotic genomes, cycles of repeat expansion and removal lead to large-scale genomic changes and propel organisms forward in evolution. However, in conifers, active repeat removal is thought to be limited, leading to expansions of their genomes, mostly exceeding 10 giga base pairs. As a result, conifer genomes are largely littered with fragmented and decayed repeats. Here, we aim to investigate how the repeat landscapes of two related conifers have diverged, given the conifers' accumulative genome evolution mode. For this, we applied low-coverage sequencing and read clustering to the genomes of European and Japanese larch, Larix decidua (Lamb.) Carrière and Larix kaempferi (Mill.), that arose from a common ancestor, but are now geographically isolated. We found that both Larix species harbored largely similar repeat landscapes, especially regarding the transposable element content. To pin down possible genomic changes, we focused on the repeat class with the fastest sequence turnover: satellite DNAs (satDNAs). Using comparative bioinformatics, Southern, and fluorescent in situ hybridization, we reveal the satDNAs' organizational patterns, their abundances, and chromosomal locations. Four out of the five identified satDNAs are widespread in the Larix genus, with two even present in the more distantly related Pseudotsuga and Abies genera. Unexpectedly, the EulaSat3 family was restricted to L. decidua and absent from L. kaempferi, indicating its evolutionarily young age. Taken together, our results exemplify how the accumulative genome evolution of conifers may limit the overall divergence of repeats after speciation, producing only few repeat-induced genomic novelties.
Collapse
Affiliation(s)
- Tony Heitkam
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Luise Schulte
- Institute of Botany, Technische Universität Dresden, Dresden, Germany.,Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Beatrice Weber
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Susan Liedtke
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Sarah Breitenbach
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Anja Kögler
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Kristin Morgenstern
- Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, Tharandt, Germany
| | | | - Ute Tröber
- Staatsbetrieb Sachsenforst, Pirna, Germany
| | - Heino Wolf
- Staatsbetrieb Sachsenforst, Pirna, Germany
| | - Doris Krabel
- Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, Tharandt, Germany
| | - Thomas Schmidt
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
8
|
Kim MH, Tran TNA, Cho JS, Park EJ, Lee H, Kim DG, Hwang S, Ko JH. Wood transcriptome analysis of Pinus densiflora identifies genes critical for secondary cell wall formation and NAC transcription factors involved in tracheid formation. TREE PHYSIOLOGY 2021; 41:1289-1305. [PMID: 33440425 DOI: 10.1093/treephys/tpab001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/04/2021] [Indexed: 05/27/2023]
Abstract
Although conifers have significant ecological and economic value, information on transcriptional regulation of wood formation in conifers is still limited. Here, to gain insight into secondary cell wall (SCW) biosynthesis and tracheid formation in conifers, we performed wood tissue-specific transcriptome analyses of Pinus densiflora (Korean red pine) using RNA sequencing. In addition, to obtain full-length transcriptome information, PacBio single molecule real-time iso-sequencing was carried out using RNAs from 28 tissues of P. densiflora. Subsequent comparative tissue-specific transcriptome analysis successfully pinpointed critical genes encoding key proteins involved in biosynthesis of the major secondary wall components (cellulose, galactoglucomannan, xylan and lignin). Furthermore, we predicted a total of 62 NAC (NAM, ATAF1/2 and CUC2) family transcription factor members and identified seven PdeNAC genes preferentially expressed in developing xylem tissues in P. densiflora. Protoplast-based transcriptional activation analysis found that four PdeNAC genes, homologous to VND, NST and SND/ANAC075, upregulated GUS activity driven by an SCW-specific cellulose synthase promoter. Consistently, transient overexpression of the four PdeNACs induced xylem vessel cell-like SCW deposition in both tobacco (Nicotiana benthamiana) and Arabidopsis leaves. Taken together, our data provide a foundation for further research to unravel transcriptional regulation of wood formation in conifers, especially SCW formation and tracheid differentiation.
Collapse
Affiliation(s)
- Min-Ha Kim
- Department of Plant & Environmental New Resources, Kyung Hee University, 1732 Deogyeong-daero, Yongin 17104, Republic of Korea
| | - Thi Ngoc Anh Tran
- Department of Plant & Environmental New Resources, Kyung Hee University, 1732 Deogyeong-daero, Yongin 17104, Republic of Korea
| | - Jin-Seong Cho
- Department of Plant & Environmental New Resources, Kyung Hee University, 1732 Deogyeong-daero, Yongin 17104, Republic of Korea
| | - Eung-Jun Park
- Division of Forest Biotechnology, National Institute of Forest Science, 39 Onjeong-ro, Suwon 16631, Republic of Korea
| | - Hyoshin Lee
- Division of Forest Biotechnology, National Institute of Forest Science, 39 Onjeong-ro, Suwon 16631, Republic of Korea
| | - Dong-Gwan Kim
- Department of Bioindustry and Bioresource Engineering, Department of Molecular Biology and Plant Engineering Research Institute, Sejong University, 209 Neungdong-ro, Seoul 05006, Republic of Korea
| | - Seongbin Hwang
- Department of Bioindustry and Bioresource Engineering, Department of Molecular Biology and Plant Engineering Research Institute, Sejong University, 209 Neungdong-ro, Seoul 05006, Republic of Korea
| | - Jae-Heung Ko
- Department of Plant & Environmental New Resources, Kyung Hee University, 1732 Deogyeong-daero, Yongin 17104, Republic of Korea
| |
Collapse
|
9
|
Ferraro Petrillo U, Palini F, Cattaneo G, Giancarlo R. FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy. BMC Bioinformatics 2021; 22:144. [PMID: 33752596 PMCID: PMC7986029 DOI: 10.1186/s12859-021-04063-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 03/04/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Storage of genomic data is a major cost for the Life Sciences, effectively addressed via specialized data compression methods. For the same reasons of abundance in data production, the use of Big Data technologies is seen as the future for genomic data storage and processing, with MapReduce-Hadoop as leaders. Somewhat surprisingly, none of the specialized FASTA/Q compressors is available within Hadoop. Indeed, their deployment there is not exactly immediate. Such a State of the Art is problematic. RESULTS We provide major advances in two different directions. Methodologically, we propose two general methods, with the corresponding software, that make very easy to deploy a specialized FASTA/Q compressor within MapReduce-Hadoop for processing files stored on the distributed Hadoop File System, with very little knowledge of Hadoop. Practically, we provide evidence that the deployment of those specialized compressors within Hadoop, not available so far, results in better space savings, and even in better execution times over compressed data, with respect to the use of generic compressors available in Hadoop, in particular for FASTQ files. Finally, we observe that these results hold also for the Apache Spark framework, when used to process FASTA/Q files stored on the Hadoop File System. CONCLUSIONS Our Methods and the corresponding software substantially contribute to achieve space and time savings for the storage and processing of FASTA/Q files in Hadoop and Spark. Being our approach general, it is very likely that it can be applied also to FASTA/Q compression methods that will appear in the future. AVAILABILITY The software and the datasets are available at https://github.com/fpalini/fastdoopc.
Collapse
Affiliation(s)
| | - Francesco Palini
- Dipartimento di Scienze Statistiche, Università di Roma - La Sapienza, Rome, Italy
| | - Giuseppe Cattaneo
- Dipartimento di Matematica ed Informatica, Università di Palermo, Palermo, Italy
| | | |
Collapse
|
10
|
Abstract
Transposable elements (TEs) are mobile DNA sequences that propagate within genomes. Through diverse invasion strategies, TEs have come to occupy a substantial fraction of nearly all eukaryotic genomes, and they represent a major source of genetic variation and novelty. Here we review the defining features of each major group of eukaryotic TEs and explore their evolutionary origins and relationships. We discuss how the unique biology of different TEs influences their propagation and distribution within and across genomes. Environmental and genetic factors acting at the level of the host species further modulate the activity, diversification, and fate of TEs, producing the dramatic variation in TE content observed across eukaryotes. We argue that cataloging TE diversity and dissecting the idiosyncratic behavior of individual elements are crucial to expanding our comprehension of their impact on the biology of genomes and the evolution of species.
Collapse
Affiliation(s)
- Jonathan N Wells
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850; ,
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850; ,
| |
Collapse
|
11
|
Putintseva YA, Bondar EI, Simonov EP, Sharov VV, Oreshkova NV, Kuzmin DA, Konstantinov YM, Shmakov VN, Belkov VI, Sadovsky MG, Keech O, Krutovsky KV. Siberian larch (Larix sibirica Ledeb.) mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome. BMC Genomics 2020; 21:654. [PMID: 32972367 PMCID: PMC7517811 DOI: 10.1186/s12864-020-07061-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 09/10/2020] [Indexed: 01/01/2023] Open
Abstract
Background Plant mitochondrial genomes (mitogenomes) can be structurally complex while their size can vary from ~ 222 Kbp in Brassica napus to 11.3 Mbp in Silene conica. To date, in comparison with the number of plant species, only a few plant mitogenomes have been sequenced and released, particularly for conifers (the Pinaceae family). Conifers cover an ancient group of land plants that includes about 600 species, and which are of great ecological and economical value. Among them, Siberian larch (Larix sibirica Ledeb.) represents one of the keystone species in Siberian boreal forests. Yet, despite its importance for evolutionary and population studies, the mitogenome of Siberian larch has not yet been assembled and studied. Results Two sources of DNA sequences were used to search for mitochondrial DNA (mtDNA) sequences: mtDNA enriched samples and nucleotide reads generated in the de novo whole genome sequencing project, respectively. The assembly of the Siberian larch mitogenome contained nine contigs, with the shortest and the largest contigs being 24,767 bp and 4,008,762 bp, respectively. The total size of the genome was estimated at 11.7 Mbp. In total, 40 protein-coding, 34 tRNA, and 3 rRNA genes and numerous repetitive elements (REs) were annotated in this mitogenome. In total, 864 C-to-U RNA editing sites were found for 38 out of 40 protein-coding genes. The immense size of this genome, currently the largest reported, can be partly explained by variable numbers of mobile genetic elements, and introns, but unlikely by plasmid-related sequences. We found few plasmid-like insertions representing only 0.11% of the entire Siberian larch mitogenome. Conclusions Our study showed that the size of the Siberian larch mitogenome is much larger than in other so far studied Gymnosperms, and in the same range as for the annual flowering plant Silene conica (11.3 Mbp). Similar to other species, the Siberian larch mitogenome contains relatively few genes, and despite its huge size, the repeated and low complexity regions cover only 14.46% of the mitogenome sequence.
Collapse
Affiliation(s)
- Yuliya A Putintseva
- Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, Krasnoyarsk, 660036, Russia
| | - Eugeniya I Bondar
- Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, Krasnoyarsk, 660036, Russia.,Laboratory of Genomic Research and Biotechnology, Federal Research Center "Krasnoyarsk Science Center", Siberian Branch, Russian Academy of Sciences, Krasnoyarsk, 660036, Russia
| | - Evgeniy P Simonov
- Institute of Environmental and Agricultural Biology (X-BIO), University of Tyumen, Tyumen, 625003, Russia
| | - Vadim V Sharov
- Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, Krasnoyarsk, 660036, Russia.,Laboratory of Genomic Research and Biotechnology, Federal Research Center "Krasnoyarsk Science Center", Siberian Branch, Russian Academy of Sciences, Krasnoyarsk, 660036, Russia.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, Krasnoyarsk, 660074, Russia
| | - Natalya V Oreshkova
- Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, Krasnoyarsk, 660036, Russia.,Laboratory of Genomic Research and Biotechnology, Federal Research Center "Krasnoyarsk Science Center", Siberian Branch, Russian Academy of Sciences, Krasnoyarsk, 660036, Russia.,Laboratory of Forest Genetics and Selection, V. N. Sukachev Institute of Forest, Siberian Branch, Russian Academy of Sciences, Krasnoyarsk, 660036, Russia
| | - Dmitry A Kuzmin
- Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, Krasnoyarsk, 660036, Russia.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, Krasnoyarsk, 660074, Russia
| | - Yuri M Konstantinov
- Laboratory of Plant Genetic Engineering, Siberian Institute of Plant Physiology and Biochemistry, Siberian Branch, Russian Academy of Sciences, Irkutsk, 664033, Russia
| | - Vladimir N Shmakov
- Laboratory of Plant Genetic Engineering, Siberian Institute of Plant Physiology and Biochemistry, Siberian Branch, Russian Academy of Sciences, Irkutsk, 664033, Russia
| | - Vadim I Belkov
- Laboratory of Plant Genetic Engineering, Siberian Institute of Plant Physiology and Biochemistry, Siberian Branch, Russian Academy of Sciences, Irkutsk, 664033, Russia
| | - Michael G Sadovsky
- Institute of Computational Modeling, Siberian Branch, Russian Academy of Sciences, Krasnoyarsk, 660036, Russia
| | - Olivier Keech
- Department of Plant Physiology, UPSC, Umeå University, S-90187, Umeå, Sweden
| | - Konstantin V Krutovsky
- Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, Krasnoyarsk, 660036, Russia. .,Department of Forest Genetics and Forest Tree Breeding, Georg-August University of Göttingen, 37077, Göttingen, Germany. .,Center for Integrated Breeding Research, George-August University of Göttingen, 37075, Göttingen, Germany. .,Laboratory of Population Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119333, Russia. .,Department of Ecosystem Science and Management, Texas A&M University, College Station, TX, 77843-2138, USA.
| |
Collapse
|
12
|
Akiyoshi N, Nakano Y, Sano R, Kunigita Y, Ohtani M, Demura T. Involvement of VNS NAC-domain transcription factors in tracheid formation in Pinus taeda. TREE PHYSIOLOGY 2020; 40:704-716. [PMID: 31821470 DOI: 10.1093/treephys/tpz106] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 08/22/2019] [Accepted: 09/24/2019] [Indexed: 05/19/2023]
Abstract
Vascular plants have two types of water-conducting cells, xylem vessel cells (in angiosperms) and tracheid cells (in ferns and gymnosperms). These cells are commonly characterized by secondary cell wall (SCW) formation and programmed cell death (PCD), which increase the efficiency of water conduction. The differentiation of xylem vessel cells is regulated by a set of NAC (NAM, ATAF1/2 and CUC2) transcription factors, called the VASCULAR-RELATED NAC-DOMAIN (VND) family, in Arabidopsis thaliana Linne. The VNDs regulate the transcriptional induction of genes required for SCW formation and PCD. However, information on the transcriptional regulation of tracheid cell differentiation is still limited. Here, we performed functional analysis of loblolly pine (Pinus taeda Linne) VND homologs (PtaVNS, for VND, NST/SND, SMB-related protein). We identified five PtaVNS genes in the loblolly pine genome, and four of these PtaVNS genes were highly expressed in tissues with tracheid cells, such as shoot apices and developing xylem. Transient overexpression of PtaVNS genes induced xylem vessel cell-like patterning of SCW deposition in tobacco (Nicotiana benthamiana Domin) leaves, and up-regulated the promoter activities of loblolly pine genes homologous to SCW-related MYB transcription factor genes and cellulose synthase genes, as well as to cysteine protease genes for PCD. Collectively, our data indicated that PtaVNS proteins possess transcriptional activity to induce the molecular programs required for tracheid formation, i.e., SCW formation and PCD. Moreover, these findings suggest that the VNS-MYB-based transcriptional network regulating water-conducting cell differentiation in angiosperm and moss plants is conserved in gymnosperms.
Collapse
Affiliation(s)
- Nobuhiro Akiyoshi
- Graduate School of Science and Technology, Division of Biological Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Yoshimi Nakano
- Graduate School of Science and Technology, Division of Biological Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Ryosuke Sano
- Graduate School of Science and Technology, Division of Biological Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Yusuke Kunigita
- Graduate School of Science and Technology, Division of Biological Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Misato Ohtani
- Graduate School of Science and Technology, Division of Biological Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-8562, Japan
| | - Taku Demura
- Graduate School of Science and Technology, Division of Biological Science, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| |
Collapse
|
13
|
Liu B, Wang JP. Tracheid-associated transcription factors in loblolly pine. TREE PHYSIOLOGY 2020; 40:700-703. [PMID: 32050028 DOI: 10.1093/treephys/tpaa014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 01/02/2020] [Accepted: 01/31/2020] [Indexed: 06/10/2023]
Affiliation(s)
- Baoguang Liu
- Department of Forestry, Beihua University, 3999 East Binjiang Road, Fengman District, Jilin 132013, China
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 51 Hexing Road, Harbin 150040, China
| | - Jack P Wang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 51 Hexing Road, Harbin 150040, China
- Forest Biotechnology Group, Department of Forestry and Environmental Resources, North Carolina State University, 840 Main Campus Drive, Raleigh, NC 27695, USA
| |
Collapse
|
14
|
Hernandez-Escribano L, Visser EA, Iturritxa E, Raposo R, Naidoo S. The transcriptome of Pinus pinaster under Fusarium circinatum challenge. BMC Genomics 2020; 21:28. [PMID: 31914917 PMCID: PMC6950806 DOI: 10.1186/s12864-019-6444-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 12/30/2019] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Fusarium circinatum, the causal agent of pitch canker disease, poses a serious threat to several Pinus species affecting plantations and nurseries. Although Pinus pinaster has shown moderate resistance to F. circinatum, the molecular mechanisms of defense in this host are still unknown. Phytohormones produced by the plant and by the pathogen are known to play a crucial role in determining the outcome of plant-pathogen interactions. Therefore, the aim of this study was to determine the role of phytohormones in F. circinatum virulence, that compromise host resistance. RESULTS A high quality P. pinaster de novo transcriptome assembly was generated, represented by 24,375 sequences from which 17,593 were full length genes, and utilized to determine the expression profiles of both organisms during the infection process at 3, 5 and 10 days post-inoculation using a dual RNA-sequencing approach. The moderate resistance shown by Pinus pinaster at the early time points may be explained by the expression profiles pertaining to early recognition of the pathogen, the induction of pathogenesis-related proteins and the activation of complex phytohormone signaling pathways that involves crosstalk between salicylic acid, jasmonic acid, ethylene and possibly auxins. Moreover, the expression of F. circinatum genes related to hormone biosynthesis suggests manipulation of the host phytohormone balance to its own benefit. CONCLUSIONS We hypothesize three key steps of host manipulation: perturbing ethylene homeostasis by fungal expression of genes related to ethylene biosynthesis, blocking jasmonic acid signaling by coronatine insensitive 1 (COI1) suppression, and preventing salicylic acid biosynthesis from the chorismate pathway by the synthesis of isochorismatase family hydrolase (ICSH) genes. These results warrant further testing in F. circinatum mutants to confirm the mechanism behind perturbing host phytohormone homeostasis.
Collapse
Affiliation(s)
- Laura Hernandez-Escribano
- Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Centro de Investigación Forestal (INIA-CIFOR), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid, Madrid, Spain
| | - Erik A Visser
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Centre for Bioinformatics and Computational Biology, University of Pretoria, Pretoria, South Africa
| | - Eugenia Iturritxa
- NEIKER, Granja Modelo de Arkaute, Apdo 46, 01080, Vitoria-Gasteiz, Spain
| | - Rosa Raposo
- Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Centro de Investigación Forestal (INIA-CIFOR), Madrid, Spain
- Instituto de Gestión Forestal Sostenible (iuFOR), Universidad de Valladolid/INIA, Valladolid, Spain
| | - Sanushka Naidoo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Centre for Bioinformatics and Computational Biology, University of Pretoria, Pretoria, South Africa.
| |
Collapse
|
15
|
A Reference Genome Sequence for the European Silver Fir ( Abies alba Mill.): A Community-Generated Genomic Resource. G3-GENES GENOMES GENETICS 2019; 9:2039-2049. [PMID: 31217262 PMCID: PMC6643874 DOI: 10.1534/g3.119.400083] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Silver fir (Abies alba Mill.) is a keystone conifer of European montane forest ecosystems that has experienced large fluctuations in population size during during the Quaternary and, more recently, due to land-use change. To forecast the species’ future distribution and survival, it is important to investigate the genetic basis of adaptation to environmental change, notably to extreme events. For this purpose, we here provide a first draft genome assembly and annotation of the silver fir genome, established through a community-based initiative. DNA obtained from haploid megagametophyte and diploid needle tissue was used to construct and sequence Illumina paired-end and mate-pair libraries, respectively, to high depth. The assembled A. alba genome sequence accounted for over 37 million scaffolds corresponding to 18.16 Gb, with a scaffold N50 of 14,051 bp. Despite the fragmented nature of the assembly, a total of 50,757 full-length genes were functionally annotated in the nuclear genome. The chloroplast genome was also assembled into a single scaffold (120,908 bp) that shows a high collinearity with both the A. koreana and A. sibirica complete chloroplast genomes. This first genome assembly of silver fir is an important genomic resource that is now publicly available in support of a new generation of research. By genome-enabling this important conifer, this resource will open the gate for new research and more precise genetic monitoring of European silver fir forests.
Collapse
|
16
|
Liu Y, El-Kassaby YA. Novel Insights into Plant Genome Evolution and Adaptation as Revealed through Transposable Elements and Non-Coding RNAs in Conifers. Genes (Basel) 2019; 10:genes10030228. [PMID: 30889931 PMCID: PMC6470726 DOI: 10.3390/genes10030228] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 03/08/2019] [Accepted: 03/11/2019] [Indexed: 01/03/2023] Open
Abstract
Plant genomes are punctuated by repeated bouts of proliferation of transposable elements (TEs), and these mobile bursts are followed by silencing and decay of most of the newly inserted elements. As such, plant genomes reflect TE-related genome expansion and shrinkage. In general, these genome activities involve two mechanisms: small RNA-mediated epigenetic repression and long-term mutational decay and deletion, that is, genome-purging. Furthermore, the spatial relationships between TE insertions and genes are an important force in shaping gene regulatory networks, their downstream metabolic and physiological outputs, and thus their phenotypes. Such cascading regulations finally set up a fitness differential among individuals. This brief review demonstrates factual evidence that unifies most updated conceptual frameworks covering genome size, architecture, epigenetic reprogramming, and gene expression. It aims to give an overview of the impact that TEs may have on genome and adaptive evolution and to provide novel insights into addressing possible causes and consequences of intimidating genome sizes (20⁻30 Gb) in a taxonomic group, conifers.
Collapse
Affiliation(s)
- Yang Liu
- Department of Forest and Conservation Sciences, The University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.
| | - Yousry A El-Kassaby
- Department of Forest and Conservation Sciences, The University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.
| |
Collapse
|
17
|
Perera D, Magbanua ZV, Thummasuwan S, Mukherjee D, Arick M, Chouvarine P, Nairn CJ, Schmutz J, Grimwood J, Dean JFD, Peterson DG. Exploring the loblolly pine (Pinus taeda L.) genome by BAC sequencing and Cot analysis. Gene 2018; 663:165-177. [PMID: 29655895 DOI: 10.1016/j.gene.2018.04.024] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 03/20/2018] [Accepted: 04/10/2018] [Indexed: 02/06/2023]
Abstract
Loblolly pine (LP; Pinus taeda L.) is an economically and ecologically important tree in the southeastern U.S. To advance understanding of the loblolly pine (LP; Pinus taeda L.) genome, we sequenced and analyzed 100 BAC clones and performed a Cot analysis. The Cot analysis indicates that the genome is composed of 57, 24, and 10% highly-repetitive, moderately-repetitive, and single/low-copy sequences, respectively (the remaining 9% of the genome is a combination of fold back and damaged DNA). Although single/low-copy DNA only accounts for 10% of the LP genome, the amount of single/low-copy DNA in LP is still 14 times the size of the Arabidopsis genome. Since gene numbers in LP are similar to those in Arabidopsis, much of the single/low-copy DNA of LP would appear to be composed of DNA that is both gene- and repeat-poor. Macroarrays prepared from a LP bacterial artificial chromosome (BAC) library were hybridized with probes designed from cell wall synthesis/wood development cDNAs, and 50 of the "targeted" clones were selected for further analysis. An additional 25 clones were selected because they contained few repeats, while 25 more clones were selected at random. The 100 BAC clones were Sanger sequenced and assembled. Of the targeted BACs, 80% contained all or part of the cDNA used to target them. One targeted BAC was found to contain fungal DNA and was eliminated from further analysis. Combinations of similarity-based and ab initio gene prediction approaches were utilized to identify and characterize potential coding regions in the 99 BACs containing LP DNA. From this analysis, we identified 154 gene models (GMs) representing both putative protein-coding genes and likely pseudogenes. Ten of the GMs (all of which were specifically targeted) had enough support to be classified as intact genes. Interestingly, the 154 GMs had statistically indistinguishable (α = 0.05) distributions in the targeted and random BAC clones (15.18 and 12.61 GM/Mb, respectively), whereas the low-repeat BACs contained significantly fewer GMs (7.08 GM/Mb). However, when GM length was considered, the targeted BACs had a significantly greater percentage of their length in GMs (3.26%) when compared to random (1.63%) and low-repeat (0.62%) BACs. The results of our study provide insight into LP evolution and inform ongoing efforts to produce a reference genome sequence for LP, while characterization of genes involved in cell wall production highlights carbon metabolism pathways that can be leveraged for increasing wood production.
Collapse
Affiliation(s)
- Dinum Perera
- Institute for Genomics, Biocomputing & Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Zenaida V Magbanua
- National Institute of Molecular Biology & Biotechnology, National Science Complex, College of Science, University of the Philippines, Diliman, Quezon City, Philippines
| | - Supaphan Thummasuwan
- Department of Agricultural Sciences, Naresuan University, Phitsanulok, Thailand.
| | - Dipaloke Mukherjee
- Department of Food Science, Nutrition, & Health Promotion, Mississippi State University, Mississippi State, MS 39762, USA.
| | - Mark Arick
- Institute for Genomics, Biocomputing & Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA.
| | - Philippe Chouvarine
- Texas Children's Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Campbell J Nairn
- Warnell School of Forest Resources, University of Georgia, Athens, GA 30602, USA.
| | - Jeremy Schmutz
- US Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA; HudsonAlpha Institute for Biotechnology, 601 Genome Way, Huntsville, AL 35801, USA.
| | - Jane Grimwood
- US Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA; HudsonAlpha Institute for Biotechnology, 601 Genome Way, Huntsville, AL 35801, USA.
| | - Jeffrey F D Dean
- Department of Biochemistry, Molecular Biology, Entomology & Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA.
| | - Daniel G Peterson
- Institute for Genomics, Biocomputing & Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA; Department of Plant & Soil Sciences, Mississippi State University, Mississippi State, MS 39762, USA.
| |
Collapse
|
18
|
Complete chloroplast genome sequence and comparative analysis of loblolly pine (Pinus taeda L.) with related species. PLoS One 2018; 13:e0192966. [PMID: 29596414 PMCID: PMC5875761 DOI: 10.1371/journal.pone.0192966] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 02/01/2018] [Indexed: 12/14/2022] Open
Abstract
Pinaceae, the largest family of conifers, has a diversified organization of chloroplast (cp) genomes with two typical highly reduced inverted repeats (IRs). In the current study, we determined the complete sequence of the cp genome of an economically and ecologically important conifer tree, the loblolly pine (Pinus taeda L.), using Illumina paired-end sequencing and compared the sequence with those of other pine species. The results revealed a genome size of 121,531 base pairs (bp) containing a pair of 830-bp IR regions, distinguished by a small single copy (42,258 bp) and large single copy (77,614 bp) region. The chloroplast genome of P. taeda encodes 120 genes, comprising 81 protein-coding genes, four ribosomal RNA genes, and 35 tRNA genes, with 151 randomly distributed microsatellites. Approximately 6 palindromic, 34 forward, and 22 tandem repeats were found in the P. taeda cp genome. Whole cp genome comparison with those of other Pinus species exhibited an overall high degree of sequence similarity, with some divergence in intergenic spacers. Higher and lower numbers of indels and single-nucleotide polymorphism substitutions were observed relative to P. contorta and P. monophylla, respectively. Phylogenomic analyses based on the complete genome sequence revealed that 60 shared genes generated trees with the same topologies, and P. taeda was closely related to P. contorta in the subgenus Pinus. Thus, the complete P. taeda genome provided valuable resources for population and evolutionary studies of gymnosperms and can be used to identify related species.
Collapse
|
19
|
Fox H, Doron-Faigenboim A, Kelly G, Bourstein R, Attia Z, Zhou J, Moshe Y, Moshelion M, David-Schwartz R. Transcriptome analysis of Pinus halepensis under drought stress and during recovery. TREE PHYSIOLOGY 2018; 38:423-441. [PMID: 29177514 PMCID: PMC5982726 DOI: 10.1093/treephys/tpx137] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Revised: 08/24/2017] [Accepted: 10/12/2017] [Indexed: 05/09/2023]
Abstract
Forest trees use various strategies to cope with drought stress and these strategies involve complex molecular mechanisms. Pinus halepensis Miller (Aleppo pine) is found throughout the Mediterranean basin and is one of the most drought-tolerant pine species. In order to decipher the molecular mechanisms that P. halepensis uses to withstand drought, we performed large-scale physiological and transcriptome analyses. We selected a mature tree from a semi-arid area with suboptimal growth conditions for clonal propagation through cuttings. We then used a high-throughput experimental system to continuously monitor whole-plant transpiration rates, stomatal conductance and the vapor pressure deficit. The transcriptomes of plants were examined at six physiological stages: pre-stomatal response, partial stomatal closure, minimum transpiration, post-irrigation, partial recovery and full recovery. At each stage, data from plants exposed to the drought treatment were compared with data collected from well-irrigated control plants. A drought-stressed P. halepensis transcriptome was created using paired-end RNA-seq. In total, ~6000 differentially expressed, non-redundant transcripts were identified between drought-treated and control trees. Cluster analysis has revealed stress-induced down-regulation of transcripts related to photosynthesis, reactive oxygen species (ROS)-scavenging through the ascorbic acid (AsA)-glutathione cycle, fatty acid and cell wall biosynthesis, stomatal activity, and the biosynthesis of flavonoids and terpenoids. Up-regulated processes included chlorophyll degradation, ROS-scavenging through AsA-independent thiol-mediated pathways, abscisic acid response and accumulation of heat shock proteins, thaumatin and exordium. Recovery from drought induced strong transcription of retrotransposons, especially the retrovirus-related transposon Tnt1-94. The drought-related transcriptome illustrates this species' dynamic response to drought and recovery and unravels novel mechanisms.
Collapse
Affiliation(s)
- Hagar Fox
- Institute of Plant Sciences, Volcani Center, ARO, Bet Dagan 50250, Israel
- Institute of Plant Sciences and Genetics in Agriculture, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | | | - Gilor Kelly
- Institute of Plant Sciences, Volcani Center, ARO, Bet Dagan 50250, Israel
| | - Ronny Bourstein
- Institute of Plant Sciences and Genetics in Agriculture, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | - Ziv Attia
- Institute of Plant Sciences and Genetics in Agriculture, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | - Jing Zhou
- Institute of Plant Sciences, Volcani Center, ARO, Bet Dagan 50250, Israel
| | - Yosef Moshe
- Institute of Plant Sciences, Volcani Center, ARO, Bet Dagan 50250, Israel
| | - Menachem Moshelion
- Institute of Plant Sciences and Genetics in Agriculture, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | | |
Collapse
|
20
|
Pellicer J, Hidalgo O, Dodsworth S, Leitch IJ. Genome Size Diversity and Its Impact on the Evolution of Land Plants. Genes (Basel) 2018; 9:E88. [PMID: 29443885 PMCID: PMC5852584 DOI: 10.3390/genes9020088] [Citation(s) in RCA: 150] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 02/02/2018] [Accepted: 02/05/2018] [Indexed: 01/09/2023] Open
Abstract
Genome size is a biodiversity trait that shows staggering diversity across eukaryotes, varying over 64,000-fold. Of all major taxonomic groups, land plants stand out due to their staggering genome size diversity, ranging ca. 2400-fold. As our understanding of the implications and significance of this remarkable genome size diversity in land plants grows, it is becoming increasingly evident that this trait plays not only an important role in shaping the evolution of plant genomes, but also in influencing plant community assemblages at the ecosystem level. Recent advances and improvements in novel sequencing technologies, as well as analytical tools, make it possible to gain critical insights into the genomic and epigenetic mechanisms underpinning genome size changes. In this review we provide an overview of our current understanding of genome size diversity across the different land plant groups, its implications on the biology of the genome and what future directions need to be addressed to fill key knowledge gaps.
Collapse
Affiliation(s)
- Jaume Pellicer
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| | - Oriane Hidalgo
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| | - Steven Dodsworth
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| | - Ilia J Leitch
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| |
Collapse
|
21
|
Yi F, Ling J, Xiao Y, Zhang H, Ouyang F, Wang J. ConTEdb: a comprehensive database of transposable elements in conifers. Database (Oxford) 2018; 2018:5255192. [PMID: 30576494 PMCID: PMC6301336 DOI: 10.1093/database/bay131] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Revised: 10/24/2018] [Accepted: 11/26/2018] [Indexed: 11/14/2022]
Abstract
Conifers are the largest and most ubiquitous group of gymnosperms and have significant ecological significance and economic importance. However, the huge and complex genomes have hindered the sequencing and mining of conifer genomes. In this study, we identified 413 423 transposable elements (TEs) from Picea abies, Picea glauca and Pinus taeda using a combination of multiple approaches and classified them into 11 133 families. A comprehensive web-based database, ConTEdb, was constructed and served for researchers. ConTEdb enables users to browse, retrieve and download the TE sequences from the database. Several analysis tools are integrated into ConTEdb to help users mine the TE data easily and effectively. In summary, ConTEdb provides a platform to study TE biology and functional genomics in conifers.
Collapse
Affiliation(s)
- Fei Yi
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
- College of Biological and Pharmaceutical Sciences, Three Gorges University, Yichang, China
| | - Juanjuan Ling
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Yao Xiao
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Hanguo Zhang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, China
| | - Fangqun Ouyang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Junhui Wang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| |
Collapse
|
22
|
|
23
|
Guan R, Zhao Y, Zhang H, Fan G, Liu X, Zhou W, Shi C, Wang J, Liu W, Liang X, Fu Y, Ma K, Zhao L, Zhang F, Lu Z, Lee SMY, Xu X, Wang J, Yang H, Fu C, Ge S, Chen W. Draft genome of the living fossil Ginkgo biloba. Gigascience 2016. [PMID: 27871309 DOI: 10.1186/s13742-016-0154-1pmid:27871309] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2023] Open
Abstract
BACKGROUND Ginkgo biloba L. (Ginkgoaceae) is one of the most distinctive plants. It possesses a suite of fascinating characteristics including a large genome, outstanding resistance/tolerance to abiotic and biotic stresses, and dioecious reproduction, making it an ideal model species for biological studies. However, the lack of a high-quality genome sequence has been an impediment to our understanding of its biology and evolution. FINDINGS The 10.61 Gb genome sequence containing 41,840 annotated genes was assembled in the present study. Repetitive sequences account for 76.58% of the assembled sequence, and long terminal repeat retrotransposons (LTR-RTs) are particularly prevalent. The diversity and abundance of LTR-RTs is due to their gradual accumulation and a remarkable amplification between 16 and 24 million years ago, and they contribute to the long introns and large genome. Whole genome duplication (WGD) may have occurred twice, with an ancient WGD consistent with that shown to occur in other seed plants, and a more recent event specific to ginkgo. Abundant gene clusters from tandem duplication were also evident, and enrichment of expanded gene families indicates a remarkable array of chemical and antibacterial defense pathways. CONCLUSIONS The ginkgo genome consists mainly of LTR-RTs resulting from ancient gradual accumulation and two WGD events. The multiple defense mechanisms underlying the characteristic resilience of ginkgo are fostered by a remarkable enrichment in ancient duplicated and ginkgo-specific gene clusters. The present study sheds light on sequencing large genomes, and opens an avenue for further genetic and evolutionary research.
Collapse
Affiliation(s)
- Rui Guan
- BGI-Shenzhen, Shenzhen, 518083, China
- BGI-Qingdao, Qingdao, 266555, China
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | - Yunpeng Zhao
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China
| | - He Zhang
- BGI-Shenzhen, Shenzhen, 518083, China
- BGI-Qingdao, Qingdao, 266555, China
- Stanley Ho Centre for Emerging Infectious Diseases, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Guangyi Fan
- BGI-Shenzhen, Shenzhen, 518083, China
- BGI-Qingdao, Qingdao, 266555, China
- State Key Laboratory of Quality Research in Chinese Medicine and Institute of Chinese Medical Sciences, Macao, China
| | - Xin Liu
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Wenbin Zhou
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China
| | | | | | - Weiqing Liu
- BGI-Wuhan, BGI-Shenzhen, Wuhan, 430074, China
| | | | - Yuanyuan Fu
- BGI-Shenzhen, Shenzhen, 518083, China
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | | | - Lijun Zhao
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China
| | - Fumin Zhang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | - Simon Ming-Yuen Lee
- State Key Laboratory of Quality Research in Chinese Medicine and Institute of Chinese Medical Sciences, Macao, China
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Jian Wang
- BGI-Shenzhen, Shenzhen, 518083, China
- James D. Watson Institute of Genome Sciences, Hangzhou, 310058, China
| | - Huanming Yang
- BGI-Shenzhen, Shenzhen, 518083, China
- James D. Watson Institute of Genome Sciences, Hangzhou, 310058, China
| | - Chengxin Fu
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China.
| | - Song Ge
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| | - Wenbin Chen
- BGI-Shenzhen, Shenzhen, 518083, China.
- BGI-Qingdao, Qingdao, 266555, China.
| |
Collapse
|
24
|
Guan R, Zhao Y, Zhang H, Fan G, Liu X, Zhou W, Shi C, Wang J, Liu W, Liang X, Fu Y, Ma K, Zhao L, Zhang F, Lu Z, Lee SMY, Xu X, Wang J, Yang H, Fu C, Ge S, Chen W. Draft genome of the living fossil Ginkgo biloba. Gigascience 2016; 5:49. [PMID: 27871309 PMCID: PMC5118899 DOI: 10.1186/s13742-016-0154-1] [Citation(s) in RCA: 153] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 11/01/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ginkgo biloba L. (Ginkgoaceae) is one of the most distinctive plants. It possesses a suite of fascinating characteristics including a large genome, outstanding resistance/tolerance to abiotic and biotic stresses, and dioecious reproduction, making it an ideal model species for biological studies. However, the lack of a high-quality genome sequence has been an impediment to our understanding of its biology and evolution. FINDINGS The 10.61 Gb genome sequence containing 41,840 annotated genes was assembled in the present study. Repetitive sequences account for 76.58% of the assembled sequence, and long terminal repeat retrotransposons (LTR-RTs) are particularly prevalent. The diversity and abundance of LTR-RTs is due to their gradual accumulation and a remarkable amplification between 16 and 24 million years ago, and they contribute to the long introns and large genome. Whole genome duplication (WGD) may have occurred twice, with an ancient WGD consistent with that shown to occur in other seed plants, and a more recent event specific to ginkgo. Abundant gene clusters from tandem duplication were also evident, and enrichment of expanded gene families indicates a remarkable array of chemical and antibacterial defense pathways. CONCLUSIONS The ginkgo genome consists mainly of LTR-RTs resulting from ancient gradual accumulation and two WGD events. The multiple defense mechanisms underlying the characteristic resilience of ginkgo are fostered by a remarkable enrichment in ancient duplicated and ginkgo-specific gene clusters. The present study sheds light on sequencing large genomes, and opens an avenue for further genetic and evolutionary research.
Collapse
Affiliation(s)
- Rui Guan
- BGI-Shenzhen, Shenzhen, 518083, China
- BGI-Qingdao, Qingdao, 266555, China
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | - Yunpeng Zhao
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China
| | - He Zhang
- BGI-Shenzhen, Shenzhen, 518083, China
- BGI-Qingdao, Qingdao, 266555, China
- Stanley Ho Centre for Emerging Infectious Diseases, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Guangyi Fan
- BGI-Shenzhen, Shenzhen, 518083, China
- BGI-Qingdao, Qingdao, 266555, China
- State Key Laboratory of Quality Research in Chinese Medicine and Institute of Chinese Medical Sciences, Macao, China
| | - Xin Liu
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Wenbin Zhou
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China
| | | | | | - Weiqing Liu
- BGI-Wuhan, BGI-Shenzhen, Wuhan, 430074, China
| | | | - Yuanyuan Fu
- BGI-Shenzhen, Shenzhen, 518083, China
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | | | - Lijun Zhao
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China
| | - Fumin Zhang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China
| | - Simon Ming-Yuen Lee
- State Key Laboratory of Quality Research in Chinese Medicine and Institute of Chinese Medical Sciences, Macao, China
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, 518083, China
| | - Jian Wang
- BGI-Shenzhen, Shenzhen, 518083, China
- James D. Watson Institute of Genome Sciences, Hangzhou, 310058, China
| | - Huanming Yang
- BGI-Shenzhen, Shenzhen, 518083, China
- James D. Watson Institute of Genome Sciences, Hangzhou, 310058, China
| | - Chengxin Fu
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, Institute of Ecology and Conservation Center for Gene Resources of Endangered Wildlife, Zhejiang University, Hangzhou, 310058, China.
| | - Song Ge
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| | - Wenbin Chen
- BGI-Shenzhen, Shenzhen, 518083, China.
- BGI-Qingdao, Qingdao, 266555, China.
| |
Collapse
|
25
|
Abstract
Until very recently, complete characterization of the megagenomes of conifers has remained elusive. The diploid genome of sugar pine (Pinus lambertiana Dougl.) has a highly repetitive, 31 billion bp genome. It is the largest genome sequenced and assembled to date, and the first from the subgenus Strobus, or white pines, a group that is notable for having the largest genomes among the pines. The genome represents a unique opportunity to investigate genome "obesity" in conifers and white pines. Comparative analysis of P. lambertiana and P. taeda L. reveals new insights on the conservation, age, and diversity of the highly abundant transposable elements, the primary factor determining genome size. Like most North American white pines, the principal pathogen of P. lambertiana is white pine blister rust (Cronartium ribicola J.C. Fischer ex Raben.). Identification of candidate genes for resistance to this pathogen is of great ecological importance. The genome sequence afforded us the opportunity to make substantial progress on locating the major dominant gene for simple resistance hypersensitive response, Cr1 We describe new markers and gene annotation that are both tightly linked to Cr1 in a mapping population, and associated with Cr1 in unrelated sugar pine individuals sampled throughout the species' range, creating a solid foundation for future mapping. This genomic variation and annotated candidate genes characterized in our study of the Cr1 region are resources for future marker-assisted breeding efforts as well as for investigations of fundamental mechanisms of invasive disease and evolutionary response.
Collapse
|
26
|
Lin X, Faridi N, Casola C. An Ancient Transkingdom Horizontal Transfer of Penelope-Like Retroelements from Arthropods to Conifers. Genome Biol Evol 2016; 8:1252-66. [PMID: 27190138 PMCID: PMC4860704 DOI: 10.1093/gbe/evw076] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Comparative genomics analyses empowered by the wealth of sequenced genomes have revealed numerous instances of horizontal DNA transfers between distantly related species. In eukaryotes, repetitive DNA sequences known as transposable elements (TEs) are especially prone to move across species boundaries. Such horizontal transposon transfers, or HTTs, are relatively common within major eukaryotic kingdoms, including animals, plants, and fungi, while rarely occurring across these kingdoms. Here, we describe the first case of HTT from animals to plants, involving TEs known as Penelope-like elements, or PLEs, a group of retrotransposons closely related to eukaryotic telomerases. Using a combination of in situ hybridization on chromosomes, polymerase chain reaction experiments, and computational analyses we show that the predominant PLE lineage, EN(+)PLEs, is highly diversified in loblolly pine and other conifers, but appears to be absent in other gymnosperms. Phylogenetic analyses of both protein and DNA sequences reveal that conifers EN(+)PLEs, or Dryads, form a monophyletic group clustering within a clade of primarily arthropod elements. Additionally, no EN(+)PLEs were detected in 1,928 genome assemblies from 1,029 nonmetazoan and nonconifer genomes from 14 major eukaryotic lineages. These findings indicate that Dryads emerged following an ancient horizontal transfer of EN(+)PLEs from arthropods to a common ancestor of conifers approximately 340 Ma. This represents one of the oldest known interspecific transmissions of TEs, and the most conspicuous case of DNA transfer between animals and plants.
Collapse
Affiliation(s)
- Xuan Lin
- Department of Ecosystem Science and Management, Texas A&M University
| | - Nurul Faridi
- Department of Ecosystem Science and Management, Texas A&M University Southern Institute of Forest Genetics, USDA Forest Service Southern Research Station, Saucier, Mississippi
| | - Claudio Casola
- Department of Ecosystem Science and Management, Texas A&M University
| |
Collapse
|
27
|
Seoane-Zonjic P, Cañas RA, Bautista R, Gómez-Maldonado J, Arrillaga I, Fernández-Pozo N, Claros MG, Cánovas FM, Ávila C. Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing. BMC Genomics 2016; 17:148. [PMID: 26922242 PMCID: PMC4769843 DOI: 10.1186/s12864-016-2490-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Accepted: 02/17/2016] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. RESULTS In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. CONCLUSIONS The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Collapse
Affiliation(s)
- Pedro Seoane-Zonjic
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| | - Rafael A Cañas
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| | - Rocío Bautista
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| | - Josefa Gómez-Maldonado
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| | - Isabel Arrillaga
- Departamento de Biología Vegetal, Facultad de Farmacia, ERI Biotecmed, Universidad de Valencia, Avda. Vicent Andrés Estellés s/n, 46100, Burjassot, Valencia, Spain.
| | - Noé Fernández-Pozo
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, NY, 14853, USA.
| | - M Gonzalo Claros
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| | - Francisco M Cánovas
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| | - Concepción Ávila
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Campus de Teatinos s/n, E-29071, Málaga, Spain.
| |
Collapse
|
28
|
Warren RL, Keeling CI, Yuen MMS, Raymond A, Taylor GA, Vandervalk BP, Mohamadi H, Paulino D, Chiu R, Jackman SD, Robertson G, Yang C, Boyle B, Hoffmann M, Weigel D, Nelson DR, Ritland C, Isabel N, Jaquish B, Yanchuk A, Bousquet J, Jones SJM, MacKay J, Birol I, Bohlmann J. Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2015; 83:189-212. [PMID: 26017574 DOI: 10.1111/tpj.12886] [Citation(s) in RCA: 122] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 05/15/2015] [Indexed: 05/21/2023]
Abstract
White spruce (Picea glauca), a gymnosperm tree, has been established as one of the models for conifer genomics. We describe the draft genome assemblies of two white spruce genotypes, PG29 and WS77111, innovative tools for the assembly of very large genomes, and the conifer genomics resources developed in this process. The two white spruce genotypes originate from distant geographic regions of western (PG29) and eastern (WS77111) North America, and represent elite trees in two Canadian tree-breeding programs. We present an update (V3 and V4) for a previously reported PG29 V2 draft genome assembly and introduce a second white spruce genome assembly for genotype WS77111. Assemblies of the PG29 and WS77111 genomes confirm the reconstructed white spruce genome size in the 20 Gbp range, and show broad synteny. Using the PG29 V3 assembly and additional white spruce genomics and transcriptomics resources, we performed MAKER-P annotation and meticulous expert annotation of very large gene families of conifer defense metabolism, the terpene synthases and cytochrome P450s. We also comprehensively annotated the white spruce mevalonate, methylerythritol phosphate and phenylpropanoid pathways. These analyses highlighted the large extent of gene and pseudogene duplications in a conifer genome, in particular for genes of secondary (i.e. specialized) metabolism, and the potential for gain and loss of function for defense and adaptation.
Collapse
Affiliation(s)
- René L Warren
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Christopher I Keeling
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Macaire Man Saint Yuen
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Anthony Raymond
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Greg A Taylor
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Benjamin P Vandervalk
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Hamid Mohamadi
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Daniel Paulino
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Readman Chiu
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Shaun D Jackman
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Gordon Robertson
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Chen Yang
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Brian Boyle
- Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
| | - Margarete Hoffmann
- Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
| | - Detlef Weigel
- Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
| | - David R Nelson
- Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
| | - Carol Ritland
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Nathalie Isabel
- Natural Resources Canada, Laurentian Forestry Centre, Québec, QC, G1V 4C7, Canada
| | - Barry Jaquish
- British Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, BC, V8W 9C2, Canada
| | - Alvin Yanchuk
- British Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, BC, V8W 9C2, Canada
| | - Jean Bousquet
- Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
| | - Steven J M Jones
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - John MacKay
- Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK
| | - Inanc Birol
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Joerg Bohlmann
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| |
Collapse
|
29
|
Zuccolo A, Scofield DG, De Paoli E, Morgante M. The Ty1-copia LTR retroelement family PARTC is highly conserved in conifers over 200 MY of evolution. Gene 2015; 568:89-99. [PMID: 25982862 DOI: 10.1016/j.gene.2015.05.028] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Revised: 04/06/2015] [Accepted: 05/11/2015] [Indexed: 11/26/2022]
Abstract
Long Terminal Repeat retroelements (LTR-RTs) are a major component of many plant genomes. Although well studied and described in angiosperms, their features and dynamics are poorly understood in gymnosperms. Representative complete copies of a Ty1-copia element isolate in Picea abies and named PARTC were identified in six other conifer species (Picea glauca, Pinus sylvestris, Pinus taeda, Abies sibirica, Taxus baccata and Juniperus communis) covering more than 200 million years of evolution. Here we characterized the structure of this element, assessed its abundance across conifers, studied the modes and timing of its amplification, and evaluated the degree of conservation of its extant copies at nucleotide level over distant species. We demonstrated that the element is ancient, abundant, widespread and its paralogous copies are present in the genera Picea, Pinus and Abies as an LTR-RT family. The amplification leading to the extant copies of PARTC occurred over long evolutionary times spanning 10s of MY and mostly took place after the speciation of the conifers analyzed. The level of conservation of PARTC is striking and may be explained by low substitution rates and limited removal mechanisms for LTR-RTs. These PARTC features and dynamics are representative of a more general scenario for LTR-RTs in gymnosperms quite different from that characterizing the vast majority of LTR-RT elements in angiosperms.
Collapse
Affiliation(s)
- Andrea Zuccolo
- Institute of Life Sciences, Scuola Superiore Sant'Anna, 56127 Pisa, Italy; Istituto di Genomica Applicata, Via J. Linussio 51, 33100 Udine, Italy.
| | - Douglas G Scofield
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-75236 Uppsala, Sweden
| | - Emanuele De Paoli
- Università degli Studi di Udine, Via delle Scienze 208, 33100 Udine, Italy
| | - Michele Morgante
- Istituto di Genomica Applicata, Via J. Linussio 51, 33100 Udine, Italy; Università degli Studi di Udine, Via delle Scienze 208, 33100 Udine, Italy
| |
Collapse
|
30
|
De La Torre AR, Birol I, Bousquet J, Ingvarsson PK, Jansson S, Jones SJM, Keeling CI, MacKay J, Nilsson O, Ritland K, Street N, Yanchuk A, Zerbe P, Bohlmann J. Insights into conifer giga-genomes. PLANT PHYSIOLOGY 2014; 166:1724-32. [PMID: 25349325 PMCID: PMC4256843 DOI: 10.1104/pp.114.248708] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Insights from sequenced genomes of major land plant lineages have advanced research in almost every aspect of plant biology. Until recently, however, assembled genome sequences of gymnosperms have been missing from this picture. Conifers of the pine family (Pinaceae) are a group of gymnosperms that dominate large parts of the world's forests. Despite their ecological and economic importance, conifers seemed long out of reach for complete genome sequencing, due in part to their enormous genome size (20-30 Gb) and the highly repetitive nature of their genomes. Technological advances in genome sequencing and assembly enabled the recent publication of three conifer genomes: white spruce (Picea glauca), Norway spruce (Picea abies), and loblolly pine (Pinus taeda). These genome sequences revealed distinctive features compared with other plant genomes and may represent a window into the past of seed plant genomes. This Update highlights recent advances, remaining challenges, and opportunities in light of the publication of the first conifer and gymnosperm genomes.
Collapse
Affiliation(s)
- Amanda R De La Torre
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Inanc Birol
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Jean Bousquet
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Pär K Ingvarsson
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Stefan Jansson
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Steven J M Jones
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Christopher I Keeling
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - John MacKay
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Ove Nilsson
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Kermit Ritland
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Nathaniel Street
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Alvin Yanchuk
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Philipp Zerbe
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| | - Jörg Bohlmann
- Department of Ecology and Environmental Sciences (A.R.D.L.T., P.K.I.) and Umeå Plant Science Center, Department of Plant Physiology (P.K.I., S.J., O.N., N.S.), Umeå University, SE-901 87 Umea, Sweden;Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 (I.B., S.J.M.J.);Canada Research Chair in Forest and Environmental Genomics (J.Bou.) and Center for Forest Research and Institute for Systems and Integrative Biology (J.Bou., J.M.), Université Laval, Quebec, Quebec, Canada G1V 0A6;Michael Smith Laboratories (C.I.K., P.Z., J.Boh.) and Department of Forest and Conservation Sciences (K.R., J.Boh.), University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4; andBritish Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, British Columbia, Canada V8W 9C2 (A.Y.)
| |
Collapse
|
31
|
Abstract
Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer “super-reads,” rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp.
Collapse
|
32
|
Heitkam T, Holtgräwe D, Dohm JC, Minoche AE, Himmelbauer H, Weisshaar B, Schmidt T. Profiling of extensively diversified plant LINEs reveals distinct plant-specific subclades. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2014; 79:385-97. [PMID: 24862340 DOI: 10.1111/tpj.12565] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 05/12/2014] [Accepted: 05/15/2014] [Indexed: 05/03/2023]
Abstract
A large fraction of eukaryotic genomes is made up of long interspersed nuclear elements (LINEs). Due to their capability to create novel copies via error-prone reverse transcription, they generate multiple families and reach high copy numbers. Although mammalian LINEs have been well described, plant LINEs have been only poorly investigated. Here, we present a systematic cross-species survey of LINEs in higher plant genomes shedding light on plant LINE evolution as well as diversity, and facilitating their annotation in genome projects. Applying a Hidden Markov Model (HMM)-based analysis, 59 390 intact LINE reverse transcriptases (RTs) were extracted from 23 plant genomes. These fall in only two out of 28 LINE clades (L1 and RTE) known in eukaryotes. While plant RTE LINEs are highly homogenous and mostly constitute only a single family per genome, plant L1 LINEs are extremely diverse and form numerous families. Despite their heterogeneity, all members across the 23 species fall into only seven L1 subclades, some of them defined here. Exemplarily focusing on the L1 LINEs of a basal reference plant genome (Beta vulgaris), we show that the subclade classification level does not only reflect RT sequence similarity, but also mirrors structural aspects of complete LINE retrotransposons, like element size, position and type of encoded enzymatic domains. Our comprehensive catalogue of plant LINE RTs serves the classification of highly diverse plant LINEs, while the provided subclade-specific HMMs facilitate their annotation.
Collapse
Affiliation(s)
- Tony Heitkam
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | | | | | | | | | | | | |
Collapse
|
33
|
Rondeau EB, Minkley DR, Leong JS, Messmer AM, Jantzen JR, von Schalburg KR, Lemon C, Bird NH, Koop BF. The genome and linkage map of the northern pike (Esox lucius): conserved synteny revealed between the salmonid sister group and the Neoteleostei. PLoS One 2014; 9:e102089. [PMID: 25069045 PMCID: PMC4113312 DOI: 10.1371/journal.pone.0102089] [Citation(s) in RCA: 106] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 06/14/2014] [Indexed: 11/19/2022] Open
Abstract
The northern pike is the most frequently studied member of the Esociformes, the closest order to the diverse and economically important Salmoniformes. The ancestor of all salmonids purportedly experienced a whole-genome duplication (WGD) event, making salmonid species ideal for studying the early impacts of genome duplication while complicating their use in wider analyses of teleost evolution. Studies suggest that the Esociformes diverged from the salmonid lineage prior to the WGD, supporting the use of northern pike as a pre-duplication outgroup. Here we present the first genome assembly, reference transcriptome and linkage map for northern pike, and evaluate the suitability of this species to provide a representative pre-duplication genome for future studies of salmonid and teleost evolution. The northern pike genome sequence is composed of 94,267 contigs (N50 = 16,909 bp) contained in 5,688 scaffolds (N50 = 700,535 bp); the total scaffolded genome size is 878 million bases. Multiple lines of evidence suggest that over 96% of the protein-coding genome is present in the genome assembly. The reference transcriptome was constructed from 13 tissues and contains 38,696 transcripts, which are accompanied by normalized expression data in all tissues. Gene-prediction analysis produced a total of 19,601 northern pike-specific gene models. The first-generation linkage map identifies 25 linkage groups, in agreement with northern pike's diploid karyotype of 2N = 50, and facilitates the placement of 46% of assembled bases onto linkage groups. Analyses reveal a high degree of conserved synteny between northern pike and other model teleost genomes. While conservation of gene order is limited to smaller syntenic blocks, the wider conservation of genome organization implies the northern pike exhibits a suitable approximation of a non-duplicated Protacanthopterygiian genome. This dataset will facilitate future studies of esocid biology and empower ongoing examinations of the Atlantic salmon and rainbow trout genomes by facilitating their comparison with other major teleost groups.
Collapse
Affiliation(s)
- Eric B. Rondeau
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - David R. Minkley
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - Jong S. Leong
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - Amber M. Messmer
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - Johanna R. Jantzen
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - Kristian R. von Schalburg
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - Craig Lemon
- The Charles O. Hayford Hackettstown State Fish Hatchery, Hackettstown, New Jersey, United States of America
| | - Nathan H. Bird
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
| | - Ben F. Koop
- Department of Biology, Centre for Biomedical Research, University of Victoria, Victoria, British Columbia, Canada
- * E-mail:
| |
Collapse
|
34
|
Neale DB, Wegrzyn JL, Stevens KA, Zimin AV, Puiu D, Crepeau MW, Cardeno C, Koriabine M, Holtz-Morris AE, Liechty JD, Martínez-García PJ, Vasquez-Gross HA, Lin BY, Zieve JJ, Dougherty WM, Fuentes-Soriano S, Wu LS, Gilbert D, Marçais G, Roberts M, Holt C, Yandell M, Davis JM, Smith KE, Dean JFD, Lorenz WW, Whetten RW, Sederoff R, Wheeler N, McGuire PE, Main D, Loopstra CA, Mockaitis K, deJong PJ, Yorke JA, Salzberg SL, Langley CH. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol 2014; 15:R59. [PMID: 24647006 PMCID: PMC4053751 DOI: 10.1186/gb-2014-15-3-r59] [Citation(s) in RCA: 274] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Accepted: 03/04/2014] [Indexed: 11/30/2022] Open
Abstract
Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.
Collapse
|
35
|
Wegrzyn JL, Liechty JD, Stevens KA, Wu LS, Loopstra CA, Vasquez-Gross HA, Dougherty WM, Lin BY, Zieve JJ, Martínez-García PJ, Holt C, Yandell M, Zimin AV, Yorke JA, Crepeau MW, Puiu D, Salzberg SL, de Jong PJ, Mockaitis K, Main D, Langley CH, Neale DB. Unique features of the loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation. Genetics 2014; 196:891-909. [PMID: 24653211 PMCID: PMC3948814 DOI: 10.1534/genetics.113.159996] [Citation(s) in RCA: 129] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 12/13/2013] [Indexed: 01/08/2023] Open
Abstract
The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20-40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.
Collapse
Affiliation(s)
- Jill L. Wegrzyn
- Department of Plant Sciences, University of California, Davis, California 95616
| | - John D. Liechty
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Kristian A. Stevens
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Le-Shin Wu
- National Center for Genome Analysis Support, Indiana University, Bloomington, Indiana 47405
| | - Carol A. Loopstra
- Department of Ecosystem Science and Management, Texas A&M University, College Station, Texas 77843
| | | | - William M. Dougherty
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Brian Y. Lin
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Jacob J. Zieve
- Department of Plant Sciences, University of California, Davis, California 95616
| | | | - Carson Holt
- Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112
| | - Aleksey V. Zimin
- Institute for Physical Sciences and Technology, University of Maryland, College Park, Maryland 20742
| | - James A. Yorke
- Institute for Physical Sciences and Technology, University of Maryland, College Park, Maryland 20742
- Departments of Mathematics and Physics, University of Maryland, College Park, Maryland 20742
| | - Marc W. Crepeau
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Daniela Puiu
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, The Johns Hopkins University, Baltimore, Maryland 21205
| | - Steven L. Salzberg
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, The Johns Hopkins University, Baltimore, Maryland 21205
| | - Pieter J. de Jong
- Children’s Hospital Oakland Research Institute, Oakland, California 94609
| | | | - Doreen Main
- Department of Horticulture, Washington State University, Pullman, Washington 99163
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - David B. Neale
- Department of Plant Sciences, University of California, Davis, California 95616
| |
Collapse
|