1
|
Genome assembly of Erythrophleum Fordii, a special "ironwood" tree in China. BMC Genom Data 2023; 24:73. [PMID: 38017381 PMCID: PMC10685560 DOI: 10.1186/s12863-023-01176-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 11/23/2023] [Indexed: 11/30/2023] Open
Abstract
OBJECTIVES Erythrophleum is a genus in the Fabaceae family. The genus contains only about 10 species, and it is best known for its hardwood and medical properties worldwide. Erythrophleum fordii Oliv. is the only species of this genus distributed in China. It has superior wood and can be used in folk medicine, which leads to its overexploitation in the wild. For its effective conservation and elucidation of the distinctive genetic traits of wood formation and medical components, we present its first genome assembly. DATA DESCRIPTION This work generated ~ 160.8 Gb raw Nanopore whole genome sequencing (WGS) long reads, ~ 126.0 Gb raw MGI WGS short reads and ~ 29.0 Gb raw RNA-seq reads using E. fordii leaf tissues. The de novo assembly contained 864,825,911 bp in the E. fordii genome, with 59 contigs and a contig N50 of 30,830,834 bp. Benchmarking Universal Single-Copy Orthologs (BUSCO) revealed 98.7% completeness of the assembly. The assembly contained 471,006,885 bp (54.4%) repetitive sequences and 28,761 genes that coded for 33,803 proteins. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.
Collapse
|
2
|
A first insight into the genomic background of Ilex pubescens (Aquifoliaceae) by flow cytometry and genome survey sequencing. BMC Genomics 2023; 24:270. [PMID: 37208610 DOI: 10.1186/s12864-023-09359-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/05/2023] [Indexed: 05/21/2023] Open
Abstract
BACKGROUND Ilex pubescens is an important traditional Chinese medicinal plant with many naturally occurring compounds and multiple pharmacological effects. However, the lack of reference genomic information has led to tardiness in molecular biology research and breeding programs of this plant. RESULTS To obtain knowledge on the genomic information of I. pubescens, a genome survey was performed for the first time by next generation sequencing (NGS) together with genome size estimation using flow cytometry. The whole genome survey of I. pubescens generated 46.472 Gb of sequence data with approximately 82.2 × coverage. K-mer analysis indicated that I. pubescens has a small genome of approximately 553 Mb with 1.93% heterozygosity rate and 39.1% repeat rate. Meanwhile, the genome size was estimated to be 722 Mb using flow cytometry, which was possibly more precise for assessment of genome size than k-mer analysis. A total of 45.842 Gb clean reads were assembled into 808,938 scaffolds with a relatively short N50 of 760 bp. The average guanine and cytosine (GC) content was 37.52%. In total, 197,429 microsatellite motifs were detected with a frequency of 2.8 kb, among which mononucleotide motifs were the most abundant (up to 62.47% of the total microsatellite motifs), followed by dinucleotide and trinucleotide motifs. CONCLUSION In summary, the genome of I. pubescens is small but complex with a high level of heterozygosity. Even though not successfully applied for estimation of genome size due to its complex genome, the survey sequences will help to design whole genome sequencing strategies and provide genetic information support for resource protection, genetic diversity analysis, genetic improvement and artificial breeding of I. pubescens.
Collapse
|
3
|
Draft genome and transcriptome of Nepenthes mirabilis, a carnivorous plant in China. BMC Genom Data 2023; 24:21. [PMID: 37060047 PMCID: PMC10103442 DOI: 10.1186/s12863-023-01126-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 04/06/2023] [Indexed: 04/16/2023] Open
Abstract
OBJECTIVES Nepenthes belongs to the monotypic family Nepenthaceae, one of the largest carnivorous plant families. Nepenthes species show impressive adaptive radiation and suffer from being overexploited in nature. Nepenthes mirabilis is the most widely distributed species and the only Nepenthes species that is naturally distributed within China. Herein, we reported the genome and transcriptome assemblies of N. mirabilis. The assemblies will be useful resources for comparative genomics, to understand the adaptation and conservation of carnivorous species. DATA DESCRIPTION This work produced ~ 139.5 Gb N. mirabilis whole genome sequencing reads using leaf tissues, and ~ 21.7 Gb and ~ 27.9 Gb of raw RNA-seq reads for its leaves and flowers, respectively. Transcriptome assembly obtained 339,802 transcripts, in which 79,758 open reading frames (ORFs) were identified. Function analysis indicated that these ORFs were mainly associated with proteolysis and DNA integration. The assembled genome was 691,409,685 bp with 159,555 contigs/scaffolds and an N50 of 10,307 bp. The BUSCO assessment of the assembled genome and transcriptome indicated 91.1% and 93.7% completeness, respectively. A total of 42,961 genes were predicted in the genome identified, coding for 45,461 proteins. The predicted genes were annotated using multiple databases, facilitating future functional analyses of them. This is the first genome report on the Nepenthaceae family.
Collapse
|
4
|
Intraspecific variation in genome size in Artemisia argyi determined using flow cytometry and a genome survey. 3 Biotech 2023; 13:57. [PMID: 36698769 PMCID: PMC9868218 DOI: 10.1007/s13205-022-03412-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/26/2022] [Indexed: 01/23/2023] Open
Abstract
Different collections and accessions of Artemisia argyi (Chinese mugwort) harbour considerable diversity in morphology and bioactive compounds, but no mechanisms have been reported that explain these variations. We studied genome size in A. argyi accessions from different regions of China by flow cytometry. Genome size was significantly distinct among origins of these 42 Chinese mugwort accessions, ranging from 8.428 to 11.717 pg. There were no significant intraspecific differences among the 42 accessions from the five regions of China. The clustering analysis showed that these 42 A. argyi accessions could be divided into three groups, which had no significant relationship with geographical location. In a genome survey, the total genome size of A. argyi (A15) was estimated to be 7.852 Gb (or 8.029 pg) by K-mer analysis. This indicated that the results from the two independent methods are consistent, and that the genome survey can be used as an adjunct to flow cytometry to compensate for its deficiencies. In addition, genome survey can provide the information about heterozygosity, repeat sequences, GC content and ploidy of A. argyi genome. The nuclear DNA contents determined here provide a new reference for intraspecific variation in genome size in A. argyi, and may also be a potential resource for the study of genetic diversity and for breeding new cultivar.
Collapse
|
5
|
Genome survey sequencing of common vetch (Vicia sativa L.) and genetic diversity analysis of Chinese germplasm with genomic SSR markers. Mol Biol Rep 2021; 49:313-320. [PMID: 34741708 PMCID: PMC8748366 DOI: 10.1007/s11033-021-06875-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 10/22/2021] [Indexed: 12/04/2022]
Abstract
Background Common vetch (Vicia sativa L.) is an annual legume with excellent suitability in cold and dry regions. Despite its great applied potential, the genomic information regarding common vetch currently remains unavailable. Methods and results In the present study, the whole genome survey of common vetch was performed using the next-generation sequencing (NGS). A total of 79.84 Gbp high quality sequence data were obtained and assembled into 3,754,145 scaffolds with an N50 length of 3556 bp. According to the K-mer analyses, the genome size, heterozygosity rate and GC content of common vetch genome were estimated to be 1568 Mbp, 0.4345 and 35%, respectively. In addition, a total of 76,810 putative simple sequence repeats (SSRs) were identified. Among them, dinucleotide was the most abundant SSR type (44.94%), followed by Tri- (35.82%), Tetra- (13.22%), Penta- (4.47%) and Hexanucleotide (1.54%). Furthermore, a total of 58,175 SSR primer pairs were designed and ten of them were validated in Chinese common vetch. Further analysis showed that Chinese common vetch harbored high genetic diversity and could be clustered into two main subgroups. Conclusion This is the first report about the genome features of common vetch, and the information will help to design whole genome sequencing strategies. The newly identified SSRs in this study provide basic molecular markers for germplasm characterization, genetic diversity and QTL mapping studies for common vetch. Supplementary Information The online version contains supplementary material available at 10.1007/s11033-021-06875-z.
Collapse
|
6
|
The karyotype, genome survey, and assembly of Mud artemisia (Artemisia selengensis). Mol Biol Rep 2021; 48:5897-5904. [PMID: 34297325 DOI: 10.1007/s11033-021-06584-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 07/20/2021] [Indexed: 11/30/2022]
Abstract
BACKGROUND Artemisia selengensis is traditional Chinese medicine and phytochemical analysis indicated that A. selengensis contains essential oils, fatty acids and phenolic acids. The lack of reference genomic information may lead to tardiness in molecular biology research of A. selengensis. METHOD AND RESULTS Karyotype analysis, genome survey, and genome assembly was employed to acquire information on the genome structure of A. selengensis. The chromosome number is 2n = 2x = 36, karyotype formula is 28 m + 8Sm, karyotype asymmetry coefficient is 58.8%, and karyotypes were symmetric to Stebbins' type 2A. Besides, the flow cytometry findings reported that the mean peak value of fluorescent intensity is 1,170,677, 2C DNA content is 12 pg and the genome size was estimated to be approximately 5.87 Gb. Furthermore, the genome survey generates 341,478,078 clean reads, unfortunately, after K-mer analysis, no significant peak can be observed, the heterozygosity, repetitive rate and genome size was unable to estimated. It is speculated that this phenomenon might be due to the complexity of genome structure. 37,266 contigs are preliminary assembled with Oxford Nanopore Technology (ONT) sequencing, totaling 804 Mb and GC content was 34.08%. The total length is 804,475,881 bp, N50 is 29,624 bp, and the largest contig length is 239,792 bp. CONCLUSION This study reveals the preliminary information of genome size of A. selengensis. These findings may provide supportive information for sequencing and assembly of whole-genome sequencing and encourage the progress of functional gene discovery, genetic improvement, evolutionary study, and structural studies of A. selengensis.
Collapse
|
7
|
Genome survey of Zanthoxylum bungeanum and development of genomic-SSR markers in congeneric species. Biosci Rep 2021; 40:225368. [PMID: 32558907 PMCID: PMC7322109 DOI: 10.1042/bsr20201101] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 06/11/2020] [Accepted: 06/18/2020] [Indexed: 01/13/2023] Open
Abstract
Zanthoxylum bungeanum, a spice and medicinal plant, is cultivated in many parts of China and some countries in Southeast Asia; however, data on its genome are lacking. In the present study, we performed a whole-genome survey and developed novel genomic-SSR markers of Z. bungeanum. Clean data (∼197.16 Gb) were obtained and assembled into 11185221 scaffolds with an N50 of 183 bp. K-mer analysis revealed that Z. bungeanum has an estimated genome size of 3971.92 Mb, and the GC content, heterozygous rate, and repeat sequence rate are 37.21%, 1.73%, and 86.04%, respectively. These results indicate that the genome of Z. bungeanum is complex. Furthermore, 27153 simple sequence repeat (SSR) loci were identified from 57288 scaffolds with a minimum length > 1 kb. Mononucleotide repeats (19706) were the most abundant type, followed by dinucleotide repeats (5154). The most common motifs were A/T, followed by AT/AT; these SSRs accounted for 71.42% and 11.84% of all repeats, respectively. A total of 21243 non-repeating primer pairs were designed, and 100 were randomly selected and validated by PCR analysis using DNA from 10 Z. bungeanum individuals and 5 Zanthoxylum armatum individuals. Finally, 36 polymorphic SSR markers were developed with polymorphism information content (PIC) values ranging from 0.16 to 0.75. Cluster analysis revealed that Z. bungeanum and Z. armatum could be divided into two major clusters, suggesting that these newly developed SSR markers are useful for genetic diversity and germplasm resource identification in Z. bungeanum and Z. armatum.
Collapse
|
8
|
Genome survey sequencing of the Caribbean spiny lobster Panulirus argus: Genome size, nuclear rRNA operon, repetitive elements, and microsatellite discovery. PeerJ 2020; 8:e10554. [PMID: 33362980 PMCID: PMC7750000 DOI: 10.7717/peerj.10554] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 11/22/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Panulirus argus is an ecologically relevant species in shallow water hard-bottom environments and coral reefs and target of the most lucrative fishery in the greater Caribbean region. METHODS This study reports, for the first time, the genome size and nuclear repetitive elements, including the 45S ribosomal DNA operon, 5S unit, and microsatellites, of P. argus. RESULTS Using a k-mer approach, the average haploid genome size estimated for P. argus was 2.17 Gbp. Repetitive elements comprised 69.02% of the nuclear genome. In turn, 30.98% of the genome represented low- or single-copy sequences. A considerable proportion of repetitive sequences could not be assigned to known repeat element families. Taking into account only annotated repetitive elements, the most frequent belonged to Class I-LINE which were noticeably more abundant than Class I-LTR-Ty- 3/Gypsy, Class I-LTR-Penelope, and Class I-LTR-Ty-3/Bel-Pao elements. Satellite DNA was also abundant. The ribosomal operon in P. argus comprises, in the following order, a 5' ETS (length = 707 bp), ssrDNA (1,875 bp), ITS1 (736 bp), 5.8S rDNA (162 bp), ITS2 (1,314 bp), lsrDNA (5,387 bp), and 3' ETS (287 bp). A total of 1,281 SSRs were identified.
Collapse
|
9
|
Genome survey sequencing of Atractylodes lancea and identification of its SSR markers. Biosci Rep 2020; 40:226599. [PMID: 33026067 PMCID: PMC7593537 DOI: 10.1042/bsr20202709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 10/03/2020] [Accepted: 10/06/2020] [Indexed: 11/17/2022] Open
Abstract
Atractylodes lancea (Thunb.) DC. is a traditional Chinese medicine rich in sesquiterpenes that has been widely used in China and Japan for the treatment of viral infections. Despite its important pharmacological value, genomic information regarding A. lancea is currently unavailable. In the present study, the whole genome sequence of A. lancea was obtained using an Illumina sequencing platform. The results revealed an estimated genome size for A. lancea of 4,159.24 Mb, with 2.28% heterozygosity, and a repeat rate of 89.2%, all of which indicate a highly heterozygous genome. Based on the genomic data of A. lancea, 27,582 simple sequence repeat (SSR) markers were identified. The differences in representation among nucleotide repeat types were large, e.g., the mononucleotide repeat type was the most abundant (54.74%) while the pentanucleotide repeats were the least abundant (0.10%), and sequence motifs GA/TC (31.17%) and TTC/GAA (7.23%) were the most abundant among the dinucleotide and trinucleotide repeat motifs, respectively. A total of 93,434 genes matched known genes in common databases including 48,493 genes in the Gene Ontology (GO) database and 34,929 genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. This is the first report to sequence and characterize the whole genome of A. lancea and will provide a theoretical basis and reference for further genome-wide deep sequencing and SSR molecular marker development of A. lancea.
Collapse
|
10
|
Genome survey and development of polymorphic microsatellite loci for Sillago sihama based on Illumina sequencing technology. Mol Biol Rep 2020; 47:3011-3017. [PMID: 32124169 DOI: 10.1007/s11033-020-05348-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 02/25/2020] [Indexed: 10/24/2022]
Abstract
In this study, we first conducted a genome survey assay for Sillago sihama by Illumina sequencing platform, and then developed 15 polymorphic microsatellite loci in a wild population. A total of 129.46 Gb raw data were obtained, of which 115.07 Gb were clean data, with a sequencing depth of 179.3-folds. This genome was estimated to be 522.6 Mb in size, with the heterozygosity, repeat content and GC content being 0.63%, 21% and 44%. A total of 630,028 microsatellites were identified from the genome, of which, dinucleotide repeat was the most abundant (56.80%), followed by mononucleotide repeat (30.23%). Furthermore, 60 pairs of primers were designed and synthesized based on microsatellite sequences, of which 15 were polymorphic in a wild population. A total of 91 alleles were found, with an average of 6.07 per locus. Number of alleles, observed and expected heterozygosity per locus ranged from two to 13, from 0.250 to 0.862, and from 0.396 to 0.901, respectively. Twelve loci were highly informative (PIC > 0.5), and the others were medium informative (0.25 < PIC < 0.5). Seven loci deviated from Hardy-Weinberg equilibrium after Bonferroni correction (P < 0.0033). No significant linkage disequilibrium was detected between loci pairs. This study provided a large number of genomic resources and 15 polymorphic microsatellite loci that should be helpful for the further genetic studies in S. sihama.
Collapse
|
11
|
Genome survey sequencing and genetic background characterization of yellow horn based on next-generation sequencing. Mol Biol Rep 2019; 46:4303-4312. [PMID: 31115837 DOI: 10.1007/s11033-019-04884-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 05/15/2019] [Indexed: 11/29/2022]
Abstract
Yellowhorn (Xanthoceras sorbifolium Bunge) is an important wood oil tree species, with high ornamental and medicinal value. Nevertheless, genomic information of yellowhorn is currently unavailable. Here, for the first time, we conducted a genome survey of two yellowhorn cultivars, Zhongshi 4 and Zhongshi 9, which had distinct differences on the phenotype and drought resistance, to obtain knowledge on the genomic information by next generation sequencing (NGS). Meanwhile, its genome size was estimated using flow cytometry. As a result, the whole genome survey of Zhongshi 4 and Zhongshi 9 generated 34.40 and 39.55 GB sequence data. The genome size of Zhongshi 4 and Zhongshi 9 estimated were about 536.58 Mb and 569.52 Mb, which were closed to results of flow cytometry. The heterozygosity rates were calculated to be 0.75% and 0.89%, and the repeat rates were 60.08% and 62.00%. These reads were assembled into 1024,373 and 885,404 contigs with a N50 length of 1005 bp and 1219 bp, respectively, which were further assembled into 714,369 and 686,128 scaffolds with scaffold N50 length of ~ 1963 bp and ~ 1938 bp, total length of 386,915 Kb and 391,904 Kb. These results indicated that there was little difference in genome size and complexity among different cultivars. In addition, 63137 and 65271 high-quality genomic simple sequence repeat (SSR) markers in Zhongshi 4 and Zhongshi 9 were generated. We suggest that the technologies combining Illumina and PacBio, assisted by Hi-C and matching assemble software should be used to one of two yellowhorn cultivars genome sequencing. The result will help to design whole genome sequencing strategies for yellowhorn, and provided a large amount of gene resources for further excavation and utilization of yellowhorn.
Collapse
|
12
|
Genome-wide identification of conserved and novel microRNAs in one bud and two tender leaves of tea plant (Camellia sinensis) by small RNA sequencing, microarray-based hybridization and genome survey scaffold sequences. BMC PLANT BIOLOGY 2017; 17:212. [PMID: 29157210 PMCID: PMC5697157 DOI: 10.1186/s12870-017-1169-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Accepted: 11/10/2017] [Indexed: 05/19/2023]
Abstract
BACKGROUND MicroRNAs (miRNAs) are important for plant growth and responses to environmental stresses via post-transcriptional regulation of gene expression. Tea, which is primarily produced from one bud and two tender leaves of the tea plant (Camellia sinensis), is one of the most popular non-alcoholic beverages worldwide owing to its abundance of secondary metabolites. A large number of miRNAs have been identified in various plants, including non-model species. However, due to the lack of reference genome sequences and/or information of tea plant genome survey scaffold sequences, discovery of miRNAs has been limited in C. sinensis. RESULTS Using small RNA sequencing, combined with our recently obtained genome survey data, we have identified and analyzed 175 conserved and 83 novel miRNAs mainly in one bud and two tender leaves of the tea plant. Among these, 93 conserved and 18 novel miRNAs were validated using miRNA microarray hybridization. In addition, the expression pattern of 11 conserved and 8 novel miRNAs were validated by stem-loop-qRT-PCR. A total of 716 potential target genes of identified miRNAs were predicted. Further, Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that most of the target genes were primarily involved in stress response and enzymes related to phenylpropanoid biosynthesis. The predicted targets of 4 conserved miRNAs were further validated by 5'RLM-RACE. A negative correlation between expression profiles of 3 out of 4 conserved miRNAs (csn-miR160a-5p, csn-miR164a, csn-miR828 and csn-miR858a) and their targets (ARF17, NAC100, WER and MYB12 transcription factor) were observed. CONCLUSION In summary, the present study is one of few such studies on miRNA detection and identification in the tea plant. The predicted target genes of majority of miRNAs encoded enzymes, transcription factors, and functional proteins. The miRNA-target transcription factor gene interactions may provide important clues about the regulatory mechanism of these miRNAs in the tea plant. The data reported in this study will make a huge contribution to knowledge on the potential miRNA regulators of the secondary metabolism pathway and other important biological processes in C. sinensis.
Collapse
|
13
|
DNA shotgun sequencing analysis of Garcinia mangostana L. variety Mesta. GENOMICS DATA 2017; 12:118-119. [PMID: 28516035 PMCID: PMC5426032 DOI: 10.1016/j.gdata.2017.05.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Revised: 04/28/2017] [Accepted: 05/01/2017] [Indexed: 10/28/2022]
Abstract
Mangosteen (Garcinia mangostana Linn.) is an ultra-tropical tree characterized by its unique dark purple fruits with white flesh. The xanthone-rich purple pericarp tissue contains valuable compounds with medicinal properties. Following previously reported genome sequencing of a common variety of mangosteen [1], we performed another whole genome sequencing of a commercially popular variety of this fruit species (var. Mesta) for comparative analysis of its genome composition. Raw reads of the DNA sequencing project were deposited to SRA database with the accession number SRX2709728.
Collapse
|
14
|
Genome survey of pistachio (Pistacia vera L.) by next generation sequencing: Development of novel SSR markers and genetic diversity in Pistacia species. BMC Genomics 2016; 17:998. [PMID: 27923352 PMCID: PMC5142174 DOI: 10.1186/s12864-016-3359-x] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Accepted: 11/28/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Pistachio (Pistacia vera L.) is one of the most important nut crops in the world. There are about 11 wild species in the genus Pistacia, and they have importance as rootstock seed sources for cultivated P. vera and forest trees. Published information on the pistachio genome is limited. Therefore, a genome survey is necessary to obtain knowledge on the genome structure of pistachio by next generation sequencing. Simple sequence repeat (SSR) markers are useful tools for germplasm characterization, genetic diversity analysis, and genetic linkage mapping, and may help to elucidate genetic relationships among pistachio cultivars and species. RESULTS To explore the genome structure of pistachio, a genome survey was performed using the Illumina platform at approximately 40× coverage depth in the P. vera cv. Siirt. The K-mer analysis indicated that pistachio has a genome that is about 600 Mb in size and is highly heterozygous. The assembly of 26.77 Gb Illumina data produced 27,069 scaffolds at N50 = 3.4 kb with a total of 513.5 Mb. A total of 59,280 SSR motifs were detected with a frequency of 8.67 kb. A total of 206 SSRs were used to characterize 24 P. vera cultivars and 20 wild Pistacia genotypes (four genotypes from each five wild Pistacia species) belonging to P. atlantica, P. integerrima, P. chinenesis, P. terebinthus, and P. lentiscus genotypes. Overall 135 SSR loci amplified in all 44 cultivars and genotypes, 41 were polymorphic in six Pistacia species. The novel SSR loci developed from cultivated pistachio were highly transferable to wild Pistacia species. CONCLUSIONS The results from a genome survey of pistachio suggest that the genome size of pistachio is about 600 Mb with a high heterozygosity rate. This information will help to design whole genome sequencing strategies for pistachio. The newly developed novel polymorphic SSRs in this study may help germplasm characterization, genetic diversity, and genetic linkage mapping studies in the genus Pistacia.
Collapse
|
15
|
Abstract
Mangosteen (Garcinia mangostana Linn.) is a tropical tree mainly found in South East Asia and considered as “the queen of fruits”. The asexually produced fruit is dark purple or reddish in color, with white flesh which is slightly acidic with sweet flavor and a pleasant aroma. The purple pericarp tissue is rich in xanthones which are useful for medical purposes. We performed the first genome sequencing of this commercially important fruit tree to study its genome composition and attempted draft genome assembly. Raw reads of the DNA sequencing project have been deposited to SRA database with the accession number SRX1426419.
Collapse
|
16
|
A genome survey sequencing of the Java mouse deer (Tragulus javanicus) adds new aspects to the evolution of lineage specific retrotransposons in Ruminantia (Cetartiodactyla). Gene 2015; 571:271-8. [PMID: 26123917 DOI: 10.1016/j.gene.2015.06.064] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Revised: 06/24/2015] [Accepted: 06/25/2015] [Indexed: 10/23/2022]
Abstract
Ruminantia, the ruminating, hoofed mammals (cow, deer, giraffe and allies) are an unranked artiodactylan clade. Around 50-60 million years ago the BovB retrotransposon entered the ancestral ruminantian genome through horizontal gene transfer. A survey genome screen using 454-pyrosequencing of the Java mouse deer (Tragulus javanicus) and the lesser kudu (Tragelaphus imberbis) was done to investigate and to compare the landscape of transposable elements within Ruminantia. The family Tragulidae (mouse deer) is the only representative of Tragulina and phylogenetically important, because it represents the earliest divergence in Ruminantia. The data analyses show that, relative to other ruminantian species, the lesser kudu genome has seen an expansion of BovB Long INterspersed Elements (LINEs) and BovB related Short INterspersed Elements (SINEs) like BOVA2. In comparison the genome of Java mouse deer has fewer BovB elements than other ruminants, especially Bovinae, and has in addition a novel CHR-3 SINE most likely propagated by LINE-1. By contrast the other ruminants have low amounts of CHR SINEs but high numbers of actively propagating BovB-derived and BovB-propagated SINEs. The survey sequencing data suggest that the transposable element landscape in mouse deer (Tragulina) is unique among Ruminantia, suggesting a lineage specific evolutionary trajectory that does not involve BovB mediated retrotransposition. This shows that the genomic landscape of mobile genetic elements can rapidly change in any lineage.
Collapse
|