1
|
Xing D, Li S, Shang M, Wang W, Zhang Q, Wang J, Hasin T, Hettiarachchi D, Alston V, Bern L, Parrales AP, Lu C, Coogan M, Johnson A, Qin Z, Su B, Dunham R. A New Strategy for Increasing Knock-in Efficiency: Multiple Elongase and Desaturase Transgenes Knock-in by Targeting Long Repeated Sequences. ACS Synth Biol 2022; 11:4210-4219. [PMID: 36332126 DOI: 10.1021/acssynbio.2c00252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
CRISPR/Cas9-mediated knock-in (KI) has a wide application in gene therapy, gene function study, and transgenic breeding programs. Unlike gene therapy, which requires accurate KI to correct gene mutation, transgenic breeding programs can accept robust KI as long as integration does not interrupt normal gene functions and result in any negative pleiotropic effects. High KI efficiency is required to reduce the breeding cost and shorten the breeding period, especially in transferring multiple foreign genes to a single individual. To elevate the KI efficacy and achieve multiple gene KIs simultaneously, we introduced a new strategy that enables transgene integration into numerous sites of the genome by targeting long repeated sequences (LRSs). Using this simple strategy, for the first time we successfully generated transgenic fish carrying the masu salmon (Oncorhynchus masou) elovl2 gene and rabbitfish (Siganus canaliculatus) Δ4 fad and Δ6 fad genes, and achieved robust target KI of elovl2 and Δ6 fad genes at multiple sites of LRS1 and LRS3, respectively, in the initial generation. This demonstrated that donor plasmid homology arms, which were nearly identical but not completely the same as the genome sequence, still led to on-target KI. Although the target KI efficiencies at LRS1, LRS2, and LRS3 sites were still relatively low in the current study, it is very promising that 100% KI efficiency in the future could be realized and perfected by selection of better LRSs and optimization of sgRNAs.
Collapse
Affiliation(s)
- De Xing
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Shangjia Li
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Mei Shang
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Wenwen Wang
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Qin Zhang
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Jinhai Wang
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Tasnuba Hasin
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Darshika Hettiarachchi
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Veronica Alston
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Logan Bern
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Abel Paladines Parrales
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Cuiyu Lu
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Michael Coogan
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Andrew Johnson
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Zhenkui Qin
- Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao266003, China
| | - Baofeng Su
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| | - Rex Dunham
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Auburn, Alabama36849, United States
| |
Collapse
|
2
|
Prospective investigation of carbapenem-resistant Klebsiella pneumonia transmission among the staff, environment and patients in five major intensive care units, Beijing. J Hosp Infect 2019; 101:150-157. [DOI: 10.1016/j.jhin.2018.11.019] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 11/29/2018] [Indexed: 12/11/2022]
|
3
|
Gouveia JG, Wolf IR, Vilas-Boas LA, Heslop-Harrison JS, Schwarzacher T, Dias AL. Repetitive DNA in the Catfish Genome: rDNA, Microsatellites, and Tc1-Mariner Transposon Sequences in Imparfinis Species (Siluriformes, Heptapteridae). J Hered 2017; 108:650-657. [PMID: 28821184 DOI: 10.1093/jhered/esx065] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Accepted: 07/15/2017] [Indexed: 11/13/2022] Open
Abstract
Physical mapping of repetitive DNA families in the karyotypes of fish is important to understand the organization and evolution of different orders, families, genera, or species. Fish in the genus Imparfinis show diverse karyotypes with various diploid numbers and ribosomal DNA (rDNA) locations. Here we isolated and characterized Tc1-mariner nucleotide sequences from Imparfinis schubarti, and mapped their locations together with 18S rDNA, 5S rDNA, and microsatellite probes in Imparfinis borodini and I. schubarti chromosomes. The physical mapping of Tc1/Mariner on chromosomes revealed dispersed signals in heterochromatin blocks with small accumulations in the terminal and interstitial regions of I. borodini and I. schubarti. Tc1/Mariner was coincident with rDNA chromosomes sites in both species, suggesting that this transposable element may have participated in the dispersion and evolution of these sequences in the fish genome. Our analysis suggests that different transposons and microsatellites have accumulated in the I. borodini and I. schubarti genomes and that the distribution patterns of these elements may be related to karyotype evolution within Imparfinis.
Collapse
Affiliation(s)
- Juceli Gonzalez Gouveia
- Department of Biology, Biological Sciences, CCB, University Estadual de Londrina, P.O. Box 6001, Londrina, Paraná CEP 86051-970, Brazil ; Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - Ivan Rodrigo Wolf
- Department of Biology, Biological Sciences, CCB, University Estadual de Londrina, P.O. Box 6001, Londrina, Paraná CEP 86051-970, Brazil ; Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - Laurival Antonio Vilas-Boas
- Department of Biology, Biological Sciences, CCB, University Estadual de Londrina, P.O. Box 6001, Londrina, Paraná CEP 86051-970, Brazil ; Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - John Seymour Heslop-Harrison
- Department of Biology, Biological Sciences, CCB, University Estadual de Londrina, P.O. Box 6001, Londrina, Paraná CEP 86051-970, Brazil ; Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - Trude Schwarzacher
- Department of Biology, Biological Sciences, CCB, University Estadual de Londrina, P.O. Box 6001, Londrina, Paraná CEP 86051-970, Brazil ; Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - Ana Lúcia Dias
- Department of Biology, Biological Sciences, CCB, University Estadual de Londrina, P.O. Box 6001, Londrina, Paraná CEP 86051-970, Brazil ; Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| |
Collapse
|
4
|
Shen X, Ngoh SY, Thevasagayam NM, Prakki SRS, Bhandare P, Tan AWK, Tan GQ, Singh S, Phua NCH, Vij S, Orbán L. BAC-pool sequencing and analysis confirms growth-associated QTLs in the Asian seabass genome. Sci Rep 2016; 6:36647. [PMID: 27821852 PMCID: PMC5099610 DOI: 10.1038/srep36647] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 10/19/2016] [Indexed: 12/31/2022] Open
Abstract
The Asian seabass is an important marine food fish that has been cultured for several decades in Asia Pacific. However, the lack of a high quality reference genome has hampered efforts to improve its selective breeding. A 3D BAC pool set generated in this study was screened using 22 SSR markers located on linkage group 2 which contains a growth-related QTL region. Seventy-two clones corresponding to 22 FPC contigs were sequenced by Illumina MiSeq technology. We co-assembled the MiSeq-derived scaffolds from each FPC contig with error-corrected PacBio reads, resulting in 187 sequences covering 9.7 Mb. Eleven genes annotated within this region were found to be potentially associated with growth and their tissue-specific expression was investigated. Correlation analysis demonstrated that SNPs in ctsb, skp1 and ppp2ca can be potentially used as markers for selecting fast-growing fingerlings. Conserved syntenies between seabass LG2 and five other teleosts were identified. This study i) provided a 10 Mb targeted genome assembly; ii) demonstrated NGS of BAC pools as a potential approach for mining candidates underlying QTLs of this species; iii) detected eleven genes potentially responsible for growth in the QTL region; and iv) identified useful SNP markers for selective breeding programs of Asian seabass.
Collapse
Affiliation(s)
- Xueyan Shen
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore
| | - Si Yan Ngoh
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore.,Nanyang Technological University, 639798 Singapore
| | | | | | - Pranjali Bhandare
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore
| | - Andy Wee Kiat Tan
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore
| | - Gui Quan Tan
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore
| | | | | | - Shubha Vij
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore
| | - László Orbán
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, 117604 Singapore.,Department of Animal Sciences and Animal Husbandry, Georgikon Faculty, University of Pannonia, 8360 Keszthely, Hungary.,Centre for Comparative Genomics, Murdoch University, Murdoch 6150, Australia
| |
Collapse
|
5
|
Boutte J, Ferreira de Carvalho J, Rousseau-Gueutin M, Poulain J, Da Silva C, Wincker P, Ainouche M, Salmon A. Reference Transcriptomes and Detection of Duplicated Copies in Hexaploid and Allododecaploid Spartina Species (Poaceae). Genome Biol Evol 2016; 8:3030-3044. [PMID: 27614235 PMCID: PMC5633685 DOI: 10.1093/gbe/evw209] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/20/2016] [Indexed: 01/19/2023] Open
Abstract
In this study, we report the assembly and annotation of five reference transcriptomes for the European hexaploid Spartina species (S. maritima, S. alterniflora and their homoploid hybrids S. x townsendii and S. x neyrautii) and the allododecaploid invasive species S. anglica These transcriptomes were constructed from various leaf and root cDNA libraries that were sequenced using both Roche-454 and Illumina technologies. Considering the high ploidy levels of the Spartina genomes under study, and considering the absence of diploid reference genome and the need of an appropriate analytical strategy, we developed generic bioinformatics tools to (1) detect different haplotypes of each gene within each species and (2) assign a parental origin to haplotypes detected in the hexaploid hybrids and the neo-allopolyploid. The approach described here allows the detection of putative homeologs from sets of short reads. Synonymous substitution rate (KS) comparisons between haplotypes from the hexaploid species revealed the presence of one KS peak (likely resulting from the tetraploid duplication event). The procedure developed in this study can be applied for future differential gene expression or genomics experiments to study the fate of duplicated genes in the invasive allododecaploid S. anglica.
Collapse
Affiliation(s)
- Julien Boutte
- UMR CNRS 6553 Ecobio, OSUR (Observatoire des Sciences de l'Univers de Rennes), University of Rennes 1, Rennes Cedex, France
| | - Julie Ferreira de Carvalho
- UMR CNRS 6553 Ecobio, OSUR (Observatoire des Sciences de l'Univers de Rennes), University of Rennes 1, Rennes Cedex, France
| | - Mathieu Rousseau-Gueutin
- UMR CNRS 6553 Ecobio, OSUR (Observatoire des Sciences de l'Univers de Rennes), University of Rennes 1, Rennes Cedex, France UMR Institut de Génétique, Environnement et Protection des Plantes, Institut National de la Recherche Agronomique, Le Rheu Cedex, France
| | | | | | | | - Malika Ainouche
- UMR CNRS 6553 Ecobio, OSUR (Observatoire des Sciences de l'Univers de Rennes), University of Rennes 1, Rennes Cedex, France
| | - Armel Salmon
- UMR CNRS 6553 Ecobio, OSUR (Observatoire des Sciences de l'Univers de Rennes), University of Rennes 1, Rennes Cedex, France
| |
Collapse
|
6
|
Jenny MJ, Walton WC, Payton SL, Powers JM, Findlay RH, O'Shields B, Diggins K, Pinkerton M, Porter D, Crane DM, Tapley J, Cunningham C. Transcriptomic evaluation of the American oyster, Crassostrea virginica, deployed during the Deepwater Horizon oil spill: Evidence of an active hydrocarbon response pathway. MARINE ENVIRONMENTAL RESEARCH 2016; 120:166-181. [PMID: 27564836 DOI: 10.1016/j.marenvres.2016.08.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Revised: 08/01/2016] [Accepted: 08/11/2016] [Indexed: 06/06/2023]
Abstract
Estuarine organisms were impacted by the Deepwater Horizon oil spill which released ∼5 million barrels of crude oil into the Gulf of Mexico in the spring and summer of 2010. Crassostrea virginica, the American oyster, is a keystone species in these coastal estuaries and is routinely used for environmental monitoring purposes. However, very little is known about their cellular and molecular responses to hydrocarbon exposure. In response to the spill, a monitoring program was initiated by deploying hatchery-reared oysters at three sites along the Alabama and Mississippi coast (Grand Bay, MS, Fort Morgan, AL, and Orange Beach, AL). Oysters were deployed for 2-month periods at five different time points from May 2010 to May 2011. Gill and digestive gland tissues were harvested for gene expression analysis and determination of aliphatic and polycyclic aromatic hydrocarbon (PAH) concentrations. To facilitate identification of stress response genes that may be involved in the hydrocarbon response, a nearly complete transcriptome was assembled using Roche 454 and Illumina high-throughput sequencing from RNA samples obtained from the gill and digestive gland tissues of deployed oysters. This effort resulted in the assembly and annotation of 27,227 transcripts comprised of a large assortment of stress response genes, including members of the aryl hydrocarbon receptor (AHR) pathway, Phase I and II biotransformation enzymes, antioxidant enzymes and xenobiotic transporters. From this assembly several potential biomarkers of hydrocarbon exposure were chosen for expression profiling, including the AHR, two cytochrome P450 1A genes (CYP1A-like 1 and CYP1A-like 2), Cu/Zn superoxide dismutase (CuZnSOD), glutathione S-transferase theta (GST theta) and multidrug resistance protein 3 (MRP3). Higher expression levels of GST theta and MRP3 were observed in gill tissues from all three sites during the summer to early fall 2010 deployments. Linear regression analysis indicated a statistically significant relationship between total PAH levels in digestive gland tissue samples with CYP1A-like 2, CuZnSOD, GST theta and MRP3 induction. These observations provide evidence of a potentially conserved AHR pathway in invertebrates and yield new insight into the development of novel biomarkers for use in environmental monitoring activities.
Collapse
Affiliation(s)
- Matthew J Jenny
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA.
| | - William C Walton
- School of Fisheries, Aquaculture and Aquatic Sciences, Auburn University, Dauphin Island, AL 36528, USA
| | - Samantha L Payton
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - John M Powers
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Robert H Findlay
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Britton O'Shields
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Kirsten Diggins
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Mark Pinkerton
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Danielle Porter
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Daniel M Crane
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Jeffrey Tapley
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Charles Cunningham
- Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
7
|
Burns FR, Cogburn AL, Ankley GT, Villeneuve DL, Waits E, Chang YJ, Llaca V, Deschamps SD, Jackson RE, Hoke RA. Sequencing and de novo draft assemblies of a fathead minnow (Pimephales promelas) reference genome. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2016; 35:212-7. [PMID: 26513338 DOI: 10.1002/etc.3186] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Revised: 02/11/2015] [Accepted: 07/27/2015] [Indexed: 05/20/2023]
Abstract
The present study was undertaken to provide the foundation for development of genome-scale resources for the fathead minnow (Pimephales promelas), an important model organism widely used in both aquatic toxicology research and regulatory testing. The authors report on the first sequencing and 2 draft assemblies for the reference genome of this species. Approximately 120× sequence coverage was achieved via Illumina sequencing of a combination of paired-end, mate-pair, and fosmid libraries. Evaluation and comparison of these assemblies demonstrate that they are of sufficient quality to be useful for genome-enabled studies, with 418 of 458 (91%) conserved eukaryotic genes mapping to at least 1 of the assemblies. In addition to its immediate utility, the present work provides a strong foundation on which to build further refinements of a reference genome for the fathead minnow.
Collapse
Affiliation(s)
- Frank R Burns
- Haskell Global Centers for Health and Environmental Sciences, E.I. du Pont de Nemours, Newark, Delaware, USA
| | - Amarin L Cogburn
- Haskell Global Centers for Health and Environmental Sciences, E.I. du Pont de Nemours, Newark, Delaware, USA
| | - Gerald T Ankley
- Mid-Continent Ecology Division, US Environmental Protection Agency, Duluth, Minnesota, USA
| | - Daniel L Villeneuve
- Mid-Continent Ecology Division, US Environmental Protection Agency, Duluth, Minnesota, USA
| | - Eric Waits
- US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Yun-Juan Chang
- High-Performance Biological Computing, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Victor Llaca
- Agricultural Biotechnology, E.I. du Pont de Nemours, Wilmington, Delaware, USA
| | | | - Raymond E Jackson
- Central Research and Development Biotechnology, E.I. du Pont de Nemours, Wilmington, Delaware, USA
| | - Robert Alan Hoke
- Haskell Global Centers for Health and Environmental Sciences, E.I. du Pont de Nemours, Newark, Delaware, USA
| |
Collapse
|
8
|
Abstract
Recent improvements in next-generation sequencing technology have made it possible to do whole genome sequencing, on even non-model eukaryote species with no available reference genomes. However, de novo assembly of diploid genomes is still a big challenge because of allelic variation. The aim of this study was to determine the feasibility of utilizing the genome of haploid fish larvae for de novo assembly of whole-genome sequences. We compared the efficiency of assembly using the haploid genome of yellowtail (Seriola quinqueradiata) with that using the diploid genome obtained from the dam. De novo assembly from the haploid and the diploid sequence reads (100 million reads per each datasets) generated by the Ion Proton sequencer (200 bp) was done under two different assembly algorithms, namely overlap-layout-consensus (OLC) and de Bruijn graph (DBG). This revealed that the assembly of the haploid genome significantly reduced (approximately 22% for OLC, 9% for DBG) the total number of contigs (with longer average and N50 contig lengths) when compared to the diploid genome assembly. The haploid assembly also improved the quality of the scaffolds by reducing the number of regions with unassigned nucleotides (Ns) (total length of Ns; 45,331,916 bp for haploids and 67,724,360 bp for diploids) in OLC-based assemblies. It appears clear that the haploid genome assembly is better because the allelic variation in the diploid genome disrupts the extension of contigs during the assembly process. Our results indicate that utilizing the genome of haploid larvae leads to a significant improvement in the de novo assembly process, thus providing a novel strategy for the construction of reference genomes from non-model diploid organisms such as fish.
Collapse
|
9
|
Thanh NM, Jung H, Lyons RE, Njaci I, Yoon BH, Chand V, Tuan NV, Thu VTM, Mather P. Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus). Mar Genomics 2015; 23:87-97. [PMID: 25979246 DOI: 10.1016/j.margen.2015.05.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 05/03/2015] [Accepted: 05/03/2015] [Indexed: 12/17/2022]
Abstract
Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species.
Collapse
Affiliation(s)
- Nguyen Minh Thanh
- International University - VNU HCMC, Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Viet Nam.
| | - Hyungtaek Jung
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia; Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Russell E Lyons
- Animal Genetics Laboratory, School of Veterinary Science, University of Queensland, Gatton, QLD 4343, Australia.
| | - Isaac Njaci
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Byoung-Ha Yoon
- Medical Genomics Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-806, Republic of Korea; Department of Functional Genomics, Korea University of Science and Technology, Daejoen 305-333, Republic of Korea.
| | - Vincent Chand
- Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Nguyen Viet Tuan
- Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Vo Thi Minh Thu
- International University - VNU HCMC, Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Viet Nam.
| | - Peter Mather
- Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| |
Collapse
|
10
|
Jiang Y, Xu P, Liu Z. Generation of physical map contig-specific sequences. Front Genet 2014; 5:243. [PMID: 25101119 PMCID: PMC4105628 DOI: 10.3389/fgene.2014.00243] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2014] [Accepted: 07/07/2014] [Indexed: 12/13/2022] Open
Abstract
Rapid advances of the next-generation sequencing technologies have allowed whole genome sequencing of many species. However, with the current sequencing technologies, the whole genome sequence assemblies often fall in short in one of the four quality measurements: accuracy, contiguity, connectivity, and completeness. In particular, small-sized contigs and scaffolds limit the applicability of whole genome sequences for genetic analysis. To enhance the quality of whole genome sequence assemblies, particularly the scaffolding capabilities, additional genomic resources are required. Among these, sequences derived from known physical locations offer great powers for scaffolding. In this mini-review, we will describe the principles, procedures and applications of physical-map-derived sequences, with the focus on physical map contig-specific sequences.
Collapse
Affiliation(s)
- Yanliang Jiang
- Centre for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences Beijing, China
| | - Peng Xu
- Centre for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences Beijing, China
| | - Zhanjiang Liu
- Aquatic Genomics Unit, The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences, and Program of Cell and Molecular Biosciences, Auburn University AL, USA
| |
Collapse
|
11
|
Thanh NM, Jung H, Lyons RE, Chand V, Tuan NV, Thu VTM, Mather P. A transcriptomic analysis of striped catfish (Pangasianodon hypophthalmus) in response to salinity adaptation: De novo assembly, gene annotation and marker discovery. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2014; 10:52-63. [PMID: 24841517 DOI: 10.1016/j.cbd.2014.04.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Revised: 04/16/2014] [Accepted: 04/28/2014] [Indexed: 01/25/2023]
Abstract
The striped catfish (Pangasianodon hypophthalmus) culture industry in the Mekong Delta in Vietnam has developed rapidly over the past decade. The culture industry now however, faces some significant challenges, especially related to climate change impacts notably from predicted extensive saltwater intrusion into many low topographical coastal provinces across the Mekong Delta. This problem highlights a need for development of culture stocks that can tolerate more saline culture environments as a response to expansion of saline water-intruded land. While a traditional artificial selection program can potentially address this need, understanding the genomic basis of salinity tolerance can assist development of more productive culture lines. The current study applied a transcriptomic approach using Ion PGM technology to generate expressed sequence tag (EST) resources from the intestine and swim bladder from striped catfish reared at a salinity level of 9ppt which showed best growth performance. Total sequence data generated was 467.8Mbp, consisting of 4,116,424 reads with an average length of 112bp. De novo assembly was employed that generated 51,188 contigs, and allowed identification of 16,116 putative genes based on the GenBank non-redundant database. GO annotation, KEGG pathway mapping, and functional annotation of the EST sequences recovered with a wide diversity of biological functions and processes. In addition, more than 11,600 simple sequence repeats were also detected. This is the first comprehensive analysis of a striped catfish transcriptome, and provides a valuable genomic resource for future selective breeding programs and functional or evolutionary studies of genes that influence salinity tolerance in this important culture species.
Collapse
Affiliation(s)
- Nguyen Minh Thanh
- International University, VNU HCMC, Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Viet Nam.
| | - Hyungtaek Jung
- Institute for Future Environment, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia; Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Russell E Lyons
- CSIRO Livestock Industries, Queensland Biosciences Precinct, QLD 4057, Australia.
| | - Vincent Chand
- Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Nguyen Viet Tuan
- Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| | - Vo Thi Minh Thu
- International University, VNU HCMC, Quarter 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Viet Nam.
| | - Peter Mather
- Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| |
Collapse
|
12
|
Jiang Y, Gao X, Liu S, Zhang Y, Liu H, Sun F, Bao L, Waldbieser G, Liu Z. Whole genome comparative analysis of channel catfish (Ictalurus punctatus) with four model fish species. BMC Genomics 2013; 14:780. [PMID: 24215161 PMCID: PMC3840565 DOI: 10.1186/1471-2164-14-780] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2013] [Accepted: 10/28/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparative mapping is a powerful tool to study evolution of genomes. It allows transfer of genome information from the well-studied model species to non-model species. Catfish is an economically important aquaculture species in United States. A large amount of genome resources have been developed from catfish including genetic linkage maps, physical maps, BAC end sequences (BES), integrated linkage and physical maps using BES-derived markers, physical map contig-specific sequences, and draft genome sequences. Application of such genome resources should allow comparative analysis at the genome scale with several other model fish species. RESULTS In this study, we conducted whole genome comparative analysis between channel catfish and four model fish species with fully sequenced genomes, zebrafish, medaka, stickleback and Tetraodon. A total of 517 Mb draft genome sequences of catfish were anchored to its genetic linkage map, which accounted for 62% of the total draft genome sequences. Based on the location of homologous genes, homologous chromosomes were determined among catfish and the four model fish species. A large number of conserved syntenic blocks were identified. Analysis of the syntenic relationships between catfish and the four model fishes supported that the catfish genome is most similar to the genome of zebrafish. CONCLUSION The organization of the catfish genome is similar to that of the four teleost species, zebrafish, medaka, stickleback, and Tetraodon such that homologous chromosomes can be identified. Within each chromosome, extended syntenic blocks were evident, but the conserved syntenies at the chromosome level involve extensive inter-chromosomal and intra-chromosomal rearrangements. This whole genome comparative map should facilitate the whole genome assembly and annotation in catfish, and will be useful for genomic studies of various other fish species.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Zhanjiang Liu
- The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures, Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, 203 Swingle Hall, Auburn University, Auburn, AL 36849, USA.
| |
Collapse
|
13
|
Jiang Y, Ninwichian P, Liu S, Zhang J, Kucuktas H, Sun F, Kaltenboeck L, Sun L, Bao L, Liu Z. Generation of physical map contig-specific sequences useful for whole genome sequence scaffolding. PLoS One 2013; 8:e78872. [PMID: 24205335 PMCID: PMC3811975 DOI: 10.1371/journal.pone.0078872] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 09/16/2013] [Indexed: 11/29/2022] Open
Abstract
Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly) were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge.
Collapse
Affiliation(s)
- Yanliang Jiang
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Parichart Ninwichian
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Shikai Liu
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Jiaren Zhang
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Huseyin Kucuktas
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Fanyue Sun
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Ludmilla Kaltenboeck
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Luyang Sun
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Lisui Bao
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
| | - Zhanjiang Liu
- The Fish Molecular Genetics and Biotechnology Laboratory, Aquatic Genomics Unit, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences, Auburn University, Auburn, Alabama, United States of America
- * E-mail:
| |
Collapse
|
14
|
Ghangal R, Chaudhary S, Jain M, Purty RS, Chand Sharma P. Optimization of de novo short read assembly of seabuckthorn (Hippophae rhamnoides L.) transcriptome. PLoS One 2013; 8:e72516. [PMID: 23991119 PMCID: PMC3749127 DOI: 10.1371/journal.pone.0072516] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2013] [Accepted: 07/09/2013] [Indexed: 11/18/2022] Open
Abstract
Seabuckthorn (Hippophaerhamnoides L.) is known for its medicinal, nutritional and environmental importance since ancient times. However, very limited efforts have been made to characterize the genome and transcriptome of this wonder plant. Here, we report the use of next generation massive parallel sequencing technology (Illumina platform) and de novo assembly to gain a comprehensive view of the seabuckthorn transcriptome. We assembled 86,253,874 high quality short reads using six assembly tools. At our hand, assembly of non-redundant short reads following a two-step procedure was found to be the best considering various assembly quality parameters. Initially, ABySS tool was used following an additive k-mer approach. The assembled transcripts were subsequently subjected to TGICL suite. Finally, de novo short read assembly yielded 88,297 transcripts (> 100 bp), representing about 53 Mb of seabuckthorn transcriptome. The average length of transcripts was 610 bp, N50 length 1198 BP and 91% of the short reads uniquely mapped back to seabuckthorn transcriptome. A total of 41,340 (46.8%) transcripts showed significant similarity with sequences present in nr protein databases of NCBI (E-value < 1E-06). We also screened the assembled transcripts for the presence of transcription factors and simple sequence repeats. Our strategy involving the use of short read assembler (ABySS) followed by TGICL will be useful for the researchers working with a non-model organism’s transcriptome in terms of saving time and reducing complexity in data management. The seabuckthorn transcriptome data generated here provide a valuable resource for gene discovery and development of functional molecular markers.
Collapse
Affiliation(s)
- Rajesh Ghangal
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Saurabh Chaudhary
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Mukesh Jain
- National Institute of Plant Genome Research, New Delhi, India
| | - Ram Singh Purty
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Prakash Chand Sharma
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
- * E-mail:
| |
Collapse
|
15
|
Liu Q, Guo Y, Li J, Long J, Zhang B, Shyr Y. Steps to ensure accuracy in genotype and SNP calling from Illumina sequencing data. BMC Genomics 2012; 13 Suppl 8:S8. [PMID: 23281772 PMCID: PMC3535703 DOI: 10.1186/1471-2164-13-s8-s8] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Background Accurate calling of SNPs and genotypes from next-generation sequencing data is an essential prerequisite for most human genetics studies. A number of computational steps are required or recommended when translating the raw sequencing data into the final calls. However, whether each step does contribute to the performance of variant calling and how it affects the accuracy still remain unclear, making it difficult to select and arrange appropriate steps to derive high quality variants from different sequencing data. In this study, we made a systematic assessment of the relative contribution of each step to the accuracy of variant calling from Illumina DNA sequencing data. Results We found that the read preprocessing step did not improve the accuracy of variant calling, contrary to the general expectation. Although trimming off low-quality tails helped align more reads, it introduced lots of false positives. The ability of markup duplication, local realignment and recalibration, to help eliminate false positive variants depended on the sequencing depth. Rearranging these steps did not affect the results. The relative performance of three popular multi-sample SNP callers, SAMtools, GATK, and GlfMultiples, also varied with the sequencing depth. Conclusions Our findings clarify the necessity and effectiveness of computational steps for improving the accuracy of SNP and genotype calls from Illumina sequencing data and can serve as a general guideline for choosing SNP calling strategies for data with different coverage.
Collapse
Affiliation(s)
- Qi Liu
- Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | | | | | | | | | | |
Collapse
|
16
|
Second-generation genetic linkage map of catfish and its integration with the BAC-based physical map. G3-GENES GENOMES GENETICS 2012; 2:1233-41. [PMID: 23050234 PMCID: PMC3464116 DOI: 10.1534/g3.112.003962] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2012] [Accepted: 08/19/2012] [Indexed: 01/03/2023]
Abstract
Construction of high-density genetic linkage maps is crucially important for quantitative trait loci (QTL) studies, and they are more useful when integrated with physical maps. Such integrated maps are valuable genome resources for fine mapping of QTL, comparative genomics, and accurate and efficient whole-genome assembly. Previously, we established both linkage maps and a physical map for channel catfish, Ictalurus punctatus, the dominant aquaculture species in the United States. Here we added 2030 BAC end sequence (BES)-derived microsatellites from 1481 physical map contigs, as well as markers from singleton BES, ESTs, anonymous microsatellites, and SNPs, to construct a second-generation linkage map. Average marker density across the 29 linkage groups reached 1.4 cM/marker. The increased marker density highlighted variations in recombination rates within and among catfish chromosomes. This work effectively anchored 44.8% of the catfish BAC physical map contigs, covering ∼52.8% of the genome. The genome size was estimated to be 2546 cM on the linkage map, and the calculated physical distance per centimorgan was 393 Kb. This integrated map should enable comparative studies with teleost model species as well as provide a framework for ordering and assembling whole-genome scaffolds.
Collapse
|
17
|
Analysis of genome survey sequences and SSR marker development for Siamese Mud Carp, Henicorhynchus siamensis, using 454 pyrosequencing. Int J Mol Sci 2012; 13:10807-10827. [PMID: 23109823 PMCID: PMC3472715 DOI: 10.3390/ijms130910807] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Revised: 07/30/2012] [Accepted: 08/24/2012] [Indexed: 11/17/2022] Open
Abstract
Siamese mud carp (Henichorynchus siamensis) is a freshwater teleost of high economic importance in the Mekong River Basin. However, genetic data relevant for delineating wild stocks for management purposes currently are limited for this species. Here, we used 454 pyrosequencing to generate a partial genome survey sequence (GSS) dataset to develop simple sequence repeat (SSR) markers from H. siamensis genomic DNA. Data generated included a total of 65,954 sequence reads with average length of 264 nucleotides, of which 2.79% contain SSR motifs. Based on GSS-BLASTx results, 10.5% of contigs and 8.1% singletons possessed significant similarity (E value < 10(-5)) with the majority matching well to reported fish sequences. KEGG analysis identified several metabolic pathways that provide insights into specific potential roles and functions of sequences involved in molecular processes in H. siamensis. Top protein domains detected included reverse transcriptase and the top putative functional transcript identified was an ORF2-encoded protein. One thousand eight hundred and thirty seven sequences containing SSR motifs were identified, of which 422 qualified for primer design and eight polymorphic loci have been tested with average observed and expected heterozygosity estimated at 0.75 and 0.83, respectively. Regardless of their relative levels of polymorphism and heterozygosity, microsatellite loci developed here are suitable for further population genetic studies in H. siamensis and may also be applicable to other related taxa.
Collapse
|
18
|
Mehinto AC, Martyniuk CJ, Spade DJ, Denslow ND. Applications for next-generation sequencing in fish ecotoxicogenomics. Front Genet 2012; 3:62. [PMID: 22539934 PMCID: PMC3336092 DOI: 10.3389/fgene.2012.00062] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 03/02/2012] [Indexed: 01/23/2023] Open
Abstract
The new technologies for next-generation sequencing (NGS) and global gene expression analyses that are widely used in molecular medicine are increasingly applied to the field of fish biology. This has facilitated new directions to address research areas that could not be previously considered due to the lack of molecular information for ecologically relevant species. Over the past decade, the cost of NGS has decreased significantly, making it possible to use non-model fish species to investigate emerging environmental issues. NGS technologies have permitted researchers to obtain large amounts of raw data in short periods of time. There have also been significant improvements in bioinformatics to assemble the sequences and annotate the genes, thus facilitating the management of these large datasets.The combination of DNA sequencing and bioinformatics has improved our abilities to design custom microarrays and study the genome and transcriptome of a wide variety of organisms. Despite the promising results obtained using these techniques in fish studies, NGS technologies are currently underused in ecotoxicogenomics and few studies have employed these methods. These issues should be addressed in order to exploit the full potential of NGS in ecotoxicological studies and expand our understanding of the biology of non-model organisms.
Collapse
Affiliation(s)
- Alvine C Mehinto
- Center for Environmental and Human Toxicology, Department of Physiological Sciences, University of Florida, Gainesville, FL, USA
| | | | | | | |
Collapse
|
19
|
Lim JS, Choi BS, Lee JS, Shin C, Yang TJ, Rhee JS, Lee JS, Choi IY. Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling. Genomics Inform 2012; 10:1-8. [PMID: 23105922 PMCID: PMC3475479 DOI: 10.5808/gi.2012.10.1.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Revised: 02/15/2012] [Accepted: 02/17/2012] [Indexed: 01/16/2023] Open
Abstract
Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novo assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30× depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.
Collapse
Affiliation(s)
- Jong-Sung Lim
- National Instrumentation Center for Environmental Management, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Korea
| | | | | | | | | | | | | | | |
Collapse
|