1
|
Lo T, Coombe L, Gagalova KK, Marr A, Warren RL, Kirk H, Pandoh P, Zhao Y, Moore RA, Mungall AJ, Ritland C, Pavy N, Jones SJM, Bohlmann J, Bousquet J, Birol I, Thomson A. Assembly and annotation of the black spruce genome provide insights on spruce phylogeny and evolution of stress response. G3 (BETHESDA, MD.) 2023; 14:jkad247. [PMID: 37875130 PMCID: PMC10755193 DOI: 10.1093/g3journal/jkad247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 05/17/2023] [Accepted: 10/09/2023] [Indexed: 10/26/2023]
Abstract
Black spruce (Picea mariana [Mill.] B.S.P.) is a dominant conifer species in the North American boreal forest that plays important ecological and economic roles. Here, we present the first genome assembly of P. mariana with a reconstructed genome size of 18.3 Gbp and NG50 scaffold length of 36.0 kbp. A total of 66,332 protein-coding sequences were predicted in silico and annotated based on sequence homology. We analyzed the evolutionary relationships between P. mariana and 5 other spruces for which complete nuclear and organelle genome sequences were available. The phylogenetic tree estimated from mitochondrial genome sequences agrees with biogeography; specifically, P. mariana was strongly supported as a sister lineage to P. glauca and 3 other taxa found in western North America, followed by the European Picea abies. We obtained mixed topologies with weaker statistical support in phylogenetic trees estimated from nuclear and chloroplast genome sequences, indicative of ancient reticulate evolution affecting these 2 genomes. Clustering of protein-coding sequences from the 6 Picea taxa and 2 Pinus species resulted in 34,776 orthogroups, 560 of which appeared to be specific to P. mariana. Analysis of these specific orthogroups and dN/dS analysis of positive selection signatures for 497 single-copy orthogroups identified gene functions mostly related to plant development and stress response. The P. mariana genome assembly and annotation provides a valuable resource for forest genetics research and applications in this broadly distributed species, especially in relation to climate adaptation.
Collapse
Affiliation(s)
- Theodora Lo
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Lauren Coombe
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Kristina K Gagalova
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Alex Marr
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - René L Warren
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Heather Kirk
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Pawan Pandoh
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Yongjun Zhao
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Richard A Moore
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Andrew J Mungall
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Carol Ritland
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Nathalie Pavy
- Canada Research Chair in Forest Genomics, Laval University, Quebec City, QC G1V 0A6, Canada
| | - Steven J M Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Joerg Bohlmann
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Department of Botany, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Jean Bousquet
- Canada Research Chair in Forest Genomics, Laval University, Quebec City, QC G1V 0A6, Canada
| | - Inanç Birol
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Ashley Thomson
- Faculty of Natural Resources Management, Lakehead University, Thunder Bay, ON P7B 5E1, Canada
| |
Collapse
|
2
|
Assembly and Annotation of Red Spruce ( Picea rubens) Chloroplast Genome, Identification of Simple Sequence Repeats, and Phylogenetic Analysis in Picea. Int J Mol Sci 2022; 23:ijms232315243. [PMID: 36499570 PMCID: PMC9739956 DOI: 10.3390/ijms232315243] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/22/2022] [Accepted: 11/27/2022] [Indexed: 12/11/2022] Open
Abstract
We have sequenced the chloroplast genome of red spruce (Picea rubens) for the first time using the single-end, short-reads (44 bp) Illumina sequences, assembled and functionally annotated it, and identified simple sequence repeats (SSRs). The contigs were assembled using SOAPdenovo2 following the retrieval of chloroplast genome sequences using the black spruce (Picea mariana) chloroplast genome as the reference. The assembled genome length was 122,115 bp (gaps included). Comparatively, the P. rubens chloroplast genome reported here may be considered a near-complete draft. Global genome alignment and phylogenetic analysis based on the whole chloroplast genome sequences of Picea rubens and 10 other Picea species revealed high sequence synteny and conservation among 11 Picea species and phylogenetic relationships consistent with their known classical interrelationships and published molecular phylogeny. The P. rubens chloroplast genome sequence showed the highest similarity with that of P. mariana and the lowest with that of P. sitchensis. We have annotated 107 genes including 69 protein-coding genes, 28 tRNAs, 4 rRNAs, few pseudogenes, identified 42 SSRs, and successfully designed primers for 26 SSRs. Mononucleotide A/T repeats were the most common followed by dinucleotide AT repeats. A similar pattern of microsatellite repeats occurrence was found in the chloroplast genomes of 11 Picea species.
Collapse
|
3
|
Jackman SD, Coombe L, Warren RL, Kirk H, Trinh E, MacLeod T, Pleasance S, Pandoh P, Zhao Y, Coope RJ, Bousquet J, Bohlmann J, Jones SJM, Birol I. Complete Mitochondrial Genome of a Gymnosperm, Sitka Spruce (Picea sitchensis), Indicates a Complex Physical Structure. Genome Biol Evol 2021; 12:1174-1179. [PMID: 32449750 PMCID: PMC7486957 DOI: 10.1093/gbe/evaa108] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/20/2020] [Indexed: 12/12/2022] Open
Abstract
Plant mitochondrial genomes vary widely in size. Although many plant mitochondrial genomes have been sequenced and assembled, the vast majority are of angiosperms, and few are of gymnosperms. Most plant mitochondrial genomes are smaller than a megabase, with a few notable exceptions. We have sequenced and assembled the complete 5.5-Mb mitochondrial genome of Sitka spruce (Picea sitchensis), to date, one of the largest mitochondrial genomes of a gymnosperm. We sequenced the whole genome using Oxford Nanopore MinION, and then identified contigs of mitochondrial origin assembled from these long reads based on sequence homology to the white spruce mitochondrial genome. The assembly graph shows a multipartite genome structure, composed of one smaller 168-kb circular segment of DNA, and a larger 5.4-Mb single component with a branching structure. The assembly graph gives insight into a putative complex physical genome structure, and its branching points may represent active sites of recombination.
Collapse
Affiliation(s)
- Shaun D Jackman
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Lauren Coombe
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - René L Warren
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Heather Kirk
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Eva Trinh
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Tina MacLeod
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Stephen Pleasance
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Pawan Pandoh
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Yongjun Zhao
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Robin J Coope
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Jean Bousquet
- Forest Genomics, Institute for Systems and Integrative Biology, Université Laval, Quebec, Quebec, Canada
| | - Joerg Bohlmann
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Steven J M Jones
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| | - Inanc Birol
- Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
| |
Collapse
|
4
|
Fouret J, Brunet FG, Binet M, Aurine N, Enchéry F, Croze S, Guinier M, Goumaidi A, Preininger D, Volff JN, Bailly-Bechet M, Lachuer J, Horvat B, Legras-Lachuer C. Sequencing the Genome of Indian Flying Fox, Natural Reservoir of Nipah Virus, Using Hybrid Assembly and Conservative Secondary Scaffolding. Front Microbiol 2020; 11:1807. [PMID: 32849415 PMCID: PMC7403528 DOI: 10.3389/fmicb.2020.01807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 07/09/2020] [Indexed: 11/20/2022] Open
Abstract
Indian fruit bats, flying fox Pteropus medius was identified as an asymptomatic natural host of recently emerged Nipah virus, which is known to induce a severe infectious disease in humans. The absence of P. medius genome sequence presents an important obstacle for further studies of virus–host interactions and better understanding of mechanisms of zoonotic viral emergence. Generation of the high-quality genome sequence is often linked to a considerable effort associated to elevated costs. Although secondary scaffolding methods have reduced sequencing expenses, they imply the development of new tools for the integration of different data sources to achieve more reliable sequencing results. We initially sequenced the P. medius genome using the combination of Illumina paired-end and Nanopore sequencing, with a depth of 57.4x and 6.1x, respectively. Then, we introduced the novel scaff2link software to integrate multiple sources of information for secondary scaffolding, allowing to remove the association with discordant information among two sources. Different quality metrics were next produced to validate the benefits from secondary scaffolding. The P. medius genome, assembled by this method, has a length of 1,985 Mb and consists of 33,613 contigs and 16,113 scaffolds with an NG50 of 19 Mb. At least 22.5% of the assembled sequences is covered by interspersed repeats already described in other species and 19,823 coding genes are annotated. Phylogenetic analysis demonstrated the clustering of P. medius genome with two other Pteropus bat species, P. alecto and P. vampyrus, for which genome sequences are currently available. SARS-CoV entry receptor ACE2 sequence of P. medius was 82.7% identical with ACE2 of Rhinolophus sinicus bats, thought to be the natural host of SARS-CoV. Altogether, our results confirm that a lower depth of sequencing is enough to obtain a valuable genome sequence, using secondary scaffolding approaches and demonstrate the benefits of the scaff2link application. The genome sequence is now available to the scientific community to (i) proceed with further genomic analysis of P. medius, (ii) to characterize the underlying mechanism allowing Nipah virus maintenance and perpetuation in its bat host, and (iii) to monitor their evolutionary pathways toward a better understanding of bats’ ability to control viral infections.
Collapse
Affiliation(s)
- Julien Fouret
- CIRI, International Center for Infectiology Research, Team Immunobiology of Viral Infections, Univ Lyon, INSERM U1111, CNRS UMR 5308, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France.,Viroscan3D, Trévoux, France
| | - Frédéric G Brunet
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France
| | - Martin Binet
- CIRI, International Center for Infectiology Research, Team Immunobiology of Viral Infections, Univ Lyon, INSERM U1111, CNRS UMR 5308, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France.,Viroscan3D, Trévoux, France
| | - Noémie Aurine
- CIRI, International Center for Infectiology Research, Team Immunobiology of Viral Infections, Univ Lyon, INSERM U1111, CNRS UMR 5308, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France
| | - Francois Enchéry
- CIRI, International Center for Infectiology Research, Team Immunobiology of Viral Infections, Univ Lyon, INSERM U1111, CNRS UMR 5308, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France
| | - Séverine Croze
- Plateforme Profilexpert, Université Claude Bernard Lyon 1, Lyon, France
| | | | | | | | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France
| | | | - Joël Lachuer
- Cancer Research Center of Lyon, INSERM 1052/CNRS 5286, Université de Lyon, Lyon, France.,Plateforme Profilexpert, Université Claude Bernard Lyon 1, Lyon, France
| | - Branka Horvat
- CIRI, International Center for Infectiology Research, Team Immunobiology of Viral Infections, Univ Lyon, INSERM U1111, CNRS UMR 5308, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, Lyon, France
| | - Catherine Legras-Lachuer
- Viroscan3D, Trévoux, France.,Ecologie Microbienne, CNRS UMR 5557, LEM, INRA, VetAgro Sup, Université Claude Bernard Lyon 1, Villeurbanne, France
| |
Collapse
|
5
|
Single-molecule analysis of nucleic acid biomarkers - A review. Anal Chim Acta 2020; 1115:61-85. [PMID: 32370870 DOI: 10.1016/j.aca.2020.03.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 02/29/2020] [Accepted: 03/02/2020] [Indexed: 12/11/2022]
Abstract
Nucleic acids are important biomarkers for disease detection, monitoring, and treatment. Advances in technologies for nucleic acid analysis have enabled discovery and clinical implementation of nucleic acid biomarkers. However, challenges remain with technologies for nucleic acid analysis, thereby limiting the use of nucleic acid biomarkers in certain contexts. Here, we review single-molecule technologies for nucleic acid analysis that can be used to overcome these challenges. We first discuss the various types of nucleic acid biomarkers important for clinical applications and conventional technologies for nucleic acid analysis. We then discuss technologies for single-molecule in vitro and in situ analysis of nucleic acid biomarkers. Finally, we discuss other ultra-sensitive techniques for nucleic acid biomarker detection.
Collapse
|
6
|
Wang C, Liu W, Shen Y, Chen J, Zhu H, Yang X, Jiang X, Wang Y, Zhou J. Cardiomyocyte dedifferentiation and remodeling in 3D scaffolds to generate the cellular diversity of engineering cardiac tissues. Biomater Sci 2019; 7:4636-4650. [PMID: 31455969 DOI: 10.1039/c9bm01003c] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The use of engineered cardiac tissues (ECTs) is a new strategy for the repair and replacement of cardiac tissues in patients with myocardial infarction, particularly at late stages. However, the mechanisms underlying the development of ECTs, including cell-scaffold interactions, are not fully understood, although they are closely related to their therapeutic effect. In the present study, we aimed to determine the cellular fate of cardiomyocytes in a 3D scaffold microenvironment, as well as their role in generating the cellular diversity of ECTs by single-cell sequencing analysis. Consistent with the observed plasticity of cardiomyocytes during cardiac regeneration, cardiomyocytes in 3D scaffolds appeared to dedifferentiate, showing an initial loss of normal cytoskeleton organization in the adaptive response to the new scaffold microenvironment. Cardiomyocytes undergoing this process regained their proliferation potential and gradually developed into myocardial cells at different developmental stages, generating heterogeneous regenerative ECTs. To better characterize the remodeled ECTs, high-throughput single-cell sequencing was performed. The ECTs contained a wide diversity of cells related to endogenous classes in the heart, including myocardial cells at different developmental stages and different kinds of interstitial cells. Non-cardiac cells seemed to play important roles in cardiac reconstruction, especially Cajal-like interstitial cells and macrophages. Altogether, our results showed for the first time that cells underwent adaptive dedifferentiation for survival in a 3D scaffold microenvironment to generate heterogeneous tissues. These findings provide an important basis for an improved understanding of the development and assembly of engineered tissues.
Collapse
Affiliation(s)
- Changyong Wang
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Wei Liu
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Yuan Shen
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Jiayun Chen
- College of Life Science and Technology, Huazhong Agricultural university, No.1, shizishan street, Wuhan 430070, PR China
| | - Huimin Zhu
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Xiaoning Yang
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Xiaoxia Jiang
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Yan Wang
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| | - Jin Zhou
- Tissue Engineering Research Center, Academy of Military Medical Sciences and Department of Neural Engineering and Biological Interdisciplinary Studies, Institute of Military Cognition and Brain Sciences, Academy of Military Medical Sciences, 27 Taiping Rd, Beijing 100850, PR China
| |
Collapse
|
7
|
Vodiasova EA, Chelebieva ES, Kuleshova ON. The new technologies of high-throughput single-cell RNA sequencing. Vavilovskii Zhurnal Genet Selektsii 2019. [DOI: 10.18699/vj19.520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
A wealth of genome and transcriptome data obtained using new generation sequencing (NGS) technologies for whole organisms could not answer many questions in oncology, immunology, physiology, neurobiology, zoology and other fields of science and medicine. Since the cell is the basis for the living of all unicellular and multicellular organisms, it is necessary to study the biological processes at its level. This understanding gave impetus to the development of a new direction – the creation of technologies that allow working with individual cells (single-cell technology). The rapid development of not only instruments, but also various advanced protocols for working with single cells is due to the relevance of these studies in many fields of science and medicine. Studying the features of various stages of ontogenesis, identifying patterns of cell differentiation and subsequent tissue development, conducting genomic and transcriptome analyses in various areas of medicine (especially in demand in immunology and oncology), identifying cell types and states, patterns of biochemical and physiological processes using single cell technologies, allows the comprehensive research to be conducted at a new level. The first RNA-sequencing technologies of individual cell transcriptomes (scRNA-seq) captured no more than one hundred cells at a time, which was insufficient due to the detection of high cell heterogeneity, existence of the minor cell types (which were not detected by morphology) and complex regulatory pathways. The unique techniques for isolating, capturing and sequencing transcripts of tens of thousands of cells at a time are evolving now. However, new technologies have certain differences both at the sample preparation stage and during the bioinformatics analysis. In the paper we consider the most effective methods of multiple parallel scRNA-seq using the example of 10XGenomics, as well as the specifics of such an experiment, further bioinformatics analysis of the data, future outlook and applications of new high-performance technologies.
Collapse
Affiliation(s)
- E. A. Vodiasova
- A.O. Kovalevsky Institute of Biology of the Southern Seas, RAS
| | | | - O. N. Kuleshova
- A.O. Kovalevsky Institute of Biology of the Southern Seas, RAS
| |
Collapse
|
8
|
A Reference Genome Sequence for the European Silver Fir ( Abies alba Mill.): A Community-Generated Genomic Resource. G3-GENES GENOMES GENETICS 2019; 9:2039-2049. [PMID: 31217262 PMCID: PMC6643874 DOI: 10.1534/g3.119.400083] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Silver fir (Abies alba Mill.) is a keystone conifer of European montane forest ecosystems that has experienced large fluctuations in population size during during the Quaternary and, more recently, due to land-use change. To forecast the species’ future distribution and survival, it is important to investigate the genetic basis of adaptation to environmental change, notably to extreme events. For this purpose, we here provide a first draft genome assembly and annotation of the silver fir genome, established through a community-based initiative. DNA obtained from haploid megagametophyte and diploid needle tissue was used to construct and sequence Illumina paired-end and mate-pair libraries, respectively, to high depth. The assembled A. alba genome sequence accounted for over 37 million scaffolds corresponding to 18.16 Gb, with a scaffold N50 of 14,051 bp. Despite the fragmented nature of the assembly, a total of 50,757 full-length genes were functionally annotated in the nuclear genome. The chloroplast genome was also assembled into a single scaffold (120,908 bp) that shows a high collinearity with both the A. koreana and A. sibirica complete chloroplast genomes. This first genome assembly of silver fir is an important genomic resource that is now publicly available in support of a new generation of research. By genome-enabling this important conifer, this resource will open the gate for new research and more precise genetic monitoring of European silver fir forests.
Collapse
|
9
|
Complete Chloroplast Genome Sequence of an Engelmann Spruce ( Picea engelmannii, Genotype Se404-851) from Western Canada. Microbiol Resour Announc 2019; 8:8/24/e00382-19. [PMID: 31196920 PMCID: PMC6588038 DOI: 10.1128/mra.00382-19] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Engelmann spruce (Picea engelmannii) is a conifer found primarily on the west coast of North America. Here, we present the complete chloroplast genome sequence of Picea engelmannii genotype Se404-851. This chloroplast sequence will benefit future conifer genomic research and contribute resources to further species conservation efforts. Engelmann spruce (Picea engelmannii) is a conifer found primarily on the west coast of North America. Here, we present the complete chloroplast genome sequence of Picea engelmannii genotype Se404-851. This chloroplast sequence will benefit future conifer genomic research and contribute resources to further species conservation efforts.
Collapse
|
10
|
Complete Chloroplast Genome Sequence of a White Spruce (Picea glauca, Genotype WS77111) from Eastern Canada. Microbiol Resour Announc 2019; 8:8/23/e00381-19. [PMID: 31171622 PMCID: PMC6554609 DOI: 10.1128/mra.00381-19] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Here, we present the complete chloroplast genome sequence of white spruce (Picea glauca, genotype WS77111), a coniferous tree widespread in the boreal forests of North America. This sequence contributes to genomic and phylogenetic analyses of the Picea genus that are part of ongoing research to understand their adaptation to environmental stress.
Collapse
|
11
|
Zeeshan S, Xiong R, Liang BT, Ahmed Z. 100 Years of evolving gene-disease complexities and scientific debutants. Brief Bioinform 2019; 21:885-905. [PMID: 30972412 DOI: 10.1093/bib/bbz038] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 03/06/2019] [Accepted: 03/08/2019] [Indexed: 12/22/2022] Open
Abstract
It's been over 100 years since the word `gene' is around and progressively evolving in several scientific directions. Time-to-time technological advancements have heavily revolutionized the field of genomics, especially when it's about, e.g. triple code development, gene number proposition, genetic mapping, data banks, gene-disease maps, catalogs of human genes and genetic disorders, CRISPR/Cas9, big data and next generation sequencing, etc. In this manuscript, we present the progress of genomics from pea plant genetics to the human genome project and highlight the molecular, technical and computational developments. Studying genome and epigenome led to the fundamentals of development and progression of human diseases, which includes chromosomal, monogenic, multifactorial and mitochondrial diseases. World Health Organization has classified, standardized and maintained all human diseases, when many academic and commercial online systems are sharing information about genes and linking to associated diseases. To efficiently fathom the wealth of this biological data, there is a crucial need to generate appropriate gene annotation repositories and resources. Our focus has been how many gene-disease databases are available worldwide and which sources are authentic, timely updated and recommended for research and clinical purposes. In this manuscript, we have discussed and compared 43 such databases and bioinformatics applications, which enable users to connect, explore and, if possible, download gene-disease data.
Collapse
Affiliation(s)
- Saman Zeeshan
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
| | - Ruoyun Xiong
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Bruce T Liang
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA.,Pat and Jim Calhoun Cardiology Center, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Zeeshan Ahmed
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| |
Collapse
|
12
|
Tuskan GA, Groover AT, Schmutz J, DiFazio SP, Myburg A, Grattapaglia D, Smart LB, Yin T, Aury JM, Kremer A, Leroy T, Le Provost G, Plomion C, Carlson JE, Randall J, Westbrook J, Grimwood J, Muchero W, Jacobson D, Michener JK. Hardwood Tree Genomics: Unlocking Woody Plant Biology. FRONTIERS IN PLANT SCIENCE 2018; 9:1799. [PMID: 30619389 PMCID: PMC6304363 DOI: 10.3389/fpls.2018.01799] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 11/19/2018] [Indexed: 05/07/2023]
Abstract
Woody perennial angiosperms (i.e., hardwood trees) are polyphyletic in origin and occur in most angiosperm orders. Despite their independent origins, hardwoods have shared physiological, anatomical, and life history traits distinct from their herbaceous relatives. New high-throughput DNA sequencing platforms have provided access to numerous woody plant genomes beyond the early reference genomes of Populus and Eucalyptus, references that now include willow and oak, with pecan and chestnut soon to follow. Genomic studies within these diverse and undomesticated species have successfully linked genes to ecological, physiological, and developmental traits directly. Moreover, comparative genomic approaches are providing insights into speciation events while large-scale DNA resequencing of native collections is identifying population-level genetic diversity responsible for variation in key woody plant biology across and within species. Current research is focused on developing genomic prediction models for breeding, defining speciation and local adaptation, detecting and characterizing somatic mutations, revealing the mechanisms of gender determination and flowering, and application of systems biology approaches to model complex regulatory networks underlying quantitative traits. Emerging technologies such as single-molecule, long-read sequencing is being employed as additional woody plant species, and genotypes within species, are sequenced, thus enabling a comparative ("evo-devo") approach to understanding the unique biology of large woody plants. Resource availability, current genomic and genetic applications, new discoveries and predicted future developments are illustrated and discussed for poplar, eucalyptus, willow, oak, chestnut, and pecan.
Collapse
Affiliation(s)
- Gerald A. Tuskan
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory (DOE), Oak Ridge, TN, United States
| | - Andrew T. Groover
- Pacific Southwest Research Station, USDA Forest Service, Davis, CA, United States
| | - Jeremy Schmutz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, United States
- Joint Genome Institute, Walnut Creek, CA, United States
| | | | - Alexander Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute, University of Pretoria, Pretoria, South Africa
| | - Dario Grattapaglia
- Embrapa Recursos Genéticos e Biotecnologia, Brasília, Brazil
- Universidade Católica de Brasília, Brasília, Brazil
| | - Lawrence B. Smart
- Horticulture Section, School of Integrative Plant Science, Cornell University, Geneva, NY, United States
| | - Tongming Yin
- The Key Laboratory for Poplar Improvement of Jiangsu Province, Nanjing Forestry University, Nanjing, China
| | - Jean-Marc Aury
- Commissariat à l’Energie Atomique, Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | - Thibault Leroy
- BIOGECO, INRA, Université de Bordeaux, Cestas, France
- ISEM, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| | | | | | - John E. Carlson
- Schatz Center for Tree Molecular Genetics, Department of Ecosystem Science and Management, Pennsylvania State University, University Park, PA, United States
| | - Jennifer Randall
- Department of Entomology, Plant Pathology and Weed Science, New Mexico State University, Las Cruces, NM, United States
| | - Jared Westbrook
- The American Chestnut Foundation, Asheville, NC, United States
| | - Jane Grimwood
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, United States
| | - Wellington Muchero
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory (DOE), Oak Ridge, TN, United States
| | - Daniel Jacobson
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory (DOE), Oak Ridge, TN, United States
| | - Joshua K. Michener
- Center for Bioenergy Innovation, Biosciences Division, Oak Ridge National Laboratory (DOE), Oak Ridge, TN, United States
| |
Collapse
|
13
|
Ott A, Schnable JC, Yeh CT, Wu L, Liu C, Hu HC, Dalgard CL, Sarkar S, Schnable PS. Linked read technology for assembling large complex and polyploid genomes. BMC Genomics 2018; 19:651. [PMID: 30180802 PMCID: PMC6122573 DOI: 10.1186/s12864-018-5040-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 08/27/2018] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Short read DNA sequencing technologies have revolutionized genome assembly by providing high accuracy and throughput data at low cost. But it remains challenging to assemble short read data, particularly for large, complex and polyploid genomes. The linked read strategy has the potential to enhance the value of short reads for genome assembly because all reads originating from a single long molecule of DNA share a common barcode. However, the majority of studies to date that have employed linked reads were focused on human haplotype phasing and genome assembly. RESULTS Here we describe a de novo maize B73 genome assembly generated via linked read technology which contains ~ 172,000 scaffolds with an N50 of 89 kb that cover 50% of the genome. Based on comparisons to the B73 reference genome, 91% of linked read contigs are accurately assembled. Because it was possible to identify errors with > 76% accuracy using machine learning, it may be possible to identify and potentially correct systematic errors. Complex polyploids represent one of the last grand challenges in genome assembly. Linked read technology was able to successfully resolve the two subgenomes of the recent allopolyploid, proso millet (Panicum miliaceum). Our assembly covers ~ 83% of the 1 Gb genome and consists of 30,819 scaffolds with an N50 of 912 kb. CONCLUSIONS Our analysis provides a framework for future de novo genome assemblies using linked reads, and we suggest computational strategies that if implemented have the potential to further improve linked read assemblies, particularly for repetitive genomes.
Collapse
Affiliation(s)
- Alina Ott
- Department of Agronomy, Iowa State University, Ames, IA 50011 USA
- Present address: Roche Sequencing Solutions, 500 S Rosa Road, Madison, WI 53719 USA
| | - James C. Schnable
- Department of Agriculture and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588 USA
- Data2Bio LLC, 2079 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
- Dryland Genetics LLC, 2073 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
| | - Cheng-Ting Yeh
- Department of Agronomy, Iowa State University, Ames, IA 50011 USA
- Data2Bio LLC, 2079 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
| | - Linjiang Wu
- Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 USA
| | - Chao Liu
- Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 USA
- Present address: Department of Thermal Engineering, Tsinghua University, Beijing, 100084 China
| | - Heng-Cheng Hu
- The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Collaborative Health Initiative Research Program (CHIRP), Uniformed Services University School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Present address: Qiagen Sciences Inc, 6951 Executive Way, Frederick, MD 21703 USA
| | - Clifton L. Dalgard
- The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Collaborative Health Initiative Research Program (CHIRP), Uniformed Services University School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Department of Anatomy, Physiology and Genetics, Uniformed Services University School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
| | - Soumik Sarkar
- Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 USA
| | - Patrick S. Schnable
- Department of Agronomy, Iowa State University, Ames, IA 50011 USA
- Data2Bio LLC, 2079 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
- Dryland Genetics LLC, 2073 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
| |
Collapse
|
14
|
Liu Q, Chang S, Hartman GL, Domier LL. Assembly and annotation of a draft genome sequence for Glycine latifolia, a perennial wild relative of soybean. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 95:71-85. [PMID: 29671916 DOI: 10.1111/tpj.13931] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Revised: 03/12/2018] [Accepted: 03/22/2018] [Indexed: 05/14/2023]
Abstract
Glycine latifolia (Benth.) Newell & Hymowitz (2n = 40), one of the 27 wild perennial relatives of soybean, possesses genetic diversity and agronomically favorable traits that are lacking in soybean. Here, we report the 939-Mb draft genome assembly of G. latifolia (PI 559298) using exclusively linked-reads sequenced from a single Chromium library. We organized scaffolds into 20 chromosome-scale pseudomolecules utilizing two genetic maps and the Glycine max (L.) Merr. genome sequence. High copy numbers of putative 91-bp centromere-specific tandem repeats were observed in consecutive blocks within predicted pericentromeric regions on several pseudomolecules. No 92-bp putative centromeric repeats, which are abundant in G. max, were detected in G. latifolia or Glycine tomentella. Annotation of the assembled genome and subsequent filtering yielded a high confidence gene set of 54 475 protein-coding loci. In comparative analysis with five legume species, genes related to defense responses were significantly overrepresented in Glycine-specific orthologous gene families. A total of 304 putative nucleotide-binding site (NBS)-leucine-rich-repeat (LRR) genes were identified in this genome assembly. Different from other legume species, we observed a scarcity of TIR-NBS-LRR genes in G. latifolia. The G. latifolia genome was also predicted to contain genes encoding 367 LRR-receptor-like kinases, a family of proteins involved in basal defense responses and responses to abiotic stress. The genome sequence and annotation of G. latifolia provides a valuable source of alternative alleles and novel genes to facilitate soybean improvement. This study also highlights the efficacy and cost-effectiveness of the application of Chromium linked-reads in diploid plant genome de novo assembly.
Collapse
Affiliation(s)
- Qiong Liu
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Sungyul Chang
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Glen L Hartman
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
- USDA ARS, Urbana, IL, 61801, USA
| | - Leslie L Domier
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
- USDA ARS, Urbana, IL, 61801, USA
| |
Collapse
|
15
|
Yuan Y, Bayer PE, Batley J, Edwards D. Improvements in Genomic Technologies: Application to Crop Genomics. Trends Biotechnol 2017; 35:547-558. [DOI: 10.1016/j.tibtech.2017.02.009] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Revised: 02/10/2017] [Accepted: 02/14/2017] [Indexed: 12/13/2022]
|
16
|
Abstract
The human reference genome is part of the foundation of modern human biology and a monumental scientific achievement. However, because it excludes a great deal of common human variation, it introduces a pervasive reference bias into the field of human genomics. To reduce this bias, it makes sense to draw on representative collections of human genomes, brought together into reference cohorts. There are a number of techniques to represent and organize data gleaned from these cohorts, many using ideas implicitly or explicitly borrowed from graph-based models. Here, we survey various projects underway to build and apply these graph-based structures-which we collectively refer to as genome graphs-and discuss the improvements in read mapping, variant calling, and haplotype determination that genome graphs are expected to produce.
Collapse
Affiliation(s)
- Benedict Paten
- Genomics Institute, CBSE, 501C Engineering 2, University of California Santa Cruz, Santa Cruz, California 95064, USA
| | - Adam M Novak
- Genomics Institute, CBSE, 501C Engineering 2, University of California Santa Cruz, Santa Cruz, California 95064, USA
| | - Jordan M Eizenga
- Genomics Institute, CBSE, 501C Engineering 2, University of California Santa Cruz, Santa Cruz, California 95064, USA
| | - Erik Garrison
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
| |
Collapse
|
17
|
Mason CE, Afshinnekoo E, Tighe S, Wu S, Levy S. International Standards for Genomes, Transcriptomes, and Metagenomes. J Biomol Tech 2017; 28:8-18. [PMID: 28337071 PMCID: PMC5359768 DOI: 10.7171/jbt.17-2801-006] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine.
Collapse
Affiliation(s)
- Christopher E. Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York 10065, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York 10065, USA
- Feil Family Brain & Mind Research Institute, Weill Cornell Medicine, New York, New York 10065, USA
| | - Ebrahim Afshinnekoo
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York 10065, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York 10065, USA
- School of Medicine, New York Medical College, Valhalla, New York 10595, USA
| | - Scott Tighe
- Advanced Genomics Lab, University of Vermont Cancer Center, Burlington, Vermont 05405, USA
| | - Shixiu Wu
- Hangzhou Cancer Institute in Hangzhou Cancer Hospital, Hangzhou, China; and
| | - Shawn Levy
- HudsonAlpha Institute of Technology, Huntsville, Alabama 35806, USA
| |
Collapse
|
18
|
Wang J, Song Y. Single cell sequencing: a distinct new field. Clin Transl Med 2017; 6:10. [PMID: 28220395 PMCID: PMC5318355 DOI: 10.1186/s40169-017-0139-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Accepted: 02/11/2017] [Indexed: 12/12/2022] Open
Abstract
Single cell sequencing (SCS) has become a new approach to study biological heterogeneity. The advancement in technologies for single cell isolation, amplification of genome/transcriptome and next-generation sequencing enables SCS to reveal the inherent properties of a single cell from the large scale of the genome, transcriptome or epigenome at high resolution. Recently, SCS has been widely applied in various clinical and research fields, such as cancer biology and oncology, immunology, microbiology, neurobiology and prenatal diagnosis. In this review, we will discuss the development of SCS methods and focus on the latest clinical and research applications of SCS.
Collapse
Affiliation(s)
- Jian Wang
- Department of Pulmonary Medicine, Zhongshan Hospital, Fudan University, Shanghai, 200030, China
| | - Yuanlin Song
- Department of Pulmonary Medicine, Zhongshan Hospital, Fudan University, Shanghai, 200030, China.
| |
Collapse
|