1
|
Puginier C, Libourel C, Otte J, Skaloud P, Haon M, Grisel S, Petersen M, Berrin JG, Delaux PM, Dal Grande F, Keller J. Phylogenomics reveals the evolutionary origins of lichenization in chlorophyte algae. Nat Commun 2024; 15:4452. [PMID: 38789482 PMCID: PMC11126685 DOI: 10.1038/s41467-024-48787-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 05/10/2024] [Indexed: 05/26/2024] Open
Abstract
Mutualistic symbioses have contributed to major transitions in the evolution of life. Here, we investigate the evolutionary history and the molecular innovations at the origin of lichens, which are a symbiosis established between fungi and green algae or cyanobacteria. We de novo sequence the genomes or transcriptomes of 12 lichen algal symbiont (LAS) and closely related non-symbiotic algae (NSA) to improve the genomic coverage of Chlorophyte algae. We then perform ancestral state reconstruction and comparative phylogenomics. We identify at least three independent gains of the ability to engage in the lichen symbiosis, one in Trebouxiophyceae and two in Ulvophyceae, confirming the convergent evolution of the lichen symbioses. A carbohydrate-active enzyme from the glycoside hydrolase 8 (GH8) family was identified as a top candidate for the molecular-mechanism underlying lichen symbiosis in Trebouxiophyceae. This GH8 was acquired in lichenizing Trebouxiophyceae by horizontal gene transfer, concomitantly with the ability to associate with lichens fungal symbionts (LFS) and is able to degrade polysaccharides found in the cell wall of LFS. These findings indicate that a combination of gene family expansion and horizontal gene transfer provided the basis for lichenization to evolve in chlorophyte algae.
Collapse
Affiliation(s)
- Camille Puginier
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, CNRS, UPS, INP, Toulouse, 31320, Castanet-Tolosan, France
| | - Cyril Libourel
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, CNRS, UPS, INP, Toulouse, 31320, Castanet-Tolosan, France
| | - Juergen Otte
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Senckenberganlage 25, 60325, Frankfurt am Main, Germany
| | - Pavel Skaloud
- Department of Botany, Faculty of Science, Charles University, Benátská 2, CZ-12800, Praha 2, Czech Republic
| | - Mireille Haon
- INRAE, Aix Marseille Université, UMR1163 Biodiversité et Biotechnologie Fongiques (BBF), 13009, Marseille, France
- INRAE, Aix Marseille Université, 3PE Platform, 13009, Marseille, France
| | - Sacha Grisel
- INRAE, Aix Marseille Université, UMR1163 Biodiversité et Biotechnologie Fongiques (BBF), 13009, Marseille, France
- INRAE, Aix Marseille Université, 3PE Platform, 13009, Marseille, France
| | - Malte Petersen
- High Performance Computing & Analytics Lab, University of Bonn, Friedrich-Hirzebruch-Allee 8, 53115, Bonn, Germany
| | - Jean-Guy Berrin
- INRAE, Aix Marseille Université, UMR1163 Biodiversité et Biotechnologie Fongiques (BBF), 13009, Marseille, France
- INRAE, Aix Marseille Université, 3PE Platform, 13009, Marseille, France
| | - Pierre-Marc Delaux
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, CNRS, UPS, INP, Toulouse, 31320, Castanet-Tolosan, France.
| | - Francesco Dal Grande
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Senckenberganlage 25, 60325, Frankfurt am Main, Germany.
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Senckenberganlage 25, 60325, Frankfurt am Main, Germany.
- Department of Biology, University of Padova, Padua, Italy.
| | - Jean Keller
- Laboratoire de Recherche en Sciences Végétales (LRSV), Université de Toulouse, CNRS, UPS, INP, Toulouse, 31320, Castanet-Tolosan, France.
- Department of Insect Symbiosis, Max Planck Institute for Chemical Ecology, 07745, Jena, Germany.
| |
Collapse
|
2
|
Su C, Chandradoss KR, Malachowski T, Boya R, Ryu HS, Brennand KJ, Phillips-Cremins JE. MASTR-seq: Multiplexed Analysis of Short Tandem Repeats with sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591790. [PMID: 38746155 PMCID: PMC11092654 DOI: 10.1101/2024.04.29.591790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
More than 60 human disorders have been linked to unstable expansion of short tandem repeat (STR) tracts. STR length and the extent of DNA methylation is linked to disease pathology and can be mosaic in a cell type-specific manner in several repeat expansion disorders. Mosaic phenomenon have been difficult to study to date due to technical bias intrinsic to repeat sequences and the need for multi-modal measurements at single-allele resolution. Nanopore long-read sequencing accurately measures STR length and DNA methylation in the same single molecule but is cost prohibitive for studies assessing a target locus across multiple experimental conditions or patient samples. Here, we describe MASTR-seq, M ultiplexed A nalysis of S hort T andem R epeats, for cost-effective, high-throughput, accurate, multi-modal measurements of DNA methylation and STR genotype at single-allele resolution. MASTR-seq couples long-read sequencing, Cas9-mediated target enrichment, and PCR-free multiplexed barcoding to achieve a >ten-fold increase in on-target read mapping for 8-12 pooled samples in a single MinION flow cell. We provide a detailed experimental protocol and computational tools and present evidence that MASTR-seq quantifies tract length and DNA methylation status for CGG and CAG STR loci in normal-length and mutation-length human cell lines. The MASTR-seq protocol takes approximately eight days for experiments and one additional day for data processing and analyses. Key points We provide a protocol for MASTR-seq: M ultiplexed A nalysis of S hort T andem R epeats using Cas9-mediated target enrichment and PCR-free, multiplexed nanopore sequencing. MASTR-seq achieves a >10-fold increase in on-target read proportion for highly repetitive, technically inaccessible regions of the genome relevant for human health and disease.MASTR-seq allows for high-throughput, efficient, accurate, and cost-effective measurement of STR length and DNA methylation in the same single allele for up to 8-12 samples in parallel in one Nanopore MinION flow cell.
Collapse
|
3
|
Hook PW, Timp W. Beyond assembly: the increasing flexibility of single-molecule sequencing technology. Nat Rev Genet 2023; 24:627-641. [PMID: 37161088 PMCID: PMC10169143 DOI: 10.1038/s41576-023-00600-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/30/2023] [Indexed: 05/11/2023]
Abstract
The maturation of high-throughput short-read sequencing technology over the past two decades has shaped the way genomes are studied. Recently, single-molecule, long-read sequencing has emerged as an essential tool in deciphering genome structure and function, including filling gaps in the human reference genome, measuring the epigenome and characterizing splicing variants in the transcriptome. With recent technological developments, these single-molecule technologies have moved beyond genome assembly and are being used in a variety of ways, including to selectively sequence specific loci with long reads, measure chromatin state and protein-DNA binding in order to investigate the dynamics of gene regulation, and rapidly determine copy number variation. These increasingly flexible uses of single-molecule technologies highlight a young and fast-moving part of the field that is leading to a more accessible era of nucleic acid sequencing.
Collapse
Affiliation(s)
- Paul W Hook
- Department of Biomedical Engineering, Molecular Biology and Genetics, and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Molecular Biology and Genetics, and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
4
|
Scheunert A, Lautenschlager U, Ott T, Oberprieler C. Nano-Strainer: A workflow for the identification of single-copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads. Ecol Evol 2023; 13:e10190. [PMID: 37475726 PMCID: PMC10354226 DOI: 10.1002/ece3.10190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 05/18/2023] [Accepted: 06/01/2023] [Indexed: 07/22/2023] Open
Abstract
In modern plant systematics, target enrichment enables simultaneous analysis of hundreds of genes. However, when dealing with reticulate or polyploidization histories, few markers may suffice, but often are required to be single-copy, a condition that is not necessarily met with commercial capture kits. Also, large genome sizes can render target capture ineffective, so that amplicon sequencing would be preferable; however, knowledge about suitable loci is often missing. Here, we present a comprehensive workflow for the identification of putative single-copy nuclear markers in a genus of interest, by mining a small dataset from target capture using a few representative taxa. The proposed pipeline assesses sequence variability contained in the data from targeted loci and assigns reads to their respective genes, via a combined BLAST/clustering procedure. Cluster consensus sequences are then examined based on four pre-defined criteria presumably indicative for absence of paralogy. This is done by calculating four specialized indices; loci are ranked according to their performance in these indices, and top-scoring loci are considered putatively single- or low copy. The approach can be applied to any probe set. As it relies on long reads, the present contribution also provides template workflows for processing Nanopore-based target capture data. Obtained markers are further tested and then entered into amplicon sequencing. For the detection of possibly remaining paralogy in these data, which might occur in groups with rampant paralogy, we also employ the long-read assembly tool canu. In diploid representatives of the young Compositae genus Leucanthemum, characterized by high levels of polyploidy, our approach resulted in successful amplification of 13 loci. Modifications to remove traces of paralogy were made in seven of these. A species tree from the markers correctly reproduced main relationships in the genus, however, at low resolution. The presented workflow has the potential to valuably support phylogenetic research, for example in polyploid plant groups.
Collapse
Affiliation(s)
- Agnes Scheunert
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| | - Ulrich Lautenschlager
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| | - Tankred Ott
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| | - Christoph Oberprieler
- Evolutionary and Systematic Botany Group, Institute of Plant SciencesUniversity of RegensburgRegensburgGermany
| |
Collapse
|
5
|
Pezzini FF, Ferrari G, Forrest LL, Hart ML, Nishii K, Kidner CA. Target capture and genome skimming for plant diversity studies. APPLICATIONS IN PLANT SCIENCES 2023; 11:e11537. [PMID: 37601316 PMCID: PMC10439825 DOI: 10.1002/aps3.11537] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 06/16/2023] [Accepted: 07/10/2023] [Indexed: 08/22/2023]
Abstract
Recent technological advances in long-read high-throughput sequencing and assembly methods have facilitated the generation of annotated chromosome-scale whole-genome sequence data for evolutionary studies; however, generating such data can still be difficult for many plant species. For example, obtaining high-molecular-weight DNA is typically impossible for samples in historical herbarium collections, which often have degraded DNA. The need to fast-freeze newly collected living samples to conserve high-quality DNA can be complicated when plants are only found in remote areas. Therefore, short-read reduced-genome representations, such as target capture and genome skimming, remain important for evolutionary studies. Here, we review the pros and cons of each technique for non-model plant taxa. We provide guidance related to logistics, budget, the genomic resources previously available for the target clade, and the nature of the study. Furthermore, we assess the available bioinformatic analyses, detailing best practices and pitfalls, and suggest pathways to combine newly generated data with legacy data. Finally, we explore the possible downstream analyses allowed by the type of data generated using each technique. We provide a practical guide to help researchers make the best-informed choice regarding reduced genome representation for evolutionary studies of non-model plants in cases where whole-genome sequencing remains impractical.
Collapse
Affiliation(s)
| | - Giada Ferrari
- Royal Botanic Garden Edinburgh Edinburgh United Kingdom
| | | | | | - Kanae Nishii
- Royal Botanic Garden Edinburgh Edinburgh United Kingdom
| | - Catherine A Kidner
- Royal Botanic Garden Edinburgh Edinburgh United Kingdom
- School of Biological Sciences University of Edinburgh Edinburgh United Kingdom
| |
Collapse
|
6
|
CRISPR/Cas9-Mediated Enrichment Coupled to Nanopore Sequencing Provides a Valuable Tool for the Precise Reconstruction of Large Genomic Target Regions. Int J Mol Sci 2023; 24:ijms24021076. [PMID: 36674592 PMCID: PMC9863143 DOI: 10.3390/ijms24021076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/23/2022] [Accepted: 12/24/2022] [Indexed: 01/09/2023] Open
Abstract
Complete and accurate identification of genetic variants associated with specific phenotypes can be challenging when there is a high level of genomic divergence between individuals in a study and the corresponding reference genome. We have applied the Cas9-mediated enrichment coupled to nanopore sequencing to perform a targeted de novo assembly and accurately reconstruct a genomic region of interest. This approach was used to reconstruct a 250-kbp target region on chromosome 5 of the common bean genome (Phaseolus vulgaris) associated with the shattering phenotype. Comparing a non-shattering cultivar (Midas) with the reference genome revealed many single-nucleotide variants and structural variants in this region. We cut five 50-kbp tiled sub-regions of Midas genomic DNA using Cas9, followed by sequencing on a MinION device and de novo assembly, generating a single contig spanning the whole 250-kbp region. This assembly increased the number of Illumina reads mapping to genes in the region, improving their genotypability for downstream analysis. The Cas9 tiling approach for target enrichment and sequencing is a valuable alternative to whole-genome sequencing for the assembly of ultra-long regions of interest, improving the accuracy of downstream genotype-phenotype association analysis.
Collapse
|
7
|
Steiert TA, Fuß J, Juzenas S, Wittig M, Hoeppner M, Vollstedt M, Varkalaite G, ElAbd H, Brockmann C, Görg S, Gassner C, Forster M, Franke A. High-throughput method for the hybridisation-based targeted enrichment of long genomic fragments for PacBio third-generation sequencing. NAR Genom Bioinform 2022; 4:lqac051. [PMID: 35855323 PMCID: PMC9278042 DOI: 10.1093/nargab/lqac051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 06/08/2022] [Accepted: 06/29/2022] [Indexed: 11/16/2022] Open
Abstract
Hybridisation-based targeted enrichment is a widely used and well-established technique in high-throughput second-generation short-read sequencing. Despite the high potential to genetically resolve highly repetitive and variable genomic sequences by, for example PacBio third-generation sequencing, targeted enrichment for long fragments has not yet established the same high-throughput due to currently existing complex workflows and technological dependencies. We here describe a scalable targeted enrichment protocol for fragment sizes of >7 kb. For demonstration purposes we developed a custom blood group panel of challenging loci. Test results achieved > 65% on-target rate, good coverage (142.7×) and sufficient coverage evenness for both non-paralogous and paralogous targets, and sufficient non-duplicate read counts (83.5%) per sample for a highly multiplexed enrichment pool of 16 samples. We genotyped the blood groups of nine patients employing highly accurate phased assemblies at an allelic resolution that match reference blood group allele calls determined by SNP array and NGS genotyping. Seven Genome-in-a-Bottle reference samples achieved high recall (96%) and precision (99%) rates. Mendelian error rates were 0.04% and 0.13% for the included Ashkenazim and Han Chinese trios, respectively. In summary, we provide a protocol and first example for accurate targeted long-read sequencing that can be used in a high-throughput fashion.
Collapse
Affiliation(s)
- Tim Alexander Steiert
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Janina Fuß
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Simonas Juzenas
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
- Institute of Biotechnology, Life Science Centre, Vilnius University, Vilnius 02241, Lithuania
| | - Michael Wittig
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Marc Patrick Hoeppner
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Melanie Vollstedt
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Greta Varkalaite
- Institute for Digestive Research, Lithuanian University of Health Sciences, Kaunas 44307, Lithuania
| | - Hesham ElAbd
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Christian Brockmann
- Institute of Transfusion Medicine, University Hospital of Schleswig-Holstein, Kiel 24105, Germany
| | - Siegfried Görg
- Institute of Transfusion Medicine, University Hospital of Schleswig-Holstein, Kiel 24105, Germany
| | - Christoph Gassner
- Institute of Translational Medicine, Private University in the Principality of Liechtenstein, Triesen 9495, Liechtenstein
| | - Michael Forster
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel 24105, Germany
| |
Collapse
|
8
|
Buitrago Acosta MC, Montúfar R, Guyot R, Mariac C, Tranbarger TJ, Restrepo S, Couvreur TLP. Bactris gasipaes Kunth var. gasipaes complete plastome and phylogenetic analysis. Mitochondrial DNA B Resour 2022; 7:1540-1544. [PMID: 36046105 PMCID: PMC9423826 DOI: 10.1080/23802359.2022.2109437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Bactris gasipaes var. gasipaes (Arecaceae, Palmae) is an economically and socially important plant species for populations across tropical South and Central America. It has been domesticated from its wild variety, B. gasipaes var. chichagui, since pre-Columbian times. In this study, we sequenced the plastome of the cultivated variety, B. gasipaes Kunth var. gasipaes and compared it with the published plastome of the wild variety. The chloroplast sequence obtained was 156,580 bp. The cultivated chloroplast sequence was conserved compared to the wild type sequence with 99.8% of nucleotide identity. We did, however, identify multiple Single Nucleotide Variants (SNVs), insertions, microsatellites and a resolved region of missing nucleotides. A SNV in one of the core barcode markers (matK) was detected between the wild and cultivated accessions. Phylogenetic analysis was carried out across the Arecaceae family and compared to previous reports, resulting in an identical topology. This study is a step forward in understanding the genome evolution of this species.
Collapse
Affiliation(s)
| | - Rommel Montúfar
- Facultad de Ciencias Exactas y Naturales, Pontificia Universidad Católica del Ecuador, Quito, Ecuador
| | - Romain Guyot
- DIADE, Univ Montpellier, CIRAD, IRD, Montpellier, France
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia
| | - Cedric Mariac
- DIADE, Univ Montpellier, CIRAD, IRD, Montpellier, France
| | | | - Silvia Restrepo
- Laboratorio de Micología y Fitopatología, Universidad de los Andes, Bogotá, Colombia
| | - Thomas L. P. Couvreur
- Facultad de Ciencias Exactas y Naturales, Pontificia Universidad Católica del Ecuador, Quito, Ecuador
- DIADE, Univ Montpellier, CIRAD, IRD, Montpellier, France
| |
Collapse
|
9
|
Wang L, Jia M, Li Z, Liu X, Sun T, Pei J, Wei C, Lin Z, Li H. Wristwatch PCR: A Versatile and Efficient Genome Walking Strategy. Front Bioeng Biotechnol 2022; 10:792848. [PMID: 35497369 PMCID: PMC9039356 DOI: 10.3389/fbioe.2022.792848] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 03/08/2022] [Indexed: 11/16/2022] Open
Abstract
Genome walking is a method used to retrieve unknown flanking DNA. Here, we reported wristwatch (WW) PCR, an efficient genome walking technique mediated by WW primers (WWPs). WWPs feature 5′- and 3′-overlap and a heterologous interval. Therefore, a wristwatch-like structure can be formed between WWPs under relatively low temperatures. Each WW-PCR set is composed of three nested (primary, secondary, and tertiary) PCRs individually performed by three WWPs. The WWP is arbitrarily annealed somewhere on the genome in the one low-stringency cycle of the primary PCR, or directionally to the previous WWP site in one reduced-stringency cycle of the secondary/tertiary PCR, producing a pool of single-stranded DNAs (ssDNAs). A target ssDNA incorporates a gene-specific primer (GSP) complementary at the 3′-end and the WWP at the 5′-end and thus can be exponentially amplified in the next high-stringency cycles. Nevertheless, a non-target ssDNA cannot be amplified as it lacks a perfect binding site for any primers. The practicability of the WW-PCR was validated by successfully accessing unknown regions flanking Lactobacillus brevis CD0817 glutamate decarboxylase gene and the hygromycin gene of rice. The WW-PCR is an attractive alternative to the existing genome walking techniques.
Collapse
Affiliation(s)
- Lingqin Wang
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
| | - Mengya Jia
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
| | - Zhaoqin Li
- Charles W. Davidson College of Engineering, San Jose State University, San Jose, CA, United States
| | - Xiaohua Liu
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
| | - Tianyi Sun
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
- Key Laboratory of Poyang Lake Environment and Resource Utilization, Ministry of Education, School of Environmental and Chemical Engineering, Nanchang University, Nanchang, China
| | - Jinfeng Pei
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
| | - Cheng Wei
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
| | - Zhiyu Lin
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
- Key Laboratory of Poyang Lake Environment and Resource Utilization, Ministry of Education, School of Environmental and Chemical Engineering, Nanchang University, Nanchang, China
| | - Haixing Li
- State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang, China
- Sino-German Joint Research Institute, Nanchang University, Nanchang, China
- *Correspondence: Haixing Li,
| |
Collapse
|
10
|
Hale JM. Engaging the next generation of plant geneticists through sustained research: an overview of a post-16 project. Heredity (Edinb) 2020; 125:431-436. [PMID: 32943768 PMCID: PMC7495401 DOI: 10.1038/s41437-020-00370-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 09/08/2020] [Accepted: 09/08/2020] [Indexed: 11/21/2022] Open
Abstract
Student career aspirations are directly linked to the careers that they are exposed to and the esteem that they are given in society. Where schools are located in areas with low visibility of scientific careers this will have an impact on student aspirations. This project is demonstrating that aspirations can be altered by engaging 16-18-year-old A level biologists in sustained research. A total of 20 students from schools across Jersey are attempting to sequence the chloroplast genomes from daffodils that they have collected from non-cultivated locations using Oxford Nanopore Technologies' MinION. Despite site closures due to COVID-19 control measures, the project has developed insight into different scientific careers through experience and ownership of the entire project pipeline. This project demonstrates an opportunity for schools and academics to collaborate to further science and potentially improve student outcomes.
Collapse
Affiliation(s)
- Jon Michael Hale
- Beaulieu Convent School, Wellington Road, St. Helier, JE2 4RJ, Jersey.
| |
Collapse
|
11
|
Rodriguez OL, Gibson WS, Parks T, Emery M, Powell J, Strahl M, Deikus G, Auckland K, Eichler EE, Marasco WA, Sebra R, Sharp AJ, Smith ML, Bashir A, Watson CT. A Novel Framework for Characterizing Genomic Haplotype Diversity in the Human Immunoglobulin Heavy Chain Locus. Front Immunol 2020; 11:2136. [PMID: 33072076 PMCID: PMC7539625 DOI: 10.3389/fimmu.2020.02136] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 08/06/2020] [Indexed: 02/06/2023] Open
Abstract
An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody-mediated processes. Due to locus complexity, standard high-throughput approaches have failed to accurately and comprehensively capture IGH polymorphism. As a result, the locus has only been fully characterized two times, severely limiting our knowledge of human IGH diversity. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize IGH variation in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, identifying 2 novel structural variants and 15 novel IGH alleles. We show multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a desperately needed foundation for leveraging IG genomic data to study population-level variation in antibody-mediated immunity, critical for bettering our understanding of disease risk, and responses to vaccines and therapeutics.
Collapse
Affiliation(s)
- Oscar L Rodriguez
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Tom Parks
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Matthew Emery
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - James Powell
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Maya Strahl
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Gintaras Deikus
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Kathryn Auckland
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, United States
| | - Wayne A Marasco
- Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Icahn Institute of Data Science and Genomic Technology, New York, NY, United States
| | - Andrew J Sharp
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Melissa L Smith
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States.,Icahn Institute of Data Science and Genomic Technology, New York, NY, United States
| | - Ali Bashir
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| |
Collapse
|
12
|
López-Girona E, Davy MW, Albert NW, Hilario E, Smart MEM, Kirk C, Thomson SJ, Chagné D. CRISPR-Cas9 enrichment and long read sequencing for fine mapping in plants. PLANT METHODS 2020; 16:121. [PMID: 32884578 PMCID: PMC7465313 DOI: 10.1186/s13007-020-00661-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2020] [Accepted: 08/18/2020] [Indexed: 05/03/2023]
Abstract
BACKGROUND Genomic methods for identifying causative variants for trait loci applicable to a wide range of germplasm are required for plant biologists and breeders to understand the genetic control of trait variation. RESULTS We implemented Cas9-targeted sequencing for fine-mapping in apple, a method combining CRISPR-Cas9 targeted cleavage of a region of interest, followed by enrichment and long-read sequencing using the Oxford Nanopore Technology (ONT). We demonstrated the capability of this methodology to specifically cleave and enrich a plant genomic locus spanning 8 kb. The repeated mini-satellite motif located upstream of the Malus × domestica (apple) MYB10 transcription factor gene, causing red fruit colouration when present in a heterozygous state, was our exemplar to demonstrate the efficiency of this method: it contains a genomic region with a long structural variant normally ignored by short-read sequencing technologiesCleavage specificity of the guide RNAs was demonstrated using polymerase chain reaction products, before using them to specify cleavage of high molecular weight apple DNA. An enriched library was subsequently prepared and sequenced using an ONT MinION flow cell (R.9.4.1). Of the 7,056 ONT reads base-called using both Albacore2 (v2.3.4) and Guppy (v3.2.4), with a median length of 9.78 and 9.89 kb, respectively, 85.35 and 91.38%, aligned to the reference apple genome. Of the aligned reads, 2.98 and 3.04% were on-target with read depths of 180 × and 196 × for Albacore2 and Guppy, respectively, and only five genomic loci were off-target with read depth greater than 25 × , which demonstrated the efficiency of the enrichment method and specificity of the CRISPR-Cas9 cleavage. CONCLUSIONS We demonstrated that this method can isolate and resolve single-nucleotide and structural variants at the haplotype level in plant genomic regions. The combination of CRISPR-Cas9 target enrichment and ONT sequencing provides a more efficient technology for fine-mapping loci than genome-walking approaches.
Collapse
Affiliation(s)
- Elena López-Girona
- The New Zealand Institute for Plant and Food Research Limited (Plant & Food Research), Private Bag 11600, Palmerston North, 4442 New Zealand
| | | | - Nick W. Albert
- The New Zealand Institute for Plant and Food Research Limited (Plant & Food Research), Private Bag 11600, Palmerston North, 4442 New Zealand
| | | | - Maia E. M. Smart
- The New Zealand Institute for Plant and Food Research Limited (Plant & Food Research), Private Bag 11600, Palmerston North, 4442 New Zealand
| | - Chris Kirk
- The New Zealand Institute for Plant and Food Research Limited (Plant & Food Research), Private Bag 11600, Palmerston North, 4442 New Zealand
| | | | - David Chagné
- The New Zealand Institute for Plant and Food Research Limited (Plant & Food Research), Private Bag 11600, Palmerston North, 4442 New Zealand
| |
Collapse
|
13
|
Barrett CF. Plastid genomes of the North American Rhus integrifolia-ovata complex and phylogenomic implications of inverted repeat structural evolution in Rhus L. PeerJ 2020; 8:e9315. [PMID: 32587799 PMCID: PMC7304433 DOI: 10.7717/peerj.9315] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Accepted: 05/17/2020] [Indexed: 12/12/2022] Open
Abstract
Plastid genomes (plastomes) represent rich sources of information for phylogenomics, from higher-level studies to below the species level. The genus Rhus (sumac) has received a significant amount of study from phylogenetic and biogeographic perspectives, but genomic studies in this genus are lacking. Rhus integrifolia and R. ovata are two shrubby species of high ecological importance in the southwestern USA and Mexico, where they occupy coastal scrub and chaparral habitats. They hybridize frequently, representing a fascinating system in which to investigate the opposing effects of hybridization and divergent selection, yet are poorly characterized from a genomic perspective. In this study, complete plastid genomes were sequenced for one accession of R. integrifolia and one each of R. ovata from California and Arizona. Sequence variation among these three accessions was characterized, and PCR primers potentially useful in phylogeographic studies were designed. Phylogenomic analyses were conducted based on a robustly supported phylogenetic framework based on 52 complete plastomes across the order Sapindales. Repeat content, rather than the size of the inverted repeat, had a stronger relative association with total plastome length across Sapindales when analyzed with phylogenetic least squares regression. Variation at the inverted repeat boundary within Rhus was striking, resulting in major shifts and independent gene losses. Specifically, rps19 was lost independently in the R. integrifolia-ovata complex and in R. chinensis, with a further loss of rps22 and a major contraction of the inverted repeat in two accessions of the latter. Rhus represents a promising novel system to study plastome structural variation of photosynthetic angiosperms at and below the species level.
Collapse
Affiliation(s)
- Craig F. Barrett
- Department of Biology, West Virginia University, Morgantown, WV, USA
| |
Collapse
|
14
|
Hale H, Gardner EM, Viruel J, Pokorny L, Johnson MG. Strategies for reducing per-sample costs in target capture sequencing for phylogenomics and population genomics in plants. APPLICATIONS IN PLANT SCIENCES 2020; 8:e11337. [PMID: 32351798 PMCID: PMC7186906 DOI: 10.1002/aps3.11337] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 12/20/2019] [Indexed: 05/19/2023]
Abstract
The reduced cost of high-throughput sequencing and the development of gene sets with wide phylogenetic applicability has led to the rise of sequence capture methods as a plausible platform for both phylogenomics and population genomics in plants. An important consideration in large targeted sequencing projects is the per-sample cost, which can be inflated when using off-the-shelf kits or reagents not purchased in bulk. Here, we discuss methods to reduce per-sample costs in high-throughput targeted sequencing projects. We review the minimal equipment and consumable requirements for targeted sequencing while comparing several alternatives to reduce bulk costs in DNA extraction, library preparation, target enrichment, and sequencing. We consider how each of the workflow alterations may be affected by DNA quality (e.g., fresh vs. herbarium tissue), genome size, and the phylogenetic scale of the project. We provide a cost calculator for researchers considering targeted sequencing to use when designing projects, and identify challenges for future development of low-cost sequencing in non-model plant systems.
Collapse
Affiliation(s)
- Haley Hale
- Department of Biological SciencesTexas Tech UniversityLubbockTexas79409USA
| | - Elliot M. Gardner
- The Morton ArboretumLisleIllinois60532USA
- Department of BiologyCase Western Reserve UniversityClevelandOhio44106USA
- Singapore Botanic GardensNational Parks Board1 Cluny Road259569Singapore
| | - Juan Viruel
- Royal Botanic GardensKew, RichmondSurreyTW9 3DSUnited Kingdom
| | - Lisa Pokorny
- Royal Botanic GardensKew, RichmondSurreyTW9 3DSUnited Kingdom
- Present address:
Centre for Plant Biotechnology and Genomics (CBGP) UPM‐INIA28223Pozuelo de Alarcón (Madrid)Spain
| | - Matthew G. Johnson
- Department of Biological SciencesTexas Tech UniversityLubbockTexas79409USA
| |
Collapse
|
15
|
Scheunert A, Dorfner M, Lingl T, Oberprieler C. Can we use it? On the utility of de novo and reference-based assembly of Nanopore data for plant plastome sequencing. PLoS One 2020; 15:e0226234. [PMID: 32208422 PMCID: PMC7092973 DOI: 10.1371/journal.pone.0226234] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 02/28/2020] [Indexed: 12/13/2022] Open
Abstract
The chloroplast genome harbors plenty of valuable information for phylogenetic research. Illumina short-read data is generally used for de novo assembly of whole plastomes. PacBio or Oxford Nanopore long reads are additionally employed in hybrid approaches to enable assembly across the highly similar inverted repeats of a chloroplast genome. Unlike for PacBio, plastome assemblies based solely on Nanopore reads are rarely found, due to their high error rate and non-random error profile. However, the actual quality decline connected to their use has rarely been quantified. Furthermore, no study has employed reference-based assembly using Nanopore reads, which is common with Illumina data. Using Leucanthemum Mill. as an example, we compared the sequence quality of seven chloroplast genome assemblies of the same species, using combinations of two sequencing platforms and three analysis pipelines. In addition, we assessed the factors which might influence Nanopore assembly quality during sequence generation and bioinformatic processing. The consensus sequence derived from de novo assembly of Nanopore data had a sequence identity of 99.59% compared to Illumina short-read de novo assembly. Most of the errors detected were indels (81.5%), and a large majority of them is part of homopolymer regions. The quality of reference-based assembly is heavily dependent upon the choice of a close-enough reference. When using a reference with 0.83% sequence divergence from the studied species, mapping of Nanopore reads results in a consensus comparable to that from Nanopore de novo assembly, and of only slightly inferior quality compared to a reference-based assembly with Illumina data. For optimal de novo assembly of Nanopore data, appropriate filtering of contaminants and chimeric sequences, as well as employing moderate read coverage, is essential. Based on these results, we conclude that Nanopore long reads are a suitable alternative to Illumina short reads in plastome phylogenomics. Few errors remain in the finalized assembly, which can be easily masked in phylogenetic analyses without loss in analytical accuracy. The easily applicable and cost-effective technology might warrant more attention by researchers dealing with plant chloroplast genomes.
Collapse
Affiliation(s)
- Agnes Scheunert
- Evolutionary and Systematic Botany Group, Institute of Plant Sciences, University of Regensburg, Regensburg, Germany
| | - Marco Dorfner
- Evolutionary and Systematic Botany Group, Institute of Plant Sciences, University of Regensburg, Regensburg, Germany
| | - Thomas Lingl
- Evolutionary and Systematic Botany Group, Institute of Plant Sciences, University of Regensburg, Regensburg, Germany
| | - Christoph Oberprieler
- Evolutionary and Systematic Botany Group, Institute of Plant Sciences, University of Regensburg, Regensburg, Germany
| |
Collapse
|
16
|
Andermann T, Torres Jiménez MF, Matos-Maraví P, Batista R, Blanco-Pastor JL, Gustafsson ALS, Kistler L, Liberal IM, Oxelman B, Bacon CD, Antonelli A. A Guide to Carrying Out a Phylogenomic Target Sequence Capture Project. Front Genet 2020; 10:1407. [PMID: 32153629 PMCID: PMC7047930 DOI: 10.3389/fgene.2019.01407] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 12/24/2019] [Indexed: 12/17/2022] Open
Abstract
High-throughput DNA sequencing techniques enable time- and cost-effective sequencing of large portions of the genome. Instead of sequencing and annotating whole genomes, many phylogenetic studies focus sequencing effort on large sets of pre-selected loci, which further reduces costs and bioinformatic challenges while increasing coverage. One common approach that enriches loci before sequencing is often referred to as target sequence capture. This technique has been shown to be applicable to phylogenetic studies of greatly varying evolutionary depth. Moreover, it has proven to produce powerful, large multi-locus DNA sequence datasets suitable for phylogenetic analyses. However, target capture requires careful considerations, which may greatly affect the success of experiments. Here we provide a simple flowchart for designing phylogenomic target capture experiments. We discuss necessary decisions from the identification of target loci to the final bioinformatic processing of sequence data. We outline challenges and solutions related to the taxonomic scope, sample quality, and available genomic resources of target capture projects. We hope this review will serve as a useful roadmap for designing and carrying out successful phylogenetic target capture studies.
Collapse
Affiliation(s)
- Tobias Andermann
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Maria Fernanda Torres Jiménez
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Pável Matos-Maraví
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Institute of Entomology, Biology Centre of the Czech Academy of Sciences, České Budějovice, Czechia
| | - Romina Batista
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Programa de Pós-Graduação em Genética, Conservação e Biologia Evolutiva, PPG GCBEv–Instituto Nacional de Pesquisas da Amazônia—INPA Campus II, Manaus, Brazil
- Coordenação de Zoologia, Museu Paraense Emílio Goeldi, Belém, Brazil
| | - José L. Blanco-Pastor
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- INRAE, Centre Nouvelle-Aquitaine-Poitiers, Lusignan, France
| | | | - Logan Kistler
- Department of Anthropology, National Museum of Natural History, Smithsonian Institution, Washington, DC, United States
| | - Isabel M. Liberal
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Bengt Oxelman
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Christine D. Bacon
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
| | - Alexandre Antonelli
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
- Royal Botanic Gardens, Kew, Richmond-Surrey, United Kingdom
| |
Collapse
|