1
|
Helal AA, Saad BT, Saad MT, Mosaad GS, Aboshanab KM. Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data. Sci Rep 2024; 14:6160. [PMID: 38486064 PMCID: PMC10940726 DOI: 10.1038/s41598-024-56604-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 03/08/2024] [Indexed: 03/18/2024] Open
Abstract
Structural variants (SVs) are one of the significant types of DNA mutations and are typically defined as larger-than-50-bp genomic alterations that include insertions, deletions, duplications, inversions, and translocations. These modifications can profoundly impact the phenotypic characteristics and contribute to disorders like cancer, response to treatment, and infections. Four long-read aligners and five SV callers have been evaluated using three Oxford Nanopore NGS human genome datasets in terms of precision, recall, and F1-score statistical metrics, depth of coverage, and speed of analysis. The best SV caller regarding recall, precision, and F1-score when matched with different aligners at different coverage levels tend to vary depending on the dataset and the specific SV types being analyzed. However, based on our findings, Sniffles and CuteSV tend to perform well across different aligners and coverage levels, followed by SVIM, PBSV, and SVDSS in the last place. The CuteSV caller has the highest average F1-score (82.51%) and recall (78.50%), and Sniffles has the highest average precision value (94.33%). Minimap2 as an aligner and Sniffles as an SV caller act as a strong base for the pipeline of SV calling because of their high speed and reasonable accomplishment. PBSV has a lower average F1-score, precision, and recall and may generate more false positives and overlook some actual SVs. Our results are valuable in the comprehensive evaluation of popular SV callers and aligners as they provide insight into the performance of several long-read aligners and SV callers and serve as a reference for researchers in selecting the most suitable tools for SV detection.
Collapse
Affiliation(s)
- Asmaa A Helal
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
| | - Bishoy T Saad
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt.
| | - Mina T Saad
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
| | - Gamal S Mosaad
- Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
| | - Khaled M Aboshanab
- Department of Microbiology and Immunology, Faculty of Pharmacy, Ain Shams University, Organization of African Unity St., Abassi, Cairo, 11566, Egypt.
| |
Collapse
|
2
|
Wu Z, Miedzinska K, Krause JS, Pérez JH, Wingfield JC, Meddle SL, Smith J. A chromosome-level genome assembly of a free-living white-crowned sparrow (Zonotrichia leucophrys gambelii). Sci Data 2024; 11:86. [PMID: 38238322 PMCID: PMC10796373 DOI: 10.1038/s41597-024-02929-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 01/03/2024] [Indexed: 01/22/2024] Open
Abstract
The white-crowned sparrow, Zonotrichia leucophrys, is a passerine bird with a wide distribution and it is extensively adapted to environmental changes. It has historically acted as a model species in studies on avian ecology, physiology and behaviour. Here, we present a high-quality chromosome-level genome of Zonotrichia leucophrys using PacBio and OmniC sequencing data. Gene models were constructed by combining RNA-seq and Iso-seq data from liver, hypothalamus, and ovary. In total a 1,123,996,003 bp genome was generated, including 31 chromosomes assembled in complete scaffolds along with other, unplaced scaffolds. This high-quality genome assembly offers an important genomic resource for the research community using the white-crowned sparrow as a model for understanding avian genome biology and development, and provides a genomic basis for future studies, both fundamental and applied.
Collapse
Affiliation(s)
- Zhou Wu
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| | - Katarzyna Miedzinska
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Jesse S Krause
- Department of Neurobiology, Physiology, and Behavior, University of California, Davis, CA, 95616, USA
- Department of Biology, University of Nevada Reno, Reno, NV, 89557, USA
| | - Jonathan H Pérez
- Department of Biology, University of South Alabama, Mobile, AL, 36688, USA
| | - John C Wingfield
- Department of Neurobiology, Physiology, and Behavior, University of California, Davis, CA, 95616, USA
| | - Simone L Meddle
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies R(D)SVS, The University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| |
Collapse
|
3
|
Freeman JC, Scott JG. Genetics, genomics and mechanisms responsible for high levels of pyrethroid resistance in Musca domestica. PESTICIDE BIOCHEMISTRY AND PHYSIOLOGY 2024; 198:105752. [PMID: 38225095 DOI: 10.1016/j.pestbp.2023.105752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 12/12/2023] [Accepted: 12/18/2023] [Indexed: 01/17/2024]
Abstract
Insecticide resistance is both economically important and evolutionarily interesting phenomenon. Identification of the mutations responsible for resistance allows for highly sensitive resistance monitoring and allows tools to study the forces (population genetics, fitness costs, etc.) that shape the evolution of resistance. Genes coding for insecticide targets have many well-characterized mutations, but the mutations responsible for enhanced detoxification have proven difficult to identify. We employed multiple strategies to identify the mutations responsible for the extraordinarily high permethrin resistance in the KS17-R strain of house fly (Musca domestica): insecticide synergist assays, linkage analysis, bulk segregant analyses (BSA), transcriptomics and long read DNA (Nanopore) sequencing. The >85,100-fold resistance in KS17-R was partially suppressed by the insecticide synergists piperonyl butoxide and S,S,S-tributylphosphorothionate, but not by diethyl maleate nor by injection. This suggests the involvement of target site insensitivity, CYP-mediated resistance, possibly hydrolase mediated resistance and potentially other unknown factors. Linkage analysis identified chromosomes 1, 2, 3 and 5 as having a role in resistance. BSA mapped resistance loci on chromosomes 3 and 5. The locus on chromosome 3 was centered on the voltage sensitive sodium channel. The locus on chromosome 5 was associated with a duplication of multiple detoxification genes. Transcriptomic analyses and long read DNA sequencing revealed overexpressed CYPs and esterases and identified a complex set of structural variants at the chromosome 5 locus.
Collapse
Affiliation(s)
- Jamie C Freeman
- Department of Entomology, Cornell University, Comstock Hall, Ithaca, New York, USA
| | - Jeffrey G Scott
- Department of Entomology, Cornell University, Comstock Hall, Ithaca, New York, USA.
| |
Collapse
|
4
|
Cuenca-Guardiola J, Morena-Barrio BDL, Navarro-Manzano E, Stevens J, Ouwehand WH, Gleadall NS, Corral J, Fernández-Breis JT. Detection and annotation of transposable element insertions and deletions on the human genome using nanopore sequencing. iScience 2023; 26:108214. [PMID: 37953943 PMCID: PMC10638045 DOI: 10.1016/j.isci.2023.108214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 07/28/2023] [Accepted: 10/11/2023] [Indexed: 11/14/2023] Open
Abstract
Repetitive sequences represent about 45% of the human genome. Some are transposable elements (TEs) with the ability to change their position in the genome, creating genetic variability both as insertions or deletions, with potential pathogenic consequences. We used long-read nanopore sequencing to identify TE variants in the genomes of 24 patients with antithrombin deficiency. We identified 7 344 TE insertions and 3 056 TE deletions, 2 926 were not previously described in publicly available databases. The insertions affected 3 955 genes, with 6 insertions located in exons, 3 929 in introns, and 147 in promoters. Potential functional impact was evaluated with gene annotation and enrichment analysis, which suggested a strong relationship with neuron-related functions and autism. We conclude that this study encourages the generation of a complete map of TEs in the human genome, which will be useful for identifying new TEs involved in genetic disorders.
Collapse
Affiliation(s)
- Javier Cuenca-Guardiola
- Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, IMIB-Pascual Parrilla, Facultad de Informática, Campus de Espinardo, Murcia 30100, Spain
| | - Belén de la Morena-Barrio
- Servicio de Hematología, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Pascual Parrilla, CIBERER-III, Ronda de Garay S/N, Murcia 30003, Spain
| | - Esther Navarro-Manzano
- Servicio de Hematología, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Pascual Parrilla, CIBERER-III, Ronda de Garay S/N, Murcia 30003, Spain
| | - Jonathan Stevens
- Department of Haematology, University of Cambridge, CB2 0PT, Cambridge Biomedical Campus, Cambridge, Cambridge, England, UK
- Blood and Transplant, National Health Service (NHS), CB2 0QQ, Cambridge Biomedical Campus, Cambridge, England, UK
| | - Willem H Ouwehand
- Department of Haematology, University of Cambridge, CB2 0PT, Cambridge Biomedical Campus, Cambridge, Cambridge, England, UK
- Blood and Transplant, National Health Service (NHS), CB2 0QQ, Cambridge Biomedical Campus, Cambridge, England, UK
- British Heart Foundation Cambridge Centre of Excellence, Division of Cardiovascular Medicine, Cambridge Heart and Lung Research Institute, Cambridge Biomedical Campus, Cambridge, England CB2 0AY, UK
- University College London Hospitals, NHS Foundation Trust, London, England, UK
| | - Nicholas S Gleadall
- Department of Haematology, University of Cambridge, CB2 0PT, Cambridge Biomedical Campus, Cambridge, Cambridge, England, UK
- Blood and Transplant, National Health Service (NHS), CB2 0QQ, Cambridge Biomedical Campus, Cambridge, England, UK
| | - Javier Corral
- Servicio de Hematología, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Universidad de Murcia, IMIB-Pascual Parrilla, CIBERER-III, Ronda de Garay S/N, Murcia 30003, Spain
| | - Jesualdo Tomás Fernández-Breis
- Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, IMIB-Pascual Parrilla, Facultad de Informática, Campus de Espinardo, Murcia 30100, Spain
| |
Collapse
|
5
|
Romagnoli S, Bartalucci N, Vannucchi AM. Resolving complex structural variants via nanopore sequencing. Front Genet 2023; 14:1213917. [PMID: 37674481 PMCID: PMC10479017 DOI: 10.3389/fgene.2023.1213917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 07/26/2023] [Indexed: 09/08/2023] Open
Abstract
The recent development of high-throughput sequencing platforms provided impressive insights into the field of human genetics and contributed to considering structural variants (SVs) as the hallmark of genome instability, leading to the establishment of several pathologic conditions, including neoplasia and neurodegenerative and cognitive disorders. While SV detection is addressed by next-generation sequencing (NGS) technologies, the introduction of more recent long-read sequencing technologies have already been proven to be invaluable in overcoming the inaccuracy and limitations of NGS technologies when applied to resolve wide and structurally complex SVs due to the short length (100-500 bp) of the sequencing read utilized. Among the long-read sequencing technologies, Oxford Nanopore Technologies developed a sequencing platform based on a protein nanopore that allows the sequencing of "native" long DNA molecules of virtually unlimited length (typical range 1-100 Kb). In this review, we focus on the bioinformatics methods that improve the identification and genotyping of known and novel SVs to investigate human pathological conditions, discussing the possibility of introducing nanopore sequencing technology into routine diagnostics.
Collapse
Affiliation(s)
| | | | - Alessandro Maria Vannucchi
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, DENOTHE Excellence Center, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| |
Collapse
|
6
|
Spealman P, De T, Chuong JN, Gresham D. Best Practices in Microbial Experimental Evolution: Using Reporters and Long-Read Sequencing to Identify Copy Number Variation in Experimental Evolution. J Mol Evol 2023; 91:356-368. [PMID: 37012421 PMCID: PMC10275804 DOI: 10.1007/s00239-023-10102-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 02/21/2023] [Indexed: 04/05/2023]
Abstract
Copy number variants (CNVs), comprising gene amplifications and deletions, are a pervasive class of heritable variation. CNVs play a key role in rapid adaptation in both natural, and experimental, evolution. However, despite the advent of new DNA sequencing technologies, detection and quantification of CNVs in heterogeneous populations has remained challenging. Here, we summarize recent advances in the use of CNV reporters that provide a facile means of quantifying de novo CNVs at a specific locus in the genome, and nanopore sequencing, for resolving the often complex structures of CNVs. We provide guidance for the engineering and analysis of CNV reporters and practical guidelines for single-cell analysis of CNVs using flow cytometry. We summarize recent advances in nanopore sequencing, discuss the utility of this technology, and provide guidance for the bioinformatic analysis of these data to define the molecular structure of CNVs. The combination of reporter systems for tracking and isolating CNV lineages and long-read DNA sequencing for characterizing CNV structures enables unprecedented resolution of the mechanisms by which CNVs are generated and their evolutionary dynamics.
Collapse
Affiliation(s)
- Pieter Spealman
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Titir De
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Julie N Chuong
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - David Gresham
- Department of Biology, New York University, New York, NY, 10003, USA.
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA.
| |
Collapse
|
7
|
Zhang Z, Xia T, Zhou S, Yang X, Lyu T, Wang L, Fang J, Wang Q, Dou H, Zhang H. High-Quality Chromosome-Level Genome Assembly of the Corsac Fox ( Vulpes corsac) Reveals Adaptation to Semiarid and Harsh Environments. Int J Mol Sci 2023; 24:ijms24119599. [PMID: 37298549 DOI: 10.3390/ijms24119599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/24/2023] [Accepted: 05/29/2023] [Indexed: 06/12/2023] Open
Abstract
The Corsac fox (Vulpes corsac) is a species of fox distributed in the arid prairie regions of Central and Northern Asia, with distinct adaptations to dry environments. Here, we applied Oxford-Nanopore sequencing and a chromosome structure capture technique to assemble the first Corsac fox genome, which was then assembled into chromosome fragments. The genome assembly has a total length of 2.2 Gb with a contig N50 of 41.62 Mb and a scaffold N50 of 132.2 Mb over 18 pseudo-chromosomal scaffolds. The genome contained approximately 32.67% of repeat sequences. A total of 20,511 protein-coding genes were predicted, of which 88.9% were functionally annotated. Phylogenetic analyses indicated a close relation to the Red fox (Vulpes vulpes) with an estimated divergence time of ~3.7 million years ago (MYA). We performed separate enrichment analyses of species-unique genes, the expanded and contracted gene families, and positively selected genes. The results suggest an enrichment of pathways related to protein synthesis and response and an evolutionary mechanism by which cells respond to protein denaturation in response to heat stress. The enrichment of pathways related to lipid and glucose metabolism, potentially preventing stress from dehydration, and positive selection of genes related to vision, as well as stress responses in harsh environments, may reveal adaptive evolutionary mechanisms in the Corsac fox under harsh drought conditions. Additional detection of positive selection for genes associated with gustatory receptors may reveal a unique desert diet strategy for the species. This high-quality genome provides a valuable resource for studying mammalian drought adaptation and evolution in the genus Vulpes.
Collapse
Affiliation(s)
- Zhihao Zhang
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Tian Xia
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Shengyang Zhou
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Xiufeng Yang
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Tianshu Lyu
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Lidong Wang
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Jiaohui Fang
- School of Life Science, Qufu Normal University, Qufu 273165, China
| | - Qi Wang
- Hulunbuir Academy of Inland Lakes in Northern Cold & Arid Areas, Hulunbuir 021000, China
| | - Huashan Dou
- Hulunbuir Academy of Inland Lakes in Northern Cold & Arid Areas, Hulunbuir 021000, China
| | - Honghai Zhang
- School of Life Science, Qufu Normal University, Qufu 273165, China
| |
Collapse
|
8
|
Lecoquierre F, Quenez O, Fourneaux S, Coutant S, Vezain M, Rolain M, Drouot N, Boland A, Olaso R, Meyer V, Deleuze JF, Dabbagh D, Gilles I, Gayet C, Saugier-Veber P, Goldenberg A, Guerrot AM, Nicolas G. High diagnostic potential of short and long read genome sequencing with transcriptome analysis in exome-negative developmental disorders. Hum Genet 2023; 142:773-783. [PMID: 37076692 DOI: 10.1007/s00439-023-02553-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 04/05/2023] [Indexed: 04/21/2023]
Abstract
Exome sequencing (ES) has become the method of choice for diagnosing rare diseases, while the availability of short-read genome sequencing (SR-GS) in a medical setting is increasing. In addition, new sequencing technologies, such as long-read genome sequencing (LR-GS) and transcriptome sequencing, are being increasingly used. However, the contribution of these techniques compared to widely used ES is not well established, particularly in regards to the analysis of non-coding regions. In a pilot study of five probands affected by an undiagnosed neurodevelopmental disorder, we performed trio-based short-read GS and long-read GS as well as case-only peripheral blood transcriptome sequencing. We identified three new genetic diagnoses, none of which affected the coding regions. More specifically, LR-GS identified a balanced inversion in NSD1, highlighting a rare mechanism of Sotos syndrome. SR-GS identified a homozygous deep intronic variant of KLHL7 resulting in a neoexon inclusion, and a de novo mosaic intronic 22-bp deletion in KMT2D, leading to the diagnosis of Perching and Kabuki syndromes, respectively. All three variants had a significant effect on the transcriptome, which showed decreased gene expression, mono-allelic expression and splicing defects, respectively, further validating the effect of these variants. Overall, in undiagnosed patients, the combination of short and long read GS allowed the detection of cryptic variations not or barely detectable by ES, making it a highly sensitive method at the cost of more complex bioinformatics approaches. Transcriptome sequencing is a valuable complement for the functional validation of variations, particularly in the non-coding genome.
Collapse
Affiliation(s)
- François Lecoquierre
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France.
| | - Olivier Quenez
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Steeve Fourneaux
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Sophie Coutant
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Myriam Vezain
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Marion Rolain
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Nathalie Drouot
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Anne Boland
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), 91057, Evry, France
| | - Robert Olaso
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), 91057, Evry, France
| | - Vincent Meyer
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), 91057, Evry, France
| | - Jean-François Deleuze
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), 91057, Evry, France
| | - Dana Dabbagh
- Department of Pediatrics, Elbeuf Hospital, Elbeuf, France
| | | | - Claire Gayet
- Department of Pediatrics, CHU Rouen, F-76000, Rouen, France
| | - Pascale Saugier-Veber
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Alice Goldenberg
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Anne-Marie Guerrot
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France
| | - Gaël Nicolas
- Univ Rouen Normandie, Inserm U12045 and CHU Rouen, Department of Genetics and Reference Center for Developmental Disorders, FHU-G4 Génomique, F-76000, Rouen, France.
| |
Collapse
|
9
|
Yildiz G, Zanini SF, Afsharyan NP, Obermeier C, Snowdon RJ, Golicz AA. Benchmarking Oxford Nanopore read alignment-based insertion and deletion detection in crop plant genomes. THE PLANT GENOME 2023:e20314. [PMID: 36988043 DOI: 10.1002/tpg2.20314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 01/15/2023] [Indexed: 06/19/2023]
Abstract
Structural variations (SVs) are larger polymorphisms (> 50 bp in length), which consist of insertions, deletions, inversions, duplications, and translocations. They can have a strong impact on agronomical traits and play an important role in environmental adaptation. The development of long-read sequencing technologies, including Oxford Nanopore, allows for comprehensive SV discovery and characterization even in complex polyploid crop genomes. However, many of the SV discovery pipeline benchmarks do not include complex plant genome datasets. In this study, we benchmarked insertion and deletion detection by popular long-read alignment-based SV detection tools for crop plant genomes. We used real and simulated Oxford Nanopore reads for two crops, allotetraploid Brassica napus (oilseed rape) and diploid Solanum lycopersicum (tomato), and evaluated several read aligners and SV callers across 5×, 10×, and 20× coverages typically used in re-sequencing studies. We further validated our findings using maize and soybean datasets. Our benchmarks provide a useful guide for designing Oxford Nanopore re-sequencing projects and SV discovery pipelines for crop plants.
Collapse
Affiliation(s)
- Gözde Yildiz
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Silvia F Zanini
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Nazanin P Afsharyan
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Christian Obermeier
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Rod J Snowdon
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Agnieszka A Golicz
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
10
|
Ikemoto K, Fujimoto H, Fujimoto A. Localized assembly for long reads enables genome-wide analysis of repetitive regions at single-base resolution in human genomes. Hum Genomics 2023; 17:21. [PMID: 36895025 PMCID: PMC9996862 DOI: 10.1186/s40246-023-00467-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 03/01/2023] [Indexed: 03/11/2023] Open
Abstract
BACKGROUND Long-read sequencing technologies have the potential to overcome the limitations of short reads and provide a comprehensive picture of the human genome. However, the characterization of repetitive sequences by reconstructing genomic structures at high resolution solely from long reads remains difficult. Here, we developed a localized assembly method (LoMA) that constructs highly accurate consensus sequences (CSs) from long reads. METHODS We developed LoMA by combining minimap2, MAFFT, and our algorithm, which classifies diploid haplotypes based on structural variants and CSs. Using this tool, we analyzed two human samples (NA18943 and NA19240) sequenced with the Oxford Nanopore sequencer. We defined target regions in each genome based on mapping patterns and then constructed a high-quality catalog of the human insertion solely from the long-read data. RESULTS The assessment of LoMA showed a high accuracy of CSs (error rate < 0.3%) compared with raw data (error rate > 8%) and superiority to a previous study. The genome-wide analysis of NA18943 and NA19240 identified 5516 and 6542 insertions (≥ 100 bp), respectively. Most insertions (~ 80%) were derived from tandem repeats and transposable elements. We also detected processed pseudogenes, insertions in transposable elements, and long insertions (> 10 kbp). Finally, our analysis suggested that short tandem duplications are associated with gene expression and transposons. CONCLUSIONS Our analysis showed that LoMA constructs high-quality sequences from long reads with substantial errors. This study revealed the true structures of the insertions with high accuracy and inferred the mechanisms for the insertions, thus contributing to future human genome studies. LoMA is available at our GitHub page: https://github.com/kolikem/loma .
Collapse
Affiliation(s)
- Ko Ikemoto
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Hongo 7-3-1, Bunkyo, Tokyo, Japan
| | - Hinano Fujimoto
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Hongo 7-3-1, Bunkyo, Tokyo, Japan
| | - Akihiro Fujimoto
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Hongo 7-3-1, Bunkyo, Tokyo, Japan.
| |
Collapse
|
11
|
Mérot C, Stenløkk KSR, Venney C, Laporte M, Moser M, Normandeau E, Árnyasi M, Kent M, Rougeux C, Flynn JM, Lien S, Bernatchez L. Genome assembly, structural variants, and genetic differentiation between lake whitefish young species pairs (Coregonus sp.) with long and short reads. Mol Ecol 2023; 32:1458-1477. [PMID: 35416336 DOI: 10.1111/mec.16468] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Revised: 03/24/2022] [Accepted: 04/01/2022] [Indexed: 11/26/2022]
Abstract
Nascent pairs of ecologically differentiated species offer an opportunity to get a better glimpse at the genetic architecture of speciation. Of particular interest is our recent ability to consider a wider range of genomic variants, not only single-nucleotide polymorphisms (SNPs), thanks to long-read sequencing technology. We can now identify structural variants (SVs) such as insertions, deletions and other rearrangements, allowing further insights into the genetic architecture of speciation and how different types of variants are involved in species differentiation. Here, we investigated genomic patterns of differentiation between sympatric species pairs (Dwarf and Normal) belonging to the lake whitefish (Coregonus clupeaformis) species complex. We assembled the first reference genomes for both C. clupeaformis sp. Normal and C. clupeaformis sp. Dwarf, annotated the transposable elements and analysed the genomes in the light of related coregonid species. Next, we used a combination of long- and short-read sequencing to characterize SVs and genotype them at the population scale using genome-graph approaches, showing that SVs cover five times more of the genome than SNPs. We then integrated both SNPs and SVs to investigate the genetic architecture of species differentiation in two different lakes and highlighted an excess of shared outliers of differentiation. In particular, a large fraction of SVs differentiating the two species correspond to insertions or deletions of transposable elements (TEs), suggesting that TE accumulation may represent a key component of genetic divergence between the Dwarf and Normal species. Together, our results suggest that SVs may play an important role in speciation and that, by combining second- and third-generation sequencing, we now have the ability to integrate SVs into speciation genomics.
Collapse
Affiliation(s)
- Claire Mérot
- Département de Biologie, Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Québec, Canada.,UMR 6553 Ecobio, OSUR, CNRS, Université de Rennes, Rennes, France
| | - Kristina S R Stenløkk
- Department of Animal and Aquacultural Sciences (IHA), Faculty of Life Sciences (BIOVIT), Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Clare Venney
- Département de Biologie, Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Québec, Canada
| | - Martin Laporte
- Département de Biologie, Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Québec, Canada.,Ministère des Forêts, de la Faune et des Parcs (MFFP) du Québec, Québec, Québec, Canada
| | - Michel Moser
- Department of Animal and Aquacultural Sciences (IHA), Faculty of Life Sciences (BIOVIT), Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Eric Normandeau
- Département de Biologie, Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Québec, Canada
| | - Mariann Árnyasi
- Department of Animal and Aquacultural Sciences (IHA), Faculty of Life Sciences (BIOVIT), Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Matthew Kent
- Department of Animal and Aquacultural Sciences (IHA), Faculty of Life Sciences (BIOVIT), Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Clément Rougeux
- Département de Biologie, Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Québec, Canada
| | - Jullien M Flynn
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA
| | - Sigbjørn Lien
- Department of Animal and Aquacultural Sciences (IHA), Faculty of Life Sciences (BIOVIT), Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Louis Bernatchez
- Département de Biologie, Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Québec, Canada
| |
Collapse
|
12
|
Long-read sequencing identifies novel structural variations in colorectal cancer. PLoS Genet 2023; 19:e1010514. [PMID: 36812239 PMCID: PMC10013895 DOI: 10.1371/journal.pgen.1010514] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 03/14/2023] [Accepted: 11/08/2022] [Indexed: 02/24/2023] Open
Abstract
Structural variations (SVs) are a key type of cancer genomic alterations, contributing to oncogenesis and progression of many cancers, including colorectal cancer (CRC). However, SVs in CRC remain difficult to be reliably detected due to limited SV-detection capacity of the commonly used short-read sequencing. This study investigated the somatic SVs in 21 pairs of CRC samples by Nanopore whole-genome long-read sequencing. 5200 novel somatic SVs from 21 CRC patients (494 SVs / patient) were identified. A 4.9-Mbp long inversion that silences APC expression (confirmed by RNA-seq) and an 11.2-kbp inversion that structurally alters CFTR were identified. Two novel gene fusions that might functionally impact the oncogene RNF38 and the tumor-suppressor SMAD3 were detected. RNF38 fusion possesses metastasis-promoting ability confirmed by in vitro migration and invasion assay, and in vivo metastasis experiments. This work highlighted the various applications of long-read sequencing in cancer genome analysis, and shed new light on how somatic SVs structurally alter critical genes in CRC. The investigation on somatic SVs via nanopore sequencing revealed the potential of this genomic approach in facilitating precise diagnosis and personalized treatment of CRC.
Collapse
|
13
|
Zhao L, Li XD, Jiang T, Wang H, Dan Z, Xu SQ, Guan DL. The Chromosome-Level Genome of Hestina assimilis (Lepidoptera: Nymphalidae) Reveals the Evolution of Saprophagy-Related Genes in Brush-Footed Butterflies. Int J Mol Sci 2023; 24:ijms24032087. [PMID: 36768416 PMCID: PMC9917059 DOI: 10.3390/ijms24032087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 01/06/2023] [Accepted: 01/16/2023] [Indexed: 01/21/2023] Open
Abstract
Most butterflies feed on nectar, while some saprophagous butterflies forage on various non-nectar foods. To date, little is known about the genomic and molecular shifts associated with the evolution of the saprophagous feeding strategy. Here, we assembled the high-quality chromosome-level genome of Hestina assimilis to explore its saprophagous molecular and genetic mechanisms. This chromosome-level genome of H. assimilis is 412.82 Mb, with a scaffold N50 of 15.70 Mb. In total, 98.11% of contigs were anchored to 30 chromosomes. Compared with H. assimilis and other Nymphalidae butterflies, the genes of metabolism and detoxification experienced expansions. We annotated 80 cytochrome P450 (CYP) genes in the H. assimilis genome, among which genes belonging to the CYP4 subfamily were significantly expanded (p < 0.01). These P450 genes were unevenly distributed and mainly concentrated on chromosomes 6-9. We identified 33 olfactory receptor (OR), 20 odorant-binding protein (OBP), and six gustatory receptor (GR) genes in the H. assimilis genome, which were fewer than in the nectarivorous Danaus plexippus. A decreased number of OBP, OR, and GR genes implied that H. assimilis should resort less to olfaction and gustation than their nectarivorous counterparts, which need highly specialized olfactory and gustatory functions. Moreover, we found one site under positive selection occurred in residue 996 (phenylalanine) of GR genes exclusive to H. assimilis, which is conservative in most lineages. Our study provides support for the adaptive evolution of feeding habits in butterflies.
Collapse
Affiliation(s)
- Lu Zhao
- College of Life Sciences, Shaanxi Normal University, Xi’an 710119, China
| | - Xiao-Dong Li
- School of Chemistry and Bioengineering, Hechi University, Yizhou 546300, China
| | - Tao Jiang
- College of Life Sciences, Shaanxi Normal University, Xi’an 710119, China
| | - Hang Wang
- College of Life Sciences, Shaanxi Normal University, Xi’an 710119, China
| | - Zhicuo Dan
- College of Life Sciences, Shaanxi Normal University, Xi’an 710119, China
| | - Sheng-Quan Xu
- College of Life Sciences, Shaanxi Normal University, Xi’an 710119, China
- Correspondence: (S.-Q.X.); (D.-L.G.)
| | - De-Long Guan
- College of Life Sciences, Shaanxi Normal University, Xi’an 710119, China
- School of Chemistry and Bioengineering, Hechi University, Yizhou 546300, China
- Correspondence: (S.-Q.X.); (D.-L.G.)
| |
Collapse
|
14
|
Hu Y, Yang C, Zhang L, Zhou X. Haplotyping-Assisted Diploid Assembly and Variant Detection with Linked Reads. Methods Mol Biol 2023; 2590:161-182. [PMID: 36335499 DOI: 10.1007/978-1-0716-2819-5_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Phasing is essential for determining the origins of each set of alleles in the whole-genome sequencing data of individuals. As such, it provides essential information for the causes of hereditary diseases and the sources of individual variability. Recent technical breakthroughs in linked-read (referred to as co-barcoding in other chapters of the book) and long-read sequencing and downstream analysis have brought the goal of accurate and complete phasing within reach. Here we review recent progress related to the assembly and phasing of personal genomes based on linked-reads and related applications. Motivated by current limitations in generating high-quality diploid assemblies and detecting variants, a new suite of software tools, Aquila, was developed to fully take advantage of linked-read sequencing technology. The overarching goal of Aquila is to exploit the strengths of linked-read technology including long-range connectivity and inherent phasing of variants for reference-assisted local de novo assembly at the whole-genome scale. The diploid nature of the assemblies facilitates detection and phasing of genetic variation, including single nucleotide variations (SNVs), small insertions and deletions (indels), and structural variants (SVs). An extension of Aquila, Aquila_stLFR, focuses on another newly developed linked-reads sequencing technology, single-tube long-fragment read (stLFR). AquilaSV, a region-based diploid assembly approach, is used to characterize structural variants and can achieve diploid assembly in one target region at a time. Lastly, we introduce HAPDeNovo, a program that exploits phasing information from linked-read sequencing to improve detection of de novo mutations. Use of these tools is expected to harness the advantages of linked-reads technology, improve phasing, and advance variant discovery.
Collapse
Affiliation(s)
- Yunfei Hu
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
| | - Chao Yang
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong.
| | - Xin Zhou
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA.
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA.
- Data Science Institute, Nashville, TN, USA.
| |
Collapse
|
15
|
Akbari V, Hanlon VC, O’Neill K, Lefebvre L, Schrader KA, Lansdorp PM, Jones SJ. Parent-of-origin detection and chromosome-scale haplotyping using long-read DNA methylation sequencing and Strand-seq. CELL GENOMICS 2022; 3:100233. [PMID: 36777186 PMCID: PMC9903809 DOI: 10.1016/j.xgen.2022.100233] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 09/08/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022]
Abstract
Hundreds of loci in human genomes have alleles that are methylated differentially according to their parent of origin. These imprinted loci generally show little variation across tissues, individuals, and populations. We show that such loci can be used to distinguish the maternal and paternal homologs for all human autosomes without the need for the parental DNA. We integrate methylation-detecting nanopore sequencing with the long-range phase information in Strand-seq data to determine the parent of origin of chromosome-length haplotypes for both DNA sequence and DNA methylation in five trios with diverse genetic backgrounds. The parent of origin was correctly inferred for all autosomes with an average mismatch error rate of 0.31% for SNVs and 1.89% for insertions or deletions (indels). Because our method can determine whether an inherited disease allele originated from the mother or the father, we predict that it will improve the diagnosis and management of many genetic diseases.
Collapse
Affiliation(s)
- Vahid Akbari
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada,Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | | | - Kieran O’Neill
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Louis Lefebvre
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Kasmintan A. Schrader
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada,Department of Molecular Oncology, BC Cancer, Vancouver, BC, Canada
| | - Peter M. Lansdorp
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada,Terry Fox Laboratory, BC Cancer, Vancouver, BC, Canada,Corresponding author
| | - Steven J.M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada,Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada,Corresponding author
| |
Collapse
|
16
|
Ferguson S, McLay T, Andrew RL, Bruhl JJ, Schwessinger B, Borevitz J, Jones A. Species-specific basecallers improve actual accuracy of nanopore sequencing in plants. PLANT METHODS 2022; 18:137. [PMID: 36517904 PMCID: PMC9749173 DOI: 10.1186/s13007-022-00971-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 12/09/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND Long-read sequencing platforms offered by Oxford Nanopore Technologies (ONT) allow native DNA containing epigenetic modifications to be directly sequenced, but can be limited by lower per-base accuracies. A key step post-sequencing is basecalling, the process of converting raw electrical signals produced by the sequencing device into nucleotide sequences. This is challenging as current basecallers are primarily based on mixtures of model species for training. Here we utilise both ONT PromethION and higher accuracy PacBio Sequel II HiFi sequencing on two plants, Phebalium stellatum and Xanthorrhoea johnsonii, to train species-specific basecaller models with the aim of improving per-base accuracy. We investigate sequencing accuracies achieved by ONT basecallers and assess accuracy gains by training single-species and species-specific basecaller models. We also evaluate accuracy gains from ONT's improved flowcells (R10.4, FLO-PRO112) and sequencing kits (SQK-LSK112). For the truth dataset for both model training and accuracy assessment, we developed highly accurate, contiguous diploid reference genomes with PacBio Sequel II HiFi reads. RESULTS Basecalling with ONT Guppy 5 and 6 super-accurate gave almost identical results, attaining read accuracies of 91.96% and 94.15%. Guppy's plant-specific model gave highly mixed results, attaining read accuracies of 91.47% and 96.18%. Species-specific basecalling models improved read accuracy, attaining 93.24% and 95.16% read accuracies. R10.4 sequencing kits also improve sequencing accuracy, attaining read accuracies of 95.46% (super-accurate) and 96.87% (species-specific). CONCLUSIONS The use of a single mixed-species basecaller model, such as ONT Guppy super-accurate, may be reducing the accuracy of nanopore sequencing, due to conflicting genome biology within the training dataset and study species. Training of single-species and genome-specific basecaller models improves read accuracy. Studies that aim to do large-scale long-read genotyping would primarily benefit from training their own basecalling models. Such studies could use sequencing accuracy gains and improving bioinformatics tools to improve study outcomes.
Collapse
Affiliation(s)
- Scott Ferguson
- Research School of Biology, Australian National University, Canberra, ACT, Australia.
| | - Todd McLay
- National Herbarium of Victoria, Royal Botanic Gardens Victoria, South Yarra, Victoria, 3004, Australia
- School of Biosciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Rose L Andrew
- Botany & N.C.W. Beadle Herbarium, School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Jeremy J Bruhl
- Botany & N.C.W. Beadle Herbarium, School of Environmental and Rural Science, University of New England, Armidale, NSW, 2351, Australia
| | - Benjamin Schwessinger
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Justin Borevitz
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Ashley Jones
- Research School of Biology, Australian National University, Canberra, ACT, Australia.
| |
Collapse
|
17
|
Chen J, Cheng J, Chen X, Inoue M, Liu Y, Song CX. Whole-genome long-read TAPS deciphers DNA methylation patterns at base resolution using PacBio SMRT sequencing technology. Nucleic Acids Res 2022; 50:e104. [PMID: 35849350 PMCID: PMC9561279 DOI: 10.1093/nar/gkac612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 06/07/2022] [Accepted: 06/30/2022] [Indexed: 11/16/2022] Open
Abstract
Long-read sequencing provides valuable information on difficult-to-map genomic regions, which can complement short-read sequencing to improve genome assembly, yet limited methods are available to accurately detect DNA methylation over long distances at a whole-genome scale. By combining our recently developed TET-assisted pyridine borane sequencing (TAPS) method, which enables direct detection of 5-methylcytosine and 5-hydroxymethylcytosine, with PacBio single-molecule real-time sequencing, we present here whole-genome long-read TAPS (wglrTAPS). To evaluate the performance of wglrTAPS, we applied it to mouse embryonic stem cells as a proof of concept, and an N50 read length of 3.5 kb is achieved. By sequencing wglrTAPS to 8.2× depth, we discovered a significant proportion of CpG sites that were not covered in previous 27.5× short-read TAPS. Our results demonstrate that wglrTAPS facilitates methylation profiling on problematic genomic regions with repetitive elements or structural variations, and also in an allelic manner, all of which are extremely difficult for short-read sequencing methods to resolve. This method therefore enhances applications of third-generation sequencing technologies for DNA epigenetics.
Collapse
Affiliation(s)
- Jinfeng Chen
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
- Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
| | - Jingfei Cheng
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
- Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
| | - Xiufei Chen
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
- Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
| | - Masato Inoue
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
- Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
| | - Yibin Liu
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
- Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
| | - Chun-Xiao Song
- Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
- Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK
| |
Collapse
|
18
|
Hernandez-Moran BA, Papanastasiou AS, Parry D, Meynert A, Gautier P, Grimes G, Adams IR, Trejo-Reveles V, Bengani H, Keighren M, Jackson IJ, Adams DJ, FitzPatrick DR, Rainger J. Robust Genetic Analysis of the X-Linked Anophthalmic ( Ie) Mouse. Genes (Basel) 2022; 13:1797. [PMID: 36292683 PMCID: PMC9601528 DOI: 10.3390/genes13101797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 09/25/2022] [Accepted: 09/29/2022] [Indexed: 11/24/2022] Open
Abstract
Anophthalmia (missing eye) describes a failure of early embryonic ocular development. Mutations in a relatively small set of genes account for 75% of bilateral anophthalmia cases, yet 25% of families currently are left without a molecular diagnosis. Here, we report our experimental work that aimed to uncover the developmental and genetic basis of the anophthalmia characterising the X-linked Ie (eye-ear reduction) X-ray-induced allele in mouse that was first identified in 1947. Histological analysis of the embryonic phenotype showed failure of normal eye development after the optic vesicle stage with particularly severe malformation of the ventral retina. Linkage analysis mapped this mutation to a ~6 Mb region on the X chromosome. Short- and long-read whole-genome sequencing (WGS) of affected and unaffected male littermates confirmed the Ie linkage but identified no plausible causative variants or structural rearrangements. These analyses did reduce the critical candidate interval and revealed evidence of multiple variants within the ancestral DNA, although none were found that altered coding sequences or that were unique to Ie. To investigate early embryonic events at a genetic level, we then generated mouse ES cells derived from male Ie embryos and wild type littermates. RNA-seq and accessible chromatin sequencing (ATAC-seq) data generated from cultured optic vesicle organoids did not reveal any large differences in gene expression or accessibility of putative cis-regulatory elements between Ie and wild type. However, an unbiased TF-footprinting analysis of accessible chromatin regions did provide evidence of a genome-wide reduction in binding of transcription factors associated with ventral eye development in Ie, and evidence of an increase in binding of the Zic-family of transcription factors, including Zic3, which is located within the Ie-refined critical interval. We conclude that the refined Ie critical region at chrX: 56,145,000-58,385,000 contains multiple genetic variants that may be linked to altered cis regulation but does not contain a convincing causative mutation. Changes in the binding of key transcription factors to chromatin causing altered gene expression during development, possibly through a subtle mis-regulation of Zic3, presents a plausible cause for the anophthalmia phenotype observed in Ie, but further work is required to determine the precise causative allele and its genetic mechanism.
Collapse
Affiliation(s)
- Brianda A. Hernandez-Moran
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Andrew S. Papanastasiou
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - David Parry
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Alison Meynert
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Philippe Gautier
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Graeme Grimes
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Ian R. Adams
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Violeta Trejo-Reveles
- The Division of Functional Genetics and Development, The Roslin Institute, Midlothian EH25 9RG, UK
| | - Hemant Bengani
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Margaret Keighren
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Ian J. Jackson
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - David J. Adams
- Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
| | - David R. FitzPatrick
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Crewe Rd South, Edinburgh EH4 2XU, UK
| | - Joe Rainger
- The Division of Functional Genetics and Development, The Roslin Institute, Midlothian EH25 9RG, UK
| |
Collapse
|
19
|
Cuenca-Guardiola J, de la Morena-Barrio B, García JL, Sanchis-Juan A, Corral J, Fernández-Breis JT. Improvement of large copy number variant detection by whole genome nanopore sequencing. J Adv Res 2022:S2090-1232(22)00241-7. [PMID: 36323370 PMCID: PMC10403694 DOI: 10.1016/j.jare.2022.10.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Revised: 10/18/2022] [Accepted: 10/22/2022] [Indexed: 11/11/2022] Open
Abstract
INTRODUCTION Whole-genome sequencing using nanopore technologies can uncover structural variants, which are DNA rearrangements larger than 50 base pairs. Nanopore technologies can also characterize their boundaries with single-base accuracy, owing to the kilobase-long reads that encompass either full variants or their junctions. Other methods, such as next-generation short read sequencing or PCR assays, are limited in their capabilities to detect or characterize structural variants. However, the existing software for nanopore sequencing data analysis still reports incomplete variant sets, which also contain erroneous calls, a considerable obstacle for the molecular diagnosis or accurate genotyping of populations. METHODS We compared multiple factors affecting variant calling, such as reference genome version, aligner (minimap2, NGMLR, and lra) choice, and variant caller combinations (Sniffles, CuteSV, SVIM, and NanoVar), to find the optimal group of tools for calling large (>50 kb) deletions and duplications, using data from seven patients exhibiting gross gene defects on SERPINC1 and from a reference variant set as the control. The goal was to obtain the most complete, yet reasonably specific group of large variants using a single cell of PromethION sequencing, which yielded lower depth coverage than short-read sequencing. We also used a custom method for the statistical analysis of the coverage value to refine the resulting datasets. RESULTS We found that for large deletions and duplications (>50 kb), the existing software performed worse than for smaller ones, in terms of both sensitivity and specificity, and newer tools had not improved this. Our novel software, disCoverage, could polish variant callers' results, improving specificity by up to 62% and sensitivity by 15%, the latter requiring other data or samples. CONCLUSION We analyzed the current situation of >50-kb copy number variants with nanopore sequencing, which could be improved. The methods presented in this work could help to identify the known deletions and duplications in a set of patients, while also helping to filter out erroneous calls for these variants, which might aid the efforts to characterize a not-yet well-known fraction of genetic variability in the human genome.
Collapse
|
20
|
Hybrid metagenome assemblies link carbohydrate structure with function in the human gut microbiome. Commun Biol 2022; 5:932. [PMID: 36076058 PMCID: PMC9458734 DOI: 10.1038/s42003-022-03865-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 08/22/2022] [Indexed: 11/30/2022] Open
Abstract
Complex carbohydrates that escape small intestinal digestion, are broken down in the large intestine by enzymes encoded by the gut microbiome. This is a symbiotic relationship between microbes and host, resulting in metabolic products that influence host health and are exploited by other microbes. However, the role of carbohydrate structure in directing microbiota community composition and the succession of carbohydrate-degrading microbes, is not fully understood. In this study we evaluate species-level compositional variation within a single microbiome in response to six structurally distinct carbohydrates in a controlled model gut using hybrid metagenome assemblies. We identified 509 high-quality metagenome-assembled genomes (MAGs) belonging to ten bacterial classes and 28 bacterial families. Bacterial species identified as carrying genes encoding starch binding modules increased in abundance in response to starches. The use of hybrid metagenomics has allowed identification of several uncultured species with the functional potential to degrade starch substrates for future study. Longitudinal hybrid metagenomic analyses of a human stool sample reveal compositional and functional variation in response to six structurally-distinct carbohydrates, providing insight into how gut bacteria utilize various carbohydrate sources.
Collapse
|
21
|
Comparing the significance of the utilization of next generation and third generation sequencing technologies in microbial metagenomics. Microbiol Res 2022; 264:127154. [DOI: 10.1016/j.micres.2022.127154] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 07/05/2022] [Accepted: 07/29/2022] [Indexed: 01/07/2023]
|
22
|
Lee Y, Ha U, Moon S. Ongoing endeavors to detect mobilization of transposable elements. BMB Rep 2022. [PMID: 35725016 PMCID: PMC9340088 DOI: 10.5483/bmbrep.2022.55.7.088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Transposable elements (TEs) are DNA sequences capable of mobilization from one location to another in the genome. Since the discovery of ‘Dissociation (Dc) locus’ by Barbara McClintock in maize (1), mounting evidence in the era of genomics indicates that a significant fraction of most eukaryotic genomes is composed of TE sequences, involving in various aspects of biological processes such as development, physiology, diseases and evolution. Although technical advances in genomics have discovered numerous functional impacts of TE across species, our understanding of TEs is still ongoing process due to challenges resulted from complexity and abundance of TEs in the genome. In this mini-review, we briefly summarize biology of TEs and their impacts on the host genome, emphasizing importance of understanding TE landscape in the genome. Then, we introduce recent endeavors especially in vivo retrotransposition assays and long read sequencing technology for identifying de novo insertions/TE polymorphism, which will broaden our knowledge of extraordinary relationship between genomic cohabitants and their host.
Collapse
Affiliation(s)
- Yujeong Lee
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| | - Una Ha
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| | - Sungjin Moon
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| |
Collapse
|
23
|
Akbari V, Garant JM, O'Neill K, Pandoh P, Moore R, Marra MA, Hirst M, Jones SJM. Genome-wide detection of imprinted differentially methylated regions using nanopore sequencing. eLife 2022; 11:77898. [PMID: 35787786 PMCID: PMC9255983 DOI: 10.7554/elife.77898] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 06/16/2022] [Indexed: 01/02/2023] Open
Abstract
Imprinting is a critical part of normal embryonic development in mammals, controlled by defined parent-of-origin (PofO) differentially methylated regions (DMRs) known as imprinting control regions. Direct nanopore sequencing of DNA provides a means to detect allelic methylation and to overcome the drawbacks of methylation array and short-read technologies. Here, we used publicly available nanopore sequencing data for 12 standard B-lymphocyte cell lines to acquire the genome-wide mapping of imprinted intervals in humans. Using the sequencing data, we were able to phase 95% of the human methylome and detect 94% of the previously well-characterized, imprinted DMRs. In addition, we found 42 novel imprinted DMRs (16 germline and 26 somatic), which were confirmed using whole-genome bisulfite sequencing (WGBS) data. Analysis of WGBS data in mouse (Mus musculus), rhesus monkey (Macaca mulatta), and chimpanzee (Pan troglodytes) suggested that 17 of these imprinted DMRs are conserved. Some of the novel imprinted intervals are within or close to imprinted genes without a known DMR. We also detected subtle parental methylation bias, spanning several kilobases at seven known imprinted clusters. At these blocks, hypermethylation occurs at the gene body of expressed allele(s) with mutually exclusive H3K36me3 and H3K27me3 allelic histone marks. These results expand upon our current knowledge of imprinting and the potential of nanopore sequencing to identify imprinting regions using only parent-offspring trios, as opposed to the large multi-generational pedigrees that have previously been required.
Collapse
Affiliation(s)
- Vahid Akbari
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, Canada
| | - Jean-Michel Garant
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada
| | - Kieran O'Neill
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada
| | - Pawan Pandoh
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada
| | - Richard Moore
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada
| | - Marco A Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, Canada
| | - Martin Hirst
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada.,Department of Microbiology and Immunology, Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, Canada
| |
Collapse
|
24
|
de la Morena-Barrio B, Stephens J, de la Morena-Barrio ME, Stefanucci L, Padilla J, Miñano A, Gleadall N, García JL, López-Fernández MF, Morange PE, Puurunen M, Undas A, Vidal F, Raymond FL, Vicente V, Ouwehand WH, Corral J, Sanchis-Juan A. Long-Read Sequencing Identifies the First Retrotransposon Insertion and Resolves Structural Variants Causing Antithrombin Deficiency. Thromb Haemost 2022; 122:1369-1378. [PMID: 35764313 PMCID: PMC9393088 DOI: 10.1055/s-0042-1749345] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The identification of inherited antithrombin deficiency (ATD) is critical to prevent potentially life-threatening thrombotic events. Causal variants in SERPINC1 are identified for up to 70% of cases, the majority being single-nucleotide variants and indels. The detection and characterization of structural variants (SVs) in ATD remain challenging due to the high number of repetitive elements in SERPINC1. Here, we performed long-read whole-genome sequencing on 10 familial and 9 singleton cases with type I ATD proven by functional and antigen assays, who were selected from a cohort of 340 patients with this rare disorder because genetic analyses were either negative, ambiguous, or not fully characterized. We developed an analysis workflow to identify disease-associated SVs. This approach resolved, independently of its size or type, all eight SVs detected by multiple ligation-dependent probe amplification, and identified for the first time a complex rearrangement previously misclassified as a deletion. Remarkably, we identified the mechanism explaining ATD in 2 out of 11 cases with previous unknown defect: the insertion of a novel 2.4 kb SINE-VNTR-Alu retroelement, which was characterized by de novo assembly and verified by specific polymerase chain reaction amplification and sequencing in the probands and affected relatives. The nucleotide-level resolution achieved for all SVs allowed breakpoint analysis, which revealed repetitive elements and microhomologies supporting a common replication-based mechanism for all the SVs. Our study underscores the utility of long-read sequencing technology as a complementary method to identify, characterize, and unveil the molecular mechanism of disease-causing SVs involved in ATD, and enlarges the catalogue of genetic disorders caused by retrotransposon insertions.
Collapse
Affiliation(s)
- Belén de la Morena-Barrio
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universidad de Murcia, Murcia, Spain
| | - Jonathan Stephens
- Department of Haematology, NHS Blood and Transplant Centre, University of Cambridge, Cambridge, United Kingdom,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - María Eugenia de la Morena-Barrio
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universidad de Murcia, Murcia, Spain
| | - Luca Stefanucci
- Department of Haematology, NHS Blood and Transplant Centre, University of Cambridge, Cambridge, United Kingdom,National Health Service Blood and Transplant (NHSBT), Cambridge Biomedical Campus, Cambridge, United Kingdom,BHF Centre of Excellence, Division of Cardiovascular Medicine, Addenbrooke's Hospital, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - José Padilla
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universidad de Murcia, Murcia, Spain
| | - Antonia Miñano
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universidad de Murcia, Murcia, Spain
| | - Nicholas Gleadall
- Department of Haematology, NHS Blood and Transplant Centre, University of Cambridge, Cambridge, United Kingdom,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Juan Luis García
- Servicio de Hematología, Hospital Universitario de Salamanca, Salamanca, Spain
| | | | - Pierre-Emmanuel Morange
- Laboratory of Haematology, La Timone Hospital, Marseille, France,C2VN, INRAE, INSERM, Aix-Marseille Université, Marseille, France
| | - Marja Puurunen
- The Framingham Heart Study, National Heart, Lung and Blood Institute, Framingham, Massachusetts, United States
| | - Anetta Undas
- Department of Experimental Cardiac Surgery, Anesthesiology and Cardiology, Institute of Cardiology, Jagiellonian University Medical College and John Paul II Hospital, Kraków, Poland
| | - Francisco Vidal
- Banc de Sang i Teixits, Barcelona, Spain,Vall d'Hebron Research Institute, Universitat Autònoma de Barcelona (VHIR-UAB), Barcelona, Spain,CIBER de Enfermedades Cardiovasculares, Madrid, Spain
| | - Frances Lucy Raymond
- NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom,Department of Medical Genetics, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Vicente Vicente
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universidad de Murcia, Murcia, Spain
| | - Willem H. Ouwehand
- Department of Haematology, NHS Blood and Transplant Centre, University of Cambridge, Cambridge, United Kingdom,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Javier Corral
- Servicio de Hematología y Oncología Médica, Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universidad de Murcia, Murcia, Spain,Javier Corral University of Murcia, Centro Regional de HemodonaciónCalle Ronda de Garay s/n, Murcia 30003Spain
| | - Alba Sanchis-Juan
- Department of Haematology, NHS Blood and Transplant Centre, University of Cambridge, Cambridge, United Kingdom,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, United Kingdom,Address for correspondence Alba Sanchis-Juan University of Cambridge, Department of Haematology, NHS Blood and Transplant CentreCambridge, CB2 0PTUnited Kingdom
| | | |
Collapse
|
25
|
Xing L, Shen Y, Wei X, Luo Y, Yang Y, Liu H, Liu H. Long-read Oxford nanopore sequencing reveals a de novo case of complex chromosomal rearrangement involving chromosomes 2, 7, and 13. Mol Genet Genomic Med 2022; 10:e2011. [PMID: 35758276 PMCID: PMC9482406 DOI: 10.1002/mgg3.2011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 06/13/2022] [Accepted: 06/15/2022] [Indexed: 12/21/2022] Open
Abstract
Background Complex chromosomal rearrangements (CCRs) are associated with high reproductive risk, infertility, abnormalities in offspring, and recurrent miscarriage in women. It is essential to accurately characterize apparently balanced chromosome rearrangements in unaffected individuals. Methods A CCR young couple who suffered two spontaneous abortions and underwent labor induction due to fetal chromosomal abnormalities was studied using long‐read sequencing(LRS), single‐nucleotide polymorphism (SNP) array, G‐banding karyotype analysis (550‐band resolution), and Sanger sequencing. Results SNP analysis of the amniotic fluid cells during the third pregnancy revealed a 9.9‐Mb duplication at 7q21.11q21.2 and a 24.8‐Mb heterozygous deletion at 13q21.1q31.1. The unaffected female partner was a carrier of a three‐way CCR [46,XX,? ins(7;13)(q21.1;q21.1q22)t(2;13)(p23;q22)]. Subsequent LRS analysis revealed the exact breakpoint locations on the derivative chromosomes and the specific method of chromosome rearrangement, indicating that the CCR carrier was a more complex structural rearrangement comprising five breakpoints. Furthermore, LRS detected an inserted fragment of chromosome 13 in chromosome 7. Conclusions LRS is effective for analyzing the complex structural variations of the human genome and may be used to clarify the specific CCRs for effective genetic counseling and appropriate intervention.
Collapse
Affiliation(s)
- Lingling Xing
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| | - Ying Shen
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| | - Xiang Wei
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| | - Yuan Luo
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| | - Yan Yang
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| | - Haipeng Liu
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| | - Hongqian Liu
- Department of Obstetrics and Gynaecology, West China Second University Hospital, Sichuan University, Chengdu, China.,Key Laboratory of Birth Defects and Related Diseases of Women and Children, Ministry of Education, Sichuan University, Chengdu, China
| |
Collapse
|
26
|
Whitford W, Hawkins V, Moodley KS, Grant MJ, Lehnert K, Snell RG, Jacobsen JC. Proof of concept for multiplex amplicon sequencing for mutation identification using the MinION nanopore sequencer. Sci Rep 2022; 12:8572. [PMID: 35595858 PMCID: PMC9122479 DOI: 10.1038/s41598-022-12613-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 05/04/2022] [Indexed: 12/22/2022] Open
Abstract
Rapid, cost-effective identification of genetic variants in small candidate genomic regions remains a challenge, particularly for less well equipped or lower throughput laboratories. The application of Oxford Nanopore Technologies’ MinION sequencer has the potential to fulfil this requirement. We demonstrate a proof of concept for a multiplexing assay that pools PCR amplicons for MinION sequencing to enable sequencing of multiple templates from multiple individuals, which could be applied to gene-targeted diagnostics. A combined strategy of barcoding and sample pooling was developed for simultaneous multiplex MinION sequencing of 100 PCR amplicons. The amplicons are family-specific, spanning a total of 30 loci in DNA isolated from 82 human neurodevelopmental cases and family members. The target regions were chosen for further interrogation because a potentially disease-causative variant had been identified in affected individuals following Illumina exome sequencing. The pooled MinION sequences were deconvoluted by aligning to custom references using the minimap2 aligner software. Our multiplexing approach produced an interpretable and expected sequence from 29 of the 30 targeted genetic loci. The sequence variant which was not correctly resolved in the MinION sequence was adjacent to a five nucleotide homopolymer. It is already known that homopolymers present a resolution problem with the MinION approach. Interestingly despite equimolar quantities of PCR amplicon pooled for sequencing, significant variation in the depth of coverage (127×–19,626×; mean = 8321×, std err = 452.99) was observed. We observed independent relationships between depth of coverage and target length, and depth of coverage and GC content. These relationships demonstrate biases of the MinION sequencer for longer templates and those with lower GC content. We demonstrate an efficient approach for variant discovery or confirmation from short DNA templates using the MinION sequencing device. With less than 130 × depth of coverage required for accurate genotyping, the methodology described here allows for rapid highly multiplexed targeted sequencing of large numbers of samples in a minimally equipped laboratory with a potential cost as much 200 × less than that from Sanger sequencing.
Collapse
Affiliation(s)
- Whitney Whitford
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand. .,Centre for Brain Research, The University of Auckland, Auckland, New Zealand.
| | - Victoria Hawkins
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.,Centre for Brain Research, The University of Auckland, Auckland, New Zealand
| | - Kriebashne S Moodley
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.,Centre for Brain Research, The University of Auckland, Auckland, New Zealand
| | - Matthew J Grant
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.,Centre for Brain Research, The University of Auckland, Auckland, New Zealand
| | - Klaus Lehnert
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.,Centre for Brain Research, The University of Auckland, Auckland, New Zealand
| | - Russell G Snell
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.,Centre for Brain Research, The University of Auckland, Auckland, New Zealand
| | - Jessie C Jacobsen
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.,Centre for Brain Research, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
27
|
Kroll F, Dimitriadis A, Campbell T, Darwent L, Collinge J, Mead S, Vire E. Prion protein gene mutation detection using long-read Nanopore sequencing. Sci Rep 2022; 12:8284. [PMID: 35585119 PMCID: PMC9117325 DOI: 10.1038/s41598-022-12130-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 05/05/2022] [Indexed: 01/04/2023] Open
Abstract
Prion diseases are fatal neurodegenerative conditions that affect humans and animals. Rapid and accurate sequencing of the prion gene PRNP is paramount to human prion disease diagnosis and for animal surveillance programmes. Current methods for PRNP genotyping involve sequencing of small fragments within the protein-coding region. The contribution of variants in the non-coding regions of PRNP including large structural changes is poorly understood. Here, we used long-range PCR and Nanopore sequencing to sequence the full length of PRNP, including its regulatory region, in 25 samples from blood and brain of individuals with inherited or sporadic prion diseases. Nanopore sequencing detected the same variants as identified by Sanger sequencing, including repeat expansions/deletions. Nanopore identified additional single-nucleotide variants in the non-coding regions of PRNP, but no novel structural variants were discovered. Finally, we explored somatic mosaicism of PRNP's octapeptide repeat region, which is a hypothetical cause of sporadic prion disease. While we found changes consistent with somatic mutations, we demonstrate that they may have been generated by the PCR. Our study illustrates the accuracy of Nanopore sequencing for rapid and field prion disease diagnosis and highlights the need for single-molecule sequencing methods for the detection of somatic mutations.
Collapse
Affiliation(s)
- François Kroll
- grid.83440.3b0000000121901201MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF UK
| | - Athanasios Dimitriadis
- grid.83440.3b0000000121901201MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF UK
| | - Tracy Campbell
- grid.83440.3b0000000121901201MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF UK
| | - Lee Darwent
- grid.83440.3b0000000121901201MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF UK
| | - John Collinge
- grid.83440.3b0000000121901201MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF UK
| | - Simon Mead
- MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF, UK.
| | - Emmanuelle Vire
- grid.83440.3b0000000121901201MRC Prion Unit at University College London (UCL), UCL Institute of Prion Diseases, UCL, London, W1W 7FF UK
| |
Collapse
|
28
|
Liu YH, Chou YT, Chang FP, Lee WJ, Guo YC, Chou CT, Huang HC, Mizuguchi T, Chou CC, Yu HY, Yu KW, Wu HM, Tsai PC, Matsumoto N, Lee YC, Liao YC. Neuronal intranuclear inclusion disease in patients with adult-onset non-vascular leukoencephalopathy. Brain 2022; 145:3010-3021. [PMID: 35411397 DOI: 10.1093/brain/awac135] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 03/24/2022] [Accepted: 03/27/2022] [Indexed: 11/12/2022] Open
Abstract
Neuronal intranuclear inclusion disease (NIID), caused by an expansion of GGC repeats in the 5'-untranslated region of NOTCH2NLC, is an important but underdiagnosed cause of adult-onset leukoencephalopathies. The present study aimed to investigate the prevalence, clinical spectrum, and brain MRI characteristics of NIID in adult-onset nonvascular leukoencephalopathies and assess the diagnostic performance of neuroimaging features. One hundred and sixty-one unrelated Taiwanese patients with genetically undetermined nonvascular leukoencephalopathies were screened for the NOTCH2NLC GGC repeat expansions using fragment analysis, repeat-primed PCR, southern blot analysis and/or nanopore sequencing with Cas9-mediated enrichment. Among them, 32 (19.9%) patients had an expanded NOTCH2NLC allele and diagnosed with NIID. We enrolled another two affected family members from one patient for further analysis. The size of the expanded NOTCH2NLC GGC repeats in the 34 patients ranged from 73 to 323 repeats. Skin biopsy from five patients all showed eosinophilic, p62-positive intranuclear inclusions in the sweat gland cells and dermal adipocytes. Among the 34 NIID patents presenting with nonvascular leukoencephalopathies, the median age at symptom onset was 61 years (range, 41-78 years) and the initial presentations included cognitive decline (44.1%; 15/34), acute encephalitis-like episodes (32.4%; 11/34), limb weakness (11.8%, 4/34), and parkinsonism (11.8%; 4/34). Cognitive decline (64.7%; 22/34) and acute encephalitis-like episodes (55.9%; 19/34) were also the most common overall manifestations. Two-thirds of the patients had either bladder dysfunction or visual disturbance. Comparing the brain MRI features between the NIID patients and individuals with other undetermined leukoencephalopathies, corticomedullary junction curvilinear lesion on diffusion weighted imaging (DWI) was the best biomarker to diagnose NIID with high specificity (98.4%) and sensitivity (88.2%). However, such DWI abnormality was absent in 11.8% of the NIID patients. When only fluid-attenuated inversion recovery images were available, presence of white matter hyperintensity lesions (WMH) either in paravermis or middle cerebellar peduncles also favored the diagnosis of NIID with a specificity of 85.3% and a sensitivity of 76.5%. Among the ten patients' MRI performed within 5 days of the onset of acute encephalitis-like episodes, five showed cortical DWI hyperintense lesions and two revealed focal brain edema. In conclusion, NIID accounts for 19.9% (32/161) of patients with adult-onset genetically undiagnosed nonvascular leukoencephalopathies in Taiwan. Half of the NIID patients ever developed encephalitis-like episodes with restricted diffusion in the cortical regions at the acute stage DWI. Corticomedullary junction hyperintense lesions, WMH in paravermis or middle cerebellar peduncles, bladder dysfunction and visual disturbance are useful hints to diagnose NIID.
Collapse
Affiliation(s)
- Yi-Hong Liu
- Department of Neurology, Taipei Veterans General Hospital, Taipei 11217, Taiwan
| | - Ying-Tsen Chou
- Department of Neurology, Taipei Veterans General Hospital, Taipei 11217, Taiwan
| | - Fu-Pang Chang
- Department of Pathology and Laboratory Medicine, Taipei Veterans General Hospital, Taipei 11217, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan
| | - Wei-Ju Lee
- Neurological Institute, Taichung Veterans General Hospital, Taichung 40705, Taiwan.,Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,College of Medicine, National Chung Hsing University, Taichung 40227, Taiwan
| | - Yuh-Cherng Guo
- Department of Neurology, China Medical University Hospital, Taichung 404332, Taiwan.,School of Medicine, College of Medicine, China Medical University, Taichung 404333, Taiwan
| | - Cheng-Ta Chou
- Neurological Institute, Taichung Veterans General Hospital, Taichung 40705, Taiwan.,Rong Hsing Research Center for Translational Medicine, National Chung Hsing University, Taichung 40227, Taiwan
| | - Hui-Chun Huang
- Department of Neurology, China Medical University Hospital, Taichung 404332, Taiwan.,School of Medicine, College of Medicine, China Medical University, Taichung 404333, Taiwan
| | - Takeshi Mizuguchi
- Yokohama City University Graduate School of Medicine, Yokohama 236-0004, Japan
| | - Chien-Chen Chou
- Department of Neurology, Taipei Veterans General Hospital, Taipei 11217, Taiwan.,Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan
| | - Hsiang-Yu Yu
- Department of Neurology, Taipei Veterans General Hospital, Taipei 11217, Taiwan.,Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan
| | - Kai-Wei Yu
- Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Department of Radiology, Taipei Veterans General Hospital, Taipei 11217, Taiwan
| | - Hsiu-Mei Wu
- Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Department of Radiology, Taipei Veterans General Hospital, Taipei 11217, Taiwan
| | - Pei-Chien Tsai
- Department of Life Sciences, National Chung Hsing University, Taichung 40227, Taiwan
| | - Naomichi Matsumoto
- Yokohama City University Graduate School of Medicine, Yokohama 236-0004, Japan
| | - Yi-Chung Lee
- Department of Neurology, Taipei Veterans General Hospital, Taipei 11217, Taiwan.,Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan
| | - Yi-Chu Liao
- Department of Neurology, Taipei Veterans General Hospital, Taipei 11217, Taiwan.,Faculty of Medicine, School of Medicine, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei 11221, Taiwan
| |
Collapse
|
29
|
Stefan CP, Hall AT, Graham AS, Minogue TD. Comparison of Illumina and Oxford Nanopore Sequencing Technologies for Pathogen Detection from Clinical Matrices Using Molecular Inversion Probes. J Mol Diagn 2022; 24:395-405. [PMID: 35085783 DOI: 10.1016/j.jmoldx.2021.12.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 11/19/2021] [Accepted: 12/22/2021] [Indexed: 11/16/2022] Open
Abstract
Next-generation sequencing is rapidly finding footholds in numerous microbiological fields, including infectious disease diagnostics. Here, we describe a molecular inversion probe panel for the identification of bacterial, viral, and parasitic pathogens. We describe the ability of Illumina and Oxford Nanopore Technologies (ONT) to sequence small amplicons originating from this panel for the identification of pathogens in complex matrices. The panel correctly classified 31 bacterial pathogens directly from positive blood culture bottles with a genus-level concordance of 96.7% and 90.3% on the Illumina and ONT platforms, respectively. Both sequencing platforms detected 18 viral and parasitic organisms directly from mock clinical samples of plasma and whole blood at concentrations of 104 PFU/mL with few exceptions. In general, Illumina sequencing exhibited greater read counts with lower percent mapped reads; however, this resulted in no effect on limits of detection compared with ONT sequencing. Mock clinical evaluation of the probe panel on the Illumina and ONT platforms resulted in positive predictive values of 0.91 and 0.88 and negative predictive values of 1 and 1 from de-identified human chikungunya virus samples compared with gold standard quantitative RT-PCR. Overall, these data show that molecular inversion probes are an adaptable technology capable of pathogen detection from complex sample matrices on current next-generation sequencing platforms.
Collapse
Affiliation(s)
- Christopher P Stefan
- Diagnostic Systems Division, United States Army Medical Research Institute of Infectious Disease, Fort Detrick, Maryland
| | - Adrienne T Hall
- Diagnostic Systems Division, United States Army Medical Research Institute of Infectious Disease, Fort Detrick, Maryland
| | - Amanda S Graham
- Diagnostic Systems Division, United States Army Medical Research Institute of Infectious Disease, Fort Detrick, Maryland
| | - Timothy D Minogue
- Diagnostic Systems Division, United States Army Medical Research Institute of Infectious Disease, Fort Detrick, Maryland.
| |
Collapse
|
30
|
Hamdan A, Ewing A. Unravelling the tumour genome: the evolutionary and clinical impacts of structural variants in tumourigenesis. J Pathol 2022; 257:479-493. [PMID: 35355264 PMCID: PMC9321913 DOI: 10.1002/path.5901] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/16/2022] [Accepted: 03/28/2022] [Indexed: 11/15/2022]
Abstract
Structural variants (SVs) represent a major source of aberration in tumour genomes. Given the diversity in the size and type of SVs present in tumours, the accurate detection and interpretation of SVs in tumours is challenging. New classes of complex structural events in tumours are discovered frequently, and the definitions of the genomic consequences of complex events are constantly being refined. Detailed analyses of short‐read whole‐genome sequencing (WGS) data from large tumour cohorts facilitate the interrogation of SVs at orders of magnitude greater scale and depth. However, the inherent technical limitations of short‐read WGS prevent us from accurately detecting and investigating the impact of all the SVs present in tumours. The expanded use of long‐read WGS will be critical for improving the accuracy of SV detection, and in fully resolving complex SV events, both of which are crucial for determining the impact of SVs on tumour progression and clinical outcome. Despite the present limitations, we demonstrate that SVs play an important role in tumourigenesis. In particular, SVs contribute significantly to late‐stage tumour development and to intratumoural heterogeneity. The evolutionary trajectories of SVs represent a window into the clonal dynamics in tumours, a comprehensive understanding of which will be vital for influencing patient outcomes in the future. Recent findings have highlighted many clinical applications of SVs in cancer, from early detection to biomarkers for treatment response and prognosis. As the methods to detect and interpret SVs improve, elucidating the full breadth of the complex SV landscape and determining how these events modulate tumour evolution will improve our understanding of cancer biology and our ability to capitalise on the utility of SVs in the clinical management of cancer patients. © 2022 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Alhafidz Hamdan
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.,Cancer Research UK Edinburgh Centre, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Ailith Ewing
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.,Cancer Research UK Edinburgh Centre, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
31
|
Czmil A, Wronski M, Czmil S, Sochacka-Pietal M, Cmil M, Gawor J, Wołkowicz T, Plewczynski D, Strzalka D, Pietal M. NanoForms: an integrated server for processing, analysis and assembly of raw sequencing data of microbial genomes, from Oxford Nanopore technology. PeerJ 2022; 10:e13056. [PMID: 35368340 PMCID: PMC8973472 DOI: 10.7717/peerj.13056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/13/2022] [Indexed: 01/11/2023] Open
Abstract
Background Next Generation Sequencing (NGS) techniques dominate today's landscape of genetics and genomics research. Though Illumina still dominates worldwide sequencing, Oxford Nanopore is one of the leading technologies currently being used by biologists, medics and geneticists across various applications. Oxford Nanopore is automated and relatively simple for conducting experiments, but generates gigabytes of raw data, to be processed by often ambiguous set of alternative bioinformatics command-line tools, and genomics frameworks which require a knowledge of bioinformatics to run. Results We established an inter-collegiate collaboration across experimentalists and bioinformaticians in order to provide a novel bioinformatics tool, free for academics. This tool allows people without extensive bioinformatics knowledge to simply process their raw genome sequencing data. Currently, due to ICT resources' maintenance reasons, our server is only capable of handling small genomes (up to 15 Mb). In this paper, we introduce our tool, NanoForms: an intuitive and integrated web server for the processing and analysis of raw prokaryotic genome data, coming from Oxford Nanopore. NanoForms is freely available for academics at the following locations: http://nanoforms.tech (webserver) and https://github.com/czmilanna/nanoforms (GitHub source repository).
Collapse
Affiliation(s)
- Anna Czmil
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Michal Wronski
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Sylwester Czmil
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Marta Sochacka-Pietal
- Department of Biotechnology and Bioinformatics, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Michal Cmil
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Jan Gawor
- DNA Sequencing and Oligonucleotide Synthesis Laboratory, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Masovian, Poland
| | - Tomasz Wołkowicz
- Department of Bacteriology and Biocontamination Control, National Institute of Public Health-National Institute of Hygiene, Warsaw, Masovian, Poland
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Masovian, Poland,Laboratory of Bioinformatics and Computational Genomics, Warsaw University of Technology, Warsaw, Masovian, Poland
| | - Dominik Strzalka
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Michal Pietal
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| |
Collapse
|
32
|
Lemay MA, Sibbesen JA, Torkamaneh D, Hamel J, Levesque RC, Belzile F. Combined use of Oxford Nanopore and Illumina sequencing yields insights into soybean structural variation biology. BMC Biol 2022; 20:53. [PMID: 35197050 PMCID: PMC8867729 DOI: 10.1186/s12915-022-01255-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 02/16/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Structural variants (SVs), including deletions, insertions, duplications, and inversions, are relatively long genomic variations implicated in a diverse range of processes from human disease to ecology and evolution. Given their complex signatures, tendency to occur in repeated regions, and large size, discovering SVs based on short reads is challenging compared to single-nucleotide variants. The increasing availability of long-read technologies has greatly facilitated SV discovery; however, these technologies remain too costly to apply routinely to population-level studies. Here, we combined short-read and long-read sequencing technologies to provide a comprehensive population-scale assessment of structural variation in a panel of Canadian soybean cultivars. RESULTS We used Oxford Nanopore long-read sequencing data (~12× mean coverage) for 17 samples to both benchmark SV calls made from Illumina short-read data and predict SVs that were subsequently genotyped in a population of 102 samples using Illumina data. Benchmarking results show that variants discovered using Oxford Nanopore can be accurately genotyped from the Illumina data. We first use the genotyped deletions and insertions for population genetics analyses and show that results are comparable to those based on single-nucleotide variants. We observe that the population frequency and distribution within the genome of deletions and insertions are constrained by the location of genes. Gene Ontology and PFAM domain enrichment analyses also confirm previous reports that genes harboring high-frequency deletions and insertions are enriched for functions in defense response. Finally, we discover polymorphic transposable elements from the deletions and insertions and report evidence of the recent activity of a Stowaway MITE. CONCLUSIONS We show that structural variants discovered using Oxford Nanopore data can be genotyped with high accuracy from Illumina data. Our results demonstrate that long-read and short-read sequencing technologies can be efficiently combined to enhance SV analysis in large populations, providing a reusable framework for their study in a wider range of samples and non-model species.
Collapse
Affiliation(s)
- Marc-André Lemay
- Département de phytologie, Université Laval, Quebec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Quebec, Canada
| | | | - Davoud Torkamaneh
- Département de phytologie, Université Laval, Quebec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Quebec, Canada
| | - Jérémie Hamel
- Institut de biologie intégrative et des systèmes, Université Laval, Quebec, Canada
- Département de microbiologie-infectiologie et d’immunologie, Université Laval, Quebec, Canada
| | - Roger C. Levesque
- Institut de biologie intégrative et des systèmes, Université Laval, Quebec, Canada
- Département de microbiologie-infectiologie et d’immunologie, Université Laval, Quebec, Canada
| | - François Belzile
- Département de phytologie, Université Laval, Quebec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Quebec, Canada
| |
Collapse
|
33
|
Lüth T, Laβ J, Schaake S, Wohlers I, Pozojevic J, Jamora RDG, Rosales RL, Brüggemann N, Saranza G, Diesta CCE, Schlüter K, Tse R, Reyes CJ, Brand M, Busch H, Klein C, Westenberger A, Trinh J. Elucidating Hexanucleotide Repeat Number and Methylation within the X-Linked Dystonia-Parkinsonism (XDP)-Related SVA Retrotransposon in TAF1 with Nanopore Sequencing. Genes (Basel) 2022; 13:genes13010126. [PMID: 35052466 PMCID: PMC8775018 DOI: 10.3390/genes13010126] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/05/2022] [Accepted: 01/07/2022] [Indexed: 12/13/2022] Open
Abstract
Background: X-linked dystonia-parkinsonism (XDP) is an adult-onset neurodegenerative disorder characterized by progressive dystonia and parkinsonism. It is caused by a SINE-VNTR-Alu (SVA) retrotransposon insertion in the TAF1 gene with a polymorphic (CCCTCT)n domain that acts as a genetic modifier of disease onset and expressivity. Methods: Herein, we used Nanopore sequencing to investigate SVA genetic variability and methylation. We used blood-derived DNA from 96 XDP patients for amplicon-based deep Nanopore sequencing and validated it with fragment analysis which was performed using fluorescence-based PCR. To detect methylation from blood- and brain-derived DNA, we used a Cas9-targeted approach. Results: High concordance was observed for hexanucleotide repeat numbers detected with Nanopore sequencing and fragment analysis. Within the SVA locus, there was no difference in genetic variability other than variations of the repeat motif between patients. We detected high CpG methylation frequency (MF) of the SVA and flanking regions (mean MF = 0.94, SD = ±0.12). Our preliminary results suggest only subtle differences between the XDP patient and the control in predicted enhancer sites directly flanking the SVA locus. Conclusions: Nanopore sequencing can reliably detect SVA hexanucleotide repeat numbers, methylation and, lastly, variation in the repeat motif.
Collapse
Affiliation(s)
- Theresa Lüth
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Joshua Laβ
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Susen Schaake
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Inken Wohlers
- Medical Systems Biology Division, Luebeck Institute of Experimental Dermatology, University of Luebeck, 23538 Luebeck, Germany; (I.W.); (H.B.)
- Institute for Cardiogenetics, University of Luebeck, 23538 Luebeck, Germany
| | - Jelena Pozojevic
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Roland Dominic G. Jamora
- Department of Neurosciences, College of Medicine, Philippine General Hospital, University of the Philippines Manila, Manila 1000, Philippines;
| | - Raymond L. Rosales
- Department of Neurology and Psychiatry, The Hospital Neuroscience Institute, University of Santo Tomas, Manila 1008, Philippines;
| | - Norbert Brüggemann
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
- Department of Neurology, University of Luebeck, 23538 Luebeck, Germany
| | - Gerard Saranza
- Section of Neurology, Department of Internal Medicine, Chong Hua Hospital, Cebu City 6000, Philippines;
| | - Cid Czarina E. Diesta
- Department of Neurosciences, Movement Disorders Clinic, Makati Medical Center, Makati 1229, Philippines;
| | - Kathleen Schlüter
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Ronnie Tse
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Charles Jourdan Reyes
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Max Brand
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Hauke Busch
- Medical Systems Biology Division, Luebeck Institute of Experimental Dermatology, University of Luebeck, 23538 Luebeck, Germany; (I.W.); (H.B.)
- Institute for Cardiogenetics, University of Luebeck, 23538 Luebeck, Germany
| | - Christine Klein
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Ana Westenberger
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
| | - Joanne Trinh
- Institute of Neurogenetics, University of Luebeck, 23538 Luebeck, Germany; (T.L.); (J.L.); (S.S.); (J.P.); (N.B.); (K.S.); (R.T.); (C.J.R.); (M.B.); (C.K.); (A.W.)
- Correspondence:
| |
Collapse
|
34
|
Lamb HJ, Hayes BJ, Randhawa IAS, Nguyen LT, Ross EM. Genomic prediction using low-coverage portable Nanopore sequencing. PLoS One 2021; 16:e0261274. [PMID: 34910782 PMCID: PMC8673642 DOI: 10.1371/journal.pone.0261274] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 11/26/2021] [Indexed: 11/18/2022] Open
Abstract
Most traits in livestock, crops and humans are polygenic, that is, a large number of loci contribute to genetic variation. Effects at these loci lie along a continuum ranging from common low-effect to rare high-effect variants that cumulatively contribute to the overall phenotype. Statistical methods to calculate the effect of these loci have been developed and can be used to predict phenotypes in new individuals. In agriculture, these methods are used to select superior individuals using genomic breeding values; in humans these methods are used to quantitatively measure an individual’s disease risk, termed polygenic risk scores. Both fields typically use SNP array genotypes for the analysis. Recently, genotyping-by-sequencing has become popular, due to lower cost and greater genome coverage (including structural variants). Oxford Nanopore Technologies’ (ONT) portable sequencers have the potential to combine the benefits genotyping-by-sequencing with portability and decreased turn-around time. This introduces the potential for in-house clinical genetic disease risk screening in humans or calculating genomic breeding values on-farm in agriculture. Here we demonstrate the potential of the later by calculating genomic breeding values for four traits in cattle using low-coverage ONT sequence data and comparing these breeding values to breeding values calculated from SNP arrays. At sequencing coverages between 2X and 4X the correlation between ONT breeding values and SNP array-based breeding values was > 0.92 when imputation was used and > 0.88 when no imputation was used. With an average sequencing coverage of 0.5x the correlation between the two methods was between 0.85 and 0.92 using imputation, depending on the trait. This suggests that ONT sequencing has potential for in clinic or on-farm genomic prediction, however, further work to validate these findings in a larger population still remains.
Collapse
Affiliation(s)
- Harrison J. Lamb
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
- * E-mail:
| | - Ben J. Hayes
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Imtiaz A. S. Randhawa
- School of Veterinary Science, The University of Queensland, Brisbane, QLD, Australia
| | - Loan T. Nguyen
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Elizabeth M. Ross
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
35
|
Bartalucci N, Romagnoli S, Vannucchi AM. A blood drop through the pore: nanopore sequencing in hematology. Trends Genet 2021; 38:572-586. [PMID: 34906378 DOI: 10.1016/j.tig.2021.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 11/09/2021] [Accepted: 11/15/2021] [Indexed: 10/19/2022]
Abstract
The development of new sequencing platforms, technologies, and bioinformatics tools in the past decade fostered key discoveries in human genomics. Among the most recent sequencing technologies, nanopore sequencing (NS) has caught the interest of researchers for its intriguing potential and flexibility. This up-to-date review highlights the recent application of NS in the hematology field, focusing on progress and challenges of the technological approaches employed for the identification of pathologic alterations. The molecular and analytic pipelines developed for the analysis of the whole-genome, target regions, and transcriptomics provide a proof of evidence of the unparalleled amount of information that could be retrieved by an innovative approach based on long-read sequencing.
Collapse
Affiliation(s)
- Niccolò Bartalucci
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, DENOTHE Excellence Center, Florence, Italy
| | - Simone Romagnoli
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, DENOTHE Excellence Center, Florence, Italy
| | - Alessandro Maria Vannucchi
- CRIMM, Center of Research and Innovation of Myeloproliferative Neoplasms, Careggi University Hospital and Department of Experimental and Clinical Medicine, University of Florence, DENOTHE Excellence Center, Florence, Italy.
| |
Collapse
|
36
|
Bolognini D, Magi A. Evaluation of Germline Structural Variant Calling Methods for Nanopore Sequencing Data. Front Genet 2021; 12:761791. [PMID: 34868242 PMCID: PMC8637281 DOI: 10.3389/fgene.2021.761791] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 10/11/2021] [Indexed: 01/27/2023] Open
Abstract
Structural variants (SVs) are genomic rearrangements that involve at least 50 nucleotides and are known to have a serious impact on human health. While prior short-read sequencing technologies have often proved inadequate for a comprehensive assessment of structural variation, more recent long reads from Oxford Nanopore Technologies have already been proven invaluable for the discovery of large SVs and hold the potential to facilitate the resolution of the full SV spectrum. With many long-read sequencing studies to follow, it is crucial to assess factors affecting current SV calling pipelines for nanopore sequencing data. In this brief research report, we evaluate and compare the performances of five long-read SV callers across four long-read aligners using both real and synthetic nanopore datasets. In particular, we focus on the effects of read alignment, sequencing coverage, and variant allele depth on the detection and genotyping of SVs of different types and size ranges and provide insights into precision and recall of SV callsets generated by integrating the various long-read aligners and SV callers. The computational pipeline we propose is publicly available at https://github.com/davidebolo1993/EViNCe and can be adjusted to further evaluate future nanopore sequencing datasets.
Collapse
Affiliation(s)
- Davide Bolognini
- Unit of Medical Genetics, Meyer Children’s Hospital, Florence, Italy
| | - Alberto Magi
- Department of Information Engineering, University of Florence, Florence, Italy
| |
Collapse
|
37
|
Yu Y, Chen L, Miao X, Li SC. SpecHap: a diploid phasing algorithm based on spectral graph theory. Nucleic Acids Res 2021; 49:e114. [PMID: 34403470 PMCID: PMC8565328 DOI: 10.1093/nar/gkab709] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 07/25/2021] [Accepted: 08/02/2021] [Indexed: 11/30/2022] Open
Abstract
Haplotype phasing plays an important role in understanding the genetic data of diploid eukaryotic organisms. Different sequencing technologies (such as next-generation sequencing or third-generation sequencing) produce various genetic data that require haplotype assembly. Although multiple diploid haplotype phasing algorithms exist, only a few will work equally well across all sequencing technologies. In this work, we propose SpecHap, a novel haplotype assembly tool that leverages spectral graph theory. On both in silico and whole-genome sequencing datasets, SpecHap consumed less memory and required less CPU time, yet achieved comparable accuracy with state-of-art methods across all the test instances, which comprises sequencing data from next-generation sequencing, linked-reads, high-throughput chromosome conformation capture, PacBio single-molecule real-time, and Oxford Nanopore long-reads. Furthermore, SpecHap successfully phased an individual Ambystoma mexicanum, a species with gigantic diploid genomes, within 6 CPU hours and 945MB peak memory usage, while other tools failed to yield results either due to memory overflow (40GB) or time limit exceeded (5 days). Our results demonstrated that SpecHap is scalable, efficient, and accurate for diploid phasing across many sequencing platforms.
Collapse
Affiliation(s)
- Yonghan Yu
- Computer Science, City University of Hong Kong, Kowloon, Hong Kong 999077, China
| | - Lingxi Chen
- Computer Science, City University of Hong Kong, Kowloon, Hong Kong 999077, China
| | - Xinyao Miao
- Computer Science, City University of Hong Kong, Kowloon, Hong Kong 999077, China
| | - Shuai Cheng Li
- Computer Science, City University of Hong Kong, Kowloon, Hong Kong 999077, China
| |
Collapse
|
38
|
Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 406] [Impact Index Per Article: 135.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
|
39
|
Johnson LK, Sahasrabudhe R, Gill JA, Roach JL, Froenicke L, Brown CT, Whitehead A. Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish. Gigascience 2021; 9:5859380. [PMID: 32556169 PMCID: PMC7301629 DOI: 10.1093/gigascience/giaa067] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 04/16/2020] [Accepted: 05/27/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. FINDINGS Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. CONCLUSIONS High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.
Collapse
Affiliation(s)
- Lisa K Johnson
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Ruta Sahasrabudhe
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - James Anthony Gill
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Jennifer L Roach
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Lutz Froenicke
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - C Titus Brown
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Andrew Whitehead
- Correspondence address. Andrew Whitehead, Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, USA, Davis, CA, USA. E-mail:
| |
Collapse
|
40
|
Stelzer CP, Blommaert J, Waldvogel AM, Pichler M, Hecox-Lea B, Mark Welch DB. Comparative analysis reveals within-population genome size variation in a rotifer is driven by large genomic elements with highly abundant satellite DNA repeat elements. BMC Biol 2021; 19:206. [PMID: 34530817 PMCID: PMC8447722 DOI: 10.1186/s12915-021-01134-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 08/27/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Eukaryotic genomes are known to display an enormous variation in size, but the evolutionary causes of this phenomenon are still poorly understood. To obtain mechanistic insights into such variation, previous studies have often employed comparative genomics approaches involving closely related species or geographically isolated populations within a species. Genome comparisons among individuals of the same population remained so far understudied-despite their great potential in providing a microevolutionary perspective to genome size evolution. The rotifer Brachionus asplanchnoidis represents one of the most extreme cases of within-population genome size variation among eukaryotes, displaying almost twofold variation within a geographic population. RESULTS Here, we used a whole-genome sequencing approach to identify the underlying DNA sequence differences by assembling a high-quality reference genome draft for one individual of the population and aligning short reads of 15 individuals from the same geographic population including the reference individual. We identified several large, contiguous copy number variable regions (CNVs), up to megabases in size, which exhibited striking coverage differences among individuals, and whose coverage overall scaled with genome size. CNVs were of remarkably low complexity, being mainly composed of tandemly repeated satellite DNA with only a few interspersed genes or other sequences, and were characterized by a significantly elevated GC-content. CNV patterns in offspring of two parents with divergent genome size and CNV patterns in several individuals from an inbred line differing in genome size demonstrated inheritance and accumulation of CNVs across generations. CONCLUSIONS By identifying the exact genomic elements that cause within-population genome size variation, our study paves the way for studying genome size evolution in contemporary populations rather than inferring patterns and processes a posteriori from species comparisons.
Collapse
Affiliation(s)
- C P Stelzer
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria.
| | - J Blommaert
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - A M Waldvogel
- Institute of Zoology, University of Cologne, Cologne, Germany
| | - M Pichler
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
| | - B Hecox-Lea
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, USA
| | - D B Mark Welch
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, USA
| |
Collapse
|
41
|
Li G, Jiang T, Li J, Wang Y. PanSVR: Pan-Genome Augmented Short Read Realignment for Sensitive Detection of Structural Variations. Front Genet 2021; 12:731515. [PMID: 34490049 PMCID: PMC8417358 DOI: 10.3389/fgene.2021.731515] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Accepted: 07/26/2021] [Indexed: 01/10/2023] Open
Abstract
The comprehensive discovery of structure variations (SVs) is fundamental to many genomics studies and high-throughput sequencing has become a common approach to this task. However, due the limited length, it is still non-trivial to state-of-the-art tools to accurately align short reads and produce high-quality SV callsets. Pan-genome provides a novel and promising framework to short read-based SV calling since it enables to comprehensively integrate known variants to reduce the incompleteness and bias of single reference to breakthrough the bottlenecks of short read alignments and provide new evidences to the detection of SVs. However, it is still an open problem to develop effective computational approaches to fully take the advantage of pan-genomes. Herein, we propose Pan-genome augmented Structure Variation calling tool with read Re-alignment (PanSVR), a novel pan-genome-based SV calling approach. PanSVR uses several tailored methods to implement precise re-alignment for SV-spanning reads against well-organized pan-genome reference with plenty of known SVs. PanSVR enables to greatly improve the quality of short read alignments and produce clear and homogenous SV signatures which facilitate SV calling. Benchmark results on real sequencing data suggest that PanSVR is able to largely improve the sensitivity of SV calling than that of state-of-the-art SV callers, especially for the SVs from repeat-rich regions and/or novel insertions which are difficult to existing tools.
Collapse
Affiliation(s)
- Gaoyang Li
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Tao Jiang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Junyi Li
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.,School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
| | - Yadong Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
42
|
Neveling K, Mantere T, Vermeulen S, Oorsprong M, van Beek R, Kater-Baats E, Pauper M, van der Zande G, Smeets D, Weghuis DO, Stevens-Kroef MJPL, Hoischen A. Next-generation cytogenetics: Comprehensive assessment of 52 hematological malignancy genomes by optical genome mapping. Am J Hum Genet 2021; 108:1423-1435. [PMID: 34237281 DOI: 10.1016/j.ajhg.2021.06.001] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 06/01/2021] [Indexed: 02/06/2023] Open
Abstract
Somatic structural variants (SVs) are important drivers of cancer development and progression. In a diagnostic set-up, especially for hematological malignancies, the comprehensive analysis of all SVs in a given sample still requires a combination of cytogenetic techniques, including karyotyping, FISH, and CNV microarrays. We hypothesize that the combination of these classical approaches could be replaced by optical genome mapping (OGM). Samples from 52 individuals with a clinical diagnosis of a hematological malignancy, divided into simple (<5 aberrations, n = 36) and complex (≥5 aberrations, n = 16) cases, were processed for OGM, reaching on average: 283-fold genome coverage. OGM called a total of 918 high-confidence SVs per sample, of which, on average, 13 were rare and >100 kb. In addition, on average, 73 CNVs were called per sample, of which six were >5 Mb. For the 36 simple cases, all clinically reported aberrations were detected, including deletions, insertions, inversions, aneuploidies, and translocations. For the 16 complex cases, results were largely concordant between standard-of-care and OGM, but OGM often revealed higher complexity than previously recognized. Detailed technical comparison with standard-of-care tests showed high analytical validity of OGM, resulting in a sensitivity of 100% and a positive predictive value of >80%. Importantly, OGM resulted in a more complete assessment than any previous single test and most likely reported the most accurate underlying genomic architecture (e.g., for complex translocations, chromoanagenesis, and marker chromosomes). In conclusion, the excellent concordance of OGM with diagnostic standard assays demonstrates its potential to replace classical cytogenetic tests as well as to rapidly map novel leukemia drivers.
Collapse
Affiliation(s)
- Kornelia Neveling
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Institute of Health Sciences, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Tuomo Mantere
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Institute of Medical Life Sciences, Radboud University Medical Center, Nijmegen, the Netherlands; Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit and Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Susan Vermeulen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Michiel Oorsprong
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Ronald van Beek
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Ellen Kater-Baats
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Marc Pauper
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Guillaume van der Zande
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Dominique Smeets
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | - Daniel Olde Weghuis
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands
| | | | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands; Radboud Institute of Medical Life Sciences, Radboud University Medical Center, Nijmegen, the Netherlands; Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, 6532 GA Nijmegen, the Netherlands.
| |
Collapse
|
43
|
How Important Are Structural Variants for Speciation? Genes (Basel) 2021; 12:genes12071084. [PMID: 34356100 PMCID: PMC8305853 DOI: 10.3390/genes12071084] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/04/2021] [Accepted: 07/14/2021] [Indexed: 12/11/2022] Open
Abstract
Understanding the genetic basis of reproductive isolation is a central issue in the study of speciation. Structural variants (SVs); that is, structural changes in DNA, including inversions, translocations, insertions, deletions, and duplications, are common in a broad range of organisms and have been hypothesized to play a central role in speciation. Recent advances in molecular and statistical methods have identified structural variants, especially inversions, underlying ecologically important traits; thus, suggesting these mutations contribute to adaptation. However, the contribution of structural variants to reproductive isolation between species—and the underlying mechanism by which structural variants most often contribute to speciation—remain unclear. Here, we review (i) different mechanisms by which structural variants can generate or maintain reproductive isolation; (ii) patterns expected with these different mechanisms; and (iii) relevant empirical examples of each. We also summarize the available sequencing and bioinformatic methods to detect structural variants. Lastly, we suggest empirical approaches and new research directions to help obtain a more complete assessment of the role of structural variants in speciation.
Collapse
|
44
|
John A, Muenzen K, Ausmees K. Evaluation of serverless computing for scalable execution of a joint variant calling workflow. PLoS One 2021; 16:e0254363. [PMID: 34242357 PMCID: PMC8270184 DOI: 10.1371/journal.pone.0254363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 06/24/2021] [Indexed: 11/18/2022] Open
Abstract
Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.
Collapse
Affiliation(s)
- Aji John
- Department of Biology, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Kathleen Muenzen
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, United States of America
| | - Kristiina Ausmees
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
45
|
Complete Genome Sequences of Priestia megaterium Type and Clinical Strains Feature Complex Plasmid Arrays. Microbiol Resour Announc 2021; 10:e0040321. [PMID: 34236233 PMCID: PMC8265218 DOI: 10.1128/mra.00403-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Here, we report the high-quality complete genome sequences and plasmid arrays of Priestia megaterium ATCC 14581T and of two clinical strains (2008724129 and 2008724142) isolated from human samples in the United States.
Collapse
|
46
|
Konishi H, Yamaguchi R, Yamaguchi K, Furukawa Y, Imoto S. Halcyon: an accurate basecaller exploiting an encoder-decoder model with monotonic attention. Bioinformatics 2021; 37:1211-1217. [PMID: 33165508 PMCID: PMC8189681 DOI: 10.1093/bioinformatics/btaa953] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 10/14/2020] [Accepted: 10/30/2020] [Indexed: 11/17/2022] Open
Abstract
Motivation In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deciphering precise DNA sequences from noisy and complicated nanopore raw signals remains a crucial demand for downstream analyses based on higher-quality nanopore sequencing, although various basecallers have been introduced to date. Results To address this need, we developed a novel basecaller, Halcyon, that incorporates neural-network techniques frequently used in the field of machine translation. Our model employs monotonic-attention mechanisms to learn semantic correspondences between nucleotides and signal levels without any pre-segmentation against input signals. We evaluated performance with a human whole-genome sequencing dataset and demonstrated that Halcyon outperformed existing third-party basecallers and achieved competitive performance against the latest Oxford Nanopore Technologies’ basecallers. Availabilityand implementation The source code (halcyon) can be found at https://github.com/relastle/halcyon.
Collapse
Affiliation(s)
| | | | - Kiyoshi Yamaguchi
- Advanced Clinical Research Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Yoichi Furukawa
- Advanced Clinical Research Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Seiya Imoto
- Health Intelligence Center.,Human Genome Center
| |
Collapse
|
47
|
Mladenova V, Mladenov E, Scholz M, Stuschke M, Iliakis G. Strong Shift to ATR-Dependent Regulation of the G 2-Checkpoint after Exposure to High-LET Radiation. Life (Basel) 2021; 11:life11060560. [PMID: 34198619 PMCID: PMC8232161 DOI: 10.3390/life11060560] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 06/09/2021] [Indexed: 12/29/2022] Open
Abstract
The utilization of high linear-energy-transfer (LET) ionizing radiation (IR) modalities is rapidly growing worldwide, causing excitement but also raising concerns, because our understanding of their biological effects is incomplete. Charged particles such as protons and heavy ions have increasing potential in cancer therapy, due to their advantageous physical properties over X-rays (photons), but are also present in the space environment, adding to the health risks of space missions. Therapy improvements and the protection of humans during space travel will benefit from a better understanding of the mechanisms underpinning the biological effects of high-LET IR. There is evidence that high-LET IR induces DNA double-strand breaks (DSBs) of increasing complexity, causing enhanced cell killing, owing, at least partly, to the frequent engagement of a low-fidelity DSB-repair pathway: alternative end-joining (alt-EJ), which is known to frequently induce severe structural chromosomal abnormalities (SCAs). Here, we evaluate the radiosensitivity of A549 lung adenocarcinoma cells to X-rays, α-particles and 56Fe ions, as well as of HCT116 colorectal cancer cells to X-rays and α-particles. We observe the expected increase in cell killing following high-LET irradiation that correlates with the increased formation of SCAs as detected by mFISH. Furthermore, we report that cells exposed to low doses of α-particles and 56Fe ions show an enhanced G2-checkpoint response which is mainly regulated by ATR, rather than the coordinated ATM/ATR-dependent regulation observed after exposure to low doses of X-rays. These observations advance our understanding of the mechanisms underpinning high-LET IR effects, and suggest the potential utility for ATR inhibitors in high-LET radiation therapy.
Collapse
Affiliation(s)
- Veronika Mladenova
- Department of Radiation Therapy, Division of Experimental Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany; (V.M.); (E.M.); (M.S.)
- Institute of Medical Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany
| | - Emil Mladenov
- Department of Radiation Therapy, Division of Experimental Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany; (V.M.); (E.M.); (M.S.)
- Institute of Medical Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany
| | - Michael Scholz
- Biophysics Division, GSI Helmholtzzentrum für Schwerionenforschung GmbH, 64291 Darmstadt, Germany;
| | - Martin Stuschke
- Department of Radiation Therapy, Division of Experimental Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany; (V.M.); (E.M.); (M.S.)
- German Cancer Consortium (DKTK), Partner Site University Hospital Essen, 45122 Essen, Germany
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - George Iliakis
- Department of Radiation Therapy, Division of Experimental Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany; (V.M.); (E.M.); (M.S.)
- Institute of Medical Radiation Biology, University Hospital Essen, University of Duisburg-Essen, 45122 Essen, Germany
- Correspondence:
| |
Collapse
|
48
|
Quan C, Li Y, Liu X, Wang Y, Ping J, Lu Y, Zhou G. Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression. Genome Biol 2021; 22:159. [PMID: 34034800 PMCID: PMC8146648 DOI: 10.1186/s13059-021-02382-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/14/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Structural variation (SV) acts as an essential mutational force shaping the evolution and function of the human genome. However, few studies have examined the role of SVs in high-altitude adaptation and little is known of adaptive introgressed SVs in Tibetans so far. RESULTS Here, we generate a comprehensive catalog of SVs in a Chinese Tibetan (n = 15) and Han (n = 10) population using nanopore sequencing technology. Among a total of 38,216 unique SVs in the catalog, 27% are sequence-resolved for the first time. We systematically assess the distribution of these SVs across repeat sequences and functional genomic regions. Through genotyping in additional 276 genomes, we identify 69 Tibetan-Han stratified SVs and 80 candidate adaptive genes. We also discover a few adaptive introgressed SV candidates and provide evidence for a deletion of 335 base pairs at 1p36.32. CONCLUSIONS Overall, our results highlight the important role of SVs in the evolutionary processes of Tibetans' adaptation to the Qinghai-Tibet Plateau and provide a valuable resource for future high-altitude adaptation studies.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yuanfeng Li
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Xinyi Liu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yahui Wang
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Jie Ping
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166 People’s Republic of China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025 People’s Republic of China
| |
Collapse
|
49
|
Valle-Inclan JE, Stangl C, de Jong AC, van Dessel LF, van Roosmalen MJ, Helmijr JCA, Renkens I, Janssen R, de Blank S, de Witte CJ, Martens JWM, Jansen MPHM, Lolkema MP, Kloosterman WP. Optimizing Nanopore sequencing-based detection of structural variants enables individualized circulating tumor DNA-based disease monitoring in cancer patients. Genome Med 2021; 13:86. [PMID: 34006333 PMCID: PMC8130429 DOI: 10.1186/s13073-021-00899-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 04/27/2021] [Indexed: 12/18/2022] Open
Abstract
Here, we describe a novel approach for rapid discovery of a set of tumor-specific genomic structural variants (SVs), based on a combination of low coverage cancer genome sequencing using Oxford Nanopore with an SV calling and filtering pipeline. We applied the method to tumor samples of high-grade ovarian and prostate cancer patients and validated on average ten somatic SVs per patient with breakpoint-spanning PCR mini-amplicons. These SVs could be quantified in ctDNA samples of patients with metastatic prostate cancer using a digital PCR assay. The results suggest that SV dynamics correlate with and may improve existing treatment-response biomarkers such as PSA. https://github.com/UMCUGenetics/SHARC .
Collapse
Affiliation(s)
- Jose Espejo Valle-Inclan
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands
| | - Christina Stangl
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands.,Division of Molecular Oncology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Anouk C de Jong
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Lisanne F van Dessel
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Markus J van Roosmalen
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Jean C A Helmijr
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Ivo Renkens
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Roel Janssen
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands
| | - Sam de Blank
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Chris J de Witte
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Oncode Institute, Utrecht, The Netherlands
| | - John W M Martens
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Maurice P H M Jansen
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Martijn P Lolkema
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands.
| | - Wigard P Kloosterman
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands. .,Cyclomics, Utrecht, The Netherlands. .,Frame Cancer Therapeutics, Amsterdam, The Netherlands.
| |
Collapse
|
50
|
Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat Genet 2021; 53:779-786. [PMID: 33972781 DOI: 10.1038/s41588-021-00865-4] [Citation(s) in RCA: 108] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 04/05/2021] [Indexed: 01/05/2023]
Abstract
Long-read sequencing (LRS) promises to improve the characterization of structural variants (SVs). We generated LRS data from 3,622 Icelanders and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions). We discovered a set of 133,886 reliably genotyped SV alleles and imputed them into 166,281 individuals to explore their effects on diseases and other traits. We discovered an association of a rare deletion in PCSK9 with lower low-density lipoprotein (LDL) cholesterol levels, compared to the population average. We also discovered an association of a multiallelic SV in ACAN with height; we found 11 alleles that differed in the number of a 57-bp-motif repeat and observed a linear relationship between the number of repeats carried and height. These results show that SVs can be accurately characterized at the population scale using LRS data in a genome-wide non-targeted approach and demonstrate how SVs impact phenotypes.
Collapse
|