1
|
English AC, Dolzhenko E, Ziaei Jam H, McKenzie SK, Olson ND, De Coster W, Park J, Gu B, Wagner J, Eberle MA, Gymrek M, Chaisson MJP, Zook JM, Sedlazeck FJ. Analysis and benchmarking of small and large genomic variants across tandem repeats. Nat Biotechnol 2024:10.1038/s41587-024-02225-z. [PMID: 38671154 DOI: 10.1038/s41587-024-02225-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 03/28/2024] [Indexed: 04/28/2024]
Abstract
Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies. We curated variants from the Genome in a Bottle (GIAB) HG002 individual to create a TR dataset to benchmark existing and future TR analysis methods. We also present an improved variant comparison method that handles variants greater than 4 bp in length and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ~24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 'truth-set' TR benchmark. We demonstrate the utility of this pipeline across short-read and long-read technologies.
Collapse
Affiliation(s)
- Adam C English
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | | | - Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | | | - Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Jonghun Park
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Bida Gu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | | | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
2
|
Gustafson JA, Gibson SB, Damaraju N, Zalusky MPG, Hoekzema K, Twesigomwe D, Yang L, Snead AA, Richmond PA, De Coster W, Olson ND, Guarracino A, Li Q, Miller AL, Goffena J, Anderson Z, Storz SHR, Ward SA, Sinha M, Gonzaga-Jauregui C, Clarke WE, Basile AO, Corvelo A, Reeves C, Helland A, Musunuri RL, Revsine M, Patterson KE, Paschal CR, Zakarian C, Goodwin S, Jensen TD, Robb E, McCombie WR, Sedlazeck FJ, Zook JM, Montgomery SB, Garrison E, Kolmogorov M, Schatz MC, McLaughlin RN, Dashnow H, Zody MC, Loose M, Jain M, Eichler EE, Miller DE. Nanopore sequencing of 1000 Genomes Project samples to build a comprehensive catalog of human genetic variation. medRxiv 2024:2024.03.05.24303792. [PMID: 38496498 PMCID: PMC10942501 DOI: 10.1101/2024.03.05.24303792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.
Collapse
Affiliation(s)
- Jonas A. Gustafson
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Sophia B. Gibson
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Nikhita Damaraju
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
- Institute for Public Health Genetics, University of Washington, Seattle, WA, USA
| | - Miranda PG Zalusky
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - David Twesigomwe
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Lei Yang
- Pacific Northwest Research Institute, Seattle, WA, USA
| | | | | | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Nathan D. Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Human Technopole, Milan, Italy
| | - Qiuhui Li
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Angela L. Miller
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Joy Goffena
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Zachery Anderson
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Sophie HR Storz
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Sydney A. Ward
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Maisha Sinha
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Claudia Gonzaga-Jauregui
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México
| | - Wayne E. Clarke
- New York Genome Center, New York, NY, USA
- Outlier Informatics Inc., Saskatoon, SK, Canada
| | | | | | | | | | | | - Mahler Revsine
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | | | - Cate R. Paschal
- Department of Laboratories, Seattle Children’s Hospital, Seattle, WA, USA
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Christina Zakarian
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Esther Robb
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | | | | | | | - Fritz J. Sedlazeck
- Human Genome Sequencing Center Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Justin M. Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | | | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Mikhail Kolmogorov
- Cancer Data Science Laboratory, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Richard N. McLaughlin
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
- Pacific Northwest Research Institute, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA
| | | | - Matt Loose
- Deep Seq, School of Life Sciences, University of Nottingham, Nottingham, England
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Khoury College of Computer Sciences, Northeastern University, Boston, MA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Danny E. Miller
- Division of Genetic Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, USA
| |
Collapse
|
3
|
English A, Dolzhenko E, Jam HZ, Mckenzie S, Olson ND, De Coster W, Park J, Gu B, Wagner J, Eberle MA, Gymrek M, Chaisson MJP, Zook JM, Sedlazeck FJ. Benchmarking of small and large variants across tandem repeats. bioRxiv 2023:2023.10.29.564632. [PMID: 37961319 PMCID: PMC10634962 DOI: 10.1101/2023.10.29.564632] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and are linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due to challenges with variant calling, representation, and lack of a genome-wide standard. To promote TR methods development, we create a comprehensive catalog of TR regions and explore its properties across 86 samples. We then curate variants from the GIAB HG002 individual to create a tandem repeat benchmark. We also present a variant comparison method that handles small and large alleles and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ∼24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 TR benchmark. We work with the GIAB community to demonstrate the utility of this benchmark across short and long read technologies.
Collapse
|
4
|
Valcek A, Philippe C, Whiteway C, Robino E, Nesporova K, Bové M, Coenye T, De Pooter T, De Coster W, Strazisar M, Van der Henst C. Phenotypic Characterization and Heterogeneity among Modern Clinical Isolates of Acinetobacter baumannii. Microbiol Spectr 2023; 11:e0306122. [PMID: 36475894 PMCID: PMC9927488 DOI: 10.1128/spectrum.03061-22] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 11/22/2022] [Indexed: 12/13/2022] Open
Abstract
Acinetobacter baumannii is an opportunistic pathogenic bacterium prioritized by WHO and CDC because of its increasing antibiotic resistance. Heterogeneity among strains represents the hallmark of A. baumannii bacteria. We wondered to what extent extensively used strains, so-called reference strains, reflect the dynamic nature and intrinsic heterogeneity of these bacteria. We analyzed multiple phenotypic traits of 43 nonredundant, modern, and multidrug-resistant, extensively drug-resistant, and pandrug-resistant clinical isolates and broadly used strains of A. baumannii. Comparison of these isolates at the genetic and phenotypic levels confirmed a high degree of heterogeneity. Importantly, we observed that a significant portion of modern clinical isolates strongly differs from several historically established strains in the light of colony morphology, cellular density, capsule production, natural transformability, and in vivo virulence. The significant differences between modern clinical isolates of A. baumannii and established strains could hamper the study of A. baumannii, especially concerning its virulence and resistance mechanisms. Hence, we propose a variable collection of modern clinical isolates that are characterized at the genetic and phenotypic levels, covering a wide range of the phenotypic spectrum, with six different macrocolony type groups, from avirulent to hypervirulent phenotypes, and with naturally noncapsulated to hypermucoid strains, with intermediate phenotypes as well. Strain-specific mechanistic observations remain interesting per se, and established "reference" strains have undoubtedly been shown to be very useful to study basic mechanisms of A. baumannii biology. However, any study based on a specific strain of A. baumannii should be compared to modern and clinically relevant isolates. IMPORTANCE Acinetobacter baumannii is a bacterium prioritized by the CDC and WHO because of its increasing antibiotic resistance, leading to treatment failures. The hallmark of this pathogen is the high heterogeneity observed among isolates, due to a very dynamic genome. In this context, we tested if a subset of broadly used isolates, considered "reference" strains, was reflecting the genetic and phenotypic diversity found among currently circulating clinical isolates. We observed that the so-called reference strains do not cover the whole diversity of the modern clinical isolates. While formerly established strains successfully generated a strong base of knowledge in the A. baumannii field and beyond, our study shows that a rational choice of strain, related to a specific biological question, should be taken into consideration. Any data obtained with historically established strains should also be compared to modern and clinically relevant isolates, especially concerning drug screening, resistance, and virulence contexts.
Collapse
Affiliation(s)
- Adam Valcek
- Microbial Resistance and Drug Discovery, VIB-VUB Center for Structural Biology, VIB, Flanders Institute for Biotechnology, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Chantal Philippe
- Research Unit in the Biology of Microorganisms (URBM), NARILIS, University of Namur (UNamur), Namur, Belgium
| | - Clémence Whiteway
- Microbial Resistance and Drug Discovery, VIB-VUB Center for Structural Biology, VIB, Flanders Institute for Biotechnology, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Etienne Robino
- Microbial Resistance and Drug Discovery, VIB-VUB Center for Structural Biology, VIB, Flanders Institute for Biotechnology, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Kristina Nesporova
- Microbial Resistance and Drug Discovery, VIB-VUB Center for Structural Biology, VIB, Flanders Institute for Biotechnology, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Mona Bové
- Laboratory of Pharmaceutical Microbiology, Ghent University, Ghent, Belgium
| | - Tom Coenye
- Laboratory of Pharmaceutical Microbiology, Ghent University, Ghent, Belgium
| | - Tim De Pooter
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Mojca Strazisar
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Charles Van der Henst
- Microbial Resistance and Drug Discovery, VIB-VUB Center for Structural Biology, VIB, Flanders Institute for Biotechnology, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| |
Collapse
|
5
|
Walker K, Kalra D, Lowdon R, Chen G, Molik D, Soto DC, Dabbaghie F, Khleifat AA, Mahmoud M, Paulin LF, Raza MS, Pfeifer SP, Agustinho DP, Aliyev E, Avdeyev P, Barrozo ER, Behera S, Billingsley K, Chong LC, Choubey D, De Coster W, Fu Y, Gener AR, Hefferon T, Henke DM, Höps W, Illarionova A, Jochum MD, Jose M, Kesharwani RK, Kolora SRR, Kubica J, Lakra P, Lattimer D, Liew CS, Lo BW, Lo C, Lötter A, Majidian S, Mendem SK, Mondal R, Ohmiya H, Parvin N, Peralta C, Poon CL, Prabhakaran R, Saitou M, Sammi A, Sanio P, Sapoval N, Syed N, Treangen T, Wang G, Xu T, Yang J, Zhang S, Zhou W, Sedlazeck FJ, Busby B. The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms. F1000Res 2022; 11:530. [PMID: 36262335 PMCID: PMC9557141 DOI: 10.12688/f1000research.110194.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/04/2022] [Indexed: 01/25/2023] Open
Abstract
In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.
Collapse
Affiliation(s)
- Kimberly Walker
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA,
| | - Divya Kalra
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA,
| | | | - Guangyi Chen
- Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Saarbrücken, Germany,Center for Bioinformatics, Saarland University, Saarbrücken, Germany,
| | - David Molik
- Tropical Crop and Commodity Protection Research Unit, Pacific Basin Agricultural Research Center, Hilo, HI, 96720, USA
| | - Daniela C. Soto
- Biochemistry & Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, Davis, CA, 95616, USA
| | - Fawaz Dabbaghie
- Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Saarbrücken, Germany,Institute for Medical Biometry and Bioinformatics, University hospital Düsseldorf, Düsseldorf, Germany
| | - Ahmad Al Khleifat
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Luis F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Muhammad Sohail Raza
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Beijing, China
| | - Susanne P. Pfeifer
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Daniel Paiva Agustinho
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, MO, 63110, USA
| | - Elbay Aliyev
- Research Department, Sidra Medicine, Doha, Qatar
| | - Pavel Avdeyev
- Computational Biology Institute, The George Washington University, Washington, DC, 20052, USA
| | - Enrico R. Barrozo
- Department of Obstetrics & Gynecology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Sairam Behera
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kimberley Billingsley
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Li Chuin Chong
- Beykoz Institute of Life Sciences and Biotechnology, Bezmialem Vakif University, Beykoz, Istanbul, Turkey
| | - Deepak Choubey
- Department of Technology, Savitribai Phule Pune University, Pune, Maharashtra, India
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, Antwerp, Belgium,Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Yilei Fu
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Alejandro R. Gener
- Association of Public Health Labs, Centers for Disease Control and Prevention, Downey, CA, USA
| | - Timothy Hefferon
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| | - David Morgan Henke
- Department Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Wolfram Höps
- EMBL Heidelberg, Genome Biology Unit, Heidelberg, Germany
| | | | - Michael D. Jochum
- Department of Obstetrics & Gynecology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Maria Jose
- Centre for Bioinformatics, Pondicherry University, Pondicherry, India
| | - Rupesh K. Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | | | - Priya Lakra
- Department of Zoology, University of Delhi, Delhi, India
| | - Damaris Lattimer
- University of Applied Sciences Upper Austria - FH Hagenberg, Mühlkreis, Austria
| | - Chia-Sin Liew
- Center for Biotechnology, University of Nebraska-Lincoln, Lincoln, Nebraska, 68588, USA
| | - Bai-Wei Lo
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Chunhsuan Lo
- Human Genetics Laboratory, National Institute of Genetics, Japan, Mishima City, Japan
| | - Anneri Lötter
- Department of Biochemistry, University of Pretoria, Pretoria, South Africa
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | | | - Rajarshi Mondal
- Department of Biotechnology, The University of Burdwan, West Bengal, India
| | - Hiroko Ohmiya
- Genetic Reagent Development Unit, Medical & Biological Laboratories Co., Ltd., Tokoyo, Japan
| | - Nasrin Parvin
- Department of Biotechnology, The University of Burdwan, West Bengal, India
| | | | | | | | - Marie Saitou
- Center of Integrative Genetics (CIGENE),Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Aditi Sammi
- School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India
| | - Philippe Sanio
- University of Applied Sciences Upper Austria - FH Hagenberg, Hagenberg im Mühlkreis, Austria
| | - Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Najeeb Syed
- Research Department, Sidra Medicine, Doha, Qatar
| | - Todd Treangen
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | - Tiancheng Xu
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Jianzhi Yang
- Department of Quantitative and Computational Biology,, University of Southern California, Los Angeles, CA, USA
| | - Shangzhe Zhang
- School of Biology, University of St Andrews, St Andrews, UK
| | - Weiyu Zhou
- Department of Statistical Science, George Mason University, Fairfax, Virginia, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA,
| | | |
Collapse
|
6
|
Abstract
Long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection at a scale of tens to thousands of samples. Concomitant with the development of new computational tools, the first population-scale studies involving long-read sequencing have emerged over the past 2 years and, given the continuous advancement of the field, many more are likely to follow. In this Review, we survey recent developments in population-scale long-read sequencing, highlight potential challenges of a scaled-up approach and provide guidance regarding experimental design. We provide an overview of current long-read sequencing platforms, variant calling methodologies and approaches for de novo assemblies and reference-based mapping approaches. Furthermore, we summarize strategies for variant validation, genotyping and predicting functional impact and emphasize challenges remaining in achieving long-read sequencing at a population scale.
Collapse
Affiliation(s)
- Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
7
|
Ascari G, Rendtorff ND, De Bruyne M, De Zaeytijd J, Van Lint M, Bauwens M, Van Heetvelde M, Arno G, Jacob J, Creytens D, Van Dorpe J, Van Laethem T, Rosseel T, De Pooter T, De Rijk P, De Coster W, Menten B, Rey AD, Strazisar M, Bertelsen M, Tranebjaerg L, De Baere E. Long-Read Sequencing to Unravel Complex Structural Variants of CEP78 Leading to Cone-Rod Dystrophy and Hearing Loss. Front Cell Dev Biol 2021; 9:664317. [PMID: 33968938 PMCID: PMC8097100 DOI: 10.3389/fcell.2021.664317] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 03/08/2021] [Indexed: 11/13/2022] Open
Abstract
Inactivating variants as well as a missense variant in the centrosomal CEP78 gene have been identified in autosomal recessive cone-rod dystrophy with hearing loss (CRDHL), a rare syndromic inherited retinal disease distinct from Usher syndrome. Apart from this, a complex structural variant (SV) implicating CEP78 has been reported in CRDHL. Here we aimed to expand the genetic architecture of typical CRDHL by the identification of complex SVs of the CEP78 region and characterization of their underlying mechanisms. Approaches used for the identification of the SVs are shallow whole-genome sequencing (sWGS) combined with quantitative polymerase chain reaction (PCR) and long-range PCR, or ExomeDepth analysis on whole-exome sequencing (WES) data. Targeted or whole-genome nanopore long-read sequencing (LRS) was used to delineate breakpoint junctions at the nucleotide level. For all SVs cases, the effect of the SVs on CEP78 expression was assessed using quantitative PCR on patient-derived RNA. Apart from two novel canonical CEP78 splice variants and a frameshifting single-nucleotide variant (SNV), two SVs affecting CEP78 were identified in three unrelated individuals with CRDHL: a heterozygous total gene deletion of 235 kb and a partial gene deletion of 15 kb in a heterozygous and homozygous state, respectively. Assessment of the molecular consequences of the SVs on patient's materials displayed a loss-of-function effect. Delineation and characterization of the 15-kb deletion using targeted LRS revealed the previously described complex CEP78 SV, suggestive of a recurrent genomic rearrangement. A founder haplotype was demonstrated for the latter SV in cases of Belgian and British origin, respectively. The novel 235-kb deletion was delineated using whole-genome LRS. Breakpoint analysis showed microhomology and pointed to a replication-based underlying mechanism. Moreover, data mining of bulk and single-cell human and mouse transcriptional datasets, together with CEP78 immunostaining on human retina, linked the CEP78 expression domain with its phenotypic manifestations. Overall, this study supports that the CEP78 locus is prone to distinct SVs and that SV analysis should be considered in a genetic workup of CRDHL. Finally, it demonstrated the power of sWGS and both targeted and whole-genome LRS in identifying and characterizing complex SVs in patients with ocular diseases.
Collapse
Affiliation(s)
- Giulia Ascari
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Nanna D Rendtorff
- The Kennedy Center, Department of Clinical Genetics, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Marieke De Bruyne
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Julie De Zaeytijd
- Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium
| | - Michel Van Lint
- Department of Ophthalmology, Antwerp University Hospital, Antwerp, Belgium
| | - Miriam Bauwens
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Mattias Van Heetvelde
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Gavin Arno
- Great Ormond Street Hospital, London, United Kingdom.,Moorfields Eye Hospital, London, United Kingdom.,UCL Institute of Ophthalmology, London, United Kingdom
| | - Julie Jacob
- Department of Ophthalmology, University Hospitals Leuven, Leuven, Belgium
| | - David Creytens
- Department of Pathology, Ghent University Hospital, Ghent, Belgium.,Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Jo Van Dorpe
- Department of Pathology, Ghent University Hospital, Ghent, Belgium.,Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Thalia Van Laethem
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Toon Rosseel
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Tim De Pooter
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Neuromics Support Facility, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Peter De Rijk
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Neuromics Support Facility, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Björn Menten
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Alfredo Dueñas Rey
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Mojca Strazisar
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Neuromics Support Facility, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Mette Bertelsen
- The Kennedy Center, Department of Clinical Genetics, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.,Department of Ophthalmology, Rigshospitalet-Glostrup, University of Copenhagen, Glostrup, Denmark
| | - Lisbeth Tranebjaerg
- The Kennedy Center, Department of Clinical Genetics, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.,Institute of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Elfride De Baere
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
8
|
Mc Cartney AM, Mahmoud M, Jochum M, Agustinho DP, Zorman B, Al Khleifat A, Dabbaghie F, K Kesharwani R, Smolka M, Dawood M, Albin D, Aliyev E, Almabrazi H, Arslan A, Balaji A, Behera S, Billingsley K, L Cameron D, Daw J, T. Dawson E, De Coster W, Du H, Dunn C, Esteban R, Jolly A, Kalra D, Liao C, Liu Y, Lu TY, M Havrilla J, M Khayat M, Marin M, Monlong J, Price S, Rafael Gener A, Ren J, Sagayaradj S, Sapoval N, Sinner C, C. Soto D, Soylev A, Subramaniyan A, Syed N, Tadimeti N, Tater P, Vats P, Vaughn J, Walker K, Wang G, Zeng Q, Zhang S, Zhao T, Kille B, Biederstedt E, Chaisson M, English A, Kronenberg Z, J. Treangen T, Hefferon T, Chin CS, Busby B, J Sedlazeck F. An international virtual hackathon to build tools for the analysis of structural variants within species ranging from coronaviruses to vertebrates. F1000Res 2021; 10:246. [PMID: 34621504 PMCID: PMC8479851 DOI: 10.12688/f1000research.51477.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/23/2021] [Indexed: 11/20/2022] Open
Abstract
In October 2020, 62 scientists from nine nations worked together remotely in the Second Baylor College of Medicine & DNAnexus hackathon, focusing on different related topics on Structural Variation, Pan-genomes, and SARS-CoV-2 related research. The overarching focus was to assess the current status of the field and identify the remaining challenges. Furthermore, how to combine the strengths of the different interests to drive research and method development forward. Over the four days, eight groups each designed and developed new open-source methods to improve the identification and analysis of variations among species, including humans and SARS-CoV-2. These included improvements in SV calling, genotyping, annotations and filtering. Together with advancements in benchmarking existing methods. Furthermore, groups focused on the diversity of SARS-CoV-2. Daily discussion summary and methods are available publicly at https://github.com/collaborativebioinformatics provides valuable insights for both participants and the research community.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Fawaz Dabbaghie
- Institute for Medical Biometry and Bioinformatics, Düsseldorf, Germany
| | | | | | | | | | | | | | - Ahmed Arslan
- Stanford University School of Medicine, California, USA
| | | | | | | | - Daniel L Cameron
- Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Joyjit Daw
- NVIDIA Corporation, Santa Clara, California, USA
| | | | | | - Haowei Du
- Baylor College of Medicine, Houston, USA
| | | | | | | | | | | | | | | | | | | | | | - Jean Monlong
- UC Santa Cruz Genomics Institute, Santa Cruz, USA
| | | | | | | | | | | | | | | | - Arda Soylev
- Konya Food and Agriculture University, Konya, Turkey
| | | | | | | | | | - Pankaj Vats
- NVIDIA Corporation, Santa Clara, California, USA
| | | | | | | | - Qiandong Zeng
- Laboratory Corporation of America Holdings, Westborough, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Mc Cartney AM, Mahmoud M, Jochum M, Agustinho DP, Zorman B, Al Khleifat A, Dabbaghie F, K Kesharwani R, Smolka M, Dawood M, Albin D, Aliyev E, Almabrazi H, Arslan A, Balaji A, Behera S, Billingsley K, L Cameron D, Daw J, T. Dawson E, De Coster W, Du H, Dunn C, Esteban R, Jolly A, Kalra D, Liao C, Liu Y, Lu TY, M Havrilla J, M Khayat M, Marin M, Monlong J, Price S, Rafael Gener A, Ren J, Sagayaradj S, Sapoval N, Sinner C, C. Soto D, Soylev A, Subramaniyan A, Syed N, Tadimeti N, Tater P, Vats P, Vaughn J, Walker K, Wang G, Zeng Q, Zhang S, Zhao T, Kille B, Biederstedt E, Chaisson M, English A, Kronenberg Z, J. Treangen T, Hefferon T, Chin CS, Busby B, J Sedlazeck F. An international virtual hackathon to build tools for the analysis of structural variants within species ranging from coronaviruses to vertebrates. F1000Res 2021; 10:246. [PMID: 34621504 PMCID: PMC8479851 DOI: 10.12688/f1000research.51477.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/04/2021] [Indexed: 11/08/2023] Open
Abstract
In October 2020, 62 scientists from nine nations worked together remotely in the Second Baylor College of Medicine & DNAnexus hackathon, focusing on different related topics on Structural Variation, Pan-genomes, and SARS-CoV-2 related research. The overarching focus was to assess the current status of the field and identify the remaining challenges. Furthermore, how to combine the strengths of the different interests to drive research and method development forward. Over the four days, eight groups each designed and developed new open-source methods to improve the identification and analysis of variations among species, including humans and SARS-CoV-2. These included improvements in SV calling, genotyping, annotations and filtering. Together with advancements in benchmarking existing methods. Furthermore, groups focused on the diversity of SARS-CoV-2. Daily discussion summary and methods are available publicly at https://github.com/collaborativebioinformatics provides valuable insights for both participants and the research community.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Fawaz Dabbaghie
- Institute for Medical Biometry and Bioinformatics, Düsseldorf, Germany
| | | | | | | | | | | | | | - Ahmed Arslan
- Stanford University School of Medicine, California, USA
| | | | | | | | - Daniel L Cameron
- Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
| | - Joyjit Daw
- NVIDIA Corporation, Santa Clara, California, USA
| | | | | | - Haowei Du
- Baylor College of Medicine, Houston, USA
| | | | | | | | | | | | | | | | | | | | | | - Jean Monlong
- UC Santa Cruz Genomics Institute, Santa Cruz, USA
| | | | | | | | | | | | | | | | - Arda Soylev
- Konya Food and Agriculture University, Konya, Turkey
| | | | | | | | | | - Pankaj Vats
- NVIDIA Corporation, Santa Clara, California, USA
| | | | | | | | - Qiandong Zeng
- Laboratory Corporation of America Holdings, Westborough, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
De Coster W, Stovner EB, Strazisar M. Methplotlib: analysis of modified nucleotides from nanopore sequencing. Bioinformatics 2020; 36:3236-3238. [PMID: 32053166 PMCID: PMC7214038 DOI: 10.1093/bioinformatics/btaa093] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 01/03/2020] [Accepted: 02/05/2020] [Indexed: 02/06/2023] Open
Abstract
SUMMARY Modified nucleotides play a crucial role in gene expression regulation. Here, we describe methplotlib, a tool developed for the visualization of modified nucleotides detected from Oxford Nanopore Technologies sequencing platforms, together with additional scripts for statistical analysis of allele-specific modification within-subjects and differential modification frequency across subjects. AVAILABILITY AND IMPLEMENTATION The methplotlib command-line tool is written in Python3, is compatible with Linux, Mac OS and the MS Windows 10 Subsystem for Linux and released under the MIT license. The source code can be found at https://github.com/wdecoster/methplotlib and can be installed from PyPI and bioconda. Our repository includes test data, and the tool is continuously tested at travis-ci.com. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Endre Bakken Stovner
- Department of Computer Science, Norwegian University of Science and Technology, Trondheim 7013, Norway
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim 7013, Norway
| | | |
Collapse
|
11
|
De Roeck A, De Coster W, Bossaerts L, Cacace R, De Pooter T, Van Dongen J, D’Hert S, De Rijk P, Strazisar M, Van Broeckhoven C, Sleegers K. NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION. Genome Biol 2019; 20:239. [PMID: 31727106 PMCID: PMC6857246 DOI: 10.1186/s13059-019-1856-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 10/10/2019] [Indexed: 12/13/2022] Open
Abstract
Technological limitations have hindered the large-scale genetic investigation of tandem repeats in disease. We show that long-read sequencing with a single Oxford Nanopore Technologies PromethION flow cell per individual achieves 30× human genome coverage and enables accurate assessment of tandem repeats including the 10,000-bp Alzheimer's disease-associated ABCA7 VNTR. The Guppy "flip-flop" base caller and tandem-genotypes tandem repeat caller are efficient for large-scale tandem repeat assessment, but base calling and alignment challenges persist. We present NanoSatellite, which analyzes tandem repeats directly on electric current data and improves calling of GC-rich tandem repeats, expanded alleles, and motif interruptions.
Collapse
Affiliation(s)
- Arne De Roeck
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Wouter De Coster
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Liene Bossaerts
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Rita Cacace
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Tim De Pooter
- Neuromics Support Facility, Center for Molecular Neurology, VIB - University of Antwerp, Antwerp, Belgium
| | - Jasper Van Dongen
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Svenn D’Hert
- Neuromics Support Facility, Center for Molecular Neurology, VIB - University of Antwerp, Antwerp, Belgium
| | - Peter De Rijk
- Neuromics Support Facility, Center for Molecular Neurology, VIB - University of Antwerp, Antwerp, Belgium
| | - Mojca Strazisar
- Neuromics Support Facility, Center for Molecular Neurology, VIB - University of Antwerp, Antwerp, Belgium
| | - Christine Van Broeckhoven
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Kristel Sleegers
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp-CDE, Universiteitsplein 1, B-2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
12
|
De Coster W, Van Broeckhoven C. Newest Methods for Detecting Structural Variations. Trends Biotechnol 2019; 37:973-982. [DOI: 10.1016/j.tibtech.2019.02.003] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 02/08/2019] [Accepted: 02/11/2019] [Indexed: 01/28/2023]
|
13
|
De Coster W, De Rijk P, De Roeck A, De Pooter T, D'Hert S, Strazisar M, Sleegers K, Van Broeckhoven C. Structural variants identified by Oxford Nanopore PromethION sequencing of the human genome. Genome Res 2019; 29:1178-1187. [PMID: 31186302 PMCID: PMC6633254 DOI: 10.1101/gr.244939.118] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 06/06/2019] [Indexed: 01/17/2023]
Abstract
We sequenced the genome of the Yoruban reference individual NA19240 on the long-read sequencing platform Oxford Nanopore PromethION for evaluation and benchmarking of recently published aligners and germline structural variant calling tools, as well as a comparison with the performance of structural variant calling from short-read sequencing data. The structural variant caller Sniffles after NGMLR or minimap2 alignment provides the most accurate results, but additional confidence or sensitivity can be obtained by a combination of multiple variant callers. Sensitive and fast results can be obtained by minimap2 for alignment and a combination of Sniffles and SVIM for variant identification. We describe a scalable workflow for identification, annotation, and characterization of tens of thousands of structural variants from long-read genome sequencing of an individual or population. By discussing the results of this well-characterized reference individual, we provide an approximation of what can be expected in future long-read sequencing studies aiming for structural variant identification.
Collapse
Affiliation(s)
- Wouter De Coster
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| | - Peter De Rijk
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Arne De Roeck
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| | - Tim De Pooter
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Svenn D'Hert
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Mojca Strazisar
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
- Neuromics Support Facility, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
| | - Kristel Sleegers
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| | - Christine Van Broeckhoven
- Neurodegenerative Brain Diseases Group, Center for Molecular Neurology, VIB, 2610 Antwerp, Belgium
- Biomedical Sciences, University of Antwerp, 2610 Antwerp, Belgium
| |
Collapse
|
14
|
Cacace R, Heeman B, Van Mossevelde S, De Roeck A, Hoogmartens J, De Rijk P, Gossye H, De Vos K, De Coster W, Strazisar M, De Baets G, Schymkowitz J, Rousseau F, Geerts N, De Pooter T, Peeters K, Sieben A, Martin JJ, Engelborghs S, Salmon E, Santens P, Vandenberghe R, Cras P, P. De Deyn P, C. van Swieten J, M. van Duijn C, van der Zee J, Sleegers K, Van Broeckhoven C. Loss of DPP6 in neurodegenerative dementia: a genetic player in the dysfunction of neuronal excitability. Acta Neuropathol 2019; 137:901-918. [PMID: 30874922 PMCID: PMC6531610 DOI: 10.1007/s00401-019-01976-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 02/07/2019] [Accepted: 02/13/2019] [Indexed: 12/14/2022]
Abstract
Emerging evidence suggested a converging mechanism in neurodegenerative brain diseases (NBD) involving early neuronal network dysfunctions and alterations in the homeostasis of neuronal firing as culprits of neurodegeneration. In this study, we used paired-end short-read and direct long-read whole genome sequencing to investigate an unresolved autosomal dominant dementia family significantly linked to 7q36. We identified and validated a chromosomal inversion of ca. 4 Mb, segregating on the disease haplotype and disrupting the coding sequence of dipeptidyl-peptidase 6 gene (DPP6). DPP6 resequencing identified significantly more rare variants-nonsense, frameshift, and missense-in early-onset Alzheimer's disease (EOAD, p value = 0.03, OR = 2.21 95% CI 1.05-4.82) and frontotemporal dementia (FTD, p = 0.006, OR = 2.59, 95% CI 1.28-5.49) patient cohorts. DPP6 is a type II transmembrane protein with a highly structured extracellular domain and is mainly expressed in brain, where it binds to the potassium channel Kv4.2 enhancing its expression, regulating its gating properties and controlling the dendritic excitability of hippocampal neurons. Using in vitro modeling, we showed that the missense variants found in patients destabilize DPP6 and reduce its membrane expression (p < 0.001 and p < 0.0001) leading to a loss of protein. Reduced DPP6 and/or Kv4.2 expression was also detected in brain tissue of missense variant carriers. Loss of DPP6 is known to cause neuronal hyperexcitability and behavioral alterations in Dpp6-KO mice. Taken together, the results of our genomic, genetic, expression and modeling analyses, provided direct evidence supporting the involvement of DPP6 loss in dementia. We propose that loss of function variants have a higher penetrance and disease impact, whereas the missense variants have a variable risk contribution to disease that can vary from high to low penetrance. Our findings of DPP6, as novel gene in dementia, strengthen the involvement of neuronal hyperexcitability and alteration in the homeostasis of neuronal firing as a disease mechanism to further investigate.
Collapse
Affiliation(s)
- Rita Cacace
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Bavo Heeman
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Sara Van Mossevelde
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
- Department of Neurology, Antwerp University Hospital, Edegem, Belgium
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - Arne De Roeck
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Julie Hoogmartens
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Peter De Rijk
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Helena Gossye
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
- Department of Neurology, Antwerp University Hospital, Edegem, Belgium
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - Kristof De Vos
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Wouter De Coster
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Mojca Strazisar
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Greet De Baets
- Switch Laboratory, VIB-KU Leuven Centre for Brain and Disease Research, Louvain, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Louvain, Belgium
| | - Joost Schymkowitz
- Switch Laboratory, VIB-KU Leuven Centre for Brain and Disease Research, Louvain, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Louvain, Belgium
| | - Frederic Rousseau
- Switch Laboratory, VIB-KU Leuven Centre for Brain and Disease Research, Louvain, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Louvain, Belgium
| | - Nathalie Geerts
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Tim De Pooter
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Karin Peeters
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Anne Sieben
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- Department of Neurology, University Hospital Ghent and University of Ghent, Ghent, Belgium
| | | | - Sebastiaan Engelborghs
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - Eric Salmon
- Department of Neurology, Centre Hospitalier Universitaire de Liège and University of Liège, Liège, Belgium
| | - Patrick Santens
- Department of Neurology, University Hospital Ghent and University of Ghent, Ghent, Belgium
| | - Rik Vandenberghe
- Department of Neurosciences, Faculty of Medicine, KU Leuven, Louvain, Belgium
- Laboratory of Cognitive Neurology, Department of Neurology, University Hospitals Leuven, Louvain, Belgium
| | - Patrick Cras
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
- Department of Neurology, Antwerp University Hospital, Edegem, Belgium
| | - Peter P. De Deyn
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - John C. van Swieten
- Department of Neurology, Erasmus University Medical Centre, Rotterdam, The Netherlands
| | - Cornelia M. van Duijn
- Department of Epidemiology, Erasmus University Medical Centre, Rotterdam, The Netherlands
| | - Julie van der Zee
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Kristel Sleegers
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
| | - Christine Van Broeckhoven
- Center for Molecular Neurology, VIB, Antwerp, Belgium
- Institute Born-Bunge, Antwerp, Belgium
- University of Antwerp, Antwerp, Belgium
- Neurodegenerative Brain Diseases Group, VIB Center for Molecular Neurology, University of Antwerp, CDE, Universiteitsplein 1, 2610 Antwerp, Belgium
| |
Collapse
|
15
|
De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 2018; 34:2666-2669. [PMID: 29547981 PMCID: PMC6061794 DOI: 10.1093/bioinformatics/bty149] [Citation(s) in RCA: 1269] [Impact Index Per Article: 211.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 02/19/2018] [Accepted: 03/12/2018] [Indexed: 01/23/2023] Open
Abstract
Summary Here we describe NanoPack, a set of tools developed for visualization and processing of long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. Availability and implementation The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for Linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wouter De Coster
- Neurodegenerative Brain Diseases Group, VIB & University of Antwerp, Antwerp, Belgium
| | - Svenn D’Hert
- Bioinformatics, Neuromics Support Facility, Center for Molecular Neurology, VIB & University of Antwerp, Antwerp, Belgium
| | - Darrin T Schultz
- Department of Biomolecular Engineering and Bioinformatics, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Marc Cruts
- Neurodegenerative Brain Diseases Group, VIB & University of Antwerp, Antwerp, Belgium
| | | |
Collapse
|
16
|
De Roeck A, Van den Bossche T, van der Zee J, Verheijen J, De Coster W, Van Dongen J, Dillen L, Baradaran-Heravi Y, Heeman B, Sanchez-Valle R, Lladó A, Nacmias B, Sorbi S, Gelpi E, Grau-Rivera O, Gómez-Tortosa E, Pastor P, Ortega-Cubero S, Pastor MA, Graff C, Thonberg H, Benussi L, Ghidoni R, Binetti G, de Mendonça A, Martins M, Borroni B, Padovani A, Almeida MR, Santana I, Diehl-Schmid J, Alexopoulos P, Clarimon J, Lleó A, Fortea J, Tsolaki M, Koutroumani M, Matěj R, Rohan Z, De Deyn P, Engelborghs S, Cras P, Van Broeckhoven C, Sleegers K. Deleterious ABCA7 mutations and transcript rescue mechanisms in early onset Alzheimer's disease. Acta Neuropathol 2017; 134:475-487. [PMID: 28447221 PMCID: PMC5563332 DOI: 10.1007/s00401-017-1714-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Revised: 04/18/2017] [Accepted: 04/19/2017] [Indexed: 12/12/2022]
Abstract
Premature termination codon (PTC) mutations in the ATP-Binding Cassette, Sub-Family A, Member 7 gene (ABCA7) have recently been identified as intermediate-to-high penetrant risk factor for late-onset Alzheimer’s disease (LOAD). High variability, however, is observed in downstream ABCA7 mRNA and protein expression, disease penetrance, and onset age, indicative of unknown modifying factors. Here, we investigated the prevalence and disease penetrance of ABCA7 PTC mutations in a large early onset AD (EOAD)—control cohort, and examined the effect on transcript level with comprehensive third-generation long-read sequencing. We characterized the ABCA7 coding sequence with next-generation sequencing in 928 EOAD patients and 980 matched control individuals. With MetaSKAT rare variant association analysis, we observed a fivefold enrichment (p = 0.0004) of PTC mutations in EOAD patients (3%) versus controls (0.6%). Ten novel PTC mutations were only observed in patients, and PTC mutation carriers in general had an increased familial AD load. In addition, we observed nominal risk reducing trends for three common coding variants. Seven PTC mutations were further analyzed using targeted long-read cDNA sequencing on an Oxford Nanopore MinION platform. PTC-containing transcripts for each investigated PTC mutation were observed at varying proportion (5–41% of the total read count), implying incomplete nonsense-mediated mRNA decay (NMD). Furthermore, we distinguished and phased several previously unknown alternative splicing events (up to 30% of transcripts). In conjunction with PTC mutations, several of these novel ABCA7 isoforms have the potential to rescue deleterious PTC effects. In conclusion, ABCA7 PTC mutations play a substantial role in EOAD, warranting genetic screening of ABCA7 in genetically unexplained patients. Long-read cDNA sequencing revealed both varying degrees of NMD and transcript-modifying events, which may influence ABCA7 dosage, disease severity, and may create opportunities for therapeutic interventions in AD.
Collapse
|
17
|
De Roeck A, Van den Bossche T, Verheijen J, De Coster W, Van Dongen J, Dillen L, Baradaran‐Heravi Y, Engelborghs S, Cras P, Zee J, Van Broeckhoven C, Sleegers K. [O2–13–05]: DELETERIOUS
ABCA7
MUTATIONS CONTRIBUTE TO EARLY‐ONSET ALZHEIMER's DISEASE AND ARE SUBJECT TO TRANSCRIPT RESCUE MECHANISMS. Alzheimers Dement 2017. [DOI: 10.1016/j.jalz.2017.07.220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Arne De Roeck
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
- Institute Born‐BungeUniversity of AntwerpAntwerpBelgium
- Department of Neurology and Memory ClinicHospital Network Antwerp (ZNA) Middelheim and Hoge BeukenAntwerpBelgium
- Hospital Network Antwerp (ZNA)AntwerpBelgium
- Department of NeurologyAntwerp University HospitalEdegemBelgium
| | - Tobi Van den Bossche
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | - Jan Verheijen
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | - Wouter De Coster
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | - Jasper Van Dongen
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | - Lubina Dillen
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | | | | | - Patrick Cras
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | - Julie Zee
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | | | - Kristel Sleegers
- Neurodegenerative Brain Diseases GroupCenter for Molecular Neurology, VIBAntwerpBelgium
| | | |
Collapse
|