1
|
Celus CS, Ahmad SF, Gangwar M, Kumar S, Kumar A. Deciphering new insights into copy number variations as drivers of genomic diversity and adaptation in farm animal species. Gene 2025; 939:149159. [PMID: 39672215 DOI: 10.1016/j.gene.2024.149159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Revised: 11/15/2024] [Accepted: 12/09/2024] [Indexed: 12/15/2024]
Abstract
The basis of all improvement in (re)production performance of animals and plants lies in the genetic variation. The underlying genetic variation can be further explored through investigations using molecular markers including single nucleotide polymorphism (SNP) and microsatellite, and more recently structural variants like copy number variations (CNVs). Unlike SNPs, CNVs affect a larger proportion of the genome, making them more impactful vis-à-vis variation at the phenotype level. They significantly contribute to genetic variation and provide raw material for natural and artificial selection for improved performance. CNVs are characterized as unbalanced structural variations that arise from four major mechanisms viz., non-homologous end joining (NHEJ), non-allelic homologous recombination (NAHR), fork stalling and template switching (FoSTeS), and retrotransposition. Various detection methods have been developed to identify CNVs, including molecular techniques and massively parallel sequencing. Next-generation sequencing (NGS)/high-throughput sequencing offers higher resolution and sensitivity, but challenges remain in delineating CNVs in regions with repetitive sequences or high GC content. High-throughput sequencing technologies utilize different methods based on read-pair, split-read, read depth, and assembly approaches (or their combination) to detect CNVs. Read-pair based methods work by mapping discordant reads, while the read-depth approach works on detecting the correlation between read depth and copy number of genetic segments or a gene. Split-read methods involve mapping segments of reads to different locations on the genome, while assembly methods involve comparing contigs to a reference or de novo sequencing. Similar to other marker-trait association studies, CNV-association studies are not uncommon in humans and farm animals. Soon, extensive studies will be needed to deduce the unique evolutionary trajectories and underlying molecular mechanisms for targeted genetic improvements in different farm animal species. The present review delineates the importance of CNVs in genetic studies, their generation along with programs and principles to efficiently identify them, and finally throw light on the existing literature on studies in farm animal species vis-à-vis CNVs.
Collapse
Affiliation(s)
- C S Celus
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Sheikh Firdous Ahmad
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India; Livestock Production and Management Section, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India.
| | - Munish Gangwar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Subodh Kumar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Amit Kumar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| |
Collapse
|
2
|
Tsai CY, Hsu JSJ, Chen PL, Wu CC. Implementing next-generation sequencing for diagnosis and management of hereditary hearing impairment: a comprehensive review. Expert Rev Mol Diagn 2024; 24:753-765. [PMID: 39194060 DOI: 10.1080/14737159.2024.2396866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 08/22/2024] [Indexed: 08/29/2024]
Abstract
INTRODUCTION Sensorineural hearing impairment (SNHI), a common childhood disorder with heterogeneous genetic causes, can lead to delayed language development and psychosocial problems. Next-generation sequencing (NGS) offers high-throughput screening and high-sensitivity detection of genetic etiologies of SNHI, enabling clinicians to make informed medical decisions, provide tailored treatments, and improve prognostic outcomes. AREAS COVERED This review covers the diverse etiologies of HHI and the utility of different NGS modalities (targeted sequencing and whole exome/genome sequencing), and includes HHI-related studies on newborn screening, genetic counseling, prognostic prediction, and personalized treatment. Challenges such as the trade-off between cost and diagnostic yield, detection of structural variants, and exploration of the non-coding genome are also highlighted. EXPERT OPINION In the current landscape of NGS-based diagnostics for HHI, there are both challenges (e.g. detection of structural variants and non-coding genome variants) and opportunities (e.g. the emergence of medical artificial intelligence tools). The authors advocate the use of technological advances such as long-read sequencing for structural variant detection, multi-omics analysis for non-coding variant exploration, and medical artificial intelligence for pathogenicity assessment and outcome prediction. By integrating these innovations into clinical practice, precision medicine in the diagnosis and management of HHI can be further improved.
Collapse
Affiliation(s)
- Cheng-Yu Tsai
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei, Taiwan
- Department of Otolaryngology, National Taiwan University Hospital, Taipei, Taiwan
| | - Jacob Shu-Jui Hsu
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Pei-Lung Chen
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei, Taiwan
- Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
- Institute of Molecular Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
- Department of Medical Genetics, National Taiwan University Hospital, Taipei, Taiwan
| | - Chen-Chi Wu
- Department of Otolaryngology, National Taiwan University Hospital, Taipei, Taiwan
- Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
- Department of Medical Research, National Taiwan University Hospital Hsin-Chu Branch, Hsinchu, Taiwan
- Department of Otolaryngology, National Taiwan University Hospital Hsin-Chu Branch, Hsinchu, Taiwan
| |
Collapse
|
3
|
Munoz JO, Rutter EM, Banuelos M, Sindi SS, Marcia RF. Sparse Negative Binomial Signal Recovery for Genomic Variant Prediction in Diploid Species. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-5. [PMID: 40039433 DOI: 10.1109/embc53108.2024.10782090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Structural variants (SVs) - such as insertions, deletions, and duplications of an individual's genome - are associated with genetic diseases and promotion of genetic diversity. Detecting SVs of an unknown genome is a mathematically challenging problem since SVs are rare and prone to low-coverage noise. Common approaches to detect SVs in an unknown genome require sequencing fragments of the genome, comparing them to a high-quality reference genome, and predicting SVs based on identified discordant fragments. We developed a computational method which seeks to improve existing SV detection methods in three ways: First, we implement an optimization approach using a negative binomial log-likelihood objective function. Second, we use a block-coordinate descent approach to simultaneously predict if an SV is homozygous or heterozygous given genomic data of related individuals. Third, we model a biologically realistic scenario where variants in the child are either inherited or novel. We validate our framework with simulated data and demonstrate improvements in predicting SVs and detecting false positives.
Collapse
|
4
|
Lee J, Huh S, Park K, Kang N, Yu HS, Park HG, Kim YS, Kang UG, Won S, Kim SH. Behavioral and transcriptional effects of repeated electroconvulsive seizures in the neonatal MK-801-treated rat model of schizophrenia. Psychopharmacology (Berl) 2024; 241:817-832. [PMID: 38081977 DOI: 10.1007/s00213-023-06511-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 11/23/2023] [Indexed: 03/13/2024]
Abstract
RATIONALE Electroconvulsive therapy (ECT) is an effective treatment modality for schizophrenia. However, its antipsychotic-like mechanism remains unclear. OBJECTIVES To gain insight into the antipsychotic-like actions of ECT, this study investigated how repeated treatments of electroconvulsive seizure (ECS), an animal model for ECT, affect the behavioral and transcriptomic profile of a neurodevelopmental animal model of schizophrenia. METHODS Two injections of MK-801 or saline were administered to rats on postnatal day 7 (PN7), and either repeated ECS treatments (E10X) or sham shock was conducted daily from PN50 to PN59. Ultimately, the rats were divided into vehicle/sham (V/S), MK-801/sham (M/S), vehicle/ECS (V/E), and MK-801/ECS (M/E) groups. On PN59, prepulse inhibition and locomotor activity were tested. Prefrontal cortex transcriptomes were analyzed with mRNA sequencing and network and pathway analyses, and quantitative real-time polymerase chain reaction (qPCR) analyses were subsequently conducted. RESULTS Prepulse inhibition deficit was induced by MK-801 and normalized by E10X. In M/S vs. M/E model, Egr1, Mmp9, and S100a6 were identified as center genes, and interleukin-17 (IL-17), nuclear factor kappa B (NF-κB), and tumor necrosis factor (TNF) signaling pathways were identified as the three most relevant pathways. In the V/E vs. V/S model, mitophagy, NF-κB, and receptor for advanced glycation end products (RAGE) pathways were identified. qPCR analyses demonstrated that Igfbp6, Btf3, Cox6a2, and H2az1 were downregulated in M/S and upregulated in M/E. CONCLUSIONS E10X reverses the behavioral changes induced by MK-801 and produces transcriptional changes in inflammatory, insulin, and mitophagy pathways, which provide mechanistic insight into the antipsychotic-like mechanism of ECT.
Collapse
Affiliation(s)
- Jeonghoon Lee
- Department of Psychiatry, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Seonghoo Huh
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Kyungtaek Park
- Institute of Health and Environment, Seoul National University, Seoul, Republic of Korea
| | - Nuree Kang
- Department of Psychiatry, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Hyun Sook Yu
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hong Geun Park
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Yong Sik Kim
- Department of Psychiatry, Nowon Eulji Medical Center, Eulji University, Seoul, Republic of Korea
| | - Ung Gu Kang
- Department of Psychiatry, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
- Institute of Human Behavioral Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Sungho Won
- Institute of Health and Environment, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program of Bioinformatics, College of Natural Sciences, Seoul National University, Seoul, Republic of Korea
- Department of Public Health Sciences, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea
- RexSoft Inc., Seoul, Republic of Korea
| | - Se Hyun Kim
- Department of Psychiatry, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
5
|
Louw N, Carstens N, Lombard Z. Incorporating CNV analysis improves the yield of exome sequencing for rare monogenic disorders-an important consideration for resource-constrained settings. Front Genet 2023; 14:1277784. [PMID: 38155715 PMCID: PMC10753787 DOI: 10.3389/fgene.2023.1277784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 11/22/2023] [Indexed: 12/30/2023] Open
Abstract
Exome sequencing (ES) is a recommended first-tier diagnostic test for many rare monogenic diseases. It allows for the detection of both single-nucleotide variants (SNVs) and copy number variants (CNVs) in coding exonic regions of the genome in a single test, and this dual analysis is a valuable approach, especially in limited resource settings. Single-nucleotide variants are well studied; however, the incorporation of copy number variant analysis tools into variant calling pipelines has not been implemented yet as a routine diagnostic test, and chromosomal microarray is still more widely used to detect copy number variants. Research shows that combined single and copy number variant analysis can lead to a diagnostic yield of up to 58%, increasing the yield with as much as 18% from the single-nucleotide variant only pipeline. Importantly, this is achieved with the consideration of computational costs only, without incurring any additional sequencing costs. This mini review provides an overview of copy number variant analysis from exome data and what the current recommendations are for this type of analysis. We also present an overview on rare monogenic disease research standard practices in resource-limited settings. We present evidence that integrating copy number variant detection tools into a standard exome sequencing analysis pipeline improves diagnostic yield and should be considered a significantly beneficial addition, with relatively low-cost implications. Routine implementation in underrepresented populations and limited resource settings will promote generation and sharing of CNV datasets and provide momentum to build core centers for this niche within genomic medicine.
Collapse
Affiliation(s)
- Nadja Louw
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Nadia Carstens
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Genomics Platform, South African Medical Research Council, Cape Town, South Africa
| | - Zané Lombard
- Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | | |
Collapse
|
6
|
Steensma MJ, Lee YL, Bouwman AC, Pita Barros C, Derks MFL, Bink MCAM, Harlizius B, Huisman AE, Crooijmans RPMA, Groenen MAM, Mulder HA, Rochus CM. Identification and characterisation of de novo germline structural variants in two commercial pig lines using trio-based whole genome sequencing. BMC Genomics 2023; 24:208. [PMID: 37072725 PMCID: PMC10114323 DOI: 10.1186/s12864-023-09296-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 04/04/2023] [Indexed: 04/20/2023] Open
Abstract
BACKGROUND De novo mutations arising in the germline are a source of genetic variation and their discovery broadens our understanding of genetic disorders and evolutionary patterns. Although the number of de novo single nucleotide variants (dnSNVs) has been studied in a number of species, relatively little is known about the occurrence of de novo structural variants (dnSVs). In this study, we investigated 37 deeply sequenced pig trios from two commercial lines to identify dnSVs present in the offspring. The identified dnSVs were characterised by identifying their parent of origin, their functional annotations and characterizing sequence homology at the breakpoints. RESULTS We identified four swine germline dnSVs, all located in intronic regions of protein-coding genes. Our conservative, first estimate of the swine germline dnSV rate is 0.108 (95% CI 0.038-0.255) per generation (one dnSV per nine offspring), detected using short-read sequencing. Two detected dnSVs are clusters of mutations. Mutation cluster 1 contains a de novo duplication, a dnSNV and a de novo deletion. Mutation cluster 2 contains a de novo deletion and three de novo duplications, of which one is inverted. Mutation cluster 2 is 25 kb in size, whereas mutation cluster 1 (197 bp) and the other two individual dnSVs (64 and 573 bp) are smaller. Only mutation cluster 2 could be phased and is located on the paternal haplotype. Mutation cluster 2 originates from both micro-homology as well as non-homology mutation mechanisms, where mutation cluster 1 and the other two dnSVs are caused by mutation mechanisms lacking sequence homology. The 64 bp deletion and mutation cluster 1 were validated through PCR. Lastly, the 64 bp deletion and the 573 bp duplication were validated in sequenced offspring of probands with three generations of sequence data. CONCLUSIONS Our estimate of 0.108 dnSVs per generation in the swine germline is conservative, due to our small sample size and restricted possibilities of dnSV detection from short-read sequencing. The current study highlights the complexity of dnSVs and shows the potential of breeding programs for pigs and livestock species in general, to provide a suitable population structure for identification and characterisation of dnSVs.
Collapse
Affiliation(s)
- Marije J Steensma
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands.
| | - Y L Lee
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - A C Bouwman
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - C Pita Barros
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - M F L Derks
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
- Topigs Norsvin Research Center, Schoenaker 6, Beuningen, 6641 SZ, the Netherlands
| | - M C A M Bink
- Hendrix Genetics, P.O. Box 114, Boxmeer, 5830 AC, the Netherlands
| | - B Harlizius
- Topigs Norsvin Research Center, Schoenaker 6, Beuningen, 6641 SZ, the Netherlands
| | - A E Huisman
- Hendrix Genetics, P.O. Box 114, Boxmeer, 5830 AC, the Netherlands
| | - R P M A Crooijmans
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - M A M Groenen
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - H A Mulder
- Wageningen University & Research Animal Breeding and Genomics, P.O. Box 338, Wageningen, 6700 AH, the Netherlands
| | - C M Rochus
- University of Guelph, Centre for Genetic Improvement of Livestock, 50 Stone Rd E, Guelph, O N, N1G 2W1, Canada
| |
Collapse
|
7
|
Roberts MB, Schultz DT, Gatins R, Escalona M, Bernardi G. Chromosome-level genome of the three-spot damselfish, Dascyllus trimaculatus. G3 (BETHESDA, MD.) 2023; 13:jkac339. [PMID: 36905099 PMCID: PMC10085752 DOI: 10.1093/g3journal/jkac339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 09/14/2022] [Indexed: 04/12/2023]
Abstract
Damselfishes (Family: Pomacentridae) are a group of ecologically important, primarily coral reef fishes that include over 400 species. Damselfishes have been used as model organisms to study recruitment (anemonefishes), the effects of ocean acidification (spiny damselfish), population structure, and speciation (Dascyllus). The genus Dascyllus includes a group of small-bodied species, and a complex of relatively larger bodied species, the Dascyllus trimaculatus species complex that is comprised of several species including D. trimaculatus itself. The three-spot damselfish, D. trimaculatus, is a widespread and common coral reef fish species found across the tropical Indo-Pacific. Here, we present the first-genome assembly of this species. This assembly contains 910 Mb, 90% of the bases are in 24 chromosome-scale scaffolds, and the Benchmarking Universal Single-Copy Orthologs score of the assembly is 97.9%. Our findings confirm previous reports of a karyotype of 2n = 47 in D. trimaculatus in which one parent contributes 24 chromosomes and the other 23. We find evidence that this karyotype is the result of a heterozygous Robertsonian fusion. We also find that the D. trimaculatus chromosomes are each homologous with single chromosomes of the closely related clownfish species, Amphiprion percula. This assembly will be a valuable resource in the population genomics and conservation of Damselfishes, and continued studies of the karyotypic diversity in this clade.
Collapse
Affiliation(s)
- May B Roberts
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Darrin T Schultz
- Department of Molecular Evolution and Development, University of Vienna, Vienna 1010, Austria
- Monterey Bay Aquarium Research Institute, Moss Landing, CA 95039, USA
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Remy Gatins
- Department of Marine Sciences, Northeastern University, Boston, MA 02115, USA
| | - Merly Escalona
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Giacomo Bernardi
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| |
Collapse
|
8
|
Fischer A, Lersch R, de Andrade Krätzig N, Strong A, Friedrich MJ, Weber J, Engleitner T, Öllinger R, Yen HY, Kohlhofer U, Gonzalez-Menendez I, Sailer D, Kogan L, Lahnalampi M, Laukkanen S, Kaltenbacher T, Klement C, Rezaei M, Ammon T, Montero JJ, Schneider G, Mayerle J, Heikenwälder M, Schmidt-Supprian M, Quintanilla-Martinez L, Steiger K, Liu P, Cadiñanos J, Vassiliou GS, Saur D, Lohi O, Heinäniemi M, Conte N, Bradley A, Rad L, Rad R. In vivo interrogation of regulatory genomes reveals extensive quasi-insufficiency in cancer evolution. CELL GENOMICS 2023; 3:100276. [PMID: 36950387 PMCID: PMC10025556 DOI: 10.1016/j.xgen.2023.100276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 09/05/2022] [Accepted: 02/08/2023] [Indexed: 03/10/2023]
Abstract
In contrast to mono- or biallelic loss of tumor-suppressor function, effects of discrete gene dysregulations, as caused by non-coding (epi)genome alterations, are poorly understood. Here, by perturbing the regulatory genome in mice, we uncover pervasive roles of subtle gene expression variation in cancer evolution. Genome-wide screens characterizing 1,450 tumors revealed that such quasi-insufficiency is extensive across entities and displays diverse context dependencies, such as distinct cell-of-origin associations in T-ALL subtypes. We compile catalogs of non-coding regions linked to quasi-insufficiency, show their enrichment with human cancer risk variants, and provide functional insights by engineering regulatory alterations in mice. As such, kilo-/megabase deletions in a Bcl11b-linked non-coding region triggered aggressive malignancies, with allele-specific tumor spectra reflecting gradual gene dysregulations through modular and cell-type-specific enhancer activities. Our study constitutes a first survey toward a systems-level understanding of quasi-insufficiency in cancer and gives multifaceted insights into tumor evolution and the tissue-specific effects of non-coding mutations.
Collapse
Affiliation(s)
- Anja Fischer
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Robert Lersch
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Niklas de Andrade Krätzig
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Alexander Strong
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, UK
| | - Mathias J. Friedrich
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- Department of Medicine II, Klinikum rechts der Isar, School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Julia Weber
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Thomas Engleitner
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Rupert Öllinger
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Hsi-Yu Yen
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Comparative Experimental Pathology, School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Ursula Kohlhofer
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, 72076 Tübingen, Germany
| | - Irene Gonzalez-Menendez
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, 72076 Tübingen, Germany
| | - David Sailer
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Liz Kogan
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Mari Lahnalampi
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| | - Saara Laukkanen
- Faculty of Medicine and Health Technology, Tampere Center for Child, Adolescent and Maternal Health Research and Tays Cancer Center, Tampere University, Tampere, Finland
| | - Thorsten Kaltenbacher
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Christine Klement
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Majdaddin Rezaei
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Tim Ammon
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- Institute of Experimental Hematology, TUM School of Medicine, Technical University of Munich, 81675 Munich, Germany
| | - Juan J. Montero
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Günter Schneider
- Department of Medicine II, Klinikum rechts der Isar, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Department of General, Visceral and Pediatric Surgery, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Julia Mayerle
- Medical Department II, University Hospital, LMU Munich, Munich, Germany
| | - Mathias Heikenwälder
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Division of Chronic Inflammation and Cancer, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Marc Schmidt-Supprian
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Institute of Experimental Hematology, TUM School of Medicine, Technical University of Munich, 81675 Munich, Germany
| | - Leticia Quintanilla-Martinez
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, 72076 Tübingen, Germany
| | - Katja Steiger
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Comparative Experimental Pathology, School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Pentao Liu
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, UK
- Li Ka Shing Faculty of Medicine, Stem Cell and Regenerative Medicine Consortium, School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Juan Cadiñanos
- Instituto de Medicina Oncológica y Molecular de Asturias (IMOMA), 33193 Oviedo, Spain
| | - George S. Vassiliou
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, UK
- Wellcome Trust-MRC Stem Cell Institute, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0XY, UK
- Department of Haematology, Cambridge University Hospitals NHS Trust, Cambridge CB2 0PT, UK
| | - Dieter Saur
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- Department of Medicine II, Klinikum rechts der Isar, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Institute for Experimental Cancer Therapy, School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Olli Lohi
- Faculty of Medicine and Health Technology, Tampere Center for Child, Adolescent and Maternal Health Research and Tays Cancer Center, Tampere University, Tampere, Finland
| | - Merja Heinäniemi
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| | - Nathalie Conte
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, UK
| | - Allan Bradley
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, UK
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Puddicombe Way, Cambridge CB2 0AW, UK
| | - Lena Rad
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- Institute for Experimental Cancer Therapy, School of Medicine, Technische Universität München, 81675 Munich, Germany
| | - Roland Rad
- Institute of Molecular Oncology and Functional Genomics, School of Medicine, Technische Universität München, 81675 Munich, Germany
- Center for Translational Cancer Research (TranslaTUM), School of Medicine, Technische Universität München, 81675 Munich, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Department of Medicine II, Klinikum rechts der Isar, School of Medicine, Technische Universität München, 81675 Munich, Germany
| |
Collapse
|
9
|
The oxidative phosphorylation inhibitor IM156 suppresses B-cell activation by regulating mitochondrial membrane potential and contributes to the mitigation of systemic lupus erythematosus. Kidney Int 2023; 103:343-356. [PMID: 36332729 DOI: 10.1016/j.kint.2022.09.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/09/2022] [Accepted: 09/22/2022] [Indexed: 11/15/2022]
Abstract
Current treatment strategies for autoimmune diseases may not sufficiently control aberrant metabolism in B-cells. To address this concern, we investigated a biguanide derivative, IM156, as a potential regulator for B-cell metabolism in vitro and in vivo on overactive B-cells stimulated by the pro-inflammatory receptor TLR-9 agonist CpG oligodeoxynucleotide, a mimic of viral/bacterial DNA. Using RNA sequencing, we analyzed the B-cell transcriptome expression, identifying the major molecular pathways affected by IM156 in vivo. We also evaluated the anti-inflammatory effects of IM156 in lupus-prone NZB/W F1 mice. CD19+B-cells exhibited higher mitochondrial mass and mitochondrial membrane potential compared to T-cells and were more susceptible to IM156-mediated oxidative phosphorylation inhibition. In vivo, IM156 inhibited mitochondrial oxidative phosphorylation, cell cycle progression, plasmablast differentiation, and activation marker levels in CpG oligodeoxynucleotide-stimulated mouse spleen B-cells. Interestingly, IM156 treatment significantly increased overall survival, reduced glomerulonephritis and inhibited B-cell activation in the NZB/W F1 mice. Thus, our data indicated that IM156 suppressed the mitochondrial membrane potentials of activated B-cells in mice, contributing to the mitigation of lupus activity. Hence, IM156 may represent a therapeutic alternative for autoimmune disease mediated by B-cell hyperactivity.
Collapse
|
10
|
Gobet N, Jan M, Franken P, Xenarios I. Towards mouse genetic-specific RNA-sequencing read mapping. PLoS Comput Biol 2022; 18:e1010552. [PMID: 36155976 PMCID: PMC9536569 DOI: 10.1371/journal.pcbi.1010552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 10/06/2022] [Accepted: 09/07/2022] [Indexed: 11/18/2022] Open
Abstract
Genetic variations affect behavior and cause disease but understanding how these variants drive complex traits is still an open question. A common approach is to link the genetic variants to intermediate molecular phenotypes such as the transcriptome using RNA-sequencing (RNA-seq). Paradoxically, these variants between the samples are usually ignored at the beginning of RNA-seq analyses of many model organisms. This can skew the transcriptome estimates that are used later for downstream analyses, such as expression quantitative trait locus (eQTL) detection. Here, we assessed the impact of reference-based analysis on the transcriptome and eQTLs in a widely-used mouse genetic population: the BXD panel of recombinant inbred lines. We highlight existing reference bias in the transcriptome data analysis and propose practical solutions which combine available genetic variants, genotypes, and genome reference sequence. The use of custom BXD line references improved downstream analysis compared to classical genome reference. These insights would likely benefit genetic studies with a transcriptomic component and demonstrate that genome references need to be reassessed and improved. To understand how genetic variations affect behavior and cause disease it is common to quantify expression of transcripts by sequencing. Transcripts are extracted, fragmented, and the sequence of the fragments read. An important step for their quantification is to virtually assign the different fragments to the transcript they originate from using a reference genome. Reference genomes are costly to build, so usually only one high-quality reference per animal model species is available. When comparing genetically different individuals, using a single reference may introduce a bias because it might be more similar to some individuals than to others. Paradoxically, the variations at the core of genetic studies are thus ignored at the start of the analysis. We built customized references with known genetic variants for each of the mouse lines we had and quantified the impact of the reference at different levels of the bioinformatic analysis. We found that using customized references reduced the bias compared to using a single reference. Our study uses publicly available data and tools, so others can easily implement this improvement in their analyses.
Collapse
Affiliation(s)
- Nastassia Gobet
- Centre for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- Vital-IT, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maxime Jan
- Centre for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- Bioinformatics Competence Center, University of Lausanne, Lausanne, Switzerland
| | - Paul Franken
- Centre for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Ioannis Xenarios
- Ludwig Cancer Research/CHUV-UNIL, Lausanne, Switzerland
- Health 2030 Genome Center, Geneva, Switzerland
- * E-mail:
| |
Collapse
|
11
|
Le CT, Price EP, Sarovich DS, Nguyen TTA, Powell D, Vu-Khac H, Kurtböke Dİ, Knibb W, Chen SC, Katouli M. Comparative genomics of Nocardia seriolae reveals recent importation and subsequent widespread dissemination in mariculture farms in the South Central Coast region, Vietnam. Microb Genom 2022; 8. [PMID: 35786440 PMCID: PMC9455698 DOI: 10.1099/mgen.0.000845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Between 2010 and 2015, nocardiosis outbreaks caused by Nocardia seriolae affected many permit farms throughout Vietnam, causing mass fish mortalities. To understand the biology, origin and epidemiology of these outbreaks, 20 N. seriolae strains collected from farms in four provinces in the South Central Coast region of Vietnam, along with two Taiwanese strains, were analysed using genetics and genomics. PFGE identified a single cluster amongst all Vietnamese strains that was distinct from the Taiwanese strains. Like the PFGE findings, phylogenomic and SNP genotyping analyses revealed that all Vietnamese N. seriolae strains belonged to a single, unique clade. Strains fell into two subclades that differed by 103 SNPs, with almost no diversity within clades (0–5 SNPs). There was no association between geographical origin and subclade placement, suggesting frequent N. seriolae transmission between Vietnamese mariculture facilities during the outbreaks. The Vietnamese strains shared a common ancestor with strains from Japan and China, with the closest strain, UTF1 from Japan, differing by just 220 SNPs from the Vietnamese ancestral node. Draft Vietnamese genomes range from 7.55 to 7.96 Mbp in size, have an average G+C content of 68.2 % and encode 7 602–7958 predicted genes. Several putative virulence factors were identified, including genes associated with host cell adhesion, invasion, intracellular survival, antibiotic and toxic compound resistance, and haemolysin biosynthesis. Our findings provide important new insights into the epidemiology and pathogenicity of N. seriolae and will aid future vaccine development and disease management strategies, with the ultimate goal of nocardiosis-free aquaculture.
Collapse
Affiliation(s)
- Cuong T. Le
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
- Institute for Aquaculture, Nha Trang University, Nha Trang, Vietnam
| | - Erin P. Price
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
- Sunshine Coast Health Institute, Birtinya, Queensland, Australia
| | - Derek S. Sarovich
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
- Sunshine Coast Health Institute, Birtinya, Queensland, Australia
| | - Thu T. A. Nguyen
- Institute for Biotechnology and Environment, Nha Trang University, Nha Trang, Vietnam
| | - Daniel Powell
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
| | - Hung Vu-Khac
- Central Vietnam Veterinary Institute, Nha Trang, Vietnam
| | - D. İpek Kurtböke
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
| | - Wayne Knibb
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
| | - Shih-Chu Chen
- Department of Veterinary Medicine, College of Veterinary Medicine, National Pingtung University of Science and Technology, Pingtung, Taiwan, ROC
| | - Mohammad Katouli
- Centre for Bioinnovation, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
- School of Science, Technology and Engineering, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
- *Correspondence: Mohammad Katouli,
| |
Collapse
|
12
|
Tak YE, Boulay G, Lee L, Iyer S, Perry NT, Schultz HT, Garcia SP, Broye L, Horng JE, Rengarajan S, Naigles B, Volorio A, Sander JD, Gong J, Riggi N, Joung JK, Rivera MN. Genome-wide functional perturbation of human microsatellite repeats using engineered zinc finger transcription factors. CELL GENOMICS 2022; 2. [PMID: 35967079 PMCID: PMC9374162 DOI: 10.1016/j.xgen.2022.100119] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Y. Esther Tak
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Gaylor Boulay
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Lukuo Lee
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Sowmya Iyer
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Nicholas T. Perry
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
| | - Hayley T. Schultz
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
| | - Sara P. Garcia
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Liliane Broye
- Institute of Pathology, Department of Experimental Pathology, Centre Hospitalier Universitaire Vaudois, University of Lausanne, 1011 Lausanne, Switzerland
| | - Joy E. Horng
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
| | - Shruthi Rengarajan
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Beverly Naigles
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Angela Volorio
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Institute of Pathology, Department of Experimental Pathology, Centre Hospitalier Universitaire Vaudois, University of Lausanne, 1011 Lausanne, Switzerland
| | - Jeffry D. Sander
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Jingyi Gong
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
| | - Nicolò Riggi
- Institute of Pathology, Department of Experimental Pathology, Centre Hospitalier Universitaire Vaudois, University of Lausanne, 1011 Lausanne, Switzerland
- Corresponding author
| | - J. Keith Joung
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Corresponding author
| | - Miguel N. Rivera
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Corresponding author
| |
Collapse
|
13
|
Mortazavi M, Ren Y, Saini S, Antaki D, St. Pierre CL, Williams A, Sohni A, Wilkinson MF, Gymrek M, Sebat J, Palmer AA. SNPs, short tandem repeats, and structural variants are responsible for differential gene expression across C57BL/6 and C57BL/10 substrains. CELL GENOMICS 2022; 2:100102. [PMID: 35720252 PMCID: PMC9205302 DOI: 10.1016/j.xgen.2022.100102] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 11/22/2021] [Accepted: 02/02/2022] [Indexed: 12/13/2022]
Abstract
Mouse substrains are an invaluable model for understanding disease. We compared C57BL/6J, which is the most commonly used inbred mouse strain, with eight C57BL/6 and five C57BL/10 closely related inbred substrains. Whole-genome sequencing and RNA-sequencing analysis yielded 352,631 SNPs, 109,096 indels, 150,344 short tandem repeats (STRs), 3,425 structural variants (SVs), and 2,826 differentially expressed genes (DE genes) among these 14 strains; 312,981 SNPs (89%) distinguished the B6 and B10 lineages. These SNPs were clustered into 28 short segments that are likely due to introgressed haplotypes rather than new mutations. Outside of these introgressed regions, we identified 53 SVs, protein-truncating SNPs, and frameshifting indels that were associated with DE genes. Our results can be used for both forward and reverse genetic approaches and illustrate how introgression and mutational processes give rise to differences among these widely used inbred substrains.
Collapse
Affiliation(s)
- Milad Mortazavi
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Yangsu Ren
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Shubham Saini
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Danny Antaki
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
- Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Celine L. St. Pierre
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | - April Williams
- Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Abhishek Sohni
- Department of Obstetrics, Gynecology and Reproductive Sciences, University of California San Diego, La Jolla, CA, USA
| | - Miles F. Wilkinson
- Department of Obstetrics, Gynecology and Reproductive Sciences, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Jonathan Sebat
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
- Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
| | - Abraham A. Palmer
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
14
|
Gordeeva V, Sharova E, Arapidi G. Progress in Methods for Copy Number Variation Profiling. Int J Mol Sci 2022; 23:ijms23042143. [PMID: 35216262 PMCID: PMC8879278 DOI: 10.3390/ijms23042143] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/09/2022] [Accepted: 02/11/2022] [Indexed: 02/04/2023] Open
Abstract
Copy number variations (CNVs) are the predominant class of structural genomic variations involved in the processes of evolutionary adaptation, genomic disorders, and disease progression. Compared with single-nucleotide variants, there have been challenges associated with the detection of CNVs owing to their diverse sizes. However, the field has seen significant progress in the past 20–30 years. This has been made possible due to the rapid development of molecular diagnostic methods which ensure a more detailed view of the genome structure, further complemented by recent advances in computational methods. Here, we review the major approaches that have been used to routinely detect CNVs, ranging from cytogenetics to the latest sequencing technologies, and then cover their specific features.
Collapse
Affiliation(s)
- Veronika Gordeeva
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
- Moscow Institute of Physics and Technology, National Research University, Moscow Oblast, 141701 Moscow, Russia
- Correspondence:
| | - Elena Sharova
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
| | - Georgij Arapidi
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
- Moscow Institute of Physics and Technology, National Research University, Moscow Oblast, 141701 Moscow, Russia
- Shemyakin–Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| |
Collapse
|
15
|
Baker LJ, Reich HG, Kitchen SA, Grace Klinges J, Koch HR, Baums IB, Muller EM, Thurber RV. The coral symbiont Candidatus Aquarickettsia is variably abundant in threatened Caribbean acroporids and transmitted horizontally. THE ISME JOURNAL 2022; 16:400-411. [PMID: 34363004 PMCID: PMC8776821 DOI: 10.1038/s41396-021-01077-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 06/28/2021] [Accepted: 07/22/2021] [Indexed: 02/07/2023]
Abstract
The symbiont "Candidatus Aquarickettsia rohweri" infects a diversity of aquatic hosts. In the threatened Caribbean coral, Acropora cervicornis, Aquarickettsia proliferates in response to increased nutrient exposure, resulting in suppressed growth and increased disease susceptibility and mortality of coral. This study evaluated the extent, as well as the ecology and evolution of Aquarickettsia infecting threatened corals, Ac. cervicornis, and Ac. palmata and their hybrid ("Ac. prolifera"). Aquarickettsia was found in all acroporids, with coral host and geographic location impacting the infection magnitude. Phylogenomic and genome-wide single-nucleotide variant analysis of Aquarickettsia found phylogenetic clustering by geographic region, not by coral taxon. Analysis of Aquarickettsia fixation indices suggests multiple sequential infections of the same coral colony are unlikely. Furthermore, relative to other Rickettsiales species, Aquarickettsia is undergoing positive selection, with Florida populations experiencing greater positive selection relative to other Caribbean locations. This may be due in part to Aquarickettsia proliferating in response to greater nutrient stress in Florida, as indicated by greater in situ replication rates in these corals. Aquarickettsia was not found to significantly codiversify with either the coral animal or the coral's algal symbiont (Symbiodinium "fitti"). Quantitative PCR analysis showed that gametes, larvae, recruits, and juveniles from susceptible, captive-reared coral genets were not infected with Aquarickettsia. Thus, horizontal transmission of Aquarickettsia via coral mucocytes or an unidentified host is more likely. The prevalence of Aquarickettsia in Ac. cervicornis and its high abundance in the Florida coral population suggests that coral disease mitigation efforts focus on preventing early infection via horizontal transmission.
Collapse
Affiliation(s)
- Lydia J Baker
- Department of Microbiology, Oregon State University, Corvallis, OR, USA.
| | - Hannah G Reich
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
| | - Sheila A Kitchen
- Division of Biology and Biological Engineering, California Institute of Science and Technology, Pasadena, CA, USA
| | - J Grace Klinges
- Department of Microbiology, Oregon State University, Corvallis, OR, USA
| | - Hanna R Koch
- Coral Restoration Program, Mote Marine Laboratory, Summerland Key, FL, USA
| | - Iliana B Baums
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
| | - Erinn M Muller
- Coral Restoration Program, Mote Marine Laboratory, Summerland Key, FL, USA
| | | |
Collapse
|
16
|
Tong L, Wu H, Wang MD, Wang G. Introduction of medical genomics and clinical informatics integration for p-Health care. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2022; 190:1-37. [DOI: 10.1016/bs.pmbts.2022.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
17
|
Symer DE, Akagi K, Geiger HM, Song Y, Li G, Emde AK, Xiao W, Jiang B, Corvelo A, Toussaint NC, Li J, Agrawal A, Ozer E, El-Naggar AK, Du Z, Shewale JB, Stache-Crain B, Zucker M, Robine N, Coombes KR, Gillison ML. Diverse tumorigenic consequences of human papillomavirus integration in primary oropharyngeal cancers. Genome Res 2021; 32:55-70. [PMID: 34903527 PMCID: PMC8744672 DOI: 10.1101/gr.275911.121] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 11/10/2021] [Indexed: 11/25/2022]
Abstract
Human papillomavirus (HPV) causes 5% of all cancers and frequently integrates into host chromosomes. The HPV oncoproteins E6 and E7 are necessary but insufficient for cancer formation, indicating that additional secondary genetic events are required. Here, we investigate potential oncogenic impacts of virus integration. Analysis of 105 HPV-positive oropharyngeal cancers by whole-genome sequencing detects virus integration in 77%, revealing five statistically significant sites of recurrent integration near genes that regulate epithelial stem cell maintenance (i.e., SOX2, TP63, FGFR, MYC) and immune evasion (i.e., CD274). Genomic copy number hyperamplification is enriched 16-fold near HPV integrants, and the extent of focal host genomic instability increases with their local density. The frequency of genes expressed at extreme outlier levels is increased 86-fold within ±150 kb of integrants. Across 95% of tumors with integration, host gene transcription is disrupted via intragenic integrants, chimeric transcription, outlier expression, gene breaking, and/or de novo expression of noncoding or imprinted genes. We conclude that virus integration can contribute to carcinogenesis in a large majority of HPV-positive oropharyngeal cancers by inducing extensive disruption of host genome structure and gene expression.
Collapse
Affiliation(s)
- David E Symer
- Department of Lymphoma and Myeloma, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Keiko Akagi
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | | | - Yang Song
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Gaiyun Li
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | | | - Weihong Xiao
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Bo Jiang
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - André Corvelo
- New York Genome Center, New York, New York 10013, USA
| | | | - Jingfeng Li
- Division of Medical Oncology, Department of Internal Medicine, Ohio State University, Columbus, Ohio 43210, USA
| | - Amit Agrawal
- Department of Otolaryngology - Head and Neck Surgery, Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | - Enver Ozer
- Department of Otolaryngology - Head and Neck Surgery, Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | - Adel K El-Naggar
- Division of Pathology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Zoe Du
- Department of Lymphoma and Myeloma, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Jitesh B Shewale
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | | | - Mark Zucker
- Department of Biomedical Informatics, Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | | | - Kevin R Coombes
- Department of Biomedical Informatics, Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | - Maura L Gillison
- Department of Thoracic/Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| |
Collapse
|
18
|
Ho PW, Piampongsant S, Gallone B, Del Cortona A, Peeters PJ, Reijbroek F, Verbaet J, Herrera B, Cortebeeck J, Nolmans R, Saels V, Steensels J, Jarosz DF, Verstrepen KJ. Massive QTL analysis identifies pleiotropic genetic determinants for stress resistance, aroma formation, and ethanol, glycerol and isobutanol production in Saccharomyces cerevisiae. BIOTECHNOLOGY FOR BIOFUELS 2021; 14:211. [PMID: 34727964 PMCID: PMC8564995 DOI: 10.1186/s13068-021-02059-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 10/16/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND The brewer's yeast Saccharomyces cerevisiae is exploited in several industrial processes, ranging from food and beverage fermentation to the production of biofuels, pharmaceuticals and complex chemicals. The large genetic and phenotypic diversity within this species offers a formidable natural resource to obtain superior strains, hybrids, and variants. However, most industrially relevant traits in S. cerevisiae strains are controlled by multiple genetic loci. Over the past years, several studies have identified some of these QTLs. However, because these studies only focus on a limited set of traits and often use different techniques and starting strains, a global view of industrially relevant QTLs is still missing. RESULTS Here, we combined the power of 1125 fully sequenced inbred segregants with high-throughput phenotyping methods to identify as many as 678 QTLs across 18 different traits relevant to industrial fermentation processes, including production of ethanol, glycerol, isobutanol, acetic acid, sulfur dioxide, flavor-active esters, as well as resistance to ethanol, acetic acid, sulfite and high osmolarity. We identified and confirmed several variants that are associated with multiple different traits, indicating that many QTLs are pleiotropic. Moreover, we show that both rare and common variants, as well as variants located in coding and non-coding regions all contribute to the phenotypic variation. CONCLUSIONS Our findings represent an important step in our understanding of the genetic underpinnings of industrially relevant yeast traits and open new routes to study complex genetics and genetic interactions as well as to engineer novel, superior industrial yeasts. Moreover, the major role of rare variants suggests that there is a plethora of different combinations of mutations that can be explored in genome editing.
Collapse
Affiliation(s)
- Ping-Wei Ho
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Supinya Piampongsant
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Brigida Gallone
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Andrea Del Cortona
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Pieter-Jan Peeters
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Frank Reijbroek
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Jules Verbaet
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Beatriz Herrera
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Jeroen Cortebeeck
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Robbe Nolmans
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Veerle Saels
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Jan Steensels
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
| | - Daniel F. Jarosz
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA 94305 USA
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Kevin J. Verstrepen
- VIB–KU Leuven Center for Microbiology, Leuven, Belgium
- CMPG Laboratory of Genetics and Genomics, Department M2S, KU Leuven, Leuven, Belgium
- Leuven Institute for Beer Research, Leuven, Belgium
- Labo VIB-CMPG, Bio-Incubator, Gaston Geenslaan 1, 3001 Leuven, Heverlee Belgium
| |
Collapse
|
19
|
Young GR, Ferron AKW, Panova V, Eksmond U, Oliver PL, Kassiotis G, Stoye JP. Gv1, a Zinc Finger Gene Controlling Endogenous MLV Expression. Mol Biol Evol 2021; 38:2468-2474. [PMID: 33560369 PMCID: PMC8136514 DOI: 10.1093/molbev/msab039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The genomes of inbred mice harbor around 50 endogenous murine leukemia virus (MLV) loci, although the specific complement varies greatly between strains. The Gv1 locus is known to control the transcription of endogenous MLVs and to be the dominant determinant of cell-surface presentation of MLV envelope, the GIX antigen. Here, we identify a single Krüppel-associated box zinc finger protein (ZFP) gene, Zfp998, as Gv1 and show it to be necessary and sufficient to determine the GIX+ phenotype. By long-read sequencing of bacterial artificial chromosome clones from 129 mice, the prototypic GIX+ strain, we reveal the source of sufficiency and deficiency as splice-acceptor variations and highlight the varying origins of the chromosomal region encompassing Gv1. Zfp998 becomes the second identified ZFP gene responsible for epigenetic suppression of endogenous MLVs in mice and further highlights the prominent role of this gene family in control of endogenous retroviruses.
Collapse
Affiliation(s)
- George R Young
- Retrovirus-host Interactions Laboratory, The Francis Crick Institute, London, UK
| | - Aaron K W Ferron
- Retrovirus-host Interactions Laboratory, The Francis Crick Institute, London, UK
| | - Veera Panova
- Retroviral Immunology, The Francis Crick Institute, London, UK
| | - Urszula Eksmond
- Retroviral Immunology, The Francis Crick Institute, London, UK
| | | | - George Kassiotis
- Retroviral Immunology, The Francis Crick Institute, London, UK.,Department of Infectious Disease, Imperial College London, London, UK
| | - Jonathan P Stoye
- Retrovirus-host Interactions Laboratory, The Francis Crick Institute, London, UK.,Department of Infectious Disease, Imperial College London, London, UK
| |
Collapse
|
20
|
Belyeu JR, Brand H, Wang H, Zhao X, Pedersen BS, Feusier J, Gupta M, Nicholas TJ, Brown J, Baird L, Devlin B, Sanders SJ, Jorde LB, Talkowski ME, Quinlan AR. De novo structural mutation rates and gamete-of-origin biases revealed through genome sequencing of 2,396 families. Am J Hum Genet 2021; 108:597-607. [PMID: 33675682 PMCID: PMC8059337 DOI: 10.1016/j.ajhg.2021.02.012] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 02/12/2021] [Indexed: 01/05/2023] Open
Abstract
Each human genome includes de novo mutations that arose during gametogenesis. While these germline mutations represent a fundamental source of new genetic diversity, they can also create deleterious alleles that impact fitness. Whereas the rate and patterns of point mutations in the human germline are now well understood, far less is known about the frequency and features that impact de novo structural variants (dnSVs). We report a family-based study of germline mutations among 9,599 human genomes from 33 multigenerational CEPH-Utah families and 2,384 families from the Simons Foundation Autism Research Initiative. We find that de novo structural mutations detected by alignment-based, short-read WGS occur at an overall rate of at least 0.160 events per genome in unaffected individuals, and we observe a significantly higher rate (0.206 per genome) in ASD-affected individuals. In both probands and unaffected samples, nearly 73% of de novo structural mutations arose in paternal gametes, and we predict most de novo structural mutations to be caused by mutational mechanisms that do not require sequence homology. After multiple testing correction, we did not observe a statistically significant correlation between parental age and the rate of de novo structural variation in offspring. These results highlight that a spectrum of mutational mechanisms contribute to germline structural mutations and that these mechanisms most likely have markedly different rates and selective pressures than those leading to point mutations.
Collapse
Affiliation(s)
- Jonathan R Belyeu
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02114, USA
| | - Harold Wang
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02114, USA
| | - Xuefang Zhao
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02114, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Julie Feusier
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Meenal Gupta
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Joseph Brown
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Lisa Baird
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Stephan J Sanders
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA; Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT 84112, USA
| | - Michael E Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02114, USA.
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA; Department of Biomedical Informatics, University of Utah, Salt Lake City, UT 84112, USA; Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT 84112, USA.
| |
Collapse
|
21
|
Fujiwara K. Novel Genetic Rearrangements in Hepatitis B Virus: Complex Structural Variations and Structural Variation Polymorphisms. Viruses 2021; 13:473. [PMID: 33809245 PMCID: PMC8000817 DOI: 10.3390/v13030473] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/06/2021] [Accepted: 03/11/2021] [Indexed: 12/11/2022] Open
Abstract
Chronic hepatitis B virus (HBV) causes serious clinical problems, such as liver cirrhosis and hepatocellular carcinoma. Current antiviral treatments suppress HBV; however, the clinical cure rate remains low. Basic research on HBV is indispensable to eradicate and cure HBV. Genetic alterations are defined by nucleotide substitutions and canonical forms of structural variations (SVs), such as insertion, deletion and duplication. Additionally, genetic changes inconsistent with the canonical forms have been reported, and these have been termed complex SVs. Detailed analyses of HBV using bioinformatical applications have detected complex SVs in HBV genomes. Sequence gaps and low sequence similarity have been observed in the region containing complex SVs. Additionally, insertional motif sequences have been observed in HBV strains with complex SVs. Following the analyses of complex SVs in the HBV genome, the role of SVs in the genetic diversity of orthohepadnavirus has been investigated. SV polymorphisms have been detected in comparisons of several species of orthohepadnaviruses. As mentioned, complex SVs are composed of multiple SVs. On the contrary, SV polymorphisms are observed as insertions of different SVs. Up to a certain point, nucleotide substitutions cause genetic differences. However, at some point, the nucleotide sequences are split into several particular patterns. These SVs have been observed as polymorphic changes. Different species of orthohepadnaviruses possess SVs which are unique and specific to a certain host of the virus. Studies have shown that SVs play an important role in the HBV genome. Further studies are required to elucidate their virologic and clinical roles.
Collapse
Affiliation(s)
- Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya 467-8601, Japan
| |
Collapse
|
22
|
Detecting Causal Variants in Mendelian Disorders Using Whole-Genome Sequencing. Methods Mol Biol 2021; 2243:1-25. [PMID: 33606250 DOI: 10.1007/978-1-0716-1103-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Increasingly affordable sequencing technologies are revolutionizing the field of genomic medicine. It is now feasible to interrogate all major classes of variation in an individual across the entire genome for less than $1000 USD. While the generation of patient sequence information using these technologies has become routine, the analysis and interpretation of this data remains the greatest obstacle to widespread clinical implementation. This chapter summarizes the steps to identify, annotate, and prioritize variant information required for clinical report generation. We discuss methods to detect each variant class and describe strategies to increase the likelihood of detecting causal variant(s) in Mendelian disease. Lastly, we describe a sample workflow for synthesizing large amount of genetic information into concise clinical reports.
Collapse
|
23
|
Imai R, Tsuda Y, Ebihara A, Matsumoto S, Tezuka A, Nagano AJ, Ootsuki R, Watano Y. Mating system evolution and genetic structure of diploid sexual populations of Cyrtomium falcatum in Japan. Sci Rep 2021; 11:3124. [PMID: 33542454 PMCID: PMC7862634 DOI: 10.1038/s41598-021-82731-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 01/12/2021] [Indexed: 11/09/2022] Open
Abstract
Evolution of mating systems has become one of the most important research areas in evolutionary biology. Cyrtomium falcatum is a homosporous fern species native to eastern Asia. Two subspecies belonging to a sexual diploid race of C. falcatum are recognized: subsp. littorale and subsp. australe. Subspecies littorale shows intermediate selfing rates, while subsp. australe is an obligate outcrosser. We aimed to evaluate the process of mating system evolution and divergence for the two subspecies using restriction site associated DNA sequencing (RAD-seq). The results showed that subsp. littorale had lower genetic diversity and stronger genetic drift than subsp. australe. Fluctuations in the effective population size over time were evaluated by extended Bayesian skyline plot and Stairway plot analyses, both of which revealed a severe population bottleneck about 20,000 years ago in subsp. littorale. This bottleneck and the subsequent range expansion after the LGM appear to have played an important role in the divergence of the two subspecies and the evolution of selfing in subsp. littorale. These results shed new light on the relationship between mating system evolution and past demographic change in fern species.
Collapse
Affiliation(s)
- Ryosuke Imai
- Sugadaira Research Station, Mountain Science Center, University of Tsukuba, Sugadaira, Ueda , Nagano, 386-2204, Japan.
| | - Yoshiaki Tsuda
- Sugadaira Research Station, Mountain Science Center, University of Tsukuba, Sugadaira, Ueda , Nagano, 386-2204, Japan
| | - Atsushi Ebihara
- Department of Botany, National Museum of Nature and Science, Tsukuba, Ibaraki, 305-0005, Japan
| | - Sadamu Matsumoto
- Department of Botany, National Museum of Nature and Science, Tsukuba, Ibaraki, 305-0005, Japan
| | - Ayumi Tezuka
- Faculty of Agriculture, Ryukoku University, Otsu, Shiga, 520-2194, Japan
| | - Atsushi J Nagano
- Faculty of Agriculture, Ryukoku University, Otsu, Shiga, 520-2194, Japan
| | - Ryo Ootsuki
- Department of Natural Sciences, Faculty of Arts and Sciences, Komazawa University, 1-23-1 Komazawa, Setagaya-ku, Tokyo, 154-8525, Japan
| | - Yasuyuki Watano
- Department of Biology, Graduate School of Science, Chiba University, Inage, Chiba, Chiba, 263-8522, Japan
| |
Collapse
|
24
|
Chen X, Li D. Sequencing facility and DNA source associated patterns of virus-mappable reads in whole-genome sequencing data. Genomics 2021; 113:1189-1198. [PMID: 33301893 PMCID: PMC7856238 DOI: 10.1016/j.ygeno.2020.12.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 11/25/2020] [Accepted: 12/04/2020] [Indexed: 12/12/2022]
Abstract
Numerous viral sequences have been reported in the whole-genome sequencing (WGS) data of human blood. However, it is not clear to what degree the virus-mappable reads represent true viral sequences rather than random-mapping or noise originating from sample preparation, sequencing processes, or other sources. Identification of patterns of virus-mappable reads may generate novel indicators for evaluating the origins of these viral sequences. We characterized paired-end unmapped reads and reads aligned to viral references in human WGS datasets, then compared patterns of the virus-mappable reads among DNA sources and sequencing facilities which produced these datasets. We then examined potential origins of the source- and facility-associated viral reads. The proportions of clean unmapped reads among the seven sequencing facilities were significantly different (P < 2 × 10-16). We identified 260,339 reads that were mappable to a total of 99 viral references in 2535 samples. The majority (86.7%) of these virus-mappable reads (corresponding to 47 viral references), which can be classified into four groups based on their distinct patterns, were strongly associated with sequencing facility or DNA source (adjusted P value <0.01). Possible origins of these reads include artificial sequences in library preparation, recombinant vectors in cell culture, and phages co-contaminated with their host bacteria. The sequencing facility-associated virus-mappable reads and patterns were repeatedly observed in other datasets produced in the same facilities. We have constructed an analytic framework and profiled the unmapped reads mappable to viral references. The results provide a new understanding of sequencing facility- and DNA source-associated batch effects in deep sequencing data and may facilitate improved bioinformatics filtering of reads.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT 05405, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT 05405, USA; Department of Computer Science, University of Vermont, Burlington, VT 05405, USA; Neuroscience, Behavior, Health Initiative, University of Vermont, Burlington, VT 05405, USA.
| |
Collapse
|
25
|
Rao J, Peng L, Liang X, Jiang H, Geng C, Zhao X, Liu X, Fan G, Chen F, Mu F. Performance of copy number variants detection based on whole-genome sequencing by DNBSEQ platforms. BMC Bioinformatics 2020; 21:518. [PMID: 33176676 PMCID: PMC7659224 DOI: 10.1186/s12859-020-03859-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 11/03/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND DNBSEQ™ platforms are new massively parallel sequencing (MPS) platforms that use DNA nanoball technology. Use of data generated from DNBSEQ™ platforms to detect single nucleotide variants (SNVs) and small insertions and deletions (indels) has proven to be quite effective, while the feasibility of copy number variants (CNVs) detection is unclear. RESULTS Here, we first benchmarked different CNV detection tools based on Illumina whole-genome sequencing (WGS) data of NA12878 and then assessed these tools in CNV detection based on DNBSEQ™ sequencing data from the same sample. When the same tool was used, the CNVs detected based on DNBSEQ™ and Illumina data were similar in quantity, length and distribution, while great differences existed within results from different tools and even based on data from a single platform. We further estimated the CNV detection power based on available CNV benchmarks of NA12878 and found similar precision and sensitivity between the DNBSEQ™ and Illumina platforms. We also found higher precision of CNVs shorter than 1 kbp based on DNBSEQ™ platforms than those based on Illumina platforms by using Pindel, DELLY and LUMPY. We carefully compared these two available benchmarks and found a large proportion of specific CNVs between them. Thus, we constructed a more complete CNV benchmark of NA12878 containing 3512 CNV regions. CONCLUSIONS We assessed and benchmarked CNV detections based on WGS with DNBSEQ™ platforms and provide guidelines for future studies.
Collapse
Affiliation(s)
- Junhua Rao
- MGI, BGI-Shenzhen, Shenzhen, 518083, China
| | | | | | - Hui Jiang
- MGI, BGI-Shenzhen, Shenzhen, 518083, China
| | | | - Xia Zhao
- MGI, BGI-Shenzhen, Shenzhen, 518083, China
| | - Xin Liu
- BGI-Shenzhen, Shenzhen, 518083, China.,BGI-Qingdao, BGI-Shenzhen, Qingdao, 266555, Shandong, China.,IGDB-BGI Joint Center for Omics, BGI-Shenzhen, Shenzhen, 518083, China.,State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, 518083, China
| | - Guangyi Fan
- BGI-Qingdao, BGI-Shenzhen, Qingdao, 266555, Shandong, China.,State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, 518083, China
| | - Fang Chen
- MGI, BGI-Shenzhen, Shenzhen, 518083, China. .,BGI-Shenzhen, Shenzhen, 518083, China. .,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China.
| | - Feng Mu
- MGI, BGI-Shenzhen, Shenzhen, 518083, China. .,MGI-Wuhan, BGI-Shenzhen, Wuhan, 430074, China.
| |
Collapse
|
26
|
Aganezov S, Raphael BJ. Reconstruction of clone- and haplotype-specific cancer genome karyotypes from bulk tumor samples. Genome Res 2020; 30:1274-1290. [PMID: 32887685 PMCID: PMC7545144 DOI: 10.1101/gr.256701.119] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Accepted: 08/07/2020] [Indexed: 12/25/2022]
Abstract
Many cancer genomes are extensively rearranged with aberrant chromosomal karyotypes. Deriving these karyotypes from high-throughput DNA sequencing of bulk tumor samples is complicated because most tumors are a heterogeneous mixture of normal cells and subpopulations of cancer cells, or clones, that harbor distinct somatic mutations. We introduce a new algorithm, Reconstructing Cancer Karyotypes (RCK), to reconstruct haplotype-specific karyotypes of one or more rearranged cancer genomes from DNA sequencing data from a bulk tumor sample. RCK leverages evolutionary constraints on the somatic mutational process in cancer to reduce ambiguity in the deconvolution of admixed sequencing data into multiple haplotype-specific cancer karyotypes. RCK models mixtures containing an arbitrary number of derived genomes and allows the incorporation of information both from short-read and long-read DNA sequencing technologies. We compare RCK to existing approaches on 17 primary and metastatic prostate cancer samples. We find that RCK infers cancer karyotypes that better explain the DNA sequencing data and conform to a reasonable evolutionary model. RCK's reconstructions of clone- and haplotype-specific karyotypes will aid further studies of the role of intra-tumor heterogeneity in cancer development and response to treatment. RCK is freely available as open source software.
Collapse
Affiliation(s)
- Sergey Aganezov
- Department of Computer Science, Princeton University, Princeton, New Jersey 08540, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, New Jersey 08540, USA
| |
Collapse
|
27
|
Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, Buyske S, Matise TC, Muzny DM, Zody MC, Lander ES, Dutcher SK, Stitziel NO, Hall IM. Mapping and characterization of structural variation in 17,795 human genomes. Nature 2020; 583:83-89. [PMID: 32460305 PMCID: PMC7547914 DOI: 10.1038/s41586-020-2371-0] [Citation(s) in RCA: 176] [Impact Index Per Article: 35.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 05/18/2020] [Indexed: 12/18/2022]
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
Affiliation(s)
- Haley J Abel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - David E Larson
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Allison A Regier
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Indraniel Das
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Krishna L Kanchi
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Ryan M Layer
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Benjamin M Neale
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - William J Salerno
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Steven Buyske
- Department of Statistics, Rutgers University, Piscataway, NJ, USA
| | - Tara C Matise
- Department of Genetics, Rutgers University, Piscataway, NJ, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Susan K Dutcher
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Nathan O Stitziel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA.
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
| |
Collapse
|
28
|
Chen X, Li D. ERVcaller: identifying polymorphic endogenous retrovirus and other transposable element insertions using whole-genome sequencing data. Bioinformatics 2020; 35:3913-3922. [PMID: 30895294 DOI: 10.1093/bioinformatics/btz205] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 02/28/2019] [Accepted: 03/19/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Approximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using whole-genome sequencing (WGS) data. RESULTS We designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark WGS datasets. Compared to existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. Polymerase chain reaction and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species. AVAILABILITY AND IMPLEMENTATION http://www.uvm.edu/genomics/software/ERVcaller.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, VT, USA.,Department of Computer Science, University of Vermont, Burlington, VT, USA
| |
Collapse
|
29
|
Wei YC, Huang GH. CONY: A Bayesian procedure for detecting copy number variations from sequencing read depths. Sci Rep 2020; 10:10493. [PMID: 32591545 PMCID: PMC7319969 DOI: 10.1038/s41598-020-64353-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 04/15/2020] [Indexed: 12/26/2022] Open
Abstract
Copy number variations (CNVs) are genomic structural mutations consisting of abnormal numbers of fragment copies. Next-generation sequencing of read-depth signals mirrors these variants. Some tools used to predict CNVs by depth have been published, but most of these tools can be applied to only a specific data type due to modeling limitations. We develop a tool for copy number variation detection by a Bayesian procedure, i.e., CONY, that adopts a Bayesian hierarchical model and an efficient reversible-jump Markov chain Monte Carlo inference algorithm for whole genome sequencing of read-depth data. CONY can be applied not only to individual samples for estimating the absolute number of copies but also to case-control pairs for detecting patient-specific variations. We evaluate the performance of CONY and compare CONY with competing approaches through simulations and by using experimental data from the 1000 Genomes Project. CONY outperforms the other methods in terms of accuracy in both single-sample and paired-samples analyses. In addition, CONY performs well regardless of whether the data coverage is high or low. CONY is useful for detecting both absolute and relative CNVs from read-depth data sequences. The package is available at https://github.com/weiyuchung/CONY.
Collapse
Affiliation(s)
- Yu-Chung Wei
- Graduate Institute of Statistics and Information Science, National Changhua University of Education, No.1 Jinde Road, Changhua City, Changhua County, 50007, Taiwan
| | - Guan-Hua Huang
- Institute of Statistics, National Chiao Tung University, 1001 University Road, Hsinchu, 30010, Taiwan.
| |
Collapse
|
30
|
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
|
31
|
Gogos JA, Crabtree G, Diamantopoulou A. The abiding relevance of mouse models of rare mutations to psychiatric neuroscience and therapeutics. Schizophr Res 2020; 217:37-51. [PMID: 30987923 PMCID: PMC6790166 DOI: 10.1016/j.schres.2019.03.018] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 03/19/2019] [Accepted: 03/22/2019] [Indexed: 01/08/2023]
Abstract
Studies using powerful family-based designs aided by large scale case-control studies, have been instrumental in cracking the genetic complexity of the disease, identifying rare and highly penetrant risk mutations and providing a handle on experimentally tractable model systems. Mouse models of rare mutations, paired with analysis of homologous cognitive and sensory processing deficits and state-of-the-art neuroscience methods to manipulate and record neuronal activity have started providing unprecedented insights into pathogenic mechanisms and building the foundation of a new biological framework for understanding mental illness. A number of important principles are emerging, namely that degradation of the computational mechanisms underlying the ordered activity and plasticity of both local and long-range neuronal assemblies, the building blocks necessary for stable cognition and perception, might be the inevitable consequence and the common point of convergence of the vastly heterogeneous genetic liability, manifesting as defective internally- or stimulus-driven neuronal activation patterns and triggering the constellation of schizophrenia symptoms. Animal models of rare mutations have the unique potential to help us move from "which" (gene) to "how", "where" and "when" computational regimes of neural ensembles are affected. Linking these variables should improve our understanding of how symptoms emerge and how diagnostic boundaries are established at a circuit level. Eventually, a better understanding of pathophysiological trajectories at the level of neural circuitry in mice, aided by basic human experimental biology, should guide the development of new therapeutics targeting either altered circuitry itself or the underlying biological pathways.
Collapse
Affiliation(s)
- Joseph A. Gogos
- Mortimer B. Zuckerman Mind Brain and Behavior Institute Columbia University, New York, NY 10027 USA,Department of Physiology and Cellular Biophysics, College of Physicians and Surgeons, Columbia University, New York, NY 10032, USA,Department of Neuroscience, Columbia University, New York, NY 10032 USA,Correspondence should be addressed to: Joseph A. Gogos ()
| | - Gregg Crabtree
- Mortimer B. Zuckerman Mind Brain and Behavior Institute Columbia University, New York, NY 10027 USA,Department of Physiology and Cellular Biophysics, College of Physicians and Surgeons, Columbia University, New York, NY 10032, USA
| | - Anastasia Diamantopoulou
- Mortimer B. Zuckerman Mind Brain and Behavior Institute Columbia University, New York, NY 10027 USA,Department of Physiology and Cellular Biophysics, College of Physicians and Surgeons, Columbia University, New York, NY 10032, USA
| |
Collapse
|
32
|
Spence M, Banuelos M, Marcia RF, Sindi S. Detecting inherited and novel structural variants in low-coverage parent-child sequencing data. Methods 2020; 173:61-68. [PMID: 31271880 DOI: 10.1016/j.ymeth.2019.06.025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Revised: 06/12/2019] [Accepted: 06/24/2019] [Indexed: 11/25/2022] Open
Abstract
Structural variants (SVs) are a class of genomic variation shared by members of the same species. Though relatively rare, they represent an increasingly important class of variation, as SVs have been associated with diseases and susceptibility to some types of cancer. Common approaches to SV detection require the sequencing and mapping of fragments from a test genome to a high-quality reference genome. Candidate SVs correspond to fragments with discordant mapped configurations. However, because errors in the sequencing and mapping will also create discordant arrangements, many of these predictions will be spurious. When sequencing coverage is low, distinguishing true SVs from errors is even more challenging. In recent work, we have developed SV detection methods that exploit genome information of closely related individuals - parents and children. Our previous approaches were based on the assumption that any SV present in a child's genome must have come from one of their parents. However, using this strict restriction may have resulted in failing to predict any rare but novel variants present only in the child. In this work, we generalize our previous approaches to allow the child to carry novel variants. We consider a constrained optimization approach where variants in the child are of two types either inherited - and therefore must be present in a parent - or novel. For simplicity, we consider only a single parent and single child each of which have a haploid genome. However, even in this restricted case, our approach has the power to improve variant prediction. We present results on both simulated candidate variant regions, parent-child trios from the 1000 Genomes Project, and a subset of the 17 Platinum Genomes.
Collapse
Affiliation(s)
- Melissa Spence
- Department of Applied Mathematics, University of California, Merced, Merced, CA 95343, USA
| | - Mario Banuelos
- Department of Mathematics, California State University, Fresno, Fresno, CA 93740, USA.
| | - Roummel F Marcia
- Department of Applied Mathematics, University of California, Merced, Merced, CA 95343, USA
| | - Suzanne Sindi
- Department of Applied Mathematics, University of California, Merced, Merced, CA 95343, USA
| |
Collapse
|
33
|
Ectopic expression of the Stabilin2 gene triggered by an intracisternal A particle (IAP) element in DBA/2J strain of mice. Mamm Genome 2020; 31:2-16. [PMID: 31912264 PMCID: PMC7060167 DOI: 10.1007/s00335-019-09824-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 12/29/2019] [Indexed: 12/21/2022]
Abstract
Stabilin2 (Stab2) encodes a large transmembrane protein which is predominantly expressed in the liver sinusoidal endothelial cells (LSECs) and functions as a scavenger receptor for various macromolecules including hyaluronans (HA). In DBA/2J mice, plasma HA concentration is ten times higher than in 129S6 or C57BL/6J mice, and this phenotype is genetically linked to the Stab2 locus. Stab2 mRNA in the LSECs was significantly lower in DBA/2J than in 129S6, leading to reduced STAB2 proteins in the DBA/2J LSECs. We found a retrovirus-derived transposable element, intracisternal A particle (IAP), in the promoter region of Stab2DBA which likely interferes with normal expression in the LSECs. In contrast, in other tissues of DBA/2J mice, the IAP drives high ectopic Stab2DBA transcription starting within the 5′ long terminal repeat of IAP in a reverse orientation and continuing through the downstream Stab2DBA. Ectopic transcription requires the Stab2-IAP element but is dominantly suppressed by the presence of loci on 59.7–73.0 Mb of chromosome (Chr) 13 from C57BL/6J, while the same region in 129S6 requires additional loci for complete suppression. Chr13:59.9–73 Mb contains a large number of genes encoding Krüppel-associated box-domain zinc-finger proteins that target transposable elements-derived sequences and repress their expression. Despite the high amount of ectopic Stab2DBA transcript in tissues other than liver, STAB2 protein was undetectable and unlikely to contribute to the plasma HA levels of DBA/2J mice. Nevertheless, the IAP insertion and its effects on the transcription of the downstream Stab2DBA exemplify that stochastic evolutional events could significantly influence susceptibility to complex but common diseases.
Collapse
|
34
|
Lee Y, Park K, Koh I. Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data. Genomics Inform 2020; 17:e40. [PMID: 31896240 PMCID: PMC6944045 DOI: 10.5808/gi.2019.17.4.e40] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 11/13/2019] [Indexed: 11/20/2022] Open
Abstract
While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.
Collapse
Affiliation(s)
- Yuna Lee
- Department of Biomedical Informatics, Hanyang University, Seoul 04763, Korea
| | - Kiejung Park
- Cheonan Industry-Academic Collaboration Foundation, Sangmyung University, Cheonan 31066, Korea.,KIOST School, University of Science and Technology, Daejeon 34113, Korea
| | - Insong Koh
- Department of Biomedical Informatics, Hanyang University, Seoul 04763, Korea
| |
Collapse
|
35
|
Daino K, Ishikawa A, Suga T, Amasaki Y, Kodama Y, Shang Y, Hirano-Sakairi S, Nishimura M, Nakata A, Yoshida M, Imai T, Shimada Y, Kakinuma S. Mutational landscape of T-cell lymphoma in mice lacking the DNA mismatch repair gene Mlh1: no synergism with ionizing radiation. Carcinogenesis 2019; 40:216-224. [PMID: 30721949 DOI: 10.1093/carcin/bgz013] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 12/06/2018] [Accepted: 02/01/2019] [Indexed: 12/29/2022] Open
Abstract
Biallelic germline mutations in the DNA mismatch repair gene MLH1 lead to constitutional mismatch repair-deficiency syndrome and an increased risk for childhood hematopoietic malignancies, including lymphoma and leukemia. To examine how Mlh1 dysfunction promotes lymphoma as well as the influence of ionizing radiation (IR) exposure, we used an Mlh1-/- mouse model and whole-exome sequencing to assess genomic alterations in 23 T-cell lymphomas, including 8 spontaneous and 15 IR-associated lymphomas. Exposure to IR accelerated T-cell lymphoma induction in the Mlh1-/- mice, and whole-exome sequencing revealed that IR exposure neither increased the number of mutations nor altered the mutation spectrum of the lymphomas. Frequent mutations were evident in genes encoding transcription factors (e.g. Ikzf1, Trp53, Bcl11b), epigenetic regulators (e.g. Suv420h1, Ep300, Kmt2d), transporters (e.g. Rangap1, Kcnj16), extracellular matrix (e.g. Megf6, Lrig1), cell motility (e.g. Argef19, Dnah17), protein kinase cascade (e.g. Ptpro, Marcks) and in genes involved in NOTCH (e.g. Notch1), and PI3K/AKT (e.g. Pten, Akt2) signaling pathways in both spontaneous and IR-associated lymphomas. Frameshift mutations in mononucleotide repeat sequences within the genes Trp53, Ep300, Kmt2d, Notch1, Pten and Marcks were newly identified in the lymphomas. The lymphomas also exhibited a few chromosomal abnormalities. The results establish a landscape of genomic alterations in spontaneous and IR-associated lymphomas that occur in the context of mismatch repair dysfunction and suggest potential targets for cancer treatment.
Collapse
Affiliation(s)
- Kazuhiro Daino
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Atsuko Ishikawa
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Tomo Suga
- Department of Basic Medical Sciences for Radiation Damages, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Yoshiko Amasaki
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Yotaro Kodama
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Yi Shang
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Shinobu Hirano-Sakairi
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Mayumi Nishimura
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | - Akifumi Nakata
- Faculty of Pharmaceutical Sciences, Hokkaido University of Science, Sapporo, Japan
| | - Mitsuaki Yoshida
- Department of Radiation Biology, Institute of Radiation Emergency Medicine, Hirosaki University, Hirosaki, Japan
| | - Takashi Imai
- Medical Databank Section, Hospital, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| | | | - Shizuko Kakinuma
- Department of Radiation Effects Research, National Institute of Radiological Sciences (NIRS), National Institutes for Quantum and Radiological Science and Technology (QST), Chiba, Japan
| |
Collapse
|
36
|
Catanach A, Crowhurst R, Deng C, David C, Bernatchez L, Wellenreuther M. The genomic pool of standing structural variation outnumbers single nucleotide polymorphism by threefold in the marine teleost Chrysophrys auratus. Mol Ecol 2019; 28:1210-1223. [PMID: 30770610 DOI: 10.1111/mec.15051] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 01/31/2019] [Accepted: 02/01/2019] [Indexed: 12/22/2022]
Abstract
Recent studies have highlighted an important role of structural variation (SV) in ecological and evolutionary processes, but few have studied nonmodel species in the wild. As part of our long-term research programme on the nonmodel teleost fish Australasian snapper (Chrysophrys auratus), we aim to build one of the first catalogues of genomic variants (SNPs and indels, and deletions, duplications and inversions) in fishes and evaluate overlap of genomic variants with regions under putative selection (Tajima's D and π), and coding sequences (genes). For this, we analysed six males and six females from three locations in New Zealand and generated a high-resolution genomic variation catalogue. We characterized 20,385 SVs and found they intersected with almost a third of all annotated genes. Together with small indels, SVs account for three times more variation in the genome in terms of bases affected compared to SNPs. We found that a sizeable portion of detected SVs was in the upper and lower genomic regions of Tajima's D and π, indicating that some of these have an effect on the phenotype. Together, these results shed light on the often neglected genomic variation that is produced by SVs and highlights the need to go beyond the mere measure of SNPs when investigating evolutionary processes, such as species diversification and adaptation.
Collapse
Affiliation(s)
- Andrew Catanach
- The New Zealand Institute for Plant & Food Research Ltd, Lincoln, New Zealand
| | - Ross Crowhurst
- The New Zealand Institute for Plant & Food Research Ltd, Auckland, New Zealand
| | - Cecilia Deng
- The New Zealand Institute for Plant & Food Research Ltd, Auckland, New Zealand
| | - Charles David
- The New Zealand Institute for Plant & Food Research Ltd, Lincoln, New Zealand
| | - Louis Bernatchez
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, Quebec, Canada
| | - Maren Wellenreuther
- The New Zealand Institute for Plant & Food Research Ltd, Nelson, New Zealand.,School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
37
|
Chattopadhyay B, Garg KM, Ray R, Rheindt FE. Fluctuating fortunes: genomes and habitat reconstructions reveal global climate-mediated changes in bats' genetic diversity. Proc Biol Sci 2019; 286:20190304. [PMID: 31530139 PMCID: PMC6784725 DOI: 10.1098/rspb.2019.0304] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Accepted: 08/23/2019] [Indexed: 12/21/2022] Open
Abstract
Over the last approximately 2.6 Myr, Earth's climate has been dominated by cyclical ice ages that have profoundly affected species' population sizes, but the impact of impending anthropogenic climate change on species' extinction potential remains a worrying problem. We investigated 11 bat species from different taxonomic, ecological and geographical backgrounds using combined information from palaeoclimatic habitat reconstructions and genomes to analyse biotic impacts of historic climate change. We discover tightly correlated fluctuations between species' historic distribution and effective population size, identify frugivores as particularly susceptible to global warming, pinpoint large insectivores as having overall low effective population size and flag the onset of the Holocene (approx. 10-12 000 years ago) as the period with the generally lowest effective population sizes across the last approximately 1 Myr. Our study shows that combining genomic and palaeoclimatological approaches reveals effects of climatic shifts on genetic diversity and may help predict impacts of future climate change.
Collapse
Affiliation(s)
| | - Kritika M. Garg
- Department of Biological Sciences, National University of Singapore, Singapore
| | - Rajasri Ray
- Center for Ecological Sciences, Indian Institute of Science, Bangalore, 560012 Karnataka, India
- Centre for Studies in Ethnobiology, Biodiversity and Sustainability (CEiBa), BG Road, Mokdumpur, Malda-732103 West Bengal, India
| | - Frank E. Rheindt
- Department of Biological Sciences, National University of Singapore, Singapore
| |
Collapse
|
38
|
Fujiwara K, Matsuura K, Matsunami K, Iio E, Nagura Y, Nojiri S, Kataoka H. Novel Genetic Rearrangements Termed "Structural Variation Polymorphisms" Contribute to the Genetic Diversity of Orthohepadnaviruses. Viruses 2019; 11:871. [PMID: 31533314 PMCID: PMC6783994 DOI: 10.3390/v11090871] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 09/08/2019] [Accepted: 09/17/2019] [Indexed: 12/27/2022] Open
Abstract
The genetic diversity of orthohepadnaviruses is not yet fully understood. This study was conducted to investigate the role of structural variations (SVs) in their diversity. Genetic sequences of orthohepadnaviruses were retrieved from databases. The positions of sequence gaps were investigated, since they were found to be related to SVs, and they were further used to search for SVs. Then, a combination of pair-wise and multiple alignment analyses was performed to analyze the genomic structure. Unique patterns of SVs were observed; genetic sequences at certain genomic positions could be separated into multiple patterns, such as no SV, SV pattern 1, SV pattern 2, and SV pattern 3, which were observed as polymorphic changes. We provisionally referred to these genetic changes as SV polymorphisms. Our data showed that higher frequency of sequence gaps and lower genetic identity were observed in the pre-S1-S2 region of various types of HBVs. Detailed examination of the genetic structure in the pre-S region by a combination of pair-wise and multiple alignment analyses showed that the genetic diversity of orthohepadnaviruses in the pre-S1 region could have been also induced by SV polymorphisms. Our data showed that novel genetic rearrangements provisionally termed SV polymorphisms were observed in various orthohepadnaviruses.
Collapse
Affiliation(s)
- Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Kentaro Matsuura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Kayoko Matsunami
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Etsuko Iio
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Yoshihito Nagura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Shunsuke Nojiri
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| | - Hiromi Kataoka
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, Nagoya, Aichi 467-8601, Japan.
| |
Collapse
|
39
|
Variant calling and quality control of large-scale human genome sequencing data. Emerg Top Life Sci 2019; 3:399-409. [DOI: 10.1042/etls20190007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 06/28/2019] [Accepted: 07/16/2019] [Indexed: 12/12/2022]
Abstract
Abstract
Next-generation sequencing has allowed genetic studies to collect genome sequencing data from a large number of individuals. However, raw sequencing data are not usually interpretable due to fragmentation of the genome and technical biases; therefore, analysis of these data requires many computational approaches. First, for each sequenced individual, sequencing data are aligned and further processed to account for technical biases. Then, variant calling is performed to obtain information on the positions of genetic variants and their corresponding genotypes. Quality control (QC) is applied to identify individuals and genetic variants with sequencing errors. These procedures are necessary to generate accurate variant calls from sequencing data, and many computational approaches have been developed for these tasks. This review will focus on current widely used approaches for variant calling and QC.
Collapse
|
40
|
Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software. Nat Commun 2019; 10:3240. [PMID: 31324872 PMCID: PMC6642177 DOI: 10.1038/s41467-019-11146-4] [Citation(s) in RCA: 166] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 06/26/2019] [Indexed: 01/12/2023] Open
Abstract
In recent years, many software packages for identifying structural variants (SVs) using whole-genome sequencing data have been released. When published, a new method is commonly compared with those already available, but this tends to be selective and incomplete. The lack of comprehensive benchmarking of methods presents challenges for users in selecting methods and for developers in understanding algorithm behaviours and limitations. Here we report the comprehensive evaluation of 10 SV callers, selected following a rigorous process and spanning the breadth of detection approaches, using high-quality reference cell lines, as well as simulations. Due to the nature of available truth sets, our focus is on general-purpose rather than somatic callers. We characterise the impact on performance of event size and type, sequencing characteristics, and genomic context, and analyse the efficacy of ensemble calling and calibration of variant quality scores. Finally, we provide recommendations for both users and methods developers. A number of computational methods have been developed for calling structural variants (SVs) using short read sequencing data. Here, the authors perform a comprehensive benchmarking analysis comparing 10 general-purpose callers and provide recommendations for both users and methods developers.
Collapse
|
41
|
Puurand T, Kukuškina V, Pajuste FD, Remm M. AluMine: alignment-free method for the discovery of polymorphic Alu element insertions. Mob DNA 2019; 10:31. [PMID: 31360240 PMCID: PMC6639938 DOI: 10.1186/s13100-019-0174-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 07/12/2019] [Indexed: 01/09/2023] Open
Abstract
Background Recently, alignment-free sequence analysis methods have gained popularity in the field of personal genomics. These methods are based on counting frequencies of short k-mer sequences, thus allowing faster and more robust analysis compared to traditional alignment-based methods. Results We have created a fast alignment-free method, AluMine, to analyze polymorphic insertions of Alu elements in the human genome. We tested the method on 2,241 individuals from the Estonian Genome Project and identified 28,962 potential polymorphic Alu element insertions. Each tested individual had on average 1,574 Alu element insertions that were different from those in the reference genome. In addition, we propose an alignment-free genotyping method that uses the frequency of insertion/deletion-specific 32-mer pairs to call the genotype directly from raw sequencing reads. Using this method, the concordance between the predicted and experimentally observed genotypes was 98.7%. The running time of the discovery pipeline is approximately 2 h per individual. The genotyping of potential polymorphic insertions takes between 0.4 and 4 h per individual, depending on the hardware configuration. Conclusions AluMine provides tools that allow discovery of novel Alu element insertions and/or genotyping of known Alu element insertions from personal genomes within few hours.
Collapse
Affiliation(s)
- Tarmo Puurand
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Viktoria Kukuškina
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | | - Maido Remm
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| |
Collapse
|
42
|
Zhu S, Emrich SJ, Chen DZ. Predicting Local Inversions Using Rectangle Clustering and Representative Rectangle Prediction. IEEE Trans Nanobioscience 2019; 18:316-323. [PMID: 31180865 PMCID: PMC6606370 DOI: 10.1109/tnb.2019.2915060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
As a specific type of structural variation, inversions are enjoying particular traction as a result of their established role in evolution. Using third-generation sequencing technology to predict inversions is growing in interest, but many such methods focus on improving sensitivity, giving rise to either too many false positives or very long running times. In this paper, we propose a new framework for inversion detection based on a combination of two novel theoretical models: rectangle clustering and representative rectangle prediction. This combination can automatically filter out false positive inversion predictions while retaining correct ones, leading to a method that has both high sensitivity and high positive prediction values (PPV). Further, this new framework can run very fast on available data. Our software can be freely obtained at https://github.com/UTbioinf/RigInv.
Collapse
|
43
|
The Genome of C57BL/6J "Eve", the Mother of the Laboratory Mouse Genome Reference Strain. G3-GENES GENOMES GENETICS 2019; 9:1795-1805. [PMID: 30996023 PMCID: PMC6553538 DOI: 10.1534/g3.119.400071] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Isogenic laboratory mouse strains enhance reproducibility because individual animals are genetically identical. For the most widely used isogenic strain, C57BL/6, there exists a wealth of genetic, phenotypic, and genomic data, including a high-quality reference genome (GRCm38.p6). Now 20 years after the first release of the mouse reference genome, C57BL/6J mice are at least 26 inbreeding generations removed from GRCm38 and the strain is now maintained with periodic reintroduction of cryorecovered mice derived from a single breeder pair, aptly named Adam and Eve. To provide an update to the mouse reference genome that more accurately represents the genome of today's C57BL/6J mice, we took advantage of long read, short read, and optical mapping technologies to generate a de novo assembly of the C57BL/6J Eve genome (B6Eve). Using these data, we have addressed recurring variants observed in previous mouse genomic studies. We have also identified structural variations, closed gaps in the mouse reference assembly, and revealed previously unannotated coding sequences. This B6Eve assembly explains discrepant observations that have been associated with GRCm38-based analyses, and will inform a reference genome that is more representative of the C57BL/6J mice that are in use today.
Collapse
|
44
|
Fuentes RR, Chebotarov D, Duitama J, Smith S, De la Hoz JF, Mohiyuddin M, Wing RA, McNally KL, Tatarinova T, Grigoriev A, Mauleon R, Alexandrov N. Structural variants in 3000 rice genomes. Genome Res 2019; 29:870-880. [PMID: 30992303 PMCID: PMC6499320 DOI: 10.1101/gr.241240.118] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 03/11/2019] [Indexed: 12/24/2022]
Abstract
Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5′ UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.
Collapse
Affiliation(s)
- Roven Rommel Fuentes
- International Rice Research Institute, Laguna 4031, Philippines.,Bioinformatics Group, Wageningen University and Research, 6708 PB Wageningen, the Netherlands
| | | | - Jorge Duitama
- Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá 111711, Colombia.,Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | - Sean Smith
- Biology Department, Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA
| | - Juan Fernando De la Hoz
- Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia
| | | | - Rod A Wing
- International Rice Research Institute, Laguna 4031, Philippines.,Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721, USA.,King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | | | - Tatiana Tatarinova
- Department of Biology, University of La Verne, La Verne, California 91750, USA.,Vavilov Institute of General Genetics, Moscow 119333, Russia.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127051, Russia.,Laboratory of Forest Genomics, Siberian Federal University, Krasnoyarsk 660041, Russia
| | - Andrey Grigoriev
- Biology Department, Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA
| | - Ramil Mauleon
- International Rice Research Institute, Laguna 4031, Philippines
| | | |
Collapse
|
45
|
Leandro J, Violante S, Argmann CA, Hagen J, Dodatko T, Bender A, Zhang W, Williams EG, Bachmann AM, Auwerx J, Yu C, Houten SM. Mild inborn errors of metabolism in commonly used inbred mouse strains. Mol Genet Metab 2019; 126:388-396. [PMID: 30709776 PMCID: PMC6535113 DOI: 10.1016/j.ymgme.2019.01.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 01/23/2019] [Accepted: 01/23/2019] [Indexed: 10/27/2022]
Abstract
Inbred mouse strains are a cornerstone of translational research but paradoxically many strains carry mild inborn errors of metabolism. For example, α-aminoadipic acidemia and branched-chain ketoacid dehydrogenase deficiency are known in C57BL/6J mice. Using RNA sequencing, we now reveal the causal variants in Dhtkd1 and Bckdhb, and the molecular mechanism underlying these metabolic defects. C57BL/6J mice have decreased Dhtkd1 mRNA expression due to a solitary long terminal repeat (LTR) in intron 4 of Dhtkd1. This LTR harbors an alternate splice donor site leading to a partial splicing defect and as a consequence decreased total and functional Dhtkd1 mRNA, decreased DHTKD1 protein and α-aminoadipic acidemia. Similarly, C57BL/6J mice have decreased Bckdhb mRNA expression due to an LTR retrotransposon in intron 1 of Bckdhb. This transposable element encodes an alternative exon 1 causing aberrant splicing, decreased total and functional Bckdhb mRNA and decreased BCKDHB protein. Using a targeted metabolomics screen, we also reveal elevated plasma C5-carnitine in 129 substrains. This biochemical phenotype resembles isovaleric acidemia and is caused by an exonic splice mutation in Ivd leading to partial skipping of exon 10 and IVD protein deficiency. In summary, this study identifies three causal variants underlying mild inborn errors of metabolism in commonly used inbred mouse strains.
Collapse
Affiliation(s)
- João Leandro
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Sara Violante
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA; Mount Sinai Genomics, Inc, One Gustave L Levy Place #1497, New York, NY 10029, USA
| | - Carmen A Argmann
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Jacob Hagen
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Tetyana Dodatko
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Aaron Bender
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Wei Zhang
- Mount Sinai Genomics, Inc, One Gustave L Levy Place #1497, New York, NY 10029, USA
| | - Evan G Williams
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich CH-8093, Switzerland
| | - Alexis M Bachmann
- Laboratory of Integrative and Systems Physiology, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| | - Johan Auwerx
- Laboratory of Integrative and Systems Physiology, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| | - Chunli Yu
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA; Mount Sinai Genomics, Inc, One Gustave L Levy Place #1497, New York, NY 10029, USA
| | - Sander M Houten
- Department of Genetics and Genomic Sciences, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA.
| |
Collapse
|
46
|
Weber J, de la Rosa J, Grove CS, Schick M, Rad L, Baranov O, Strong A, Pfaus A, Friedrich MJ, Engleitner T, Lersch R, Öllinger R, Grau M, Menendez IG, Martella M, Kohlhofer U, Banerjee R, Turchaninova MA, Scherger A, Hoffman GJ, Hess J, Kuhn LB, Ammon T, Kim J, Schneider G, Unger K, Zimber-Strobl U, Heikenwälder M, Schmidt-Supprian M, Yang F, Saur D, Liu P, Steiger K, Chudakov DM, Lenz G, Quintanilla-Martinez L, Keller U, Vassiliou GS, Cadiñanos J, Bradley A, Rad R. PiggyBac transposon tools for recessive screening identify B-cell lymphoma drivers in mice. Nat Commun 2019; 10:1415. [PMID: 30926791 PMCID: PMC6440946 DOI: 10.1038/s41467-019-09180-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Accepted: 02/18/2019] [Indexed: 01/03/2023] Open
Abstract
B-cell lymphoma (BCL) is the most common hematologic malignancy. While sequencing studies gave insights into BCL genetics, identification of non-mutated cancer genes remains challenging. Here, we describe PiggyBac transposon tools and mouse models for recessive screening and show their application to study clonal B-cell lymphomagenesis. In a genome-wide screen, we discover BCL genes related to diverse molecular processes, including signaling, transcriptional regulation, chromatin regulation, or RNA metabolism. Cross-species analyses show the efficiency of the screen to pinpoint human cancer drivers altered by non-genetic mechanisms, including clinically relevant genes dysregulated epigenetically, transcriptionally, or post-transcriptionally in human BCL. We also describe a CRISPR/Cas9-based in vivo platform for BCL functional genomics, and validate discovered genes, such as Rfx7, a transcription factor, and Phip, a chromatin regulator, which suppress lymphomagenesis in mice. Our study gives comprehensive insights into the molecular landscapes of BCL and underlines the power of genome-scale screening to inform biology.
Collapse
Affiliation(s)
- Julia Weber
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
| | - Jorge de la Rosa
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Carolyn S Grove
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- School of Medicine, University of Western Australia, Crawley, 6009, Australia
- Department of Haematology, PathWest and Sir Charles Gairdner Hospital, Queen Elizabeth II Medical Centre, Nedlands, 6009, Australia
| | - Markus Schick
- Department of Medicine III, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
| | - Lena Rad
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Olga Baranov
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
| | - Alexander Strong
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Anja Pfaus
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
| | - Mathias J Friedrich
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- Department of Medicine II, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
| | - Thomas Engleitner
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
| | - Robert Lersch
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
| | - Rupert Öllinger
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
| | - Michael Grau
- Department of Medicine A, University Hospital Münster, Münster, 48149, Germany
- Cluster of Excellence EXC 1003, Cells in Motion, Münster, 48149, Germany
| | - Irene Gonzalez Menendez
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, Tübingen, 72076, Germany
| | - Manuela Martella
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, Tübingen, 72076, Germany
| | - Ursula Kohlhofer
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, Tübingen, 72076, Germany
| | - Ruby Banerjee
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Maria A Turchaninova
- Laboratory of Genomics of Antitumor Adaptive Immunity, Privolzhsky Research Medical University, Nizhny Novgorod, 603005, Russia
- Genomics of Adaptive Immunity Department, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Science, Moscow, 117997, Russia
- Pirogov Russian National Research Medical University, Moscow, 117997, Russia
| | - Anna Scherger
- Department of Medicine III, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
| | - Gary J Hoffman
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- School of Medicine, University of Western Australia, Crawley, 6009, Australia
| | - Julia Hess
- Helmholtz Zentrum München, Research Unit Radiation Cytogenetics, Neuherberg, 85764, Germany
| | - Laura B Kuhn
- Helmholtz Zentrum München, Research Unit Gene Vectors, Munich, 81377, Germany
| | - Tim Ammon
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Department of Medicine III, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
| | - Johnny Kim
- Department of Cardiac Development and Remodeling, Max-Planck-Institute for Heart and Lung Research, Bad Nauheim, 61231, Germany
- German Center for Cardiovascular Research (DZHK), Rhine Main, Germany
| | - Günter Schneider
- Department of Medicine II, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
| | - Kristian Unger
- Helmholtz Zentrum München, Research Unit Radiation Cytogenetics, Neuherberg, 85764, Germany
| | | | - Mathias Heikenwälder
- Divison of Chronic Inflammation and Cancer, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| | - Marc Schmidt-Supprian
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Department of Medicine III, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
| | - Fengtang Yang
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Dieter Saur
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany
- Department of Medicine II, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| | - Pentao Liu
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- Li Ka Shing Faculty of Medicine, Stem Cell and Regenerative Medicine Consortium, School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Katja Steiger
- Comparative Experimental Pathology, Technische Universität München, Munich, 81675, Germany
| | - Dmitriy M Chudakov
- Laboratory of Genomics of Antitumor Adaptive Immunity, Privolzhsky Research Medical University, Nizhny Novgorod, 603005, Russia
- Genomics of Adaptive Immunity Department, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Science, Moscow, 117997, Russia
- Pirogov Russian National Research Medical University, Moscow, 117997, Russia
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
- Center of Molecular Medicine, CEITEC, Masaryk University, Brno, 601 77, Czech Republic
| | - Georg Lenz
- Department of Medicine A, University Hospital Münster, Münster, 48149, Germany
- Cluster of Excellence EXC 1003, Cells in Motion, Münster, 48149, Germany
| | - Leticia Quintanilla-Martinez
- Institute of Pathology and Comprehensive Cancer Center, Eberhard Karls Universität Tübingen, Tübingen, 72076, Germany
| | - Ulrich Keller
- Department of Medicine III, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany
- Hematology and Oncology-Campus Benjamin Franklin (CBF), Charité-Universitätsmedizin Berlin, Berlin, 12203, Germany
| | - George S Vassiliou
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- Wellcome Trust-MRC Stem Cell Institute, Cambridge Biomedical Campus, University of Cambridge, CB2 0XY, Cambridge, UK
- Department of Haematology, Cambridge University Hospitals NHS Trust, Cambridge, CB2 0PT, UK
| | - Juan Cadiñanos
- Instituto de Medicina Oncológica y Molecular de Asturias (IMOMA), Oviedo, 33193, Spain
- Departamento de Bioquímica y Biología Molecular, Facultad de Medicina, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo, 33006, Spain
| | - Allan Bradley
- The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Roland Rad
- Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine, Technische Universität München, Munich, 81675, Germany.
- Center for Translational Cancer Research (TranslaTUM), TUM School of Medicine, Technische Universität München, Munich, 81675, Germany.
- Department of Medicine II, Klinikum rechts der Isar, Technische Universität München, Munich, 81675, Germany.
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany.
| |
Collapse
|
47
|
Chen X, Kost J, Sulovari A, Wong N, Liang WS, Cao J, Li D. A virome-wide clonal integration analysis platform for discovering cancer viral etiology. Genome Res 2019; 29:819-830. [PMID: 30872350 PMCID: PMC6499315 DOI: 10.1101/gr.242529.118] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 03/11/2019] [Indexed: 12/31/2022]
Abstract
Oncoviral infection is responsible for 12%–15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contamination and noncausal viruses complicate the process of identifying genuine oncoviruses. Here, we propose a novel strategy to address these challenges by performing virome-wide screening of early-stage clonal viral integrations. To implement this strategy, we developed VIcaller, a novel platform for identifying viral integrations that are derived from any characterized viruses and shared by a large proportion of tumor cells using whole-genome sequencing (WGS) data. The sensitivity and precision were confirmed with simulated and benchmark cancer data sets. By applying this platform to cancer WGS data sets with proven or speculated viral etiology, we newly identified or confirmed clonal integrations of hepatitis B virus (HBV), human papillomavirus (HPV), Epstein-Barr virus (EBV), and BK Virus (BKV), suggesting the involvement of these viruses in early stages of tumorigenesis in affected tumors, such as HBV in TERT and KMT2B (also known as MLL4) gene loci in liver cancer, HPV and BKV in bladder cancer, and EBV in non-Hodgkin's lymphoma. We also showed the capacity of VIcaller to identify integrations from some uncharacterized viruses. This is the first study to systematically investigate the strategy and method of virome-wide screening of clonal integrations to identify oncoviruses. Searching clonal viral integrations with our platform has the capacity to identify virus-caused cancers and discover cancer viral etiologies.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Jason Kost
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Arvis Sulovari
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA
| | - Nathalie Wong
- Department of Anatomical and Cellular Pathology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, NT, Hong Kong 999077, P.R. China
| | - Winnie S Liang
- Translational Genomics Research Institute, Phoenix, Arizona 85004, USA
| | - Jian Cao
- Division of Medical Oncology, Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA.,Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, Vermont 05405, USA.,Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA
| |
Collapse
|
48
|
Maxwell CS, Mattox K, Turissini DA, Teixeira MM, Barker BM, Matute DR. Gene exchange between two divergent species of the fungal human pathogen, Coccidioides. Evolution 2019; 73:42-58. [PMID: 30414183 PMCID: PMC6430640 DOI: 10.1111/evo.13643] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 10/15/2018] [Accepted: 10/18/2018] [Indexed: 12/12/2022]
Abstract
The fungal genus Coccidioides is composed of two species, Coccidioides immitis and Coccidioides posadasii. These two species are the causal agents of coccidioidomycosis, a pulmonary disease also known as valley fever. The two species are thought to have shared genetic material due to gene exchange in spite of their long divergence. To quantify the magnitude of shared ancestry between them, we analyzed the genomes of a population sample from each species. Next, we inferred what is the expected size of shared haplotypes that might be inherited from the last common ancestor of the two species and find a cutoff to find what haplotypes have conclusively been exchanged between species. Finally, we precisely identified the breakpoints of the haplotypes that have crossed the species boundary and measure the allele frequency of each introgression in this sample. We find that introgressions are not uniformly distributed across the genome. Most, but not all, of the introgressions segregate at low frequency. Our results show that divergent species can share alleles, that species boundaries can be porous, and highlight the need for a systematic exploration of gene exchange in fungal species.
Collapse
Affiliation(s)
- Colin S Maxwell
- Biology Department, University of North Carolina, Chapel Hill, North Carolina
| | - Kathleen Mattox
- Biology Department, University of North Carolina, Chapel Hill, North Carolina
| | - David A Turissini
- Biology Department, University of North Carolina, Chapel Hill, North Carolina
| | - Marcus M Teixeira
- Núcleo de Medicina Tropical, Faculdade de Medicina, University of Brasília, Brasília, Brazil
| | - Bridget M Barker
- Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, Arizona
| | - Daniel R Matute
- Biology Department, University of North Carolina, Chapel Hill, North Carolina
| |
Collapse
|
49
|
Bae J, Lee KW, Islam MN, Yim HS, Park H, Rho M. iMGEins: detecting novel mobile genetic elements inserted in individual genomes. BMC Genomics 2018; 19:944. [PMID: 30563451 PMCID: PMC6299635 DOI: 10.1186/s12864-018-5290-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 11/20/2018] [Indexed: 11/10/2022] Open
Abstract
Background Recent advances in sequencing technology have allowed us to investigate personal genomes to find structural variations, which have been studied extensively to identify their association with the physiology of diseases such as cancer. In particular, mobile genetic elements (MGEs) are one of the major constituents of the human genomes, and cause genome instability by insertion, mutation, and rearrangement. Result We have developed a new program, iMGEins, to identify such novel MGEs by using sequencing reads of individual genomes, and to explore the breakpoints with the supporting reads and MGEs detected. iMGEins is the first MGE detection program that integrates three algorithmic components: discordant read-pair mapping, split-read mapping, and insertion sequence assembly. Our evaluation results showed its outstanding performance in detecting novel MGEs from simulated genomes, as well as real personal genomes. In detail, the average recall and precision rates of iMGEins are 96.67 and 100%, respectively, which are the highest among the programs compared. In the testing with real human genomes of the NA12878 sample, iMGEins shows the highest accuracy in detecting MGEs within 20 bp proximity of the breakpoints annotated. Conclusion In order to study the dynamics of MGEs in individual genomes, iMGEins was developed to accurately detect breakpoints and report inserted MGEs. Compared with other programs, iMGEins has valuable features of identifying novel MGEs and assembling the MGEs inserted. Electronic supplementary material The online version of this article (10.1186/s12864-018-5290-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Junwoo Bae
- Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea
| | - Kyeong Won Lee
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea
| | - Mohammad Nazrul Islam
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea.,Department of Marine Biotechnology, Korea University of Science and Technology, Daejeon, Korea.,Department of Biotechnology, Sher-e-Bangla Agricultural University, Dhaka, 1207, Bangladesh
| | - Hyung-Soon Yim
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea.,Department of Marine Biotechnology, Korea University of Science and Technology, Daejeon, Korea
| | - Heejin Park
- Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. .,Department of Biomedical Informatics, Hanyang University, Seoul, Korea.
| | - Mina Rho
- Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. .,Department of Biomedical Informatics, Hanyang University, Seoul, Korea.
| |
Collapse
|
50
|
Fujiwara K, Matsuura K, Matsunami K, Iio E, Nojiri S. Characterization of hepatitis B virus with complex structural variations. BMC Microbiol 2018; 18:202. [PMID: 30509169 PMCID: PMC6276219 DOI: 10.1186/s12866-018-1350-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 11/20/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Hepatitis B virus (HBV) infection is one of the most serious public health issues. Recent HBV genetic research has revealed novel genetic rearrangements termed complex structural variations (SVs), which are composed of combinations of SVs such as insertions, deletions, and duplications. An extensive search was made for complex SVs of HBV and their characteristics were analyzed. RESULTS Fifty-five HBV strains with complex SVs were identified by analyzing genetic sequences of HBV with bioinformatical tools. Along with 15 HBV strains with complex SVs in a previous report, a total of 70 HBV strains harboring complex SVs were analyzed. Complex SVs in the HBV genome were located frequently between nt 1500 and 2000. Insertions were observed in 65/70 (92.9%) of HBV strains with complex SVs. As insertional motif sequences, hepatocyte nuclear factor 1 binding site, a sequence complementary to part of box α in enhancer II, and insertions of unknown origins were observed. The complex SVs were classified into six groups, and combination of insertion and deletion was observed more frequently than other patterns. CONCLUSION Through an extensive search of HBV sequences, new strains with complex SVs were identified in this study. Characteristics of HBV with complex SVs were clarified by the analysis of 70 HBV strains harboring complex SVs. Further investigation is required to elucidate its role in pathogenesis of HBV-related liver disease.
Collapse
Affiliation(s)
- Kei Fujiwara
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi 467-8601 Japan
| | - Kentaro Matsuura
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi 467-8601 Japan
| | - Kayoko Matsunami
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi 467-8601 Japan
| | - Etsuko Iio
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi 467-8601 Japan
| | - Shunsuke Nojiri
- Department of Gastroenterology and Metabolism, Nagoya City University Graduate School of Medical Sciences, 1 Kawasumi, Mizuho, Nagoya, Aichi 467-8601 Japan
| |
Collapse
|