51
|
Yépez VA, Gusic M, Kopajtich R, Mertes C, Smith NH, Alston CL, Ban R, Beblo S, Berutti R, Blessing H, Ciara E, Distelmaier F, Freisinger P, Häberle J, Hayflick SJ, Hempel M, Itkis YS, Kishita Y, Klopstock T, Krylova TD, Lamperti C, Lenz D, Makowski C, Mosegaard S, Müller MF, Muñoz-Pujol G, Nadel A, Ohtake A, Okazaki Y, Procopio E, Schwarzmayr T, Smet J, Staufner C, Stenton SL, Strom TM, Terrile C, Tort F, Van Coster R, Vanlander A, Wagner M, Xu M, Fang F, Ghezzi D, Mayr JA, Piekutowska-Abramczuk D, Ribes A, Rötig A, Taylor RW, Wortmann SB, Murayama K, Meitinger T, Gagneur J, Prokisch H. Clinical implementation of RNA sequencing for Mendelian disease diagnostics. Genome Med 2022; 14:38. [PMID: 35379322 PMCID: PMC8981716 DOI: 10.1186/s13073-022-01019-9] [Citation(s) in RCA: 101] [Impact Index Per Article: 33.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 02/03/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Lack of functional evidence hampers variant interpretation, leaving a large proportion of individuals with a suspected Mendelian disorder without genetic diagnosis after whole genome or whole exome sequencing (WES). Research studies advocate to further sequence transcriptomes to directly and systematically probe gene expression defects. However, collection of additional biopsies and establishment of lab workflows, analytical pipelines, and defined concepts in clinical interpretation of aberrant gene expression are still needed for adopting RNA sequencing (RNA-seq) in routine diagnostics. METHODS We implemented an automated RNA-seq protocol and a computational workflow with which we analyzed skin fibroblasts of 303 individuals with a suspected mitochondrial disease that previously underwent WES. We also assessed through simulations how aberrant expression and mono-allelic expression tests depend on RNA-seq coverage. RESULTS We detected on average 12,500 genes per sample including around 60% of all disease genes-a coverage substantially higher than with whole blood, supporting the use of skin biopsies. We prioritized genes demonstrating aberrant expression, aberrant splicing, or mono-allelic expression. The pipeline required less than 1 week from sample preparation to result reporting and provided a median of eight disease-associated genes per patient for inspection. A genetic diagnosis was established for 16% of the 205 WES-inconclusive cases. Detection of aberrant expression was a major contributor to diagnosis including instances of 50% reduction, which, together with mono-allelic expression, allowed for the diagnosis of dominant disorders caused by haploinsufficiency. Moreover, calling aberrant splicing and variants from RNA-seq data enabled detecting and validating splice-disrupting variants, of which the majority fell outside WES-covered regions. CONCLUSION Together, these results show that streamlined experimental and computational processes can accelerate the implementation of RNA-seq in routine diagnostics.
Collapse
Affiliation(s)
- Vicente A. Yépez
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Informatics, Technical University of Munich, Garching, Germany
- Quantitative Biosciences Munich, Department of Biochemistry, Ludwig-Maximilians-Universität, Munich, Germany
| | - Mirjana Gusic
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
| | - Robert Kopajtich
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Christian Mertes
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Nicholas H. Smith
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Charlotte L. Alston
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH UK
- NHS Highly Specialised Services for Rare Mitochondrial Disorders, Royal Victoria Infirmary, Newcastle upon Tyne Hospitals NHS Foundation Trust, Queen Victoria Road, Newcastle upon Tyne, NE1 4LP UK
| | - Rui Ban
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Pediatric Neurology, Beijing Children’s Hospital, Capital Medical University, National Center for Children’s Health, Beijing, China
| | - Skadi Beblo
- Department of Women and Child Health, Hospital for Children and Adolescents, Center for Pediatric Research Leipzig (CPL), Center for Rare Diseases, University Hospitals, University of Leipzig, Leipzig, Germany
| | - Riccardo Berutti
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Holger Blessing
- Department for Inborn Metabolic Diseases, Children’s and Adolescents’ Hospital, University of Erlangen-Nürnberg, Erlangen, Germany
| | - Elżbieta Ciara
- Department of Medical Genetics, Children’s Memorial Health Institute, Warsaw, Poland
| | - Felix Distelmaier
- Department of General Pediatrics, Neonatology and Pediatric Cardiology, Heinrich-Heine-University, Düsseldorf, Germany
| | - Peter Freisinger
- Department of Pediatrics, Klinikum Reutlingen, Reutlingen, Germany
| | - Johannes Häberle
- University Children’s Hospital Zurich and Children’s Research Centre, Zürich, Switzerland
| | - Susan J. Hayflick
- Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, USA
| | - Maja Hempel
- Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | | | - Yoshihito Kishita
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, Japan
- Department of Life Science, Faculty of Science and Engineering, Kindai University, Osaka, Japan
| | - Thomas Klopstock
- Department of Neurology, Friedrich-Baur-Institute, University Hospital, Ludwig-Maximilians-Universität, Munich, Germany
- German Center for Neurodegenerative Diseases (DZNE), Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | | | - Costanza Lamperti
- Unit of Medical Genetics and Neurogenetics, Fondazione IRCCS (Istituto di Ricovero e Cura a Carattere Scientifico) Istituto Neurologico Carlo Besta, Milan, Italy
| | - Dominic Lenz
- Division of Neuropediatrics and Pediatric Metabolic Medicine, Center for Pediatric and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - Christine Makowski
- Department of Pediatrics, Technical University of Munich, Munich, Germany
| | - Signe Mosegaard
- Research Unit for Molecular Medicine, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Michaela F. Müller
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Gerard Muñoz-Pujol
- Section of Inborn Errors of Metabolism-IBC, Department of Biochemistry and Molecular Genetics, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Agnieszka Nadel
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Akira Ohtake
- Department of Pediatrics & Clinical Genomics, Faculty of Medicine, Saitama Medical University, Saitama, Japan
- Center for Intractable Diseases, Saitama Medical University Hospital, Saitama, Japan
| | - Yasushi Okazaki
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, Japan
| | - Elena Procopio
- Inborn Metabolic and Muscular Disorders Unit, Anna Meyer Children Hospital, Florence, Italy
| | - Thomas Schwarzmayr
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Joél Smet
- Department of Pediatric Neurology and Metabolism, Ghent University Hospital, Ghent, Belgium
| | - Christian Staufner
- Division of Neuropediatrics and Pediatric Metabolic Medicine, Center for Pediatric and Adolescent Medicine, University Hospital Heidelberg, Heidelberg, Germany
| | - Sarah L. Stenton
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Tim M. Strom
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Caterina Terrile
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Frederic Tort
- Section of Inborn Errors of Metabolism-IBC, Department of Biochemistry and Molecular Genetics, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Rudy Van Coster
- Department of Pediatric Neurology and Metabolism, Ghent University Hospital, Ghent, Belgium
| | - Arnaud Vanlander
- Department of Pediatric Neurology and Metabolism, Ghent University Hospital, Ghent, Belgium
| | - Matias Wagner
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Manting Xu
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Pediatric Neurology, Beijing Children’s Hospital, Capital Medical University, National Center for Children’s Health, Beijing, China
| | - Fang Fang
- Department of Pediatric Neurology, Beijing Children’s Hospital, Capital Medical University, National Center for Children’s Health, Beijing, China
| | - Daniele Ghezzi
- Unit of Medical Genetics and Neurogenetics, Fondazione IRCCS (Istituto di Ricovero e Cura a Carattere Scientifico) Istituto Neurologico Carlo Besta, Milan, Italy
- Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy
| | - Johannes A. Mayr
- University Children’s Hospital, Paracelsus Medical University Salzburg, Salzburg, Austria
| | | | - Antonia Ribes
- Section of Inborn Errors of Metabolism-IBC, Department of Biochemistry and Molecular Genetics, Hospital Clínic, IDIBAPS, CIBERER, Barcelona, Spain
| | - Agnès Rötig
- Université de Paris, Institut Imagine, INSERM UMR 1163, Paris, France
| | - Robert W. Taylor
- Wellcome Centre for Mitochondrial Research, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH UK
- NHS Highly Specialised Services for Rare Mitochondrial Disorders, Royal Victoria Infirmary, Newcastle upon Tyne Hospitals NHS Foundation Trust, Queen Victoria Road, Newcastle upon Tyne, NE1 4LP UK
| | - Saskia B. Wortmann
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- University Children’s Hospital, Paracelsus Medical University Salzburg, Salzburg, Austria
- Amalia Children’s Hospital, Radboudumc Nijmegen, Nijmegen, The Netherlands
| | - Kei Murayama
- Department of Metabolism, Chiba Children’s Hospital, Chiba, Japan
| | - Thomas Meitinger
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
| | - Julien Gagneur
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Informatics, Technical University of Munich, Garching, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Holger Prokisch
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Pediatric Neurology, Beijing Children’s Hospital, Capital Medical University, National Center for Children’s Health, Beijing, China
| |
Collapse
|
52
|
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res 2021; 49:12673-12691. [PMID: 34850938 PMCID: PMC8682775 DOI: 10.1093/nar/gkab1159] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/02/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022] Open
Abstract
Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
Collapse
Affiliation(s)
- Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Ariel A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
53
|
Javanmardi K, Chou CW, Terrace CI, Annapareddy A, Kaoud TS, Guo Q, Lutgens J, Zorkic H, Horton AP, Gardner EC, Nguyen G, Boutz DR, Goike J, Voss WN, Kuo HC, Dalby KN, Gollihar JD, Finkelstein IJ. Rapid characterization of spike variants via mammalian cell surface display. Mol Cell 2021; 81:5099-5111.e8. [PMID: 34919820 PMCID: PMC8675084 DOI: 10.1016/j.molcel.2021.11.024] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 09/16/2021] [Accepted: 11/19/2021] [Indexed: 12/21/2022]
Abstract
The SARS-CoV-2 spike protein is a critical component of vaccines and a target for neutralizing monoclonal antibodies (nAbs). Spike is also undergoing immunogenic selection with variants that increase infectivity and partially escape convalescent plasma. Here, we describe Spike Display, a high-throughput platform to rapidly characterize glycosylated spike ectodomains across multiple coronavirus-family proteins. We assayed ∼200 variant SARS-CoV-2 spikes for their expression, ACE2 binding, and recognition by 13 nAbs. An alanine scan of all five N-terminal domain (NTD) loops highlights a public epitope in the N1, N3, and N5 loops recognized by most NTD-binding nAbs. NTD mutations in variants of concern B.1.1.7 (alpha), B.1.351 (beta), B.1.1.28 (gamma), B.1.427/B.1.429 (epsilon), and B.1.617.2 (delta) impact spike expression and escape most NTD-targeting nAbs. Finally, B.1.351 and B.1.1.28 completely escape a potent ACE2 mimic. We anticipate that Spike Display will accelerate antigen design, deep scanning mutagenesis, and antibody epitope mapping for SARS-CoV-2 and other emerging viral threats.
Collapse
Affiliation(s)
- Kamyab Javanmardi
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA.
| | - Chia-Wei Chou
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | | | - Ankur Annapareddy
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Tamer S Kaoud
- Division of Chemical Biology and Medicinal Chemistry, College of Pharmacy, The University of Texas at Austin, Austin, TX, USA
| | - Qingqing Guo
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Josh Lutgens
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Hayley Zorkic
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Andrew P Horton
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Elizabeth C Gardner
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Giaochau Nguyen
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | | | - Jule Goike
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - William N Voss
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Hung-Che Kuo
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Kevin N Dalby
- Division of Chemical Biology and Medicinal Chemistry, College of Pharmacy, The University of Texas at Austin, Austin, TX, USA
| | - Jimmy D Gollihar
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA; CCDC Army Research Laboratory-South, Austin, TX, USA; Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute, Houston Methodist Hospital, Houston, TX, USA; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX, USA
| | - Ilya J Finkelstein
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX, USA.
| |
Collapse
|
54
|
Find and cut-and-transfer (FiCAT) mammalian genome engineering. Nat Commun 2021; 12:7071. [PMID: 34862378 PMCID: PMC8642419 DOI: 10.1038/s41467-021-27183-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 11/09/2021] [Indexed: 12/26/2022] Open
Abstract
While multiple technologies for small allele genome editing exist, robust technologies for targeted integration of large DNA fragments in mammalian genomes are still missing. Here we develop a gene delivery tool (FiCAT) combining the precision of a CRISPR-Cas9 (find module), and the payload transfer efficiency of an engineered piggyBac transposase (cut-and-transfer module). FiCAT combines the functionality of Cas9 DNA scanning and targeting DNA, with piggyBac donor DNA processing and transfer capacity. PiggyBac functional domains are engineered providing increased on-target integration while reducing off-target events. We demonstrate efficient delivery and programmable insertion of small and large payloads in cellulo (human (Hek293T, K-562) and mouse (C2C12)) and in vivo in mouse liver. Finally, we evolve more efficient versions of FiCAT by generating a targeted diversity of 394,000 variants and undergoing 4 rounds of evolution. In this work, we develop a precise and efficient targeted insertion of multi kilobase DNA fragments in mammalian genomes. Mammalian genome engineering has advanced tremendously over the last decade, however there is still a need for robust gene writing with size scaling capacity. Here the authors present Find Cut-and-Transfer (FiCAT) technology to delivery large targeted payload insertion in cell lines and in vivo in mouse models.
Collapse
|
55
|
Findlay GM. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum Mol Genet 2021; 30:R187-R197. [PMID: 34338757 PMCID: PMC8490018 DOI: 10.1093/hmg/ddab219] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
The application of genomics to medicine has accelerated the discovery of mutations underlying disease and has enhanced our knowledge of the molecular underpinnings of diverse pathologies. As the amount of human genetic material queried via sequencing has grown exponentially in recent years, so too has the number of rare variants observed. Despite progress, our ability to distinguish which rare variants have clinical significance remains limited. Over the last decade, however, powerful experimental approaches have emerged to characterize variant effects orders of magnitude faster than before. Fueled by improved DNA synthesis and sequencing and, more recently, by CRISPR/Cas9 genome editing, multiplex functional assays provide a means of generating variant effect data in wide-ranging experimental systems. Here, I review recent applications of multiplex assays that link human variants to disease phenotypes and I describe emerging strategies that will enhance their clinical utility in coming years.
Collapse
Affiliation(s)
- Gregory M Findlay
- The Francis Crick Institute, The Genome Function Laboratory, London NW1 1AT, UK
| |
Collapse
|
56
|
A broad analysis of splicing regulation in yeast using a large library of synthetic introns. PLoS Genet 2021; 17:e1009805. [PMID: 34570750 PMCID: PMC8496845 DOI: 10.1371/journal.pgen.1009805] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 10/07/2021] [Accepted: 09/03/2021] [Indexed: 11/19/2022] Open
Abstract
RNA splicing is a key process in eukaryotic gene expression, in which an intron is spliced out of a pre-mRNA molecule to eventually produce a mature mRNA. Most intron-containing genes are constitutively spliced, hence efficient splicing of an intron is crucial for efficient regulation of gene expression. Here we use a large synthetic oligo library of ~20,000 variants to explore how different intronic sequence features affect splicing efficiency and mRNA expression levels in S. cerevisiae. Introns are defined by three functional sites, the 5’ donor site, the branch site, and the 3’ acceptor site. Using a combinatorial design of synthetic introns, we demonstrate how non-consensus splice site sequences in each of these sites affect splicing efficiency. We then show that S. cerevisiae splicing machinery tends to select alternative 3’ splice sites downstream of the original site, and we suggest that this tendency created a selective pressure, leading to the avoidance of cryptic splice site motifs near introns’ 3’ ends. We further use natural intronic sequences from other yeast species, whose splicing machineries have diverged to various extents, to show how intron architectures in the various species have been adapted to the organism’s splicing machinery. We suggest that the observed tendency for cryptic splicing is a result of a loss of a specific splicing factor, U2AF1. Lastly, we show that synthetic sequences containing two introns give rise to alternative RNA isoforms in S. cerevisiae, demonstrating that merely a synthetic fusion of two introns might be suffice to facilitate alternative splicing in yeast. Our study reveals novel mechanisms by which introns are shaped in evolution to allow cells to regulate their transcriptome. In addition, it provides a valuable resource to study the regulation of constitutive and alternative splicing in a model organism. RNA splicing is a process in which parts of a new pre-mRNA are spliced out of the mRNA molecule to produce eventually a mature mRNA. Those RNA segments that are spliced out are termed introns, and they are found in most genes in eukaryotic organisms. Hence regulation of this process has a major role in the control of gene expression. The budding yeast S. cerevisiae is a popular model organism for eukaryotic cell biology, but in terms of splicing it differs, as it has only few intron-containing genes. Nevertheless, this species has been used to study basic principles of splicing regulation based on its ~300 introns. Here we used the technology of a large synthetic genetic library to introduce many new intron-containing genes to the yeast genome, to explore splicing regulation at a wider scope than was possible so far. Reassuringly, our results confirm known regulatory mechanisms, and further expand our understanding of splicing regulation, specifically how the yeast splicing machinery interacts with the end of introns, and how through evolution introns have evolved to avoid unwanted misidentifications of this end. We further demonstrate the potential of the yeast splicing machinery to alternatively splice a two-intron gene, which is common in other eukaryotes but rare in yeast. Our work presents a first-of-its-kind resource for the systematic study of splicing in live cells.
Collapse
|
57
|
Geck RC, Boyle G, Amorosi CJ, Fowler DM, Dunham MJ. Measuring Pharmacogene Variant Function at Scale Using Multiplexed Assays. Annu Rev Pharmacol Toxicol 2021; 62:531-550. [PMID: 34516287 DOI: 10.1146/annurev-pharmtox-032221-085807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As costs of next-generation sequencing decrease, identification of genetic variants has far outpaced our ability to understand their functional consequences. This lack of understanding is a central challenge to a key promise of pharmacogenomics: using genetic information to guide drug selection and dosing. Recently developed multiplexed assays of variant effect enable experimental measurement of the function of thousands of variants simultaneously. Here, we describe multiplexed assays that have been performed on nearly 25,000 variants in eight key pharmacogenes (ADRB2, CYP2C9, CYP2C19, NUDT15, SLCO1B1, TMPT, VKORC1, and the LDLR promoter), discuss advances in experimental design, and explore key challenges that must be overcome to maximize the utility of multiplexed functional data. Expected final online publication date for the Annual Review of Pharmacology and Toxicology, Volume 62 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Renee C Geck
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Gabriel Boyle
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Clara J Amorosi
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; , .,Department of Bioengineering, University of Washington, Seattle, Washington 98195, USA
| | - Maitreya J Dunham
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| |
Collapse
|
58
|
Sinyakov AN, Ryabinin VA, Kostina EV. Application of Array-Based Oligonucleotides for Synthesis of Genetic Designs. Mol Biol 2021. [DOI: 10.1134/s0026893321030109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
59
|
Gergics P, Smith C, Bando H, Jorge AAL, Rockstroh-Lippold D, Vishnopolska SA, Castinetti F, Maksutova M, Carvalho LRS, Hoppmann J, Martínez Mayer J, Albarel F, Braslavsky D, Keselman A, Bergadá I, Martí MA, Saveanu A, Barlier A, Abou Jamra R, Guo MH, Dauber A, Nakaguma M, Mendonca BB, Jayakody SN, Ozel AB, Fang Q, Ma Q, Li JZ, Brue T, Pérez Millán MI, Arnhold IJP, Pfaeffle R, Kitzman JO, Camper SA. High-throughput splicing assays identify missense and silent splice-disruptive POU1F1 variants underlying pituitary hormone deficiency. Am J Hum Genet 2021; 108:1526-1539. [PMID: 34270938 DOI: 10.1016/j.ajhg.2021.06.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 06/18/2021] [Indexed: 12/13/2022] Open
Abstract
Pituitary hormone deficiency occurs in ∼1:4,000 live births. Approximately 3% of the cases are due to mutations in the alpha isoform of POU1F1, a pituitary-specific transcriptional activator. We found four separate heterozygous missense variants in unrelated individuals with hypopituitarism that were predicted to affect a minor isoform, POU1F1 beta, which can act as a transcriptional repressor. These variants retain repressor activity, but they shift splicing to favor the expression of the beta isoform, resulting in dominant-negative loss of function. Using a high-throughput splicing reporter assay, we tested 1,070 single-nucleotide variants in POU1F1. We identified 96 splice-disruptive variants, including 14 synonymous variants. In separate cohorts, we found two additional synonymous variants nominated by this screen that co-segregate with hypopituitarism. This study underlines the importance of evaluating the impact of variants on splicing and provides a catalog for interpretation of variants of unknown significance in POU1F1.
Collapse
Affiliation(s)
- Peter Gergics
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Cathy Smith
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | - Hironori Bando
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Alexander A L Jorge
- Genetic Endocrinology Unit (LIM25), Division of Endocrinology, Hospital das Clinicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo 01246-903, Brazil
| | - Denise Rockstroh-Lippold
- Department of Women's and Child Health, Division of Pediatric Endocrinology, University Hospital Leipzig, Leipzig 04103, Germany
| | - Sebastian A Vishnopolska
- Instituto de Biociencias, Biotecnología y Biología Traslacional, Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad de Buenos Aires, CABA CE1428EHA, Argentina
| | - Frederic Castinetti
- Aix Marseille University, AP-HM, INSERM, Marseille Medical Genetics, Marmara Institute, La Conception Hospital, Department of Endocrinology, Marseille 13005, France
| | - Mariam Maksutova
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Luciani Renata Silveira Carvalho
- Developmental Endocrinology Unit, Laboratory of Hormones and Molecular Genetics LIM/42, Division of Endocrinology, Hospital das Clinicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo 05403-900, Brazil
| | - Julia Hoppmann
- Department of Women's and Child Health, Division of Pediatric Endocrinology, University Hospital Leipzig, Leipzig 04103, Germany
| | - Julián Martínez Mayer
- Instituto de Biociencias, Biotecnología y Biología Traslacional, Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad de Buenos Aires, CABA CE1428EHA, Argentina
| | - Frédérique Albarel
- Aix Marseille University, AP-HM, INSERM, Marseille Medical Genetics, Marmara Institute, La Conception Hospital, Department of Endocrinology, Marseille 13005, France
| | - Debora Braslavsky
- Centro de Investigaciones Endocrinológicas "Dr. César Bergadá," FEI - CONICET - División de Endocrinología, Hospital de Niños Ricardo Gutiérrez, Ciudad de Buenos Aires, CABA CE1428EHA, Argentina
| | - Ana Keselman
- Centro de Investigaciones Endocrinológicas "Dr. César Bergadá," FEI - CONICET - División de Endocrinología, Hospital de Niños Ricardo Gutiérrez, Ciudad de Buenos Aires, CABA CE1428EHA, Argentina
| | - Ignacio Bergadá
- Centro de Investigaciones Endocrinológicas "Dr. César Bergadá," FEI - CONICET - División de Endocrinología, Hospital de Niños Ricardo Gutiérrez, Ciudad de Buenos Aires, CABA CE1428EHA, Argentina
| | - Marcelo A Martí
- Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales CONICET, Pabellòn 2 de Ciudad Universitaria, Ciudad de Buenos Aires, CABA C1428EHA, Argentina
| | - Alexandru Saveanu
- Aix Marseille University, AP-HM, INSERM, Marseille Medical Genetics, Marmara Institute, La Conception Hospital, Laboratory of Molecular Biology, Marseille 13385, France
| | - Anne Barlier
- Aix Marseille University, AP-HM, INSERM, Marseille Medical Genetics, Marmara Institute, La Conception Hospital, Laboratory of Molecular Biology, Marseille 13385, France
| | - Rami Abou Jamra
- Institute of Human Genetics, University of Leipzig Medical Center, Leipzig 04103, Germany
| | - Michael H Guo
- Division of Endocrinology, Boston Children's Hospital and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Andrew Dauber
- Cincinnati Center for Growth Disorders, Division of Endocrinology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Marilena Nakaguma
- Developmental Endocrinology Unit, Laboratory of Hormones and Molecular Genetics LIM/42, Division of Endocrinology, Hospital das Clinicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo 05403-900, Brazil
| | - Berenice B Mendonca
- Developmental Endocrinology Unit, Laboratory of Hormones and Molecular Genetics LIM/42, Division of Endocrinology, Hospital das Clinicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo 05403-900, Brazil
| | - Sajini N Jayakody
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - A Bilge Ozel
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Qing Fang
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Qianyi Ma
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Jun Z Li
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA
| | - Thierry Brue
- Aix Marseille University, AP-HM, INSERM, Marseille Medical Genetics, Marmara Institute, La Conception Hospital, Department of Endocrinology, Marseille 13005, France
| | - María Ines Pérez Millán
- Instituto de Biociencias, Biotecnología y Biología Traslacional, Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad de Buenos Aires, CABA CE1428EHA, Argentina
| | - Ivo J P Arnhold
- Developmental Endocrinology Unit, Laboratory of Hormones and Molecular Genetics LIM/42, Division of Endocrinology, Hospital das Clinicas da Faculdade de Medicina da Universidade de São Paulo, São Paulo 05403-900, Brazil
| | - Roland Pfaeffle
- Department of Women's and Child Health, Division of Pediatric Endocrinology, University Hospital Leipzig, Leipzig 04103, Germany; Institute of Human Genetics, University of Leipzig Medical Center, Leipzig 04103, Germany
| | - Jacob O Kitzman
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA.
| | - Sally A Camper
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109-5618, USA.
| |
Collapse
|
60
|
Lord J, Baralle D. Splicing in the Diagnosis of Rare Disease: Advances and Challenges. Front Genet 2021; 12:689892. [PMID: 34276790 PMCID: PMC8280750 DOI: 10.3389/fgene.2021.689892] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 06/07/2021] [Indexed: 12/13/2022] Open
Abstract
Mutations which affect splicing are significant contributors to rare disease, but are frequently overlooked by diagnostic sequencing pipelines. Greater ascertainment of pathogenic splicing variants will increase diagnostic yields, ending the diagnostic odyssey for patients and families affected by rare disorders, and improving treatment and care strategies. Advances in sequencing technologies, predictive modeling, and understanding of the mechanisms of splicing in recent years pave the way for improved detection and interpretation of splice affecting variants, yet several limitations still prohibit their routine ascertainment in diagnostic testing. This review explores some of these advances in the context of clinical application and discusses challenges to be overcome before these variants are comprehensively and routinely recognized in diagnostics.
Collapse
Affiliation(s)
- Jenny Lord
- School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Diana Baralle
- School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Wessex Clinical Genetics Service, University Hospital Southampton NHS Foundation Trust, Southampton, United Kingdom
| |
Collapse
|
61
|
Rentzsch P, Schubach M, Shendure J, Kircher M. CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med 2021; 13:31. [PMID: 33618777 PMCID: PMC7901104 DOI: 10.1186/s13073-021-00835-9] [Citation(s) in RCA: 419] [Impact Index Per Article: 104.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 01/20/2021] [Indexed: 02/08/2023] Open
Abstract
Background Splicing of genomic exons into mRNAs is a critical prerequisite for the accurate synthesis of human proteins. Genetic variants impacting splicing underlie a substantial proportion of genetic disease, but are challenging to identify beyond those occurring at donor and acceptor dinucleotides. To address this, various methods aim to predict variant effects on splicing. Recently, deep neural networks (DNNs) have been shown to achieve better results in predicting splice variants than other strategies. Methods It has been unclear how best to integrate such process-specific scores into genome-wide variant effect predictors. Here, we use a recently published experimental data set to compare several machine learning methods that score variant effects on splicing. We integrate the best of those approaches into general variant effect prediction models and observe the effect on classification of known pathogenic variants. Results We integrate two specialized splicing scores into CADD (Combined Annotation Dependent Depletion; cadd.gs.washington.edu), a widely used tool for genome-wide variant effect prediction that we previously developed to weight and integrate diverse collections of genomic annotations. With this new model, CADD-Splice, we show that inclusion of splicing DNN effect scores substantially improves predictions across multiple variant categories, without compromising overall performance. Conclusions While splice effect scores show superior performance on splice variants, specialized predictors cannot compete with other variant scores in general variant interpretation, as the latter account for nonsense and missense effects that do not alter splicing. Although only shown here for splice scores, we believe that the applied approach will generalize to other specific molecular processes, providing a path for the further improvement of genome-wide variant effect prediction. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00835-9.
Collapse
Affiliation(s)
- Philipp Rentzsch
- Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany.,Berlin Institute of Health (BIH), 10178, Berlin, Germany
| | - Max Schubach
- Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany.,Berlin Institute of Health (BIH), 10178, Berlin, Germany
| | - Jay Shendure
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, 98195, USA.,Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Martin Kircher
- Charité - Universitätsmedizin Berlin, 10117, Berlin, Germany. .,Berlin Institute of Health (BIH), 10178, Berlin, Germany.
| |
Collapse
|
62
|
Liao SE, Regev O. Splicing at the phase-separated nuclear speckle interface: a model. Nucleic Acids Res 2021; 49:636-645. [PMID: 33337476 PMCID: PMC7826271 DOI: 10.1093/nar/gkaa1209] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 11/24/2020] [Accepted: 12/03/2020] [Indexed: 02/07/2023] Open
Abstract
Phase-separated membraneless bodies play important roles in nucleic acid biology. While current models for the roles of phase separation largely focus on the compartmentalization of constituent proteins, we reason that other properties of phase separation may play functional roles. Specifically, we propose that interfaces of phase-separated membraneless bodies could have functional roles in spatially organizing biochemical reactions. Here we propose such a model for the nuclear speckle, a membraneless body implicated in RNA splicing. In our model, sequence-dependent RNA positioning along the nuclear speckle interface coordinates RNA splicing. Our model asserts that exons are preferentially sequestered into nuclear speckles through binding by SR proteins, while introns are excluded through binding by nucleoplasmic hnRNP proteins. As a result, splice sites at exon-intron boundaries are preferentially positioned at nuclear speckle interfaces. This positioning exposes splice sites to interface-localized spliceosomes, enabling the subsequent splicing reaction. Our model provides a simple mechanism that seamlessly explains much of the complex logic of splicing. This logic includes experimental results such as the antagonistic duality between splicing factors, the position dependence of splicing sequence motifs, and the collective contribution of many motifs to splicing decisions. Similar functional roles for phase-separated interfaces may exist for other membraneless bodies.
Collapse
Affiliation(s)
- Susan E Liao
- Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
| | - Oded Regev
- Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
| |
Collapse
|
63
|
Amoah K, Hsiao YHE, Bahn JH, Sun Y, Burghard C, Tan BX, Yang EW, Xiao X. Allele-specific alternative splicing and its functional genetic variants in human tissues. Genome Res 2021; 31:359-371. [PMID: 33452016 PMCID: PMC7919445 DOI: 10.1101/gr.265637.120] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 01/14/2021] [Indexed: 02/07/2023]
Abstract
Alternative splicing is an RNA processing mechanism that affects most genes in human, contributing to disease mechanisms and phenotypic diversity. The regulation of splicing involves an intricate network of cis-regulatory elements and trans-acting factors. Due to their high sequence specificity, cis-regulation of splicing can be altered by genetic variants, significantly affecting splicing outcomes. Recently, multiple methods have been applied to understanding the regulatory effects of genetic variants on splicing. However, it is still challenging to go beyond apparent association to pinpoint functional variants. To fill in this gap, we utilized large-scale data sets of the Genotype-Tissue Expression (GTEx) project to study genetically modulated alternative splicing (GMAS) via identification of allele-specific splicing events. We demonstrate that GMAS events are shared across tissues and individuals more often than expected by chance, consistent with their genetically driven nature. Moreover, although the allelic bias of GMAS exons varies across samples, the degree of variation is similar across tissues versus individuals. Thus, genetic background drives the GMAS pattern to a similar degree as tissue-specific splicing mechanisms. Leveraging the genetically driven nature of GMAS, we developed a new method to predict functional splicing-altering variants, built upon a genotype-phenotype concordance model across samples. Complemented by experimental validations, this method predicted >1000 functional variants, many of which may alter RNA-protein interactions. Lastly, 72% of GMAS-associated SNPs were in linkage disequilibrium with GWAS-reported SNPs, and such association was enriched in tissues of relevance for specific traits/diseases. Our study enables a comprehensive view of genetically driven splicing variations in human tissues.
Collapse
Affiliation(s)
- Kofi Amoah
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA
| | - Yun-Hua Esther Hsiao
- Department of Bioengineering, University of California, Los Angeles, California 90095, USA
| | - Jae Hoon Bahn
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Yiwei Sun
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Christina Burghard
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA
| | - Boon Xin Tan
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Ei-Wen Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA
| | - Xinshu Xiao
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA.,Department of Bioengineering, University of California, Los Angeles, California 90095, USA.,Department of Integrative Biology and Physiology, University of California, Los Angeles, California 90095, USA.,Molecular Biology Institute, University of California, Los Angeles, California 90095, USA.,Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
64
|
Jia X, Burugula BB, Chen V, Lemons RM, Jayakody S, Maksutova M, Kitzman JO. Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am J Hum Genet 2021; 108:163-175. [PMID: 33357406 PMCID: PMC7820803 DOI: 10.1016/j.ajhg.2020.12.003] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 12/03/2020] [Indexed: 12/20/2022] Open
Abstract
The lack of functional evidence for the majority of missense variants limits their clinical interpretability and poses a key barrier to the broad utility of carrier screening. In Lynch syndrome (LS), one of the most highly prevalent cancer syndromes, nearly 90% of clinically observed missense variants are deemed “variants of uncertain significance” (VUS). To systematically resolve their functional status, we performed a massively parallel screen in human cells to identify loss-of-function missense variants in the key DNA mismatch repair factor MSH2. The resulting functional effect map is substantially complete, covering 94% of the 17,746 possible variants, and is highly concordant (96%) with existing functional data and expert clinicians’ interpretations. The large majority (89%) of missense variants were functionally neutral, perhaps unexpectedly in light of its evolutionary conservation. These data provide ready-to-use functional evidence to resolve the ∼1,300 extant missense VUSs in MSH2 and may facilitate the prospective classification of newly discovered variants in the clinic.
Collapse
|
65
|
Hartin SN, Means JC, Alaimo JT, Younger ST. Expediting rare disease diagnosis: a call to bridge the gap between clinical and functional genomics. Mol Med 2020; 26:117. [PMID: 33238891 PMCID: PMC7691058 DOI: 10.1186/s10020-020-00244-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 11/18/2020] [Indexed: 11/10/2022] Open
Abstract
Approximately 400 million people throughout the world suffer from a rare disease. Although advances in whole exome and whole genome sequencing have greatly facilitated rare disease diagnosis, overall diagnostic rates remain below 50%. Furthermore, in cases where accurate diagnosis is achieved the process requires an average of 4.8 years. Reducing the time required for disease diagnosis is among the most critical needs of patients impacted by a rare disease. In this perspective we describe current challenges associated with rare disease diagnosis and discuss several cutting-edge functional genomic screening technologies that have the potential to rapidly accelerate the process of distinguishing pathogenic variants that lead to disease.
Collapse
Affiliation(s)
- Samantha N Hartin
- Center for Pediatric Genomic Medicine, Children's Mercy Kansas City, Kansas City, MO, 64108, USA.,Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - John C Means
- Center for Pediatric Genomic Medicine, Children's Mercy Kansas City, Kansas City, MO, 64108, USA.,Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Joseph T Alaimo
- Center for Pediatric Genomic Medicine, Children's Mercy Kansas City, Kansas City, MO, 64108, USA.,Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA.,Department of Pediatrics, University of Missouri-Kansas City School of Medicine, Kansas City, MO, 64110, USA.,Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO, 64108, USA
| | - Scott T Younger
- Center for Pediatric Genomic Medicine, Children's Mercy Kansas City, Kansas City, MO, 64108, USA. .,Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, MO, 64108, USA. .,Department of Pediatrics, University of Missouri-Kansas City School of Medicine, Kansas City, MO, 64110, USA. .,Department of Pediatrics, University of Kansas Medical Center, Kansas City, KS, 66160, USA.
| |
Collapse
|
66
|
Renganaath K, Chong R, Day L, Kosuri S, Kruglyak L, Albert FW. Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross. eLife 2020; 9:e62669. [PMID: 33179598 PMCID: PMC7685706 DOI: 10.7554/elife.62669] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/11/2020] [Indexed: 02/06/2023] Open
Abstract
Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Collapse
Affiliation(s)
- Kaushik Renganaath
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| | - Rockie Chong
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Laura Day
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Sriram Kosuri
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Frank W Albert
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|
67
|
Baeza-Centurion P, Miñana B, Valcárcel J, Lehner B. Mutations primarily alter the inclusion of alternatively spliced exons. eLife 2020; 9:59959. [PMID: 33112234 PMCID: PMC7673789 DOI: 10.7554/elife.59959] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 10/27/2020] [Indexed: 12/17/2022] Open
Abstract
Genetic analyses and systematic mutagenesis have revealed that synonymous, non-synonymous and intronic mutations frequently alter the inclusion levels of alternatively spliced exons, consistent with the concept that altered splicing might be a common mechanism by which mutations cause disease. However, most exons expressed in any cell are highly-included in mature mRNAs. Here, by performing deep mutagenesis of highly-included exons and by analysing the association between genome sequence variation and exon inclusion across the transcriptome, we report that mutations only very rarely alter the inclusion of highly-included exons. This is true for both exonic and intronic mutations as well as for perturbations in trans. Therefore, mutations that affect splicing are not evenly distributed across primary transcripts but are focussed in and around alternatively spliced exons with intermediate inclusion levels. These results provide a resource for prioritising synonymous and other variants as disease-causing mutations.
Collapse
Affiliation(s)
- Pablo Baeza-Centurion
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Belén Miñana
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Juan Valcárcel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
68
|
Jones EM, Lubock NB, Venkatakrishnan AJ, Wang J, Tseng AM, Paggi JM, Latorraca NR, Cancilla D, Satyadi M, Davis JE, Babu MM, Dror RO, Kosuri S. Structural and functional characterization of G protein-coupled receptors with deep mutational scanning. eLife 2020; 9:54895. [PMID: 33084570 PMCID: PMC7707821 DOI: 10.7554/elife.54895] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2020] [Accepted: 10/16/2020] [Indexed: 01/14/2023] Open
Abstract
The >800 human G protein–coupled receptors (GPCRs) are responsible for transducing diverse chemical stimuli to alter cell state- and are the largest class of drug targets. Their myriad structural conformations and various modes of signaling make it challenging to understand their structure and function. Here, we developed a platform to characterize large libraries of GPCR variants in human cell lines with a barcoded transcriptional reporter of G protein signal transduction. We tested 7800 of 7828 possible single amino acid substitutions to the beta-2 adrenergic receptor (β2AR) at four concentrations of the agonist isoproterenol. We identified residues specifically important for β2AR signaling, mutations in the human population that are potentially loss of function, and residues that modulate basal activity. Using unsupervised learning, we identify residues critical for signaling, including all major structural motifs and molecular interfaces. We also find a previously uncharacterized structural latch spanning the first two extracellular loops that is highly conserved across Class A GPCRs and is conformationally rigid in both the inactive and active states of the receptor. More broadly, by linking deep mutational scanning with engineered transcriptional reporters, we establish a generalizable method for exploring pharmacogenomics, structure and function across broad classes of drug receptors.
Collapse
Affiliation(s)
- Eric M Jones
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| | - Nathan B Lubock
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| | - A J Venkatakrishnan
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom.,Department of Computer Science, Stanford University, Department of Computer Science, Institute for Computational and Mathematical Engineering, Stanford University, Department of Computer Science, Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Department of Computer Science, Department of Structural Biology, Stanford University School of Medicine, Stanford, United States
| | - Jeffrey Wang
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| | - Alex M Tseng
- Department of Computer Science, Stanford University, Department of Computer Science, Institute for Computational and Mathematical Engineering, Stanford University, Department of Computer Science, Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Department of Computer Science, Department of Structural Biology, Stanford University School of Medicine, Stanford, United States
| | - Joseph M Paggi
- Department of Computer Science, Stanford University, Department of Computer Science, Institute for Computational and Mathematical Engineering, Stanford University, Department of Computer Science, Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Department of Computer Science, Department of Structural Biology, Stanford University School of Medicine, Stanford, United States
| | - Naomi R Latorraca
- Department of Computer Science, Stanford University, Department of Computer Science, Institute for Computational and Mathematical Engineering, Stanford University, Department of Computer Science, Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Department of Computer Science, Department of Structural Biology, Stanford University School of Medicine, Stanford, United States
| | - Daniel Cancilla
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| | - Megan Satyadi
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| | - Jessica E Davis
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Ron O Dror
- Department of Computer Science, Stanford University, Department of Computer Science, Institute for Computational and Mathematical Engineering, Stanford University, Department of Computer Science, Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Department of Computer Science, Department of Structural Biology, Stanford University School of Medicine, Stanford, United States
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, and Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, United States
| |
Collapse
|
69
|
Tubeuf H, Charbonnier C, Soukarieh O, Blavier A, Lefebvre A, Dauchel H, Frebourg T, Gaildrat P, Martins A. Large-scale comparative evaluation of user-friendly tools for predicting variant-induced alterations of splicing regulatory elements. Hum Mutat 2020; 41:1811-1829. [PMID: 32741062 DOI: 10.1002/humu.24091] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 07/11/2020] [Accepted: 07/26/2020] [Indexed: 12/20/2022]
Abstract
Discriminating which nucleotide variants cause disease or contribute to phenotypic traits remains a major challenge in human genetics. In theory, any intragenic variant can potentially affect RNA splicing by altering splicing regulatory elements (SREs). However, these alterations are often ignored mainly because pioneer SRE predictors have proved inefficient. Here, we report the first large-scale comparative evaluation of four user-friendly SRE-dedicated algorithms (QUEPASA, HEXplorer, SPANR, and HAL) tested both as standalone tools and in multiple combined ways based on two independent benchmark datasets adding up to >1,300 exonic variants studied at the messenger RNA level and mapping to 89 different disease-causing genes. These methods display good predictive power, based on decision thresholds derived from the receiver operating characteristics curve analyses, with QUEPASA and HAL having the best accuracies either as standalone or in combination. Still, overall there was a tight race between the four predictors, suggesting that all methods may be of use. Additionally, QUEPASA and HEXplorer may be beneficial as well for predicting variant-induced creation of pseudoexons deep within introns. Our study highlights the potential of SRE predictors as filtering tools for identifying disease-causing candidates among the plethora of variants detected by high-throughput DNA sequencing and provides guidance for their use in genomic medicine settings.
Collapse
Affiliation(s)
- Hélène Tubeuf
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Interactive Biosoftware, Rouen, France
| | - Camille Charbonnier
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Omar Soukarieh
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | | | - Arnaud Lefebvre
- Computer Science, Information Processing and Systems Laboratory, UNIROUEN, Normandie University, Mont-Saint-Aignan, France
| | - Hélène Dauchel
- Computer Science, Information Processing and Systems Laboratory, UNIROUEN, Normandie University, Mont-Saint-Aignan, France
| | - Thierry Frebourg
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France.,Department of Genetics, University Hospital, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Pascaline Gaildrat
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| | - Alexandra Martins
- Inserm U1245, UNIROUEN, Normandie University, Normandy Centre for Genomic and Personalized Medicine, Rouen, France
| |
Collapse
|
70
|
Rong S, Buerer L, Rhine CL, Wang J, Cygan KJ, Fairbrother WG. Mutational bias and the protein code shape the evolution of splicing enhancers. Nat Commun 2020; 11:2845. [PMID: 32504065 PMCID: PMC7275064 DOI: 10.1038/s41467-020-16673-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/28/2020] [Indexed: 02/06/2023] Open
Abstract
Exonic splicing enhancers (ESEs) are enriched in exons relative to introns and bind splicing activators. This study considers a fundamental question of co-evolution: How did ESE motifs become enriched in exons prior to the evolution of ESE recognition? We hypothesize that the high exon to intron motif ratios necessary for ESE function were created by mutational bias coupled with purifying selection on the protein code. These two forces retain certain coding motifs in exons while passively depleting them from introns. Through the use of simulations, genomic analyses, and high throughput splicing assays, we confirm the key predictions of this hypothesis, including an overlap between protein and splicing information in ESEs. We discuss the implications of mutational bias as an evolutionary driver in other cis-regulatory systems. Splicing is regulated by cis-acting elements in pre-mRNAs such as exonic or intronic splicing enhancers and silencers. Here the authors show that exonic splicing enhancers are enriched in exons compared to introns due to mutational bias coupled with purifying selection on the protein code.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA
| | - Luke Buerer
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
| | - Christy L Rhine
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Jing Wang
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Kamil J Cygan
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - William G Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA. .,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA. .,Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
71
|
Emerging Roles for 3' UTRs in Neurons. Int J Mol Sci 2020; 21:ijms21103413. [PMID: 32408514 PMCID: PMC7279237 DOI: 10.3390/ijms21103413] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 05/06/2020] [Accepted: 05/09/2020] [Indexed: 12/14/2022] Open
Abstract
The 3′ untranslated regions (3′ UTRs) of mRNAs serve as hubs for post-transcriptional control as the targets of microRNAs (miRNAs) and RNA-binding proteins (RBPs). Sequences in 3′ UTRs confer alterations in mRNA stability, direct mRNA localization to subcellular regions, and impart translational control. Thousands of mRNAs are localized to subcellular compartments in neurons—including axons, dendrites, and synapses—where they are thought to undergo local translation. Despite an established role for 3′ UTR sequences in imparting mRNA localization in neurons, the specific RNA sequences and structural features at play remain poorly understood. The nervous system selectively expresses longer 3′ UTR isoforms via alternative polyadenylation (APA). The regulation of APA in neurons and the neuronal functions of longer 3′ UTR mRNA isoforms are starting to be uncovered. Surprising roles for 3′ UTRs are emerging beyond the regulation of protein synthesis and include roles as RBP delivery scaffolds and regulators of alternative splicing. Evidence is also emerging that 3′ UTRs can be cleaved, leading to stable, isolated 3′ UTR fragments which are of unknown function. Mutations in 3′ UTRs are implicated in several neurological disorders—more studies are needed to uncover how these mutations impact gene regulation and what is their relationship to disease severity.
Collapse
|
72
|
Matreyek KA, Stephany JJ, Chiasson MA, Hasle N, Fowler DM. An improved platform for functional assessment of large protein libraries in mammalian cells. Nucleic Acids Res 2020; 48:e1. [PMID: 31612958 PMCID: PMC7145622 DOI: 10.1093/nar/gkz910] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 09/30/2019] [Accepted: 10/02/2019] [Indexed: 12/19/2022] Open
Abstract
Multiplex genetic assays can simultaneously test thousands of genetic variants for a property of interest. However, limitations of existing multiplex assay methods in cultured mammalian cells hinder the breadth, speed and scale of these experiments. Here, we describe a series of improvements that greatly enhance the capabilities of a Bxb1 recombinase-based landing pad system for conducting different types of multiplex genetic assays in various mammalian cell lines. We incorporate the landing pad into a lentiviral vector, easing the process of generating new landing pad cell lines. We also develop several new landing pad versions, including one where the Bxb1 recombinase is expressed from the landing pad itself, improving recombination efficiency more than 2-fold and permitting rapid prototyping of transgenic constructs. Other versions incorporate positive and negative selection markers that enable drug-based enrichment of recombinant cells, enabling the use of larger libraries and reducing costs. A version with dual convergent promoters allows enrichment of recombinant cells independent of transgene expression, permitting the assessment of libraries of transgenes that perturb cell growth and survival. Lastly, we demonstrate these improvements by assessing the effects of a combinatorial library of oncogenes and tumor suppressors on cell growth. Collectively, these advancements make multiplex genetic assays in diverse cultured cell lines easier, cheaper and more effective, facilitating future studies probing how proteins impact cell function, using transgenic variant libraries tested individually or in combination.
Collapse
Affiliation(s)
- Kenneth A Matreyek
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Jason J Stephany
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Melissa A Chiasson
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Nicholas Hasle
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
73
|
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, Fowler DM, Rubin AF. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol 2019; 20:223. [PMID: 31679514 PMCID: PMC6827219 DOI: 10.1186/s13059-019-1845-6] [Citation(s) in RCA: 154] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 10/01/2019] [Indexed: 11/10/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.
Collapse
Affiliation(s)
- Daniel Esposito
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Lea M Starita
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, Australia
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia
| | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Canadian Institute for Advanced Research, Toronto, ON, Canada.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
- Bioinformatics and Cancer Genomics Laboratory, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia.
| |
Collapse
|
74
|
Shirley BC, Mucaki EJ, Rogan PK. Pan-cancer repository of validated natural and cryptic mRNA splicing mutations. F1000Res 2019; 7:1908. [PMID: 31275557 DOI: 10.12688/f1000research.17204.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/30/2018] [Indexed: 12/26/2022] Open
Abstract
We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon "Validated Splicing Mutations" either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes.
Collapse
Affiliation(s)
| | - Eliseos J Mucaki
- Biochemistry, University of Western Ontario, London, Ontario, N6A 2C1, Canada
| | - Peter K Rogan
- CytoGnomix Inc., London, Ontario, N5X 3X5, Canada.,Biochemistry, University of Western Ontario, London, Ontario, N6A 2C1, Canada.,Computer Science, University of Western Ontario, London, Ontario, N6A 2C1, Canada.,Oncology, University of Western Ontario, London, Ontario, N6A 2C1, Canada
| |
Collapse
|
75
|
Mount SM, Avsec Ž, Carmel L, Casadio R, Çelik MH, Chen K, Cheng J, Cohen NE, Fairbrother WG, Fenesh T, Gagneur J, Gotea V, Holzer T, Lin CF, Martelli PL, Naito T, Nguyen TYD, Savojardo C, Unger R, Wang R, Yang Y, Zhao H. Assessing predictions of the impact of variants on splicing in CAGI5. Hum Mutat 2019; 40:1215-1224. [PMID: 31301154 DOI: 10.1002/humu.23869] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 06/20/2019] [Accepted: 07/10/2019] [Indexed: 12/28/2022]
Abstract
Precision medicine and sequence-based clinical diagnostics seek to predict disease risk or to identify causative variants from sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. In the past, few CAGI challenges have addressed the impact of sequence variants on splicing. In CAGI5, two challenges (Vex-seq and MaPSY) involved prediction of the effect of variants, primarily single-nucleotide changes, on splicing. Although there are significant differences between these two challenges, both involved prediction of results from high-throughput exon inclusion assays. Here, we discuss the methods used to predict the impact of these variants on splicing, their performance, strengths, and weaknesses, and prospects for predicting the impact of sequence variation on splicing and disease phenotypes.
Collapse
Affiliation(s)
- Stephen M Mount
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Liran Carmel
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Rita Casadio
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | | | - Ken Chen
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Jun Cheng
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Noa E Cohen
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.,The integrated program for Computer Science and Computational Biology, School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - William G Fairbrother
- Department of Molecular Biology, Cell Biology, and Biochemistry, Center For Computational Biology, Brown University, Providence, Rhode Island
| | - Tzila Fenesh
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Valer Gotea
- National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH), Bethesda, Maryland
| | - Tamar Holzer
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Chiao-Feng Lin
- Translational Informatics, DNAnexus, Mountain View, California
| | - Pier Luigi Martelli
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Tatsuhiko Naito
- Department of Neurology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | | | - Castrense Savojardo
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Ron Unger
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | - Robert Wang
- Department of Bioengineering, University of California, Berkeley, California.,Department of Plant and Molecular Biology, University of California, Berkeley, California
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Huiying Zhao
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
76
|
Wang R, Wang Y, Hu Z. Using secondary structure to predict the effects of genetic variants on alternative splicing. Hum Mutat 2019; 40:1270-1279. [PMID: 31074545 DOI: 10.1002/humu.23790] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 04/15/2019] [Accepted: 05/06/2019] [Indexed: 01/29/2023]
Abstract
Accurate interpretation of genomic variants that alter RNA splicing is critical to precision medicine. We present a computational framework, Prediction of variant Effect on Percent Spliced In (PEPSI), that predicts the splicing impact of coding and noncoding variants for the Fifth Critical Assessment of Genome Interpretation (CAGI5) "Vex-seq" challenge. PEPSI is a random forest regression model trained on multiple layers of features associated with sequence conservation and regulatory sequence elements. Compared to other splicing defect prediction tools from the literature, our framework integrates secondary structure information in predicting variants that disrupt splicing regulatory elements (SREs). We applied our model to classify splice-disrupting variants among 2,094 single-nucleotide polymorphisms from the Exome Aggregation Consortium using model-predicted changes in percent spliced in (ΔPSI) associated with tested variants. Benchmarking our model against widely used state-of-the-art tools, we demonstrate that PEPSI achieves comparable performance in terms of sensitivity and precision. Moreover, we also show that using secondary structure context can help resolve several cases where changes in the counts of SREs do not correspond with the directionality of ΔPSI measured for tested variants.
Collapse
Affiliation(s)
- Robert Wang
- Department of Bioengineering, University of California, Berkeley, California.,Department of Plant and Microbial Biology, University of California, Berkeley, California
| | - Yaqiong Wang
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| |
Collapse
|
77
|
Rotival M. Characterising the genetic basis of immune response variation to identify causal mechanisms underlying disease susceptibility. HLA 2019; 94:275-284. [PMID: 31115186 DOI: 10.1111/tan.13598] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 05/15/2019] [Indexed: 12/12/2022]
Abstract
Over the last 10 years, genome-wide association studies (GWAS) have identified hundreds of susceptibility loci for autoimmune diseases. However, despite increasing power for the detection of both common and rare coding variants affecting disease susceptibility, a large fraction of disease heritability has remained unexplained. In addition, a majority of the identified loci are located in noncoding regions, and translation of disease-associated loci into new biological insights on the etiology of immune disorders has been lagging. This highlights the need for a better understanding of noncoding variation and new strategies to identify causal genes at disease loci. In this review, I will first detail the molecular basis of gene expression and review the various mechanisms that contribute to alter gene activity at the transcriptional and post-transcriptional level. I will then review the findings from 10 years of functional genomics studies regarding the genetics on gene expression, in particular in the context of infection. Finally, I will discuss the extent to which genetic variants that modulate gene expression at transcriptional and post-transcriptional level contribute to disease susceptibility and present strategies to leverage this information for the identification of causal mechanisms at disease loci in the era of whole genome sequencing.
Collapse
Affiliation(s)
- Maxime Rotival
- Unit of Human Evolutionary Genetics, CNRS UMR2000, Institut Pasteur, Paris, France
| |
Collapse
|
78
|
Qiu C, Kaplan CD. Functional assays for transcription mechanisms in high-throughput. Methods 2019; 159-160:115-123. [PMID: 30797033 PMCID: PMC6589137 DOI: 10.1016/j.ymeth.2019.02.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 02/18/2019] [Indexed: 01/12/2023] Open
Abstract
Dramatic increases in the scale of programmed synthesis of nucleic acid libraries coupled with deep sequencing have powered advances in understanding nucleic acid and protein biology. Biological systems centering on nucleic acids or encoded proteins greatly benefit from such high-throughput studies, given that large DNA variant pools can be synthesized and DNA, or RNA products of transcription, can be easily analyzed by deep sequencing. Here we review the scope of various high-throughput functional assays for studies of nucleic acids and proteins in general, followed by discussion of how these types of study have yielded insights into the RNA Polymerase II (Pol II) active site as an example. We discuss methodological considerations in the design and execution of these experiments that should be valuable to studies in any system.
Collapse
Affiliation(s)
- Chenxi Qiu
- Department of Medicine, Division of Translational Therapeutics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
79
|
Rotival M, Quach H, Quintana-Murci L. Defining the genetic and evolutionary architecture of alternative splicing in response to infection. Nat Commun 2019; 10:1671. [PMID: 30975994 PMCID: PMC6459842 DOI: 10.1038/s41467-019-09689-7] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 03/21/2019] [Indexed: 12/17/2022] Open
Abstract
Host and environmental factors contribute to variation in human immune responses, yet the genetic and evolutionary drivers of alternative splicing in response to infection remain largely uncharacterised. Leveraging 970 RNA-sequencing profiles of resting and stimulated monocytes from 200 individuals of African- and European-descent, we show that immune activation elicits a marked remodelling of the isoform repertoire, while increasing the levels of erroneous splicing. We identify 1,464 loci associated with variation in isoform usage (sQTLs), 9% of them being stimulation-specific, which are enriched in disease-related loci. Furthermore, we detect a longstanding increased plasticity of immune gene splicing, and show that positive selection and Neanderthal introgression have both contributed to diversify the splicing landscape of human populations. Together, these findings suggest that differential isoform usage has been an important substrate of innovation in the long-term evolution of immune responses and a more recent vehicle of population local adaptation. Genetic ancestry might influence immunological response to infection at different regulatory levels. Here, the authors use RNA-Seq to investigate the variability of alternative splicing patterns in resting and stimulated monocytes of African- and European-descent.
Collapse
Affiliation(s)
- Maxime Rotival
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR2000, 25-28 rue Dr Roux, Paris, 75015, France.
| | - Hélène Quach
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR2000, 25-28 rue Dr Roux, Paris, 75015, France
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR2000, 25-28 rue Dr Roux, Paris, 75015, France.
| |
Collapse
|
80
|
On fitness: how do mutations shape the biology of cancer? Biochem Soc Trans 2019; 47:559-569. [DOI: 10.1042/bst20180224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Revised: 01/31/2019] [Accepted: 02/14/2019] [Indexed: 12/14/2022]
Abstract
Abstract
The theory of evolution by natural selection shapes our understanding of the living world. While natural selection has given rise to all the intricacies of life on the planet, those responsible for treating cancer have a darker view of adaptation and selection. Revolutionary changes in DNA sequencing technology have allowed us to survey the complexities that constitute the cancer genome, while advances in genetic engineering are allowing us to functionally interrogate these alterations. These approaches are providing new insights into how mutations influence cancer biology. It is possible that with time, this new knowledge will allow us to take control of the evolutionary processes that shape the disease, to develop more effective treatments.
Collapse
|
81
|
Cheng J, Nguyen TYD, Cygan KJ, Çelik MH, Fairbrother WG, Avsec Ž, Gagneur J. MMSplice: modular modeling improves the predictions of genetic variant effects on splicing. Genome Biol 2019; 20:48. [PMID: 30823901 PMCID: PMC6396468 DOI: 10.1186/s13059-019-1653-z] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 02/12/2019] [Indexed: 12/15/2022] Open
Abstract
Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI5 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity, with matched or higher performance than state-of-the-art. Our models, available in the repository Kipoi, apply to variants including indels directly from VCF files.
Collapse
Affiliation(s)
- Jun Cheng
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, München, Germany
| | - Thi Yen Duong Nguyen
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
| | - Kamil J. Cygan
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island USA
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island USA
| | - Muhammed Hasan Çelik
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
| | - William G. Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island USA
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island USA
| | - žiga Avsec
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, München, Germany
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748 Germany
| |
Collapse
|
82
|
Shirley BC, Mucaki EJ, Rogan PK. Pan-cancer repository of validated natural and cryptic mRNA splicing mutations. F1000Res 2018; 7:1908. [PMID: 31275557 PMCID: PMC6544075 DOI: 10.12688/f1000research.17204.3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/27/2019] [Indexed: 11/20/2022] Open
Abstract
We present a major public resource of mRNA splicing mutations validated according to multiple lines of evidence of abnormal gene expression. Likely mutations present in all tumor types reported in the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) were identified based on the comparative strengths of splice sites in tumor versus normal genomes, and then validated by respectively comparing counts of splice junction spanning and abundance of transcript reads in RNA-Seq data from matched tissues and tumors lacking these mutations. The comprehensive resource features 341,486 of these validated mutations, the majority of which (69.9%) are not present in the Single Nucleotide Polymorphism Database (dbSNP 150). There are 131,347 unique mutations which weaken or abolish natural splice sites, and 222,071 mutations which strengthen cryptic splice sites (11,932 affect both simultaneously). 28,812 novel or rare flagged variants (with <1% population frequency in dbSNP) were observed in multiple tumor tissue types. An algorithm was developed to classify variants into splicing molecular phenotypes that integrates germline heterozygosity, degree of information change and impact on expression. The classification thresholds were calibrated against the ClinVar clinical database phenotypic assignments. Variants are partitioned into allele-specific alternative splicing, likely aberrant and aberrant splicing phenotypes. Single variants or chromosome ranges can be queried using a Global Alliance for Genomics and Health (GA4GH)-compliant, web-based Beacon "Validated Splicing Mutations" either separately or in aggregate alongside other Beacons through the public Beacon Network, as well as through our website. The website provides additional information, such as a visual representation of supporting RNAseq results, gene expression in the corresponding normal tissues, and splicing molecular phenotypes.
Collapse
Affiliation(s)
| | - Eliseos J Mucaki
- Biochemistry, University of Western Ontario, London, Ontario, N6A 2C1, Canada
| | - Peter K Rogan
- CytoGnomix Inc., London, Ontario, N5X 3X5, Canada.,Biochemistry, University of Western Ontario, London, Ontario, N6A 2C1, Canada.,Computer Science, University of Western Ontario, London, Ontario, N6A 2C1, Canada.,Oncology, University of Western Ontario, London, Ontario, N6A 2C1, Canada
| |
Collapse
|