1
|
Ciesielski TH, Bartlett J, Iyengar SK, Williams SM. Hemizygosity can reveal variant pathogenicity on the X-chromosome. Hum Genet 2023; 142:11-19. [PMID: 35994124 PMCID: PMC9840679 DOI: 10.1007/s00439-022-02478-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 08/10/2022] [Indexed: 01/24/2023]
Abstract
Pathogenic variants on the X-chromosome can have more severe consequences for hemizygous males, while heterozygote females can avoid severe consequences due to diploidy and the capacity for nonrandom expression. Thus, when an allele is more common in females this could indicate that it increases the probability of early death in the male hemizygous state, which can be considered a measure of pathogenicity. Importantly, large-scale genomic data now makes it possible to compare allele proportions between the sexes. To discover pathogenic variants on the X-chromosome, we analyzed exome data from 125,748 ancestrally diverse participants in the Genome Aggregation Database (gnomAD). After filtering out duplicates and extremely rare variants, 44,606 of the original 348,221 remained for analysis. We divided the proportion of variant alleles in females by the proportion in males for all variant sites, and then placed each variant into one of three a priori categories: (1) Reference (Primarily synonymous and intronic), (2) Unlikely-to-be-tolerated (Primarily missense), and (3) Least-likely-to-be-tolerated (Primarily frameshift). To assess the impact of ploidy, we compared the distribution of these ratios between pseudoautosomal and non-pseudoautosomal regions. In the non-pseudoautosomal regions, mean female-to-male ratios were lowest among Reference (2.40), greater for Unlikely-to-be-tolerated (2.77) and highest for Least-likely-to-be-tolerated (3.28) variants. Corresponding ratios were lower in the pseudoautosomal regions (1.52, 1.57, and 1.68, respectively), with the most extreme ratio being just below 11. Because pathogenic effects in the pseudoautosomal regions should not drive ratio increases, this maximum ratio provides an upper bound for baseline noise. In the non-pseudoautosomal regions, 319 variants had a ratio over 11. In sum, we identified a measure with a dataset specific threshold for identifying pathogenicity in non-pseudoautosomal X-chromosome variants: the female-to-male allele proportion ratio.
Collapse
Affiliation(s)
- Timothy H. Ciesielski
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,Mary Ann Swetland Center for Environmental Health at Case Western Reserve University School of Medicine, Cleveland, OH,Ronin Institute, Montclair, NJ
| | - Jacquelaine Bartlett
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH
| | - Sudha K. Iyengar
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,The Department of Genetics and Genome Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,Cleveland Institute for Computational Biology, Cleveland, OH
| | - Scott M. Williams
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,The Department of Genetics and Genome Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,Cleveland Institute for Computational Biology, Cleveland, OH
| |
Collapse
|
2
|
Song J, Merrill RA, Usachev AY, Strack S. The X-linked intellectual disability gene product and E3 ubiquitin ligase KLHL15 degrades doublecortin proteins to constrain neuronal dendritogenesis. J Biol Chem 2020; 296:100082. [PMID: 33199366 PMCID: PMC7948412 DOI: 10.1074/jbc.ra120.016210] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 10/27/2020] [Accepted: 11/16/2020] [Indexed: 11/13/2022] Open
Abstract
Proper brain development and function requires finely controlled mechanisms for protein turnover, and disruption of genes involved in proteostasis is a common cause of neurodevelopmental disorders. Kelch-like 15 (KLHL15) is a substrate adaptor for cullin3-containing E3 ubiquitin ligases, and KLHL15 gene mutations were recently described as a cause of severe X-linked intellectual disability. Here, we used a bioinformatics approach to identify a family of neuronal microtubule-associated proteins as KLHL15 substrates, which are themselves critical for early brain development. We biochemically validated doublecortin (DCX), also an X-linked disease protein, and doublecortin-like kinase 1 and 2 as bona fide KLHL15 interactors and mapped KLHL15 interaction regions to their tandem DCX domains. Shared with two previously identified KLHL15 substrates, a FRY tripeptide at the C-terminal edge of the second DCX domain is necessary for KLHL15-mediated ubiquitination of DCX and doublecortin-like kinase 1 and 2 and subsequent proteasomal degradation. Conversely, silencing endogenous KLHL15 markedly stabilizes these DCX domain-containing proteins and prolongs their half-life. Functionally, overexpression of KLHL15 in the presence of WT DCX reduces dendritic complexity of cultured hippocampal neurons, whereas neurons expressing FRY-mutant DCX are resistant to KLHL15. Collectively, our findings highlight the critical importance of the E3 ubiquitin ligase adaptor KLHL15 in proteostasis of neuronal microtubule-associated proteins and identify a regulatory network important for development of the mammalian nervous system.
Collapse
Affiliation(s)
- Jianing Song
- Department of Neuroscience and Pharmacology and the Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, USA
| | - Ronald A Merrill
- Department of Neuroscience and Pharmacology and the Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, USA
| | - Andrew Y Usachev
- Department of Neuroscience and Pharmacology and the Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, USA
| | - Stefan Strack
- Department of Neuroscience and Pharmacology and the Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, USA.
| |
Collapse
|
3
|
Wiel L, Baakman C, Gilissen D, Veltman JA, Vriend G, Gilissen C. MetaDome: Pathogenicity analysis of genetic variants through aggregation of homologous human protein domains. Hum Mutat 2019; 40:1030-1038. [PMID: 31116477 PMCID: PMC6772141 DOI: 10.1002/humu.23798] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 04/21/2019] [Accepted: 05/15/2019] [Indexed: 01/19/2023]
Abstract
The growing availability of human genetic variation has given rise to novel methods of measuring genetic tolerance that better interpret variants of unknown significance. We recently developed a concept based on protein domain homology in the human genome to improve variant interpretation. For this purpose, we mapped population variation from the Exome Aggregation Consortium (ExAC) and pathogenic mutations from the Human Gene Mutation Database (HGMD) onto Pfam protein domains. The aggregation of these variation data across homologous domains into meta-domains allowed us to generate amino acid resolution of genetic intolerance profiles for human protein domains. Here, we developed MetaDome, a fast and easy-to-use web server that visualizes meta-domain information and gene-wide profiles of genetic tolerance. We updated the underlying data of MetaDome to contain information from 56,319 human transcripts, 71,419 protein domains, 12,164,292 genetic variants from gnomAD, and 34,076 pathogenic mutations from ClinVar. MetaDome allows researchers to easily investigate their variants of interest for the presence or absence of variation at corresponding positions within homologous domains. We illustrate the added value of MetaDome by an example that highlights how it may help in the interpretation of variants of unknown significance. The MetaDome web server is freely accessible at https://stuart.radboudumc.nl/metadome.
Collapse
Affiliation(s)
- Laurens Wiel
- Department of Human Genetics, Radboud Institute for Molecular Life SciencesRadboud University Medical CenterNijmegenThe Netherlands
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life SciencesRadboud University Medical CenterNijmegenThe Netherlands
| | - Coos Baakman
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life SciencesRadboud University Medical CenterNijmegenThe Netherlands
| | - Daan Gilissen
- Department of Human Genetics, Radboud Institute for Molecular Life SciencesRadboud University Medical CenterNijmegenThe Netherlands
- Bio‐informaticaHAN University of Applied SciencesNijmegenThe Netherlands
| | - Joris A. Veltman
- Department of Human Genetics, Donders Institute for Brain, Cognition and BehaviourRadboud University Medical CenterNijmegenThe Netherlands
- Institute of Genetic Medicine, International Centre for LifeNewcastle UniversityNewcastle upon TyneUnited Kingdom
| | - Gerrit Vriend
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life SciencesRadboud University Medical CenterNijmegenThe Netherlands
| | - Christian Gilissen
- Department of Human Genetics, Radboud Institute for Molecular Life SciencesRadboud University Medical CenterNijmegenThe Netherlands
| |
Collapse
|
4
|
Affiliation(s)
- Joseph T. C. Shieh
- Division of Medical Genetics, Department of Pediatrics, Institute for Human Genetics, University of California San Francisco, UCSF Benioff Children's Hospital, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
5
|
Abstract
BACKGROUND Genetic data have the potential to impact patient care significantly. In primary care and in the ICU, patients are undergoing genetic testing. Genetics is also transforming cancer care and undiagnosed diseases. Optimal personalized medicine relies on the understanding of disease penetrance. In this article, I examine the complexity of penetrance. METHODS In this article, I assess how variable penetrance can be seen with many diseases, including those of different modes of inheritance, and how genomic testing is being applied effectively for many diseases. In this article, I also identify challenges in the field, including the interpretation of gene variants. RESULTS Using advancing bioinformatics and detailed phenotypic assessment, we can increase the yield of genomic testing, particularly for highly penetrant conditions. The technologies are useful and applicable to different medical situations. CONCLUSIONS There are now effective genome diagnostics for many diseases. However, the best personalized application of these data still requires skilled interpretation.
Collapse
Affiliation(s)
- Joseph T.C. Shieh
- Division of Medical Genetics, Department of Pediatrics, Institute for Human Genetics, University of California, San Francisco, San Francisco, California
| |
Collapse
|
6
|
Alyousfi D, Baralle D, Collins A. Gene-specific metrics to facilitate identification of disease genes for molecular diagnosis in patient genomes: a systematic review. Brief Funct Genomics 2018; 18:23-29. [DOI: 10.1093/bfgp/ely033] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2018] [Revised: 08/30/2018] [Accepted: 09/20/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Dareen Alyousfi
- Genetic Epidemiology and Bioinformatics Research Group, Human Development and Health, Faculty of Medicine, University of Southampton, UK
| | - Diana Baralle
- Human Development and Health, Faculty of Medicine, University of Southampton, UK
- Wessex Clinical Genetics Service, Princess Anne Hospital, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology and Bioinformatics Research Group, Human Development and Health, Faculty of Medicine, University of Southampton, UK
| |
Collapse
|
7
|
Penon M, Zahed H, Berger V, Su I, Shieh JT. Using exome sequencing to decipher family history in a healthy individual: Comparison of pathogenic and population MTM1 variants. Mol Genet Genomic Med 2018; 6:722-727. [PMID: 30047259 PMCID: PMC6160706 DOI: 10.1002/mgg3.405] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Revised: 03/11/2018] [Accepted: 03/13/2018] [Indexed: 12/31/2022] Open
Abstract
Background When a family encounters the loss of a child early in life, extensive genetic testing of the affected neonate is sometimes not performed or not possible. However, the increasing availability of genomic sequencing may allow for direct application to families in cases where there is a condition inherited from parental gene(s). When neonatal testing is not possible, it is feasible to perform family testing as long as there is optimal interpretation of the genomic information. Here, we present an example of a healthy adult woman with a history of recurrent male neonatal losses due to severe respiratory distress who presented to Medical Genetics for evaluation. A family history of additional male neonatal loss was present, suggesting a potential inherited genetic etiology. Methods Although there was no DNA available from the neonates, by performing exome sequencing on the healthy adult woman, we found a missense variant in MTM1 as a potential candidate, which was deemed pathogenic based on multiple criteria including past report. Results By performing an analysis of all known MTM1‐disease associated mutations and control population variation, we can also better infer the effects of missense variations on MTM1, as not all variants are truncating. MTM1‐X‐linked myotubular myopathy is a condition that leads to male perinatal respiratory failure and a high risk for early mortality. Conclusions The application of genetic testing in the healthy population here highlights the broader utility of genomic sequencing in evaluating unexplained recurrent neonatal loss, especially when genetic testing is not available on the affected neonates.
Collapse
Affiliation(s)
- Monica Penon
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, California
| | - Hengameh Zahed
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, California
| | - Victoria Berger
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, California.,Department of Obstetrics, Gynecology and Reproductive Science, University of California San Francisco, San Francisco, California
| | - Irene Su
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, California
| | - Joseph T Shieh
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, California.,Institute for Human Genetics, University of California San Francisco, San Francisco, California
| |
Collapse
|
8
|
O-GlcNAc in cancer: An Oncometabolism-fueled vicious cycle. J Bioenerg Biomembr 2018; 50:155-173. [PMID: 29594839 DOI: 10.1007/s10863-018-9751-2] [Citation(s) in RCA: 105] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 03/15/2018] [Indexed: 12/17/2022]
Abstract
Cancer cells exhibit unregulated growth, altered metabolism, enhanced metastatic potential and altered cell surface glycans. Fueled by oncometabolism and elevated uptake of glucose and glutamine, the hexosamine biosynthetic pathway (HBP) sustains glycosylation in the endomembrane system. In addition, the elevated pools of UDP-GlcNAc drives the O-GlcNAc modification of key targets in the cytoplasm, nucleus and mitochondrion. These targets include transcription factors, kinases, key cytoplasmic enzymes of intermediary metabolism, and electron transport chain complexes. O-GlcNAcylation can thereby alter epigenetics, transcription, signaling, proteostasis, and bioenergetics, key 'hallmarks of cancer'. In this review, we summarize accumulating evidence that many cancer hallmarks are linked to dysregulation of O-GlcNAc cycling on cancer-relevant targets. We argue that onconutrient and oncometabolite-fueled elevation increases HBP flux and triggers O-GlcNAcylation of key regulatory enzymes in glycolysis, Kreb's cycle, pentose-phosphate pathway, and the HBP itself. The resulting rerouting of glucose metabolites leads to elevated O-GlcNAcylation of oncogenes and tumor suppressors further escalating elevation in HBP flux creating a 'vicious cycle'. Downstream, elevated O-GlcNAcylation alters DNA repair and cellular stress pathways which influence oncogenesis. The elevated steady-state levels of O-GlcNAcylated targets found in many cancers may also provide these cells with a selective advantage for sustained growth, enhanced metastatic potential, and immune evasion in the tumor microenvironment.
Collapse
|
9
|
Alhuzimi E, Leal LG, Sternberg MJE, David A. Properties of human genes guided by their enrichment in rare and common variants. Hum Mutat 2017; 39:365-370. [PMID: 29197136 PMCID: PMC5838408 DOI: 10.1002/humu.23377] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 11/20/2017] [Accepted: 11/26/2017] [Indexed: 01/01/2023]
Abstract
We analyzed 563,099 common (minor allele frequency, MAF≥0.01) and rare (MAF < 0.01) genetic variants annotated in ExAC and UniProt and 26,884 disease‐causing variants from ClinVar and UniProt occurring in the coding region of 17,975 human protein‐coding genes. Three novel sets of genes were identified: those enriched in rare variants (n = 32 genes), in common variants (n = 282 genes), and in disease‐causing variants (n = 800 genes). Genes enriched in rare variants have far greater similarities in terms of biological and network properties to genes enriched in disease‐causing variants, than to genes enriched in common variants. However, in half of the genes enriched in rare variants (AOC2, MAMDC4, ANKHD1, CDC42BPB, SPAG5, TRRAP, TANC2, IQCH, USP54, SRRM2, DOPEY2, and PITPNM1), no disease‐causing variants have been identified in major, publicly available databases. Thus, genetic variants in these genes are strong candidates for disease and their identification, as part of sequencing studies, should prompt further in vitro analyses.
Collapse
Affiliation(s)
- Eman Alhuzimi
- Structural Bioinformatics Group, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Luis G Leal
- Structural Bioinformatics Group, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Michael J E Sternberg
- Structural Bioinformatics Group, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Alessia David
- Structural Bioinformatics Group, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
10
|
Akan I, Olivier-Van Stichelen S, Bond MR, Hanover JA. Nutrient-driven O-GlcNAc in proteostasis and neurodegeneration. J Neurochem 2017; 144:7-34. [PMID: 29049853 DOI: 10.1111/jnc.14242] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 09/28/2017] [Accepted: 10/11/2017] [Indexed: 12/14/2022]
Abstract
Proteostasis is essential in the mammalian brain where post-mitotic cells must function for decades to maintain synaptic contacts and memory. The brain is dependent on glucose and other metabolites for proper function and is spared from metabolic deficits even during starvation. In this review, we outline how the nutrient-sensitive nucleocytoplasmic post-translational modification O-linked N-acetylglucosamine (O-GlcNAc) regulates protein homeostasis. The O-GlcNAc modification is highly abundant in the mammalian brain and has been linked to proteopathies, including neurodegenerative diseases such as Alzheimer's, Parkinson's, and Huntington's. C. elegans, Drosophila, and mouse models harboring O-GlcNAc transferase- and O-GlcNAcase-knockout alleles have helped define the role O-GlcNAc plays in development as well as age-associated neurodegenerative disease. These enzymes add and remove the single monosaccharide from protein serine and threonine residues, respectively. Blocking O-GlcNAc cycling is detrimental to mammalian brain development and interferes with neurogenesis, neural migration, and proteostasis. Findings in C. elegans and Drosophila model systems indicate that the dynamic turnover of O-GlcNAc is critical for maintaining levels of key transcriptional regulators responsible for neurodevelopment cell fate decisions. In addition, pathways of autophagy and proteasomal degradation depend on a transcriptional network that is also reliant on O-GlcNAc cycling. Like the quality control system in the endoplasmic reticulum which uses a 'mannose timer' to monitor protein folding, we propose that cytoplasmic proteostasis relies on an 'O-GlcNAc timer' to help regulate the lifetime and fate of nuclear and cytoplasmic proteins. O-GlcNAc-dependent developmental alterations impact metabolism and growth of the developing mouse embryo and persist into adulthood. Brain-selective knockout mouse models will be an important tool for understanding the role of O-GlcNAc in the physiology of the brain and its susceptibility to neurodegenerative injury.
Collapse
Affiliation(s)
- Ilhan Akan
- Laboratory of Cell and Molecular Biology, NIDDK, National Institutes of Health, Bethesda, Maryland, USA
| | | | - Michelle R Bond
- Laboratory of Cell and Molecular Biology, NIDDK, National Institutes of Health, Bethesda, Maryland, USA
| | - John A Hanover
- Laboratory of Cell and Molecular Biology, NIDDK, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
11
|
Lelieveld SH, Wiel L, Venselaar H, Pfundt R, Vriend G, Veltman JA, Brunner HG, Vissers LE, Gilissen C. Spatial Clustering of de Novo Missense Mutations Identifies Candidate Neurodevelopmental Disorder-Associated Genes. Am J Hum Genet 2017; 101:478-484. [PMID: 28867141 DOI: 10.1016/j.ajhg.2017.08.004] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 08/04/2017] [Indexed: 10/19/2022] Open
Abstract
Haploinsufficiency (HI) is the best characterized mechanism through which dominant mutations exert their effect and cause disease. Non-haploinsufficiency (NHI) mechanisms, such as gain-of-function and dominant-negative mechanisms, are often characterized by the spatial clustering of mutations, thereby affecting only particular regions or base pairs of a gene. Variants leading to haploinsufficency might occasionally cluster as well, for example in critical domains, but such clustering is on the whole less pronounced with mutations often spread throughout the gene. Here we exploit this property and develop a method to specifically identify genes with significant spatial clustering patterns of de novo mutations in large cohorts. We apply our method to a dataset of 4,061 de novo missense mutations from published exome studies of trios with intellectual disability and developmental disorders (ID/DD) and successfully identify 15 genes with clustering mutations, including 12 genes for which mutations are known to cause neurodevelopmental disorders. For 11 out of these 12, NHI mutation mechanisms have been reported. Additionally, we identify three candidate ID/DD-associated genes of which two have an established role in neuronal processes. We further observe a higher intolerance to normal genetic variation of the identified genes compared to known genes for which mutations lead to HI. Finally, 3D modeling of these mutations on their protein structures shows that 81% of the observed mutations are unlikely to affect the overall structural integrity and that they therefore most likely act through a mechanism other than HI.
Collapse
|
12
|
Wiel L, Venselaar H, Veltman JA, Vriend G, Gilissen C. Aggregation of population-based genetic variation over protein domain homologues and its potential use in genetic diagnostics. Hum Mutat 2017; 38:1454-1463. [PMID: 28815929 PMCID: PMC5656839 DOI: 10.1002/humu.23313] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Revised: 08/03/2017] [Accepted: 08/08/2017] [Indexed: 12/11/2022]
Abstract
Whole exomes of patients with a genetic disorder are nowadays routinely sequenced but interpretation of the identified genetic variants remains a major challenge. The increased availability of population‐based human genetic variation has given rise to measures of genetic tolerance that have been used, for example, to predict disease‐causing genes in neurodevelopmental disorders. Here, we investigated whether combining variant information from homologous protein domains can improve variant interpretation. For this purpose, we developed a framework that maps population variation and known pathogenic mutations onto 2,750 “meta‐domains.” These meta‐domains consist of 30,853 homologous Pfam protein domain instances that cover 36% of all human protein coding sequences. We find that genetic tolerance is consistent across protein domain homologues, and that patterns of genetic tolerance faithfully mimic patterns of evolutionary conservation. Furthermore, for a significant fraction (68%) of the meta‐domains high‐frequency population variation re‐occurs at the same positions across domain homologues more often than expected. In addition, we observe that the presence of pathogenic missense variants at an aligned homologous domain position is often paired with the absence of population variation and vice versa. The use of these meta‐domains can improve the interpretation of genetic variation.
Collapse
Affiliation(s)
- Laurens Wiel
- Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, GA, 6525, The Netherlands.,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, GA, 6525, The Netherlands
| | - Hanka Venselaar
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, GA, 6525, The Netherlands
| | - Joris A Veltman
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, GA, 6525, The Netherlands.,Institute of Genetic Medicine, International Centre for Life, Newcastle University, Newcastle upon Tyne, NE1 3BZ, United Kingdom
| | - Gert Vriend
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, GA, 6525, The Netherlands
| | - Christian Gilissen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, GA, 6525, The Netherlands
| |
Collapse
|
13
|
Ge X, Gong H, Dumas K, Litwin J, Phillips JJ, Waisfisz Q, Weiss MM, Hendriks Y, Stuurman KE, Nelson SF, Grody WW, Lee H, Kwok PY, Shieh JT. Missense-depleted regions in population exomes implicate ras superfamily nucleotide-binding protein alteration in patients with brain malformation. NPJ Genom Med 2016; 1. [PMID: 28868155 PMCID: PMC5576364 DOI: 10.1038/npjgenmed.2016.36] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Genomic sequence interpretation can miss clinically relevant missense variants for several reasons. Rare missense variants are numerous in the exome and difficult to prioritise. Affected genes may also not have existing disease association. To improve variant prioritisation, we leverage population exome data to identify intragenic missense-depleted regions (MDRs) genome-wide that may be important in disease. We then use missense depletion analyses to help prioritise undiagnosed disease exome variants. We demonstrate application of this strategy to identify a novel gene association for human brain malformation. We identified de novo missense variants that affect the GDP/GTP-binding site of ARF1 in three unrelated patients. Corresponding functional analysis suggests ARF1 GDP/GTP-activation is affected by the specific missense mutations associated with heterotopia. These findings expand the genetic pathway underpinning neurologic disease that classically includes FLNA. ARF1 along with ARFGEF2 add further evidence implicating ARF/GEFs in the brain. Using functional ontology, top MDR-containing genes were highly enriched for nucleotide-binding function, suggesting these may be candidates for human disease. Routine consideration of MDR in the interpretation of exome data for rare diseases may help identify strong genetic factors for many severe conditions, infertility/reduction in reproductive capability, and embryonic conditions contributing to preterm loss.
Collapse
Affiliation(s)
- Xiaoyan Ge
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, CA, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Henry Gong
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, CA, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Kevin Dumas
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, CA, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Jessica Litwin
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA.,Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Joanna J Phillips
- Department of Neurologic Surgery, University of California San Francisco, San Francisco, CA, USA.,Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - Quinten Waisfisz
- Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | - Marjan M Weiss
- Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | - Yvonne Hendriks
- Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | - Kyra E Stuurman
- Department of Clinical Genetics, VU University Medical Center, Amsterdam, The Netherlands
| | - Stanley F Nelson
- Departments of Pathology and Laboratory Medicine, Pediatrics, and Human Genetics, Divisions of Medical Genetics and Molecular Diagnostics, University of California Los Angeles, Los Angeles, CA, USA
| | - Wayne W Grody
- Department of Pathology and Laboratory Medicine and Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
| | - Hane Lee
- Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Pui-Yan Kwok
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.,Department of Dermatology, University of California San Francisco, San Francisco, CA, USA.,Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA, USA
| | - Joseph Tc Shieh
- Department of Pediatrics, Division of Medical Genetics, University of California San Francisco, San Francisco, CA, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|