1
|
Cacheiro P, Lawson S, Van den Veyver IB, Marengo G, Zocche D, Murray SA, Duyzend M, Robinson PN, Smedley D. Lethal phenotypes in Mendelian disorders. Genet Med 2024; 26:101141. [PMID: 38629401 DOI: 10.1016/j.gim.2024.101141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/08/2024] [Accepted: 04/09/2024] [Indexed: 04/26/2024] Open
Abstract
PURPOSE Existing resources that characterize the essentiality status of genes are based on either proliferation assessment in human cell lines, viability evaluation in mouse knockouts, or constraint metrics derived from human population sequencing studies. Several repositories document phenotypic annotations for rare disorders; however, there is a lack of comprehensive reporting on lethal phenotypes. METHODS We queried Online Mendelian Inheritance in Man for terms related to lethality and classified all Mendelian genes according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. We characterized the genes across these lethality categories, examined the evidence on viability from mouse models and explored how this information could be used for novel gene discovery. RESULTS We developed the Lethal Phenotypes Portal to showcase this curated catalog of human essential genes. Differences in the mode of inheritance, physiological systems affected, and disease class were found for genes in different lethality categories, as well as discrepancies between the lethal phenotypes observed in mouse and human. CONCLUSION We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.
Collapse
Affiliation(s)
- Pilar Cacheiro
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Samantha Lawson
- ITS Research, Queen Mary University of London, London, United Kingdom
| | - Ignatia B Van den Veyver
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX; Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, TX
| | - Gabriel Marengo
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - David Zocche
- North West Thames Regional Genetics Service, Northwick Park and St Mark's Hospitals, London, United Kingdom
| | | | - Michael Duyzend
- Massachusetts General Hospital, Boston, MA; Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA; Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom.
| |
Collapse
|
2
|
Cacheiro P, Lawson S, Van den Veyver IB, Marengo G, Zocche D, Murray SA, Duyzend M, Robinson PN, Smedley D. Lethal phenotypes in Mendelian disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.12.24301168. [PMID: 38260283 PMCID: PMC10802756 DOI: 10.1101/2024.01.12.24301168] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Essential genes are those whose function is required for cell proliferation and/or organism survival. A gene's intolerance to loss-of-function can be allocated within a spectrum, as opposed to being considered a binary feature, since this function might be essential at different stages of development, genetic backgrounds or other contexts. Existing resources that collect and characterise the essentiality status of genes are based on either proliferation assessment in human cell lines, embryonic and postnatal viability evaluation in different model organisms, and gene metrics such as intolerance to variation scores derived from human population sequencing studies. There are also several repositories available that document phenotypic annotations for rare disorders in humans such as the Online Mendelian Inheritance in Man (OMIM) and the Human Phenotype Ontology (HPO) knowledgebases. This raises the prospect of being able to use clinical data, including lethality as the most severe phenotypic manifestation, to further our characterisation of gene essentiality. Here we queried OMIM for terms related to lethality and classified all Mendelian genes into categories, according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. To showcase this curated catalogue of human essential genes, we developed the Lethal Phenotypes Portal (https://lethalphenotypes.research.its.qmul.ac.uk), where we also explore the relationships between these lethality categories, constraint metrics and viability in cell lines and mouse. Further analysis of the genes in these categories reveals differences in the mode of inheritance of the associated disorders, physiological systems affected and disease class. We highlight how the phenotypic similarity between genes in the same lethality category combined with gene family/group information can be used for novel disease gene discovery. Finally, we explore the overlaps and discrepancies between the lethal phenotypes observed in mouse and human and discuss potential explanations that include differences in transcriptional regulation, functional compensation and molecular disease mechanisms. We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.
Collapse
Affiliation(s)
- Pilar Cacheiro
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | | | - Ignatia B. Van den Veyver
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, TX, USA
| | - Gabriel Marengo
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | - David Zocche
- North West Thames Regional Genetics Service, Northwick Park & St Mark’s Hospitals, London, UK
| | | | | | - Peter N. Robinson
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, UK
| |
Collapse
|
3
|
Jabalameli M, Lin JR, Zhang Q, Wang Z, Mitra J, Nguyen N, Gao T, Khusidman M, Atzmon G, Milman S, Vijg J, Barzilai N, Zhang ZD. Polygenic prediction of human longevity on the supposition of pervasive pleiotropy. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.10.23299795. [PMID: 38168353 PMCID: PMC10760260 DOI: 10.1101/2023.12.10.23299795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
The highly polygenic nature of human longevity renders cross-trait pleiotropy an indispensable feature of its genetic architecture. Leveraging the genetic correlation between the aging-related traits (ARTs), we sought to model the additive variance in lifespan as a function of cumulative liability from pleiotropic segregating variants. We tracked allele frequency changes as a function of viability across different age bins and prioritized 34 variants with an immediate implication on lipid metabolism, body mass index (BMI), and cognitive performance, among other traits, revealed by PheWAS analysis in the UK Biobank. Given the highly complex and non-linear interactions between the genetic determinants of longevity, we reasoned that a composite polygenic score would approximate a substantial portion of the variance in lifespan and developed the integrated longevity genetic scores (iLGSs) for distinguishing exceptional survival. We showed that coefficients derived from our ensemble model could potentially reveal an interesting pattern of genomic pleiotropy specific to lifespan. We assessed the predictive performance of our model for distinguishing the enrichment of exceptional longevity among long-lived individuals in two replication cohorts and showed that the median lifespan in the highest decile of our composite prognostic index is up to 4.8 years longer. Finally, using the proteomic correlates of i L G S , we identified protein markers associated with exceptional longevity irrespective of chronological age and prioritized drugs with repurposing potentials for gerotherapeutics. Together, our approach demonstrates a promising framework for polygenic modeling of additive liability conferred by ARTs in defining exceptional longevity and assisting the identification of individuals at higher risk of mortality for targeted lifestyle modifications earlier in life. Furthermore, the proteomic signature associated with i L G S highlights the functional pathway upstream of the PI3K-Akt that can be effectively targeted to slow down aging and extend lifespan.
Collapse
Affiliation(s)
- M.Reza Jabalameli
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Jhih-Rong Lin
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Quanwei Zhang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Zhen Wang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Joydeep Mitra
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Nha Nguyen
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Tina Gao
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Mark Khusidman
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Gil Atzmon
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Sofiya Milman
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Jan Vijg
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Nir Barzilai
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Zhengdong D. Zhang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| |
Collapse
|
4
|
Gu X. A Simple Evolutionary Model of Genetic Robustness After Gene Duplication. J Mol Evol 2022; 90:352-361. [PMID: 35913597 DOI: 10.1007/s00239-022-10065-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 06/23/2022] [Indexed: 10/16/2022]
Abstract
When a dispensable gene is duplicated (referred to the ancestral dispensability denoted by O+), genetic buffering and duplicate compensation together maintain the duplicate redundancy, whereas duplicate compensation is the only mechanism when an essential gene is duplicated (referred to the ancestral essentiality denoted by O-). To investigate these evolutionary scenarios of genetic robustness, I formulated a simple mixture model for analyzing duplicate pairs with one of the following states: double dispensable (DD), semi-dispensable (one dispensable one essential, DE), or double essential (EE). This model was applied to the yeast duplicate pairs from a whole-genome duplication (WGD) occurred about 100 million years ago (mya), and the mouse duplicate pairs from a WGD occurred about more than 500 mya. Both case studies revealed that the proportion of essentiality for those duplicates with ancestral essentiality [PE(O-)] was much higher than that for those with ancestral dispensability [PE(O+)]. While it was negligible in the yeast duplicate pairs, PE(O+) (about 20%) was shown statistically significant in the mouse duplicate pairs. These findings, together, support the hypothesis that both sub-functionalization and neo-functionalization may play some roles after gene duplication, though the former may be much faster than the later.
Collapse
Affiliation(s)
- Xun Gu
- The Laurence H. Baker Center in Bioinformatics on Biological Statistics, Department of Genetics, Development and Cell Biology, Program of Ecological and Evolutionary Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
5
|
Seaby EG, Ennis S. Challenges in the diagnosis and discovery of rare genetic disorders using contemporary sequencing technologies. Brief Funct Genomics 2021; 19:243-258. [PMID: 32393978 DOI: 10.1093/bfgp/elaa009] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Next generation sequencing (NGS) has revolutionised rare disease diagnostics. Concomitant with advancing technologies has been a rise in the number of new gene disorders discovered and diagnoses made for patients and their families. However, despite the trend towards whole exome and whole genome sequencing, diagnostic rates remain suboptimal. On average, only ~30% of patients receive a molecular diagnosis. National sequencing projects launched in the last 5 years are integrating clinical diagnostic testing with research avenues to widen the spectrum of known genetic disorders. Consequently, efforts to diagnose genetic disorders in a clinical setting are now often shared with efforts to prioritise candidate variants for the detection of new disease genes. Herein we discuss some of the biggest obstacles precluding molecular diagnosis and discovery of new gene disorders. We consider bioinformatic and analytical challenges faced when interpreting next generation sequencing data and showcase some of the newest tools available to mitigate these issues. We consider how incomplete penetrance, non-coding variation and structural variants are likely to impact diagnostic rates, and we further discuss methods for uplifting novel gene discovery by adopting a gene-to-patient-based approach.
Collapse
|
6
|
Alyousfi D, Baralle D, Collins A. Essentiality-specific pathogenicity prioritization gene score to improve filtering of disease sequence data. Brief Bioinform 2020; 22:1782-1789. [PMID: 32186701 DOI: 10.1093/bib/bbaa029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2019] [Revised: 02/17/2020] [Accepted: 02/18/2020] [Indexed: 11/12/2022] Open
Abstract
The causal genetic variants underlying more than 50% of single gene (monogenic) disorders are yet to be discovered. Many patients with conditions likely to have a monogenic basis do not receive a confirmed molecular diagnosis which has potential impacts on clinical management. We have developed a gene-specific score, essentiality-specific pathogenicity prioritization (ESPP), to guide the recognition of genes likely to underlie monogenic disease variation to assist in filtering of genome sequence data. When a patient genome is sequenced, there are frequently several plausibly pathogenic variants identified in different genes. Recognition of the single gene most likely to include pathogenic variation can guide the identification of a causal variant. The ESPP score integrates gene-level scores which are broadly related to gene essentiality. Previous work towards the recognition of monogenic disease genes proposed a model with increasing gene essentiality from 'non-essential' to 'essential' genes (for which pathogenic variation may be incompatible with survival) with genes liable to contain disease variation positioned between these two extremes. We demonstrate that the ESPP score is useful for recognizing genes with high potential for pathogenic disease-related variation. Genes classed as essential have particularly high scores, as do genes recently recognized as strong candidates for developmental disorders. Through the integration of individual gene-specific scores, which have different properties and assumptions, we demonstrate the utility of an essentiality-based gene score to improve sequence genome filtering.
Collapse
|
7
|
Cacheiro P, Muñoz-Fuentes V, Murray SA, Dickinson ME, Bucan M, Nutter LMJ, Peterson KA, Haselimashhadi H, Flenniken AM, Morgan H, Westerberg H, Konopka T, Hsu CW, Christiansen A, Lanza DG, Beaudet AL, Heaney JD, Fuchs H, Gailus-Durner V, Sorg T, Prochazka J, Novosadova V, Lelliott CJ, Wardle-Jones H, Wells S, Teboul L, Cater H, Stewart M, Hough T, Wurst W, Sedlacek R, Adams DJ, Seavitt JR, Tocchini-Valentini G, Mammano F, Braun RE, McKerlie C, Herault Y, de Angelis MH, Mallon AM, Lloyd KCK, Brown SDM, Parkinson H, Meehan TF, Smedley D. Human and mouse essentiality screens as a resource for disease gene discovery. Nat Commun 2020; 11:655. [PMID: 32005800 PMCID: PMC6994715 DOI: 10.1038/s41467-020-14284-2] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Accepted: 12/12/2019] [Indexed: 12/31/2022] Open
Abstract
The identification of causal variants in sequencing studies remains a considerable challenge that can be partially addressed by new gene-specific knowledge. Here, we integrate measures of how essential a gene is to supporting life, as inferred from viability and phenotyping screens performed on knockout mice by the International Mouse Phenotyping Consortium and essentiality screens carried out on human cell lines. We propose a cross-species gene classification across the Full Spectrum of Intolerance to Loss-of-function (FUSIL) and demonstrate that genes in five mutually exclusive FUSIL categories have differing biological properties. Most notably, Mendelian disease genes, particularly those associated with developmental disorders, are highly overrepresented among genes non-essential for cell survival but required for organism development. After screening developmental disorder cases from three independent disease sequencing consortia, we identify potentially pathogenic variants in genes not previously associated with rare diseases. We therefore propose FUSIL as an efficient approach for disease gene discovery.
Collapse
Grants
- UM1 HG008900 NHGRI NIH HHS
- UM1 HG006504 NHGRI NIH HHS
- MC_UP_1502/1 Medical Research Council
- UM1 HG006542 NHGRI NIH HHS
- UM1 OD023221 NIH HHS
- MC_U142684171 Medical Research Council
- MR/S006753/1 Medical Research Council
- UM1 HG006370 NHGRI NIH HHS
- UM1 HG006493 NHGRI NIH HHS
- U54 HG006370 NHGRI NIH HHS
- U54 HG006364 NHGRI NIH HHS
- MC_U142684172 Medical Research Council
- UM1 HG006348 NHGRI NIH HHS
- U42 OD011174 NIH HHS
- U42 OD011175 NIH HHS
- Wellcome Trust
- This work was supported by NIH grant U54 HG006370. IMPC-related mouse production and phenotyping was funded by the Government of Canada through Genome Canada and Ontario Genomics (OGI-051) for NorCOMM2 (C.M.) and the National Institutes of Health and OD, NCRR, NIDDK and NHLBI for KOMP and KOMP2 Projects U42 OD011175 and UM1OD023221 (C.M., K.C.K.L), Infrafrontier grant 01KX1012, EU Horizon2020: IPAD-MD funding 653961 (M.H.d.A); EUCOMM: LSHM-CT-2005-018931, EUCOMMTOOLS: FP7-HEALTH-F4-2010-261492 (W.G.W). UM1 HG006348; U42 OD011174; U54 HG005348 (A.L.B), NIH U54706HG006364 (A.L.B). Wellcome Trust grants WT098051 and WT206194 (D.A). The French National Centre for Scientific Research (CNRS), the French National Institute of Health and Medical Research (INSERM), the University of Strasbourg and the “Centre Europeen de Recherche en Biomedecine”, and the French state funds through the “Agence Nationale de la Recherche” under the frame programme Investissements d’Avenir labelled (ANR-10-IDEX-0002-02, ANR-10-LABX-0030-INRT, ANR-10-INBS-07 PHENOMIN (J.H.). This research was made possible through access to the data and findings generated by the 100,000 Genomes Project. The 100,000 Genomes Project is managed by Genomics England Limited (a wholly owned company of the Department of Health). The 100,000 Genomes Project is funded by the National Institute for Health Research and NHS England. The Wellcome Trust, Cancer Research UK and the Medical Research Council have also funded research infrastructure. The 100,000 Genomes Project uses data provided by patients and collected by the National Health Service as part of their care and support. We are also grateful for the data access provided by the DDD and CMG projects. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund [grant number HICF-1009-003], a parallel funding partnership between Wellcome and the Department of Health, and the Wellcome Sanger Institute [grant number WT098051]. The views expressed in this publication are those of the author(s) and not necessarily those of Wellcome or the Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). The research team acknowledges the support of the National Institute for Health Research, through the Comprehensive Clinical Research Network. The Centers for Mendelian Genomics are funded by the National Human Genome Research Institute, the National Heart, Lung, and Blood Institute, and the National Eye Institute. Broad Institute (UM1 HG008900), Johns Hopkins University School of Medicine/Baylor College of Medicine (UM1 HG006542), University of Washington (UM1 HG006493), Yale University (UM1 HG006504).
Collapse
Affiliation(s)
- Pilar Cacheiro
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Violeta Muñoz-Fuentes
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | - Mary E Dickinson
- Departments of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Maja Bucan
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Lauryl M J Nutter
- The Centre for Phenogenomics, The Hospital for Sick Children, Toronto, ON, M5T 3H7, Canada
| | | | - Hamed Haselimashhadi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ann M Flenniken
- The Centre for Phenogenomics, Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, M5T 3H7, Canada
| | - Hugh Morgan
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Henrik Westerberg
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Tomasz Konopka
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Chih-Wei Hsu
- Departments of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Audrey Christiansen
- Departments of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Denise G Lanza
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Arthur L Beaudet
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jason D Heaney
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Helmut Fuchs
- German Mouse Clinic, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Valerie Gailus-Durner
- German Mouse Clinic, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Tania Sorg
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404, Illkirch, France
| | - Jan Prochazka
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, 252 50, Prague, Czech Republic
| | - Vendula Novosadova
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, 252 50, Prague, Czech Republic
| | | | | | - Sara Wells
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Lydia Teboul
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Heather Cater
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Michelle Stewart
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Tertius Hough
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Wolfgang Wurst
- Institute of Developmental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, 85764, Neuherberg, Germany
- Department of Developmental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, 85764, Neuherberg, Germany
- Deutsches Institut für Neurodegenerative Erkrankungen (DZNE) Site Munich, Munich Cluster for Systems Neurology (SyNergy), Adolf-Butenandt-Institut, Ludwig-Maximilians-Universität München, 80336, Munich, Germany
| | - Radislav Sedlacek
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, 252 50, Prague, Czech Republic
| | - David J Adams
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - John R Seavitt
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Glauco Tocchini-Valentini
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, 00015, Monterotondo Scalo, Italy
| | - Fabio Mammano
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, 00015, Monterotondo Scalo, Italy
| | | | - Colin McKerlie
- The Centre for Phenogenomics, The Hospital for Sick Children, Toronto, ON, M5T 3H7, Canada
- Translational Medicine, The Hospital for Sick Children, Toronto, ON, M5T 3H7, Canada
| | - Yann Herault
- Université de Strasbourg, CNRS, INSERM, Institut de Génétique, Biologie Moléculaire et Cellulaire, Institut Clinique de la Souris, IGBMC, PHENOMIN-ICS, 67404, Illkirch, France
| | - Martin Hrabě de Angelis
- German Mouse Clinic, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
- Department of Experimental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, 85354, Freising-Weihenstephan, Germany
- German Center for Diabetes Research (DZD), 85764, Neuherberg, Germany
| | - Ann-Marie Mallon
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - K C Kent Lloyd
- Mouse Biology Program, University of California, Davis, CA, 95618, USA
| | - Steve D M Brown
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Terrence F Meehan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Damian Smedley
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK.
| |
Collapse
|
8
|
Gene-dense autosomal chromosomes show evidence for increased selection. Heredity (Edinb) 2019; 123:774-783. [PMID: 31576017 DOI: 10.1038/s41437-019-0272-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 09/16/2019] [Indexed: 12/20/2022] Open
Abstract
Purifying selection tends to reduce nucleotide and haplotype diversity leading to increased linkage disequilibrium. However, detection of evidence for selection is difficult as the signature is confounded by wide variation in the recombination rate which has a complex relationship with selection. The effective bottleneck time (the ratio of the linkage disequilibrium map to the genetic map in Morgans) controls for variability in the recombination rate. Reduced effective bottleneck times indicate stronger residual linkage disequilibrium, consistent with increased selection. Using whole genome sequence data from one European and three Sub-Saharan African human populations we find, in the African samples, strong correlations between high gene densities and reduced effective bottleneck time for autosomal chromosomes. This suggests that gene-dense autosomes have been subject to increased purifying selection reducing effective bottleneck times compared to gene-poor autosomes. Although previous studies have shown unusually strong linkage disequilibrium for the sex chromosomes variation within the autosomes has not been recognised. The strongest relationship is between effective bottleneck time and the density of essential genes, which are likely targets of greater selective pressure (p = 0.006, for the 22 autosomes). The magnitude of the reduction in chromosome-specific effective bottleneck times from the least to the most gene-dense autosomes is ~17-21% for Sub-Saharan African populations. The effect size is greater in Sub-Saharan African populations, compared to a European sample, consistent with increased efficiency of selection in populations with larger effective population sizes which have not been subject to intense population bottlenecks as experienced by populations of European ancestry. The findings highlight the value of deeper analyses of selection within Sub-Saharan African populations.
Collapse
|
9
|
GenePy - a score for estimating gene pathogenicity in individuals using next-generation sequencing data. BMC Bioinformatics 2019; 20:254. [PMID: 31096927 PMCID: PMC6524327 DOI: 10.1186/s12859-019-2877-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 05/06/2019] [Indexed: 12/30/2022] Open
Abstract
Background Next-generation sequencing is revolutionising diagnosis and treatment of rare diseases, however its application to understanding common disease aetiology is limited. Rare disease applications binarily attribute genetic change(s) at a single locus to a specific phenotype. In common diseases, where multiple genetic variants within and across genes contribute to disease, binary modelling cannot capture the burden of pathogenicity harboured by an individual across a given gene/pathway. We present GenePy, a novel gene-level scoring system for integration and analysis of next-generation sequencing data on a per-individual basis that transforms NGS data interpretation from variant-level to gene-level. This simple and flexible scoring system is intuitive and amenable to integration for machine learning, network and topological approaches, facilitating the investigation of complex phenotypes. Results Whole-exome sequencing data from 508 individuals were used to generate GenePy scores. For each variant a score is calculated incorporating: i) population allele frequency estimates; ii) individual zygosity, determined through standard variant calling pipelines and; iii) any user defined deleteriousness metric to inform on functional impact. GenePy then combines scores generated for all variants observed into a single gene score for each individual. We generated a matrix of ~ 14,000 GenePy scores for all individuals for each of sixteen popular deleteriousness metrics. All per-gene scores are corrected for gene length. The majority of genes generate GenePy scores < 0.01 although individuals harbouring multiple rare highly deleterious mutations can accumulate extremely high GenePy scores. In the absence of a comparator metric, we examine GenePy performance in discriminating genes known to be associated with three common, complex diseases. A Mann-Whitney U test conducted on GenePy scores for this positive control gene in cases versus controls demonstrates markedly more significant results (p = 1.37 × 10− 4) compared to the most commonly applied association tool that combines common and rare variation (p = 0.003). Conclusions Per-gene per-individual GenePy scores are intuitive when assessing genetic variation in individual patients or comparing scores between groups. GenePy outperforms the currently accepted best practice tools for combining common and rare variation. GenePy scores are suitable for downstream data integration with transcriptomic and proteomic data that also report at the gene level. Electronic supplementary material The online version of this article (10.1186/s12859-019-2877-3) contains supplementary material, which is available to authorized users.
Collapse
|
10
|
Vergara-Lope A, Ennis S, Vorechovsky I, Pengelly RJ, Collins A. Heterogeneity in the extent of linkage disequilibrium among exonic, intronic, non-coding RNA and intergenic chromosome regions. Eur J Hum Genet 2019; 27:1436-1444. [PMID: 31053778 DOI: 10.1038/s41431-019-0419-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Revised: 03/04/2019] [Accepted: 04/16/2019] [Indexed: 11/09/2022] Open
Abstract
Whole-genome sequence data enable construction of high-resolution linkage disequilibrium (LD) maps revealing the LD structure of functional elements within genic and subgenic sequences. The Malecot-Morton model defines LD map distances in linkage disequilibrium units (LDUs), analogous to the centimorgan scale of linkage maps. For whole-genome sequence-derived LD maps, we introduce the ratio of corresponding map lengths kilobases/LDU to describe the extent of LD within genome components. The extent of LD is highly variable across the genome ranging from ~38 kb for intergenic sequences to ~858 kb for centromeric regions. LD is ~16% more extensive in genic, compared with intergenic sequences, reflecting relatively increased selection and/or reduced recombination in genes. The LD profile across 18,268 autosomal genes reveals reduced extent of LD, consistent with elevated recombination, in exonic regions near the 5' end of genes but more extensive LD, compared with intronic sequences, across more centrally located exons. Genes classified as essential and genes linked to Mendelian phenotypes show more extensive LD compared with genes associated with complex traits, perhaps reflecting differences in selective pressure. Significant differences between exonic, intronic and intergenic components demonstrate that fine-scale LD structure provides important insights into genome function, which cannot be revealed by LD analysis of much lower resolution array-based genotyping and conventional linkage maps.
Collapse
Affiliation(s)
- Alejandra Vergara-Lope
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Sarah Ennis
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Igor Vorechovsky
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Reuben J Pengelly
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Andrew Collins
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK.
| |
Collapse
|