1
|
Ng JK, Chen Y, Akinwe TM, Heins HB, Mehinovic E, Chang Y, Payne ZL, Manuel JG, Karchin R, Turner TN. Proteome-Wide Assessment of Clustering of Missense Variants in Neurodevelopmental Disorders Versus Cancer. medRxiv 2024:2024.02.02.24302238. [PMID: 38352539 PMCID: PMC10863034 DOI: 10.1101/2024.02.02.24302238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Missense de novo variants (DNVs) and missense somatic variants contribute to neurodevelopmental disorders (NDDs) and cancer, respectively. Proteins with statistical enrichment based on analyses of these variants exhibit convergence in the differing NDD and cancer phenotypes. Herein, the question of why some of the same proteins are identified in both phenotypes is examined through investigation of clustering of missense variation at the protein level. Our hypothesis is that missense variation is present in different protein locations in the two phenotypes leading to the distinct phenotypic outcomes. We tested this hypothesis in 1D protein space using our software CLUMP. Furthermore, we newly developed 3D-CLUMP that uses 3D protein structures to spatially test clustering of missense variation for proteome-wide significance. We examined missense DNVs in 39,883 parent-child sequenced trios with NDDs and missense somatic variants from 10,543 sequenced tumors covering five TCGA cancer types and two COSMIC pan-cancer aggregates of tissue types. There were 57 proteins with proteome-wide significant missense variation clustering in NDDs when compared to cancers and 79 proteins with proteome-wide significant missense clustering in cancers compared to NDDs. While our main objective was to identify differences in patterns of missense variation, we also identified a novel NDD protein BLTP2. Overall, our study is innovative, provides new insights into differential missense variation in NDDs and cancer at the protein-level, and contributes necessary information toward building a framework for thinking about prognostic and therapeutic aspects of these proteins.
Collapse
Affiliation(s)
- Jeffrey K. Ng
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Yilin Chen
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Titilope M. Akinwe
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Molecular Genetics & Genomics Graduate Program, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Hillary B. Heins
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Elvisa Mehinovic
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Yoonhoo Chang
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Human & Statistical Genetics Graduate Program, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Zachary L. Payne
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Molecular Genetics & Genomics Graduate Program, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Juana G. Manuel
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Rachel Karchin
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- The Sidney Kimmel Comprehensive Cancer Center, School of Medicine, Johns Hopkins University, Baltimore, MD, USA
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Tychele N. Turner
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Intellectual and Developmental Disabilities Research Center, Washington University School of Medicine, St. Louis, MO, USA
| |
Collapse
|
4
|
Padhi EM, Ng JK, Mehinovic E, Sams EI, Turner TN. ACES: Analysis of Conservation with an Extensive list of Species. Bioinformatics 2021; 37:3920-3922. [PMID: 34601580 PMCID: PMC8570785 DOI: 10.1093/bioinformatics/btab684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 09/20/2021] [Accepted: 09/24/2021] [Indexed: 11/18/2022] Open
Abstract
Motivation An abundance of new reference genomes is becoming available through large-scale sequencing efforts. While the reference FASTA for each genome is available, there is currently no automated mechanism to query a specific sequence across all new reference genomes. Results We developed ACES (Analysis of Conservation with an Extensive list of Species) as a computational workflow to query specific sequences of interest (e.g. enhancers, promoters, exons) against reference genomes with an available reference FASTA. This automated workflow generates BLAST hits against each of the reference genomes, a multiple sequence alignment file, a graphical fragment assembly file and a phylogenetic tree file. These data files can then be used by the researcher in several ways to provide key insights into conservation of the query sequence. Availability and implementation ACES is available at https://github.com/TNTurnerLab/ACES Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Evin M Padhi
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Jeffrey K Ng
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Elvisa Mehinovic
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Eleanor I Sams
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Tychele N Turner
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, 63110, USA
| |
Collapse
|
5
|
Padhi EM, Hayeck TJ, Cheng Z, Chatterjee S, Mannion BJ, Byrska-Bishop M, Willems M, Pinson L, Redon S, Benech C, Uguen K, Audebert-Bellanger S, Le Marechal C, Férec C, Efthymiou S, Rahman F, Maqbool S, Maroofian R, Houlden H, Musunuri R, Narzisi G, Abhyankar A, Hunter RD, Akiyama J, Fries LE, Ng JK, Mehinovic E, Stong N, Allen AS, Dickel DE, Bernier RA, Gorkin DU, Pennacchio LA, Zody MC, Turner TN. Coding and noncoding variants in EBF3 are involved in HADDS and simplex autism. Hum Genomics 2021; 15:44. [PMID: 34256850 PMCID: PMC8278787 DOI: 10.1186/s40246-021-00342-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 06/17/2021] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Previous research in autism and other neurodevelopmental disorders (NDDs) has indicated an important contribution of protein-coding (coding) de novo variants (DNVs) within specific genes. The role of de novo noncoding variation has been observable as a general increase in genetic burden but has yet to be resolved to individual functional elements. In this study, we assessed whole-genome sequencing data in 2671 families with autism (discovery cohort of 516 families, replication cohort of 2155 families). We focused on DNVs in enhancers with characterized in vivo activity in the brain and identified an excess of DNVs in an enhancer named hs737. RESULTS We adapted the fitDNM statistical model to work in noncoding regions and tested enhancers for excess of DNVs in families with autism. We found only one enhancer (hs737) with nominal significance in the discovery (p = 0.0172), replication (p = 2.5 × 10-3), and combined dataset (p = 1.1 × 10-4). Each individual with a DNV in hs737 had shared phenotypes including being male, intact cognitive function, and hypotonia or motor delay. Our in vitro assessment of the DNVs showed they all reduce enhancer activity in a neuronal cell line. By epigenomic analyses, we found that hs737 is brain-specific and targets the transcription factor gene EBF3 in human fetal brain. EBF3 is genome-wide significant for coding DNVs in NDDs (missense p = 8.12 × 10-35, loss-of-function p = 2.26 × 10-13) and is widely expressed in the body. Through characterization of promoters bound by EBF3 in neuronal cells, we saw enrichment for binding to NDD genes (p = 7.43 × 10-6, OR = 1.87) involved in gene regulation. Individuals with coding DNVs have greater phenotypic severity (hypotonia, ataxia, and delayed development syndrome [HADDS]) in comparison to individuals with noncoding DNVs that have autism and hypotonia. CONCLUSIONS In this study, we identify DNVs in the hs737 enhancer in individuals with autism. Through multiple approaches, we find hs737 targets the gene EBF3 that is genome-wide significant in NDDs. By assessment of noncoding variation and the genes they affect, we are beginning to understand their impact on gene regulatory networks in NDDs.
Collapse
Affiliation(s)
- Evin M Padhi
- Department of Genetics, Washington University School of Medicine, 4523 Clayton Avenue, Campus Box 8232, St. Louis, MO, 63110, USA
| | - Tristan J Hayeck
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Zhang Cheng
- Center for Epigenomics, University of California San Diego School of Medicine, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Sumantra Chatterjee
- Center for Human Genetics and Genomics, NYU School of Medicine, New York, NY, 10016, USA
| | - Brandon J Mannion
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Marjolaine Willems
- University of Montpellier, département de Génétique, maladies rares médecine personnalisée, U 1298, CHU Montpellier, University of Montpellier, Montpellier, France
| | - Lucile Pinson
- University of Montpellier, département de Génétique, maladies rares médecine personnalisée, U 1298, CHU Montpellier, University of Montpellier, Montpellier, France
| | - Sylvia Redon
- CHU Brest, Inserm, Univ Brest, EFS,UMR 1078, GGB, F-29200, Brest, France
| | - Caroline Benech
- CHU Brest, Inserm, Univ Brest, EFS,UMR 1078, GGB, F-29200, Brest, France
| | - Kevin Uguen
- CHU Brest, Inserm, Univ Brest, EFS,UMR 1078, GGB, F-29200, Brest, France
| | | | - Cédric Le Marechal
- CHU Brest, Inserm, Univ Brest, EFS,UMR 1078, GGB, F-29200, Brest, France
| | - Claude Férec
- CHU Brest, Inserm, Univ Brest, EFS,UMR 1078, GGB, F-29200, Brest, France
| | - Stephanie Efthymiou
- Department of Neuromuscular Disorders, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Fatima Rahman
- Development and Behavioral Pediatrics Department, Institute of Child Health and Children Hospital, Lahore, Pakistan
| | - Shazia Maqbool
- Department of Neuromuscular Disorders, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
- Development and Behavioral Pediatrics Department, Institute of Child Health and Children Hospital, Lahore, Pakistan
| | - Reza Maroofian
- Department of Neuromuscular Disorders, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Henry Houlden
- Department of Neuromuscular Disorders, UCL Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | | | | | | | - Riana D Hunter
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jennifer Akiyama
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Lauren E Fries
- Center for Human Genetics and Genomics, NYU School of Medicine, New York, NY, 10016, USA
| | - Jeffrey K Ng
- Department of Genetics, Washington University School of Medicine, 4523 Clayton Avenue, Campus Box 8232, St. Louis, MO, 63110, USA
| | - Elvisa Mehinovic
- Department of Genetics, Washington University School of Medicine, 4523 Clayton Avenue, Campus Box 8232, St. Louis, MO, 63110, USA
| | - Nick Stong
- Institute for Genomic Medicine, Columbia University, New York, NY, 10027, USA
| | - Andrew S Allen
- Center for Statistical Genetics and Genomics, Duke University, Durham, NC, 27708, USA
- Division of Integrative Genomics, Duke University, Durham, NC, 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, 27708, USA
| | - Diane E Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Raphael A Bernier
- Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, 98195, USA
| | - David U Gorkin
- Center for Epigenomics, University of California San Diego School of Medicine, 9500 Gilman Drive, La Jolla, CA, 92093, USA
- Department of Biology, Emory University, Atlanta, GA, 30322, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | | | - Tychele N Turner
- Department of Genetics, Washington University School of Medicine, 4523 Clayton Avenue, Campus Box 8232, St. Louis, MO, 63110, USA.
| |
Collapse
|