1
|
Gilbert MA, Keefer-Jacques E, Jadhav T, Antfolk D, Ming Q, Valente N, Shaw GTW, Sottolano CJ, Matwijec G, Luca VC, Loomes KM, Rajagopalan R, Hayeck TJ, Spinner NB. Functional characterization of 2,832 JAG1 variants supports reclassification for Alagille syndrome and improves guidance for clinical variant interpretation. Am J Hum Genet 2024; 111:1656-1672. [PMID: 39043182 PMCID: PMC11339624 DOI: 10.1016/j.ajhg.2024.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 06/15/2024] [Accepted: 06/24/2024] [Indexed: 07/25/2024] Open
Abstract
Pathogenic variants in the JAG1 gene are a primary cause of the multi-system disorder Alagille syndrome. Although variant detection rates are high for this disease, there is uncertainty associated with the classification of missense variants that leads to reduced diagnostic yield. Consequently, up to 85% of reported JAG1 missense variants have uncertain or conflicting classifications. We generated a library of 2,832 JAG1 nucleotide variants within exons 1-7, a region with a high number of reported missense variants, and designed a high-throughput assay to measure JAG1 membrane expression, a requirement for normal function. After calibration using a set of 175 known or predicted pathogenic and benign variants included within the variant library, 486 variants were characterized as functionally abnormal (n = 277 abnormal and n = 209 likely abnormal), of which 439 (90.3%) were missense. We identified divergent membrane expression occurring at specific residues, indicating that loss of the wild-type residue itself does not drive pathogenicity, a finding supported by structural modeling data and with broad implications for clinical variant classification both for Alagille syndrome and globally across other disease genes. Of 144 uncertain variants reported in patients undergoing clinical or research testing, 27 had functionally abnormal membrane expression, and inclusion of our data resulted in the reclassification of 26 to likely pathogenic. Functional evidence augments the classification of genomic variants, reducing uncertainty and improving diagnostics. Inclusion of this repository of functional evidence during JAG1 variant reclassification will significantly affect resolution of variant pathogenicity, making a critical impact on the molecular diagnosis of Alagille syndrome.
Collapse
Affiliation(s)
- Melissa A Gilbert
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA; Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| | - Ernest Keefer-Jacques
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Tanaya Jadhav
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Daniel Antfolk
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Qianqian Ming
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Nicolette Valente
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Grace Tzun-Wen Shaw
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Christopher J Sottolano
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Grace Matwijec
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Vincent C Luca
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Kathleen M Loomes
- Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Tristan J Hayeck
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nancy B Spinner
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
2
|
Chu SKS, Narang K, Siegel JB. Protein stability prediction by fine-tuning a protein language model on a mega-scale dataset. PLoS Comput Biol 2024; 20:e1012248. [PMID: 39038042 PMCID: PMC11293664 DOI: 10.1371/journal.pcbi.1012248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 08/01/2024] [Accepted: 06/13/2024] [Indexed: 07/24/2024] Open
Abstract
Protein stability plays a crucial role in a variety of applications, such as food processing, therapeutics, and the identification of pathogenic mutations. Engineering campaigns commonly seek to improve protein stability, and there is a strong interest in streamlining these processes to enable rapid optimization of highly stabilized proteins with fewer iterations. In this work, we explore utilizing a mega-scale dataset to develop a protein language model optimized for stability prediction. ESMtherm is trained on the folding stability of 528k natural and de novo sequences derived from 461 protein domains and can accommodate deletions, insertions, and multiple-point mutations. We show that a protein language model can be fine-tuned to predict folding stability. ESMtherm performs reasonably on small protein domains and generalizes to sequences distal from the training set. Lastly, we discuss our model's limitations compared to other state-of-the-art methods in generalizing to larger protein scaffolds. Our results highlight the need for large-scale stability measurements on a diverse dataset that mirrors the distribution of sequence lengths commonly observed in nature.
Collapse
Affiliation(s)
- Simon K. S. Chu
- Biophysics Graduate Program, University of California Davis, Davis, California, United States of America
| | - Kush Narang
- College of Biological Sciences, University of California Davis, Davis, California, United States of America
| | - Justin B. Siegel
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Chemistry, University of California Davis, Davis, California, United States of America
- Department of Biochemistry and Molecular Medicine, University of California Davis, Davis, California, United States of America
| |
Collapse
|
3
|
Kang SC, Sarn NB, Venegas J, Tan Z, Hitomi M, Eng C. Germline PTEN genotype-dependent phenotypic divergence during the early neural developmental process of forebrain organoids. Mol Psychiatry 2024; 29:1767-1781. [PMID: 38030818 DOI: 10.1038/s41380-023-02325-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 10/22/2023] [Accepted: 11/13/2023] [Indexed: 12/01/2023]
Abstract
PTEN germline mutations account for ~0.2-1% of all autism spectrum disorder (ASD) cases, as well as ~17% of ASD patients with macrocephaly, making it one of the top ASD-associated risk genes. Individuals with germline PTEN mutations receive the molecular diagnosis of PTEN Hamartoma Tumor Syndrome (PHTS), an inherited cancer predisposition syndrome, about 20-23% of whom are diagnosed with ASD. We generated forebrain organoid cultures from gene-edited isogenic human induced pluripotent stem cells (hiPSCs) harboring a PTENG132D (ASD) or PTENM134R (cancer) mutant allele to model how these mutations interrupt neurodevelopmental processes. Here, we show that the PTENG132D allele disrupts early neuroectoderm formation during the first several days of organoid generation, and results in deficient electrophysiology. While organoids generated from PTENM134R hiPSCs remained morphologically similar to wild-type organoids during this early stage in development, we observed disrupted neuronal differentiation, radial glia positioning, and cortical layering in both PTEN-mutant organoids at the later stage of 72+ days of development. Perifosine, an AKT inhibitor, reduced over-activated AKT and partially corrected the abnormalities in cellular organization observed in PTENG132D organoids. Single cell RNAseq analyses on early-stage organoids revealed that genes related to neural cell fate were decreased in PTENG132D mutant organoids, and AKT inhibition was capable of upregulating gene signatures related to neuronal cell fate and CNS maturation pathways. These findings demonstrate that different PTEN missense mutations can have a profound impact on neurodevelopment at diverse stages which in turn may predispose PHTS individuals to ASD. Further study will shed light on ways to mitigate pathological impact of PTEN mutants on neurodevelopment by stage-specific manipulation of downstream PTEN signaling components.
Collapse
Affiliation(s)
- Shin Chung Kang
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Nicholas B Sarn
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Juan Venegas
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Zhibing Tan
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, 44195, USA
| | - Masahiro Hitomi
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, 44195, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA.
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, 44195, USA.
- Center for Personalized Genetic Healthcare, Medical Specialties Institute, Cleveland Clinic, Cleveland, OH, 44195, USA.
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
- Taussig Cancer Institute, Cleveland Clinic Foundation, Cleveland, OH, 44195, USA.
- Department of Genetics and Genome Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| |
Collapse
|
4
|
Zhong G, Zhao Y, Zhuang D, Chung WK, Shen Y. PreMode predicts mode-of-action of missense variants by deep graph representation learning of protein sequence and structural context. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.20.581321. [PMID: 38746140 PMCID: PMC11092447 DOI: 10.1101/2024.02.20.581321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Accurate prediction of the functional impact of missense variants is important for disease gene discovery, clinical genetic diagnostics, therapeutic strategies, and protein engineering. Previous efforts have focused on predicting a binary pathogenicity classification, but the functional impact of missense variants is multi-dimensional. Pathogenic missense variants in the same gene may act through different modes of action (i.e., gain/loss-of-function) by affecting different aspects of protein function. They may result in distinct clinical conditions that require different treatments. We developed a new method, PreMode, to perform gene-specific mode-of-action predictions. PreMode models effects of coding sequence variants using SE(3)-equivariant graph neural networks on protein sequences and structures. Using the largest-to-date set of missense variants with known modes of action, we showed that PreMode reached state-of-the-art performance in multiple types of mode-of-action predictions by efficient transfer-learning. Additionally, PreMode's prediction of G/LoF variants in a kinase is consistent with inactive-active conformation transition energy changes. Finally, we show that PreMode enables efficient study design of deep mutational scans and optimization in protein engineering.
Collapse
|
5
|
Gersing S, Schulze TK, Cagiada M, Stein A, Roth FP, Lindorff-Larsen K, Hartmann-Petersen R. Characterizing glucokinase variant mechanisms using a multiplexed abundance assay. Genome Biol 2024; 25:98. [PMID: 38627865 PMCID: PMC11021015 DOI: 10.1186/s13059-024-03238-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 04/04/2024] [Indexed: 04/19/2024] Open
Abstract
BACKGROUND Amino acid substitutions can perturb protein activity in multiple ways. Understanding their mechanistic basis may pinpoint how residues contribute to protein function. Here, we characterize the mechanisms underlying variant effects in human glucokinase (GCK) variants, building on our previous comprehensive study on GCK variant activity. RESULTS Using a yeast growth-based assay, we score the abundance of 95% of GCK missense and nonsense variants. When combining the abundance scores with our previously determined activity scores, we find that 43% of hypoactive variants also decrease cellular protein abundance. The low-abundance variants are enriched in the large domain, while residues in the small domain are tolerant to mutations with respect to abundance. Instead, many variants in the small domain perturb GCK conformational dynamics which are essential for appropriate activity. CONCLUSIONS In this study, we identify residues important for GCK metabolic stability and conformational dynamics. These residues could be targeted to modulate GCK activity, and thereby affect glucose homeostasis.
Collapse
Affiliation(s)
- Sarah Gersing
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200, Copenhagen, Denmark.
| | - Thea K Schulze
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200, Copenhagen, Denmark
| | - Matteo Cagiada
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200, Copenhagen, Denmark
| | - Amelie Stein
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200, Copenhagen, Denmark
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, M5S 3E1, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, M5S 1A8, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, M5G 1X5, Toronto, ON, Canada
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, 15213, Pittsburgh, USA
| | - Kresten Lindorff-Larsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200, Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200, Copenhagen, Denmark.
| |
Collapse
|
6
|
Swint-Kruse L, Fenton AW. Rheostats, toggles, and neutrals, Oh my! A new framework for understanding how amino acid changes modulate protein function. J Biol Chem 2024; 300:105736. [PMID: 38336297 PMCID: PMC10914490 DOI: 10.1016/j.jbc.2024.105736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/09/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Advances in personalized medicine and protein engineering require accurately predicting outcomes of amino acid substitutions. Many algorithms correctly predict that evolutionarily-conserved positions show "toggle" substitution phenotypes, which is defined when a few substitutions at that position retain function. In contrast, predictions often fail for substitutions at the less-studied "rheostat" positions, which are defined when different amino acid substitutions at a position sample at least half of the possible functional range. This review describes efforts to understand the impact and significance of rheostat positions: (1) They have been observed in globular soluble, integral membrane, and intrinsically disordered proteins; within single proteins, their prevalence can be up to 40%. (2) Substitutions at rheostat positions can have biological consequences and ∼10% of substitutions gain function. (3) Although both rheostat and "neutral" (defined when all substitutions exhibit wild-type function) positions are nonconserved, the two classes have different evolutionary signatures. (4) Some rheostat positions have pleiotropic effects on function, simultaneously modulating multiple parameters (e.g., altering both affinity and allosteric coupling). (5) In structural studies, substitutions at rheostat positions appear to cause only local perturbations; the overall conformations appear unchanged. (6) Measured functional changes show promising correlations with predicted changes in protein dynamics; the emergent properties of predicted, dynamically coupled amino acid networks might explain some of the complex functional outcomes observed when substituting rheostat positions. Overall, rheostat positions provide unique opportunities for using single substitutions to tune protein function. Future studies of these positions will yield important insights into the protein sequence/function relationship.
Collapse
Affiliation(s)
- Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
7
|
Kamath ND, Matreyek KA. Multiplex Functional Characterization of Protein Variant Libraries in Mammalian Cells with Single-Copy Genomic Integration and High-Throughput DNA Sequencing. Methods Mol Biol 2024; 2774:135-152. [PMID: 38441763 DOI: 10.1007/978-1-0716-3718-0_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
Sequencing-based, massively parallel genetic assays have enabled simultaneous characterization of the genotype-phenotype relationships for libraries encoding thousands of unique protein variants. Since plasmid transfection and lentiviral transduction have characteristics that limit multiplexing with pooled libraries, we developed a mammalian synthetic biology platform that harnesses the Bxb1 bacteriophage DNA recombinase to insert single promoterless plasmids encoding a transgene of interest into a pre-engineered "landing pad" site within the cell genome. The transgene is expressed behind a genomically integrated promoter, ensuring only one transgene is expressed per cell, preserving a strict genotype-phenotype link. Upon selecting cells based on a desired phenotype, the transgene can be sequenced to ascribe each variant a phenotypic score. We describe how to create and utilize landing pad cells for large-scale, library-based genetic experiments. Using the provided examples, the experimental template can be adapted to explore protein variants in diverse biological problems within mammalian cells.
Collapse
Affiliation(s)
- Nisha D Kamath
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Kenneth A Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| |
Collapse
|
8
|
Notin P, Kollasch AW, Ritter D, van Niekerk L, Paul S, Spinner H, Rollins N, Shaw A, Weitzman R, Frazer J, Dias M, Franceschi D, Orenbuch R, Gal Y, Marks DS. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570727. [PMID: 38106144 PMCID: PMC10723403 DOI: 10.1101/2023.12.07.570727] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins that can address our most pressing challenges in climate, agriculture and healthcare. Despite a surge in machine learning-based protein models to tackle these questions, an assessment of their respective benefits is challenging due to the use of distinct, often contrived, experimental datasets, and the variable performance of models across different protein families. Addressing these challenges requires scale. To that end we introduce ProteinGym, a large-scale and holistic set of benchmarks specifically designed for protein fitness prediction and design. It encompasses both a broad collection of over 250 standardized deep mutational scanning assays, spanning millions of mutated sequences, as well as curated clinical datasets providing high-quality expert annotations about mutation effects. We devise a robust evaluation framework that combines metrics for both fitness prediction and design, factors in known limitations of the underlying experimental methods, and covers both zero-shot and supervised settings. We report the performance of a diverse set of over 70 high-performing models from various subfields (eg., alignment-based, inverse folding) into a unified benchmark suite. We open source the corresponding codebase, datasets, MSAs, structures, model predictions and develop a user-friendly website that facilitates data access and analysis.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Ada Shaw
- Applied Mathematics, Harvard University
| | | | | | - Mafalda Dias
- Centre for Genomic Regulation, Universitat Pompeu Fabra
| | | | | | - Yarin Gal
- Computer Science, University of Oxford
| | | |
Collapse
|
9
|
Maes S, Deploey N, Peelman F, Eyckerman S. Deep mutational scanning of proteins in mammalian cells. CELL REPORTS METHODS 2023; 3:100641. [PMID: 37963462 PMCID: PMC10694495 DOI: 10.1016/j.crmeth.2023.100641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]
Abstract
Protein mutagenesis is essential for unveiling the molecular mechanisms underlying protein function in health, disease, and evolution. In the past decade, deep mutational scanning methods have evolved to support the functional analysis of nearly all possible single-amino acid changes in a protein of interest. While historically these methods were developed in lower organisms such as E. coli and yeast, recent technological advancements have resulted in the increased use of mammalian cells, particularly for studying proteins involved in human disease. These advancements will aid significantly in the classification and interpretation of variants of unknown significance, which are being discovered at large scale due to the current surge in the use of whole-genome sequencing in clinical contexts. Here, we explore the experimental aspects of deep mutational scanning studies in mammalian cells and report the different methods used in each step of the workflow, ultimately providing a useful guide toward the design of such studies.
Collapse
Affiliation(s)
- Stefanie Maes
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biochemistry and Microbiology, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Nick Deploey
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Frank Peelman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium.
| |
Collapse
|
10
|
Sinha S, Li J, Tam B, Wang SM. Classification of PTEN missense VUS through exascale simulations. Brief Bioinform 2023; 24:bbad361. [PMID: 37843401 DOI: 10.1093/bib/bbad361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/08/2023] [Accepted: 09/20/2023] [Indexed: 10/17/2023] Open
Abstract
Phosphatase and tensin homolog (PTEN), a tumor suppressor with dual phosphatase properties, is a key factor in PI3K/AKT signaling pathway. Pathogenic germline variation in PTEN can abrogate its ability to dephosphorylate, causing high cancer risk. Lack of functional evidence lets numerous PTEN variants be classified as variants of uncertain significance (VUS). Utilizing Molecular Dynamics (MD) simulations, we performed a thorough evaluation for 147 PTEN missense VUS, sorting them into 66 deleterious and 81 tolerated variants. Utilizing replica exchange molecular dynamic (REMD) simulations, we further assessed the variants situated in the catalytic core of PTEN's phosphatase domain and uncovered conformational alterations influencing the structural stability of the phosphatase domain. There was a high degree of agreement between our results and the variants classified by Variant Abundance by Massively Parallel Sequencing, saturation mutagenesis, multiplexed functional data and experimental assays. Our extensive analysis of PTEN missense VUS should benefit their clinical applications in PTEN-related cancer. SIGNIFICANCE STATEMENT Classification of PTEN variants affecting its lipid phosphatase activity is important for understanding the roles of PTEN variation in the pathogenesis of hereditary and sporadic malignancies. Of the 3000 variants identified in PTEN, 1296 (43%) were assigned as VUS. Here, we applied MD and REMD simulations to investigate the effects of PTEN missense VUS on the structural integrity of the PTEN phosphatase domain consisting the WPD, P and TI active sites. We classified a total of 147 missense VUS into 66 deleterious and 81 tolerated variants by referring to the control group comprising 54 pathogenic and 12 benign variants. The classification was largely in concordance with these classified by experimental approaches.
Collapse
Affiliation(s)
- Siddharth Sinha
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| | - Jiaheng Li
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| | - Benjamin Tam
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| | - San Ming Wang
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| |
Collapse
|
11
|
Gersing S, Schulze TK, Cagiada M, Stein A, Roth FP, Lindorff-Larsen K, Hartmann-Petersen R. Characterizing glucokinase variant mechanisms using a multiplexed abundance assay. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.24.542036. [PMID: 37292969 PMCID: PMC10245906 DOI: 10.1101/2023.05.24.542036] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Amino acid substitutions can perturb protein activity in multiple ways. Understanding their mechanistic basis may pinpoint how residues contribute to protein function. Here, we characterize the mechanisms of human glucokinase (GCK) variants, building on our previous comprehensive study on GCK variant activity. We assayed the abundance of 95% of GCK missense and nonsense variants, and found that 43% of hypoactive variants have a decreased cellular abundance. By combining our abundance scores with predictions of protein thermodynamic stability, we identify residues important for GCK metabolic stability and conformational dynamics. These residues could be targeted to modulate GCK activity, and thereby affect glucose homeostasis.
Collapse
Affiliation(s)
- Sarah Gersing
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark
| | - Thea K. Schulze
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark
| | - Matteo Cagiada
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark
| | - Amelie Stein
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark
| | - Frederick P. Roth
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, M5G 1X5, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5T 3A1, Canada
| | - Kresten Lindorff-Larsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, Denmark
| |
Collapse
|
12
|
Fu Y, Bedő J, Papenfuss AT, Rubin AF. Integrating deep mutational scanning and low-throughput mutagenesis data to predict the impact of amino acid variants. Gigascience 2022; 12:giad073. [PMID: 37721410 PMCID: PMC10506130 DOI: 10.1093/gigascience/giad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/02/2023] [Accepted: 08/23/2023] [Indexed: 09/19/2023] Open
Abstract
BACKGROUND Evaluating the impact of amino acid variants has been a critical challenge for studying protein function and interpreting genomic data. High-throughput experimental methods like deep mutational scanning (DMS) can measure the effect of large numbers of variants in a target protein, but because DMS studies have not been performed on all proteins, researchers also model DMS data computationally to estimate variant impacts by predictors. RESULTS In this study, we extended a linear regression-based predictor to explore whether incorporating data from alanine scanning (AS), a widely used low-throughput mutagenesis method, would improve prediction results. To evaluate our model, we collected 146 AS datasets, mapping to 54 DMS datasets across 22 distinct proteins. CONCLUSIONS We show that improved model performance depends on the compatibility of the DMS and AS assays, and the scale of improvement is closely related to the correlation between DMS and AS results.
Collapse
Affiliation(s)
- Yunfan Fu
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Justin Bedő
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Anthony T Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
- Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
| | - Alan F Rubin
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| |
Collapse
|
13
|
Coyote-Maestas W, Nedrud D, He Y, Schmidt D. Determinants of trafficking, conduction, and disease within a K + channel revealed through multiparametric deep mutational scanning. eLife 2022; 11:e76903. [PMID: 35639599 PMCID: PMC9273215 DOI: 10.7554/elife.76903] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 05/27/2022] [Indexed: 01/04/2023] Open
Abstract
A long-standing goal in protein science and clinical genetics is to develop quantitative models of sequence, structure, and function relationships to understand how mutations cause disease. Deep mutational scanning (DMS) is a promising strategy to map how amino acids contribute to protein structure and function and to advance clinical variant interpretation. Here, we introduce 7429 single-residue missense mutations into the inward rectifier K+ channel Kir2.1 and determine how this affects folding, assembly, and trafficking, as well as regulation by allosteric ligands and ion conduction. Our data provide high-resolution information on a cotranslationally folded biogenic unit, trafficking and quality control signals, and segregated roles of different structural elements in fold stability and function. We show that Kir2.1 surface trafficking mutants are underrepresented in variant effect databases, which has implications for clinical practice. By comparing fitness scores with expert-reviewed variant effects, we can predict the pathogenicity of 'variants of unknown significance' and disease mechanisms of known pathogenic mutations. Our study in Kir2.1 provides a blueprint for how multiparametric DMS can help us understand the mechanistic basis of genetic disorders and the structure-function relationships of proteins.
Collapse
Affiliation(s)
- Willow Coyote-Maestas
- Department of Biochemistry, Molecular Biology and Biophysics, University of MinnesotaMinneapolisUnited States
| | - David Nedrud
- Department of Biochemistry, Molecular Biology and Biophysics, University of MinnesotaMinneapolisUnited States
| | - Yungui He
- Department of Genetics, Cell Biology and Development, University of MinnesotaMinneapolisUnited States
| | - Daniel Schmidt
- Department of Genetics, Cell Biology and Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|