1
|
Jain S, Trinidad M, Nguyen TB, Jones K, Neto SD, Ge F, Glagovsky A, Jones C, Moran G, Wang B, Rahimi K, Çalıcı SZ, Cedillo LR, Berardelli S, Özden B, Chen K, Katsonis P, Williams A, Lichtarge O, Rana S, Pradhan S, Srinivasan R, Sajeed R, Joshi D, Faraggi E, Jernigan R, Kloczkowski A, Xu J, Song Z, Özkan S, Padilla N, de la Cruz X, Acuna-Hidalgo R, Grafmüller A, Barrón LTJ, Manfredi M, Savojardo C, Babbi G, Martelli PL, Casadio R, Sun Y, Zhu S, Shen Y, Pucci F, Rooman M, Cia G, Raimondi D, Hermans P, Kwee S, Chen E, Astore C, Kamandula A, Pejaver V, Ramola R, Velyunskiy M, Zeiberg D, Mishra R, Sterling T, Goldstein JL, Lugo-Martinez J, Kazi S, Li S, Long K, Brenner SE, Bakolitsa C, Radivojac P, Suhr D, Suhr T, Clark WT. Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A. Hum Genet 2025; 144:295-308. [PMID: 40055237 DOI: 10.1007/s00439-025-02731-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 02/01/2025] [Indexed: 03/12/2025]
Abstract
Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.
Collapse
Affiliation(s)
- Shantanu Jain
- The Institute for Experiential AI, Northeastern University, Boston, MA, USA
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Marena Trinidad
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Thanh Binh Nguyen
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Australia
| | | | | | - Fang Ge
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), Nanjing University of Posts & Telecommunications, Nanjing, China
| | | | | | | | - Boqi Wang
- Department of Bioinformatics and System Biology, University of California, San Diego, La Jolla, CA, USA
| | - Kobra Rahimi
- Department of Computational Biology, School of Life Sciences, Ochanomizu University, Tokyo, Japan
| | - Sümeyra Zeynep Çalıcı
- Department of Genomics, Faculty of Aquatic Science, Istanbul University, Istanbul, Turkey
| | | | - Silvia Berardelli
- Department of Electrical, Computer, and Biomedical Engineering, University of Pavia, Pavia, Italy
- enGenome srl, Pavia, Italy
| | - Buse Özden
- Department of Molecular Biology and Genetics, Faculty of Arts and Sciences, Istanbul Kültür University, Istanbul, Turkey
| | - Ken Chen
- University of California, Berkeley, Berkeley, CA, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | | | | | | | | | - Eshel Faraggi
- Research and Information Systems LLC, Indianapolis, IN, USA
- Physics Department, Indiana University-Purdue University, Indianapolis, IN, USA
| | - Robert Jernigan
- Roy J. Carver Department of Biochemistry, Iowa State University, Ames, IA, USA
| | - Andrzej Kloczkowski
- Institute for Genomic Medicine, The Research Institute at Nationwide Children'S Hospital, Columbus, OH, USA
| | - Jierui Xu
- University of California, Berkeley, Berkeley, CA, USA
| | | | - Selen Özkan
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Natàlia Padilla
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Xavier de la Cruz
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | | | | | | | | | | | - Giulia Babbi
- Biocomputing Group, University of Bologna, Bologna, Italy
| | | | - Rita Casadio
- Biocomputing Group, University of Bologna, Bologna, Italy
| | - Yuanfei Sun
- Department of Electrical & Computer Engineering, Texas a&M University, College Station, TX, USA
| | - Shaowen Zhu
- Department of Electrical & Computer Engineering, Texas a&M University, College Station, TX, USA
| | - Yang Shen
- Department of Electrical & Computer Engineering, Texas a&M University, College Station, TX, USA
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Gabriel Cia
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | | | - Pauline Hermans
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Sofia Kwee
- University of California, Berkeley, Berkeley, CA, USA
| | - Ella Chen
- University of California, Berkeley, Berkeley, CA, USA
| | | | - Akash Kamandula
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rashika Ramola
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Michelle Velyunskiy
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Daniel Zeiberg
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Reet Mishra
- Department of Bioengineering, University of California, Berkeley, CA, USA
- Department of Bioengineering, University of California, San Francisco, CA, USA
| | | | - Jennifer L Goldstein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jose Lugo-Martinez
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Sindy Li
- University of California, Berkeley, Berkeley, CA, USA
| | - Kinsey Long
- University of California, Berkeley, Berkeley, CA, USA
| | | | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | | | | | | |
Collapse
|
2
|
Joshi D, Pradhan S, Sajeed R, Srinivasan R, Rana S. An augmented transformer model trained on protein family specific variant data leads to improved prediction of variants of uncertain significance. Hum Genet 2025; 144:143-158. [PMID: 39869148 DOI: 10.1007/s00439-025-02727-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Accepted: 01/12/2025] [Indexed: 01/28/2025]
Abstract
Variants of uncertain significance (VUS) represent variants that lack sufficient evidence to be confidently associated with a disease, thus posing a challenge in the interpretation of genetic testing results. Here we report an improved method for predicting the VUS of Arylsulfatase A (ARSA) gene as part of the Critical Assessment of Genome Interpretation challenge (CAGI6). Our method uses a transfer learning approach that leverages a pre-trained protein language model to predict the impact of mutations on the activity of the ARSA enzyme, whose deficiency is known to cause a rare genetic disorder, metachromatic leukodystrophy. Our innovative framework combines zero-shot log odds scores and embeddings from the ESM, an evolutionary scale model as features for training a supervised model on gene variants functionally related to the ARSA gene. The zero-shot log odds score feature captures the generic properties of the proteins learned due to its pre-training on millions of sequences in the UniProt data, while the ESM embeddings for the proteins in the ARSA family capture features specific to the family. We also tested our approach on another enzyme, N-acetyl-glucosaminidase (NAGLU), that belongs to the same superfamily as ARSA. Our results demonstrate that the performance of our family models (augmented ESM models) is either comparable or better than the ESM models. The ARSA model compares favorably with the majority of state-of-the-art predictors on area under precision and recall curve (AUPRC) performance metric. However, the NAGLU model outperforms all pathogenicity predictors evaluated in this study on AUPRC metric. The improved AUPRC has relevance in a diagnostic setting where variant prioritization generally entails identifying a small number of pathogenic variants from a larger number of benign variants. Our results also indicate that genes that have sparse or no experimental variant impact data, the family variant data can serve as a proxy training data for making accurate predictions. Attention analysis of active sites and binding sites in ARSA and NAGLU proteins shed light on probable mechanisms of pathogenicity for positions that are highly attended.
Collapse
Affiliation(s)
- Dinesh Joshi
- TCS Research, Tata Consultancy Services, Hyderabad, India
| | | | | | | | - Sadhna Rana
- TCS Research, Tata Consultancy Services, Hyderabad, India.
| |
Collapse
|
3
|
Katsonis P, Lichtarge O. Meta-EA: a gene-specific combination of available computational tools for predicting missense variant effects. Nat Commun 2025; 16:159. [PMID: 39746940 PMCID: PMC11696468 DOI: 10.1038/s41467-024-55066-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 11/27/2024] [Indexed: 01/04/2025] Open
Abstract
Computational methods for estimating missense variant impact suffer from inconsistent performance across genes, which poses a major challenge for their reliable use in clinical practice. While ensemble scores leverage multiple prediction methods to enhance consistency, the overrepresentation of certain genes in the training data can bias their outcomes. To address this critical limitation, we propose a gene-specific ensemble framework trained on reference computational annotations rather than on clinical or experimental data. Accordingly, we generate Meta-EA ensemble scores that achieve comparable performance to the top individual predicting method for each gene set. Incorporating the effects of splicing and the allele frequency of human polymorphisms further enhances the performance of Meta-EA, achieving an area under the receiver operating characteristic curve of 0.97 for both gene-balanced and imbalanced clinical assessments. In conclusion, this work leverages the wealth of existing variant impact prediction approaches to generate improved estimations for clinical interpretation.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry & Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
4
|
Jain S, Trinidad M, Nguyen TB, Jones K, Neto SD, Ge F, Glagovsky A, Jones C, Moran G, Wang B, Rahimi K, Çalıcı SZ, Cedillo LR, Berardelli S, Özden B, Chen K, Katsonis P, Williams A, Lichtarge O, Rana S, Pradhan S, Srinivasan R, Sajeed R, Joshi D, Faraggi E, Jernigan R, Kloczkowski A, Xu J, Song Z, Özkan S, Padilla N, de la Cruz X, Acuna-Hidalgo R, Grafmüller A, Jiménez Barrón LT, Manfredi M, Savojardo C, Babbi G, Martelli PL, Casadio R, Sun Y, Zhu S, Shen Y, Pucci F, Rooman M, Cia G, Raimondi D, Hermans P, Kwee S, Chen E, Astore C, Kamandula A, Pejaver V, Ramola R, Velyunskiy M, Zeiberg D, Mishra R, Sterling T, Goldstein JL, Lugo-Martinez J, Kazi S, Li S, Long K, Brenner SE, Bakolitsa C, Radivojac P, Suhr D, Suhr T, Clark WT. Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.594558. [PMID: 38798479 PMCID: PMC11118473 DOI: 10.1101/2024.05.16.594558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.
Collapse
Affiliation(s)
- Shantanu Jain
- The Institute for Experiential AI, Northeastern University, Boston, MA, USA
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Marena Trinidad
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Thanh Binh Nguyen
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Australia
| | | | | | - Fang Ge
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), Nanjing University of Posts & Telecommunications, Nanjing, China
| | | | | | | | - Boqi Wang
- Department of Bioinformatics and System Biology, University of California, San Diego, La Jolla, CA, USA
| | - Kobra Rahimi
- Department of Computational Biology, School of Life Sciences, Ochanomizu University, Tokyo, Japan
| | - Sümeyra Zeynep Çalıcı
- Department of Genomics, Faculty of Aquatic Science, Istanbul University, Istanbul, Türkiye
| | | | - Silvia Berardelli
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
- enGenome srl, Pavia, Italy
| | - Buse Özden
- Program of Molecular Biotechnology and Genetics, Institute of Science, Istanbul University, Istanbul, Türkiye
| | - Ken Chen
- University of California, Berkeley, Berkeley, CA, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | | | | | | | | | - Eshel Faraggi
- Research and Information Systems LLC, Indianapolis, IN, USA
- Physics Department, Indiana University-Purdue University, Indianapolis, IN, USA
| | - Robert Jernigan
- Roy J. Carver Department of Biochemistry, Iowa State University, Ames, IA, USA
| | - Andrzej Kloczkowski
- Institute for Genomic Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Jierui Xu
- University of California, Berkeley, Berkeley, CA, USA
| | | | - Selen Özkan
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Natàlia Padilla
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Xavier de la Cruz
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
- Institucío Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | | | | | | | | | | | - Giulia Babbi
- Biocomputing Group, University of Bologna, Bologna, Italy
| | | | - Rita Casadio
- Biocomputing Group, University of Bologna, Bologna, Italy
| | - Yuanfei Sun
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Shaowen Zhu
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Yang Shen
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Gabriel Cia
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | | | - Pauline Hermans
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Sofia Kwee
- University of California, Berkeley, Berkeley, CA, USA
| | - Ella Chen
- University of California, Berkeley, Berkeley, CA, USA
| | | | - Akash Kamandula
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rashika Ramola
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Michelle Velyunskiy
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Daniel Zeiberg
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Reet Mishra
- Department of Bioengineering, University of California, Berkeley, CA, USA
- Department of Bioengineering, University of California, San Francisco, CA, USA
| | | | - Jennifer L Goldstein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jose Lugo-Martinez
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Sindy Li
- University of California, Berkeley, Berkeley, CA, USA
| | - Kinsey Long
- University of California, Berkeley, Berkeley, CA, USA
| | | | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | | | | | | |
Collapse
|
5
|
Jain S, Bakolitsa C, Brenner SE, Radivojac P, Moult J, Repo S, Hoskins RA, Andreoletti G, Barsky D, Chellapan A, Chu H, Dabbiru N, Kollipara NK, Ly M, Neumann AJ, Pal LR, Odell E, Pandey G, Peters-Petrulewicz RC, Srinivasan R, Yee SF, Yeleswarapu SJ, Zuhl M, Adebali O, Patra A, Beer MA, Hosur R, Peng J, Bernard BM, Berry M, Dong S, Boyle AP, Adhikari A, Chen J, Hu Z, Wang R, Wang Y, Miller M, Wang Y, Bromberg Y, Turina P, Capriotti E, Han JJ, Ozturk K, Carter H, Babbi G, Bovo S, Di Lena P, Martelli PL, Savojardo C, Casadio R, Cline MS, De Baets G, Bonache S, Díez O, Gutiérrez-Enríquez S, Fernández A, Montalban G, Ootes L, Özkan S, Padilla N, Riera C, De la Cruz X, Diekhans M, Huwe PJ, Wei Q, Xu Q, Dunbrack RL, Gotea V, Elnitski L, Margolin G, Fariselli P, Kulakovskiy IV, Makeev VJ, Penzar DD, Vorontsov IE, Favorov AV, Forman JR, Hasenahuer M, Fornasari MS, Parisi G, Avsec Z, Çelik MH, Nguyen TYD, Gagneur J, Shi FY, Edwards MD, Guo Y, Tian K, Zeng H, Gifford DK, Göke J, Zaucha J, Gough J, Ritchie GRS, Frankish A, Mudge JM, Harrow J, Young EL, Yu Y, et alJain S, Bakolitsa C, Brenner SE, Radivojac P, Moult J, Repo S, Hoskins RA, Andreoletti G, Barsky D, Chellapan A, Chu H, Dabbiru N, Kollipara NK, Ly M, Neumann AJ, Pal LR, Odell E, Pandey G, Peters-Petrulewicz RC, Srinivasan R, Yee SF, Yeleswarapu SJ, Zuhl M, Adebali O, Patra A, Beer MA, Hosur R, Peng J, Bernard BM, Berry M, Dong S, Boyle AP, Adhikari A, Chen J, Hu Z, Wang R, Wang Y, Miller M, Wang Y, Bromberg Y, Turina P, Capriotti E, Han JJ, Ozturk K, Carter H, Babbi G, Bovo S, Di Lena P, Martelli PL, Savojardo C, Casadio R, Cline MS, De Baets G, Bonache S, Díez O, Gutiérrez-Enríquez S, Fernández A, Montalban G, Ootes L, Özkan S, Padilla N, Riera C, De la Cruz X, Diekhans M, Huwe PJ, Wei Q, Xu Q, Dunbrack RL, Gotea V, Elnitski L, Margolin G, Fariselli P, Kulakovskiy IV, Makeev VJ, Penzar DD, Vorontsov IE, Favorov AV, Forman JR, Hasenahuer M, Fornasari MS, Parisi G, Avsec Z, Çelik MH, Nguyen TYD, Gagneur J, Shi FY, Edwards MD, Guo Y, Tian K, Zeng H, Gifford DK, Göke J, Zaucha J, Gough J, Ritchie GRS, Frankish A, Mudge JM, Harrow J, Young EL, Yu Y, Huff CD, Murakami K, Nagai Y, Imanishi T, Mungall CJ, Jacobsen JOB, Kim D, Jeong CS, Jones DT, Li MJ, Guthrie VB, Bhattacharya R, Chen YC, Douville C, Fan J, Kim D, Masica D, Niknafs N, Sengupta S, Tokheim C, Turner TN, Yeo HTG, Karchin R, Shin S, Welch R, Keles S, Li Y, Kellis M, Corbi-Verge C, Strokach AV, Kim PM, Klein TE, Mohan R, Sinnott-Armstrong NA, Wainberg M, Kundaje A, Gonzaludo N, Mak ACY, Chhibber A, Lam HYK, Dahary D, Fishilevich S, Lancet D, Lee I, Bachman B, Katsonis P, Lua RC, Wilson SJ, Lichtarge O, Bhat RR, Sundaram L, Viswanath V, Bellazzi R, Nicora G, Rizzo E, Limongelli I, Mezlini AM, Chang R, Kim S, Lai C, O’Connor R, Topper S, van den Akker J, Zhou AY, Zimmer AD, Mishne G, Bergquist TR, Breese MR, Guerrero RF, Jiang Y, Kiga N, Li B, Mort M, Pagel KA, Pejaver V, Stamboulian MH, Thusberg J, Mooney SD, Teerakulkittipong N, Cao C, Kundu K, Yin Y, Yu CH, Kleyman M, Lin CF, Stackpole M, Mount SM, Eraslan G, Mueller NS, Naito T, Rao AR, Azaria JR, Brodie A, Ofran Y, Garg A, Pal D, Hawkins-Hooker A, Kenlay H, Reid J, Mucaki EJ, Rogan PK, Schwarz JM, Searls DB, Lee GR, Seok C, Krämer A, Shah S, Huang CV, Kirsch JF, Shatsky M, Cao Y, Chen H, Karimi M, Moronfoye O, Sun Y, Shen Y, Shigeta R, Ford CT, Nodzak C, Uppal A, Shi X, Joseph T, Kotte S, Rana S, Rao A, Saipradeep VG, Sivadasan N, Sunderam U, Stanke M, Su A, Adzhubey I, Jordan DM, Sunyaev S, Rousseau F, Schymkowitz J, Van Durme J, Tavtigian SV, Carraro M, Giollo M, Tosatto SCE, Adato O, Carmel L, Cohen NE, Fenesh T, Holtzer T, Juven-Gershon T, Unger R, Niroula A, Olatubosun A, Väliaho J, Yang Y, Vihinen M, Wahl ME, Chang B, Chong KC, Hu I, Sun R, Wu WKK, Xia X, Zee BC, Wang MH, Wang M, Wu C, Lu Y, Chen K, Yang Y, Yates CM, Kreimer A, Yan Z, Yosef N, Zhao H, Wei Z, Yao Z, Zhou F, Folkman L, Zhou Y, Daneshjou R, Altman RB, Inoue F, Ahituv N, Arkin AP, Lovisa F, Bonvini P, Bowdin S, Gianni S, Mantuano E, Minicozzi V, Novak L, Pasquo A, Pastore A, Petrosino M, Puglisi R, Toto A, Veneziano L, Chiaraluce R, Ball MP, Bobe JR, Church GM, Consalvi V, Cooper DN, Buckley BA, Sheridan MB, Cutting GR, Scaini MC, Cygan KJ, Fredericks AM, Glidden DT, Neil C, Rhine CL, Fairbrother WG, Alontaga AY, Fenton AW, Matreyek KA, Starita LM, Fowler DM, Löscher BS, Franke A, Adamson SI, Graveley BR, Gray JW, Malloy MJ, Kane JP, Kousi M, Katsanis N, Schubach M, Kircher M, Mak ACY, Tang PLF, Kwok PY, Lathrop RH, Clark WT, Yu GK, LeBowitz JH, Benedicenti F, Bettella E, Bigoni S, Cesca F, Mammi I, Marino-Buslje C, Milani D, Peron A, Polli R, Sartori S, Stanzial F, Toldo I, Turolla L, Aspromonte MC, Bellini M, Leonardi E, Liu X, Marshall C, McCombie WR, Elefanti L, Menin C, Meyn MS, Murgia A, Nadeau KCY, Neuhausen SL, Nussbaum RL, Pirooznia M, Potash JB, Dimster-Denk DF, Rine JD, Sanford JR, Snyder M, Cote AG, Sun S, Verby MW, Weile J, Roth FP, Tewhey R, Sabeti PC, Campagna J, Refaat MM, Wojciak J, Grubb S, Schmitt N, Shendure J, Spurdle AB, Stavropoulos DJ, Walton NA, Zandi PP, Ziv E, Burke W, Chen F, Carr LR, Martinez S, Paik J, Harris-Wai J, Yarborough M, Fullerton SM, Koenig BA, McInnes G, Shigaki D, Chandonia JM, Furutsuki M, Kasak L, Yu C, Chen R, Friedberg I, Getz GA, Cong Q, Kinch LN, Zhang J, Grishin NV, Voskanian A, Kann MG, Tran E, Ioannidis NM, Hunter JM, Udani R, Cai B, Morgan AA, Sokolov A, Stuart JM, Minervini G, Monzon AM, Batzoglou S, Butte AJ, Greenblatt MS, Hart RK, Hernandez R, Hubbard TJP, Kahn S, O’Donnell-Luria A, Ng PC, Shon J, Veltman J, Zook JM. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol 2024; 25:53. [PMID: 38389099 PMCID: PMC10882881 DOI: 10.1186/s13059-023-03113-6] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 11/17/2023] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors. RESULTS Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. CONCLUSIONS Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.
Collapse
|
6
|
Trinidad M, Hong X, Froelich S, Daiker J, Sacco J, Nguyen HP, Campagna M, Suhr D, Suhr T, LeBowitz JH, Gelb MH, Clark WT. Predicting disease severity in metachromatic leukodystrophy using protein activity and a patient phenotype matrix. Genome Biol 2023; 24:172. [PMID: 37480112 PMCID: PMC10360315 DOI: 10.1186/s13059-023-03001-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 06/29/2023] [Indexed: 07/23/2023] Open
Abstract
BACKGROUND Metachromatic leukodystrophy (MLD) is a lysosomal storage disorder caused by mutations in the arylsulfatase A gene (ARSA) and categorized into three subtypes according to age of onset. The functional effect of most ARSA mutants remains unknown; better understanding of the genotype-phenotype relationship is required to support newborn screening (NBS) and guide treatment. RESULTS We collected a patient data set from the literature that relates disease severity to ARSA genotype in 489 individuals with MLD. Patient-based data were used to develop a phenotype matrix that predicts MLD phenotype given ARSA alleles in a patient's genotype with 76% accuracy. We then employed a high-throughput enzyme activity assay using mass spectrometry to explore the function of ARSA variants from the curated patient data set and the Genome Aggregation Database (gnomAD). We observed evidence that 36% of variants of unknown significance (VUS) in ARSA may be pathogenic. By classifying functional effects for 251 VUS from gnomAD, we reduced the incidence of genotypes of unknown significance (GUS) by over 98.5% in the overall population. CONCLUSIONS These results provide an additional tool for clinicians to anticipate the disease course in MLD patients, identifying individuals at high risk of severe disease to support treatment access. Our results suggest that more than 1 in 3 VUS in ARSA may be pathogenic. We show that combining genetic and biochemical information increases diagnostic yield. Our strategy may apply to other recessive diseases, providing a tool to address the challenge of interpreting VUS within genotype-phenotype relationships and NBS.
Collapse
Affiliation(s)
- Marena Trinidad
- Translational Genomics Group, BioMarin Pharmaceutical Inc., Novato, CA, USA
| | - Xinying Hong
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Steven Froelich
- Translational Genomics Group, BioMarin Pharmaceutical Inc., Novato, CA, USA
| | - Jessica Daiker
- Department of Chemistry, University of Washington, Seattle, WA, USA
| | - James Sacco
- Translational Genomics Group, BioMarin Pharmaceutical Inc., Novato, CA, USA
| | - Hong Phuc Nguyen
- Translational Genomics Group, BioMarin Pharmaceutical Inc., Novato, CA, USA
| | - Madelynn Campagna
- Department of Chemistry, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | | | | | | | - Michael H Gelb
- Department of Chemistry, University of Washington, Seattle, WA, USA.
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
| | - Wyatt T Clark
- Translational Genomics Group, BioMarin Pharmaceutical Inc., Novato, CA, USA.
| |
Collapse
|
7
|
Gorvin CM, Newey PJ, Thakker RV. Identification of prolactin receptor variants with diverse effects on receptor signalling. J Mol Endocrinol 2023; 70:e220164. [PMID: 36445946 PMCID: PMC7614258 DOI: 10.1530/jme-22-0164] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 11/29/2022] [Indexed: 12/02/2022]
Abstract
The prolactin receptor (PRLR) signals predominantly through the JAK2-STAT5 pathway regulating multiple physiological functions relating to fertility, lactation, and metabolism. However, the molecular pathology and role of PRLR mutations and signalling are incompletely defined, with progress hampered by a lack of reported disease-associated PRLR variants. To date, two common germline PRLR variants are reported to demonstrate constitutive activity, with one, Ile146Leu, overrepresented in benign breast disease, while a rare activating variant, Asn492Ile, is reported to be associated with an increased incidence of prolactinoma. In contrast, an inactivating germline heterozygous PRLR variant (His188Arg) was reported in a kindred with hyperprolactinaemia, while an inactivating compound heterozygous PRLR variant (Pro269Leu/Arg171Stop) was identified in an individual with hyperprolactinaemia and agalactia. We hypothesised that additional rare germline PRLR variants, identified from large-scale sequencing projects (ExAC and GnomAD), may be associated with altered in vitro PRLR signalling activity. We therefore evaluated >300 previously uncharacterised non-synonymous, germline PRLR variants and selected 10 variants for in vitro analysis based on protein prediction algorithms, proximity to known functional domains and structural modelling. Five variants, including extracellular and intracellular domain variants, were associated with altered responses when compared to the wild-type receptor. These altered responses included loss- and gain-of-function activities related to STAT5 signalling, Akt and FOXO1 activity, as well as cell viability and apoptosis. These studies provide further insight into PRLR structure-function and indicate that rare germline PRLR variants may have diverse modulating effects on PRLR signalling, although the pathophysiologic relevance of such alterations remains to be defined.
Collapse
Affiliation(s)
- Caroline M Gorvin
- Academic Endocrine Unit, Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- Oxford NIHR Biomedical Research Centre, University of Oxford, Churchill Hospital, Oxford, UK
- Institute of Metabolism and Systems Research (IMSR) & Centre for Endocrinology, Diabetes and Metabolism (CEDAM), Birmingham Health Partners, University of Birmingham, Birmingham, UK
- Centre of Membrane Proteins and Receptors (COMPARE), University of Birmingham, Birmingham, UK
| | - Paul J Newey
- Academic Endocrine Unit, Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- Division of Molecular & Clinical Medicine (MCM), University of Dundee, Jacqui Wood Cancer Centre, Dundee, UK
| | - Rajesh V Thakker
- Academic Endocrine Unit, Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- Oxford NIHR Biomedical Research Centre, University of Oxford, Churchill Hospital, Oxford, UK
| |
Collapse
|
8
|
Muschol N, Koehn A, von Cossel K, Okur I, Ezgu F, Harmatz P, de Castro Lopez MJ, Couce ML, Lin SP, Batzios S, Cleary M, Solano M, Nestrasil I, Kaufman B, Shaywitz AJ, Maricich SM, Kuca B, Kovalchin J, Zanelli E. A phase I/II study on intracerebroventricular tralesinidase alfa in patients with Sanfilippo syndrome type B. J Clin Invest 2023; 133:165076. [PMID: 36413418 PMCID: PMC9843052 DOI: 10.1172/jci165076] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 11/17/2022] [Indexed: 11/23/2022] Open
Abstract
BackgroundSanfilippo type B is a mucopolysaccharidosis (MPS) with a major neuronopathic component characterized by heparan sulfate (HS) accumulation due to mutations in the NAGLU gene encoding alfa-N-acetyl-glucosaminidase. Enzyme replacement therapy for neuronopathic MPS requires efficient enzyme delivery throughout the brain in order to normalize HS levels, prevent brain atrophy, and potentially delay cognitive decline.MethodsIn this phase I/II open-label study, patients with MPS type IIIB (n = 22) were treated with tralesinidase alfa administered i.c.v. The patients were monitored for drug exposure; total HS and HS nonreducing end (HS-NRE) levels in both cerebrospinal fluid (CSF) and plasma; anti-drug antibody response; brain, spleen, and liver volumes as measured by MRI; and cognitive development as measured by age-equivalent (AEq) scores.ResultsIn the Part 1 dose escalation (30, 100, and 300 mg) phase, a 300 mg dose of tralesinidase alfa was necessary to achieve normalization of HS and HS-NRE levels in the CSF and plasma. In Part 2, 300 mg tralesinidase alfa sustained HS and HS-NRE normalization in the CSF and stabilized cortical gray matter volume (CGMV) over 48 weeks of treatment. Resolution of hepatomegaly and a reduction in spleen volume were observed in most patients. Significant correlations were also established between the change in cognitive AEq score and plasma drug exposure, plasma HS-NRE levels, and CGMV.ConclusionAdministration of tralesinidase alfa i.c.v. effectively normalized HS and HS-NRE levels as a prerequisite for clinical efficacy. Peripheral drug exposure data suggest a role for the glymphatic system in altering tralesinidase alfa efficacy.Trial registrationClinicaltrials.gov NCT02754076.FUNDINGBioMarin Pharmaceutical Inc. and Allievex Corporation.
Collapse
Affiliation(s)
- Nicole Muschol
- University Medical Center Hamburg-Eppendorf, International Center for Lysosomal Disorders (ICLD), Hamburg, Germany
| | - Anja Koehn
- University Medical Center Hamburg-Eppendorf, International Center for Lysosomal Disorders (ICLD), Hamburg, Germany
| | - Katharina von Cossel
- University Medical Center Hamburg-Eppendorf, International Center for Lysosomal Disorders (ICLD), Hamburg, Germany
| | - Ilyas Okur
- Gazi University Faculty of Medicine, Departments of Pediatric Metabolism and Genetics, Ankara, Turkey
| | - Fatih Ezgu
- Gazi University Faculty of Medicine, Departments of Pediatric Metabolism and Genetics, Ankara, Turkey
| | - Paul Harmatz
- UCSF Benioff Children’s Hospital Oakland, Oakland, California, USA
| | - Maria J. de Castro Lopez
- Hospital Clínico Universitario de Santiago, University of Santiago de Compostela, IDIS, CIBERER, MetabERN, A Coruña, Spain
| | - Maria Luz Couce
- Hospital Clínico Universitario de Santiago, University of Santiago de Compostela, IDIS, CIBERER, MetabERN, A Coruña, Spain
| | | | | | | | | | - Igor Nestrasil
- Division of Clinical Behavioral Neuroscience, Department of Pediatrics, and Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, Minnesota, USA
| | - Brian Kaufman
- CLB Consulting, Falls of Neuse, Raleigh, North Carolina, USA
| | | | | | - Bernice Kuca
- Allievex Corporation, Marblehead, Massachusetts, USA
| | | | - Eric Zanelli
- Allievex Corporation, Marblehead, Massachusetts, USA
| |
Collapse
|
9
|
A Novel Variant in the LIPA Gene Associated with Distinct Phenotype. Balkan J Med Genet 2022; 25:93-100. [PMID: 36880034 PMCID: PMC9985358 DOI: 10.2478/bjmg-2022-0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023] Open
Abstract
Deficiency of lysosomal acid lipase (LAL-D) is caused by biallelic pathogenic variants in the LIPA gene. Spectrum of LAL-D ranges from early onset of hepatosplenomegaly and psychomotor regression (Wolman disease) to a more chronic course (cholesteryl ester storage disease - CESD). The diagnosis is based on lipid and biomarker profiles, specific liver histopathology, enzyme deficiency, and identification of causative genetic variants. Biomarker findings are a useful for diagnostics of LAL-D, including high plasma concentration of chitotriosidase as well as elevated oxysterols. Current treatment options include enzyme replacement therapy (sebelipase-alpha), statins, liver transplantation, and stem cell transplantation. We present two pairs of siblings from Serbia with a distinctive phenotype resembling LAL-D with a novel variant of unknown significance (VUS) detected in the LIPA gene and residual LAL activity. All patients presented with hepatosplenomegaly at early childhood. In siblings from family 1, compound heterozygosity for a pathogenic c.419G>A (p.Trp140Ter) variant and a novel VUS c.851C>T (p.Ser284Phe) was detected. Patients from family 2 were homozygous for c.851C>T VUS and both have typical histopathologic findings for LAL-D in the liver. Enzyme activity of LAL was tested in three patients and reported as sufficient, and therefore enzyme replacement therapy could not be approved. When confronted with a challenge of diagnosing an inherited metabolic disorder, several aspects are taken into consideration: clinical manifestations, specific biomarkers, enzyme assay results, and molecular genetic findings. This report brings cases to light which have a considerable discrepancy between those aspects, namely the preserved LAL enzyme activity in presence of clinical manifestations and rare variants in the LIPA gene.
Collapse
|
10
|
Borges P, Pasqualim G, Giugliani R, Vairo F, Matte U. Estimated prevalence of mucopolysaccharidoses from population-based exomes and genomes. Orphanet J Rare Dis 2020; 15:324. [PMID: 33208168 PMCID: PMC7672855 DOI: 10.1186/s13023-020-01608-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 11/09/2020] [Indexed: 11/22/2022] Open
Abstract
Background In this study, the prevalence of different types of mucopolysaccharidoses (MPS) was estimated based on data from the exome aggregation consortium (ExAC) and the genome aggregation database (gnomAD). The population-based allele frequencies were used to identify potential disease-causing variants on each gene related to MPS I to IX (except MPS II).
Methods We evaluated the canonical transcripts and excluded homozygous, intronic, 3′, and 5′ UTR variants. Frameshift and in-frame insertions and deletions were evaluated using the SIFT Indel tool. Splice variants were evaluated using SpliceAI and Human Splice Finder 3.0 (HSF). Loss-of-function single nucleotide variants in coding regions were classified as potentially pathogenic, while synonymous variants outside the exon–intron boundaries were deemed non-pathogenic. Missense variants were evaluated by five in silico prediction tools, and only those predicted to be damaging by at least three different algorithms were considered disease-causing. Results The combined frequencies of selected variants (ranged from 127 in GNS to 259 in IDUA) were used to calculate prevalence based on Hardy–Weinberg's equilibrium. The maximum estimated prevalence ranged from 0.46 per 100,000 for MPSIIID to 7.1 per 100,000 for MPS I. Overall, the estimated prevalence of all types of MPS was higher than what has been published in the literature. This difference may be due to misdiagnoses and/or underdiagnoses, especially of the attenuated forms of MPS. However, overestimation of the number of disease-causing variants by in silico predictors cannot be ruled out. Even so, the disease prevalences are similar to those reported in diagnosis-based prevalence studies.
Conclusion We report on an approach to estimate the prevalence of different types of MPS based on publicly available population-based genomic data, which may help health systems to be better prepared to deal with these conditions and provide support to initiatives on diagnosis and management of MPS.
Collapse
Affiliation(s)
- Pâmella Borges
- Cell, Tissue and Gene Laboratory, Clinicas Hospital of Porto Alegre, Rio Grande do Sul, Brazil.,Experimental Research Centre, Bioinformatics Core, Clinicas Hospital of Porto Alegre, Rio Grande do Sul, Brazil.,Graduate Programme in Genetics and Molecular Biology, Federal University of Rio Grande Do Sul (UFRGS), Rio Grande do Sul, Brazil
| | - Gabriela Pasqualim
- Genetics Laboratory, Biological Sciences Institute, Federal University of Rio Grande (FURG), Rio Grande do Sul, Brazil
| | - Roberto Giugliani
- Graduate Programme in Genetics and Molecular Biology, Federal University of Rio Grande Do Sul (UFRGS), Rio Grande do Sul, Brazil.,Department of Genetics, UFRGS, Porto Alegre, Brazil.,Medical Genetics Service, HCPA, Porto Alegre, Brazil
| | - Filippo Vairo
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA. .,Department of Clinical Genomics, Mayo Clinic, Rochester, MN, USA.
| | - Ursula Matte
- Cell, Tissue and Gene Laboratory, Clinicas Hospital of Porto Alegre, Rio Grande do Sul, Brazil.,Experimental Research Centre, Bioinformatics Core, Clinicas Hospital of Porto Alegre, Rio Grande do Sul, Brazil.,Graduate Programme in Genetics and Molecular Biology, Federal University of Rio Grande Do Sul (UFRGS), Rio Grande do Sul, Brazil.,Department of Genetics, UFRGS, Porto Alegre, Brazil
| |
Collapse
|
11
|
Clark WT, Kasak L, Bakolitsa C, Hu Z, Andreoletti G, Babbi G, Bromberg Y, Casadio R, Dunbrack R, Folkman L, Ford CT, Jones D, Katsonis P, Kundu K, Lichtarge O, Martelli PL, Mooney SD, Nodzak C, Pal LR, Radivojac P, Savojardo C, Shi X, Zhou Y, Uppal A, Xu Q, Yin Y, Pejaver V, Wang M, Wei L, Moult J, Yu GK, Brenner SE, LeBowitz JH. Assessment of predicted enzymatic activity of α-N-acetylglucosaminidase variants of unknown significance for CAGI 2016. Hum Mutat 2019; 40:1519-1529. [PMID: 31342580 PMCID: PMC7156275 DOI: 10.1002/humu.23875] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 06/27/2019] [Accepted: 07/15/2019] [Indexed: 12/25/2022]
Abstract
The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population-scale analysis of disease epidemiology and rare variant association analysis.
Collapse
Affiliation(s)
| | - Laura Kasak
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia
| | - Constantina Bakolitsa
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Gaia Andreoletti
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Giulia Babbi
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | | | - Lukas Folkman
- School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Colby T. Ford
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | - David Jones
- Bioinformatics Group, Department of Computer Science, University College London, UK
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Kunal Kundu
- University of Maryland, College Park, MD, USA
| | - Olivier Lichtarge
- Departments of Molecular and Human Genetics, Biochemistry & Molecular Biology, Pharmacology, and Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX, USA
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | | | - Conor Nodzak
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | | | - Predrag Radivojac
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Aneeta Uppal
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, NC, USA
| | - Qifang Xu
- Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Yizhou Yin
- University of Maryland, College Park, MD, USA
| | - Vikas Pejaver
- Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
| | - Meng Wang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, P.R. China
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, P.R. China
| | - John Moult
- University of Maryland, College Park, MD, USA
| | - G. Karen Yu
- BioMarin Pharmaceutical, San Rafael, California, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | | |
Collapse
|
12
|
Yogalingam G, Luu AR, Prill H, Lo MJ, Yip B, Holtzinger J, Christianson T, Aoyagi-Scharber M, Lawrence R, Crawford BE, LeBowitz JH. BMN 250, a fusion of lysosomal alpha-N-acetylglucosaminidase with IGF2, exhibits different patterns of cellular uptake into critical cell types of Sanfilippo syndrome B disease pathogenesis. PLoS One 2019; 14:e0207836. [PMID: 30657762 PMCID: PMC6338363 DOI: 10.1371/journal.pone.0207836] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 12/19/2018] [Indexed: 01/27/2023] Open
Abstract
Sanfilippo syndrome type B (Sanfilippo B; Mucopolysaccharidosis type IIIB) occurs due to genetic deficiency of lysosomal alpha-N-acetylglucosaminidase (NAGLU) and subsequent lysosomal accumulation of heparan sulfate (HS), which coincides with devastating neurodegenerative disease. Because NAGLU expressed in Chinese hamster ovary cells is not mannose-6-phosphorylated, we developed an insulin-like growth factor 2 (IGF2)-tagged NAGLU molecule (BMN 250; tralesinidase alfa) that binds avidly to the IGF2 / cation-independent mannose 6-phosphate receptor (CI-MPR) for glycosylation independent lysosomal targeting. BMN 250 is currently being developed as an investigational enzyme replacement therapy for Sanfilippo B. Here we distinguish two cellular uptake mechanisms by which BMN 250 is targeted to lysosomes. In normal rodent-derived neurons and astrocytes, the majority of BMN250 uptake over 24 hours reaches saturation, which can be competitively inhibited with IGF2, suggestive of CI-MPR-mediated uptake. Kuptake, defined as the concentration of enzyme at half-maximal uptake, is 5 nM and 3 nM in neurons and astrocytes, with a maximal uptake capacity (Vmax) corresponding to 764 nmol/hr/mg and 5380 nmol/hr/mg, respectively. Similar to neurons and astrocytes, BMN 250 uptake in Sanfilippo B patient fibroblasts is predominantly CI-MPR-mediated, resulting in augmentation of NAGLU activity with doses of enzyme that fall well below the Kuptake (5 nM), which are sufficient to prevent HS accumulation. In contrast, uptake of the untagged recombinant human NAGLU (rhNAGLU) enzyme in neurons, astrocytes and fibroblasts is negligible at the same doses tested. In microglia, receptor-independent uptake, defined as enzyme uptake resistant to competition with excess IGF2, results in appreciable lysosomal delivery of BMN 250 and rhNAGLU (Vmax = 12,336 nmol/hr/mg and 5469 nmol/hr/mg, respectively). These results suggest that while receptor-independent mechanisms exist for lysosomal targeting of rhNAGLU in microglia, BMN 250, by its IGF2 tag moiety, confers increased CI-MPR-mediated lysosomal targeting to neurons and astrocytes, two additional critical cell types of Sanfilippo B disease pathogenesis.
Collapse
Affiliation(s)
- Gouri Yogalingam
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
- * E-mail:
| | - Amanda R. Luu
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | - Heather Prill
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | - Melanie J. Lo
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | - Bryan Yip
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | - John Holtzinger
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | - Terri Christianson
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | | | - Roger Lawrence
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | - Brett E. Crawford
- Research, BioMarin Pharmaceutical, Inc., Novato, CA, United States of America
| | | |
Collapse
|