1
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
2
|
Ding M, Xu W, Pei G, Li P. Long way up: rethink diseases in light of phase separation and phase transition. Protein Cell 2024; 15:475-492. [PMID: 38069453 PMCID: PMC11214837 DOI: 10.1093/procel/pwad057] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 11/24/2023] [Indexed: 07/02/2024] Open
Abstract
Biomolecular condensation, driven by multivalency, serves as a fundamental mechanism within cells, facilitating the formation of distinct compartments, including membraneless organelles that play essential roles in various cellular processes. Perturbations in the delicate equilibrium of condensation, whether resulting in gain or loss of phase separation, have robustly been associated with cellular dysfunction and physiological disorders. As ongoing research endeavors wholeheartedly embrace this newly acknowledged principle, a transformative shift is occurring in our comprehension of disease. Consequently, significant strides have been made in unraveling the profound relevance and potential causal connections between abnormal phase separation and various diseases. This comprehensive review presents compelling recent evidence that highlight the intricate associations between aberrant phase separation and neurodegenerative diseases, cancers, and infectious diseases. Additionally, we provide a succinct summary of current efforts and propose innovative solutions for the development of potential therapeutics to combat the pathological consequences attributed to aberrant phase separation.
Collapse
Affiliation(s)
- Mingrui Ding
- State Key Laboratory of Membrane Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
- NuPhase Therapeutics, Beijing 100083, China
| | - Weifan Xu
- State Key Laboratory of Membrane Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
- NuPhase Therapeutics, Beijing 100083, China
| | - Gaofeng Pei
- State Key Laboratory of Membrane Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| | - Pilong Li
- State Key Laboratory of Membrane Biology & Frontier Research Center for Biological Structure, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Tsinghua-Peking Center for Life Sciences, Beijing 100084, China
| |
Collapse
|
3
|
The Critical Assessment of Genome Interpretation Consortium, Jain S, Bakolitsa C, Brenner SE, Radivojac P, Moult J, Repo S, Hoskins RA, Andreoletti G, Barsky D, Chellapan A, Chu H, Dabbiru N, Kollipara NK, Ly M, Neumann AJ, Pal LR, Odell E, Pandey G, Peters-Petrulewicz RC, Srinivasan R, Yee SF, Yeleswarapu SJ, Zuhl M, Adebali O, Patra A, Beer MA, Hosur R, Peng J, Bernard BM, Berry M, Dong S, Boyle AP, Adhikari A, Chen J, Hu Z, Wang R, Wang Y, Miller M, Wang Y, Bromberg Y, Turina P, Capriotti E, Han JJ, Ozturk K, Carter H, Babbi G, Bovo S, Di Lena P, Martelli PL, Savojardo C, Casadio R, Cline MS, De Baets G, Bonache S, Díez O, Gutiérrez-Enríquez S, Fernández A, Montalban G, Ootes L, Özkan S, Padilla N, Riera C, De la Cruz X, Diekhans M, Huwe PJ, Wei Q, Xu Q, Dunbrack RL, Gotea V, Elnitski L, Margolin G, Fariselli P, Kulakovskiy IV, Makeev VJ, Penzar DD, Vorontsov IE, Favorov AV, Forman JR, Hasenahuer M, Fornasari MS, Parisi G, Avsec Z, Çelik MH, Nguyen TYD, Gagneur J, Shi FY, Edwards MD, Guo Y, Tian K, Zeng H, Gifford DK, Göke J, Zaucha J, Gough J, Ritchie GRS, Frankish A, Mudge JM, Harrow J, Young EL, et alThe Critical Assessment of Genome Interpretation Consortium, Jain S, Bakolitsa C, Brenner SE, Radivojac P, Moult J, Repo S, Hoskins RA, Andreoletti G, Barsky D, Chellapan A, Chu H, Dabbiru N, Kollipara NK, Ly M, Neumann AJ, Pal LR, Odell E, Pandey G, Peters-Petrulewicz RC, Srinivasan R, Yee SF, Yeleswarapu SJ, Zuhl M, Adebali O, Patra A, Beer MA, Hosur R, Peng J, Bernard BM, Berry M, Dong S, Boyle AP, Adhikari A, Chen J, Hu Z, Wang R, Wang Y, Miller M, Wang Y, Bromberg Y, Turina P, Capriotti E, Han JJ, Ozturk K, Carter H, Babbi G, Bovo S, Di Lena P, Martelli PL, Savojardo C, Casadio R, Cline MS, De Baets G, Bonache S, Díez O, Gutiérrez-Enríquez S, Fernández A, Montalban G, Ootes L, Özkan S, Padilla N, Riera C, De la Cruz X, Diekhans M, Huwe PJ, Wei Q, Xu Q, Dunbrack RL, Gotea V, Elnitski L, Margolin G, Fariselli P, Kulakovskiy IV, Makeev VJ, Penzar DD, Vorontsov IE, Favorov AV, Forman JR, Hasenahuer M, Fornasari MS, Parisi G, Avsec Z, Çelik MH, Nguyen TYD, Gagneur J, Shi FY, Edwards MD, Guo Y, Tian K, Zeng H, Gifford DK, Göke J, Zaucha J, Gough J, Ritchie GRS, Frankish A, Mudge JM, Harrow J, Young EL, Yu Y, Huff CD, Murakami K, Nagai Y, Imanishi T, Mungall CJ, Jacobsen JOB, Kim D, Jeong CS, Jones DT, Li MJ, Guthrie VB, Bhattacharya R, Chen YC, Douville C, Fan J, Kim D, Masica D, Niknafs N, Sengupta S, Tokheim C, Turner TN, Yeo HTG, Karchin R, Shin S, Welch R, Keles S, Li Y, Kellis M, Corbi-Verge C, Strokach AV, Kim PM, Klein TE, Mohan R, Sinnott-Armstrong NA, Wainberg M, Kundaje A, Gonzaludo N, Mak ACY, Chhibber A, Lam HYK, Dahary D, Fishilevich S, Lancet D, Lee I, Bachman B, Katsonis P, Lua RC, Wilson SJ, Lichtarge O, Bhat RR, Sundaram L, Viswanath V, Bellazzi R, Nicora G, Rizzo E, Limongelli I, Mezlini AM, Chang R, Kim S, Lai C, O’Connor R, Topper S, van den Akker J, Zhou AY, Zimmer AD, Mishne G, Bergquist TR, Breese MR, Guerrero RF, Jiang Y, Kiga N, Li B, Mort M, Pagel KA, Pejaver V, Stamboulian MH, Thusberg J, Mooney SD, Teerakulkittipong N, Cao C, Kundu K, Yin Y, Yu CH, Kleyman M, Lin CF, Stackpole M, Mount SM, Eraslan G, Mueller NS, Naito T, Rao AR, Azaria JR, Brodie A, Ofran Y, Garg A, Pal D, Hawkins-Hooker A, Kenlay H, Reid J, Mucaki EJ, Rogan PK, Schwarz JM, Searls DB, Lee GR, Seok C, Krämer A, Shah S, Huang CV, Kirsch JF, Shatsky M, Cao Y, Chen H, Karimi M, Moronfoye O, Sun Y, Shen Y, Shigeta R, Ford CT, Nodzak C, Uppal A, Shi X, Joseph T, Kotte S, Rana S, Rao A, Saipradeep VG, Sivadasan N, Sunderam U, Stanke M, Su A, Adzhubey I, Jordan DM, Sunyaev S, Rousseau F, Schymkowitz J, Van Durme J, Tavtigian SV, Carraro M, Giollo M, Tosatto SCE, Adato O, Carmel L, Cohen NE, Fenesh T, Holtzer T, Juven-Gershon T, Unger R, Niroula A, Olatubosun A, Väliaho J, Yang Y, Vihinen M, Wahl ME, Chang B, Chong KC, Hu I, Sun R, Wu WKK, Xia X, Zee BC, Wang MH, Wang M, Wu C, Lu Y, Chen K, Yang Y, Yates CM, Kreimer A, Yan Z, Yosef N, Zhao H, Wei Z, Yao Z, Zhou F, Folkman L, Zhou Y, Daneshjou R, Altman RB, Inoue F, Ahituv N, Arkin AP, Lovisa F, Bonvini P, Bowdin S, Gianni S, Mantuano E, Minicozzi V, Novak L, Pasquo A, Pastore A, Petrosino M, Puglisi R, Toto A, Veneziano L, Chiaraluce R, Ball MP, Bobe JR, Church GM, Consalvi V, Cooper DN, Buckley BA, Sheridan MB, Cutting GR, Scaini MC, Cygan KJ, Fredericks AM, Glidden DT, Neil C, Rhine CL, Fairbrother WG, Alontaga AY, Fenton AW, Matreyek KA, Starita LM, Fowler DM, Löscher BS, Franke A, Adamson SI, Graveley BR, Gray JW, Malloy MJ, Kane JP, Kousi M, Katsanis N, Schubach M, Kircher M, Mak ACY, Tang PLF, Kwok PY, Lathrop RH, Clark WT, Yu GK, LeBowitz JH, Benedicenti F, Bettella E, Bigoni S, Cesca F, Mammi I, Marino-Buslje C, Milani D, Peron A, Polli R, Sartori S, Stanzial F, Toldo I, Turolla L, Aspromonte MC, Bellini M, Leonardi E, Liu X, Marshall C, McCombie WR, Elefanti L, Menin C, Meyn MS, Murgia A, Nadeau KCY, Neuhausen SL, Nussbaum RL, Pirooznia M, Potash JB, Dimster-Denk DF, Rine JD, Sanford JR, Snyder M, Cote AG, Sun S, Verby MW, Weile J, Roth FP, Tewhey R, Sabeti PC, Campagna J, Refaat MM, Wojciak J, Grubb S, Schmitt N, Shendure J, Spurdle AB, Stavropoulos DJ, Walton NA, Zandi PP, Ziv E, Burke W, Chen F, Carr LR, Martinez S, Paik J, Harris-Wai J, Yarborough M, Fullerton SM, Koenig BA, McInnes G, Shigaki D, Chandonia JM, Furutsuki M, Kasak L, Yu C, Chen R, Friedberg I, Getz GA, Cong Q, Kinch LN, Zhang J, Grishin NV, Voskanian A, Kann MG, Tran E, Ioannidis NM, Hunter JM, Udani R, Cai B, Morgan AA, Sokolov A, Stuart JM, Minervini G, Monzon AM, Batzoglou S, Butte AJ, Greenblatt MS, Hart RK, Hernandez R, Hubbard TJP, Kahn S, O’Donnell-Luria A, Ng PC, Shon J, Veltman J, Zook JM. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol 2024; 25:53. [PMID: 38389099 PMCID: PMC10882881 DOI: 10.1186/s13059-023-03113-6] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 11/17/2023] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors. RESULTS Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. CONCLUSIONS Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.
Collapse
|
4
|
Tu KJ, Diplas BH, Regal JA, Waitkus MS, Pirozzi CJ, Reitman ZJ. Mining cancer genomes for change-of-metabolic-function mutations. Commun Biol 2023; 6:1143. [PMID: 37950065 PMCID: PMC10638295 DOI: 10.1038/s42003-023-05475-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 10/17/2023] [Indexed: 11/12/2023] Open
Abstract
Enzymes with novel functions are needed to enable new organic synthesis techniques. Drawing inspiration from gain-of-function cancer mutations that functionally alter proteins and affect cellular metabolism, we developed METIS (Mutated Enzymes from Tumors In silico Screen). METIS identifies metabolism-altering cancer mutations using mutation recurrence rates and protein structure. We used METIS to screen 298,517 cancer mutations and identify 48 candidate mutations, including those previously identified to alter enzymatic function. Unbiased metabolomic profiling of cells exogenously expressing a candidate mutant (OGDHLp.A400T) supports an altered phenotype that boosts in vitro production of xanthosine, a pharmacologically useful chemical that is currently produced using unsustainable, water-intensive methods. We then applied METIS to 49 million cancer mutations, yielding a refined set of candidates that may impart novel enzymatic functions or contribute to tumor progression. Thus, METIS can be used to identify and catalog potentially-useful cancer mutations for green chemistry and therapeutic applications.
Collapse
Affiliation(s)
- Kevin J Tu
- Department of Radiation Oncology, Duke University, Durham, NC, 27710, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, 21044, USA
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, CB2 0RE, UK
| | - Bill H Diplas
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Joshua A Regal
- Department of Radiation Oncology, Duke University, Durham, NC, 27710, USA
| | | | | | - Zachary J Reitman
- Department of Radiation Oncology, Duke University, Durham, NC, 27710, USA.
- Department of Neurosurgery, Duke University, Durham, NC, 27710, USA.
- Department of Pathology, Duke University, Durham, NC, 27710, USA.
| |
Collapse
|
5
|
Spielmann M, Kircher M. Computational and experimental methods for classifying variants of unknown clinical significance. Cold Spring Harb Mol Case Stud 2022; 8:mcs.a006196. [PMID: 35483875 PMCID: PMC9059783 DOI: 10.1101/mcs.a006196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
The increase in sequencing capacity, reduction in costs, and national and international coordinated efforts have led to the widespread introduction of next-generation sequencing (NGS) technologies in patient care. More generally, human genetics and genomic medicine are gaining importance for more and more patients. Some communities are already discussing the prospect of sequencing each individual's genome at time of birth. Together with digital health records, this shall enable individualized treatments and preventive measures, so-called precision medicine. A central step in this process is the identification of disease causal mutations or variant combinations that make us more susceptible for diseases. Although various technological advances have improved the identification of genetic alterations, the interpretation and ranking of the identified variants remains a major challenge. Based on our knowledge of molecular processes or previously identified disease variants, we can identify potentially functional genetic variants and, using different lines of evidence, we are sometimes able to demonstrate their pathogenicity directly. However, the vast majority of variants are classified as variants of uncertain clinical significance (VUSs) with not enough experimental evidence to determine their pathogenicity. In these cases, computational methods may be used to improve the prioritization and an increasing toolbox of experimental methods is emerging that can be used to assay the molecular effects of VUSs. Here, we discuss how computational and experimental methods can be used to create catalogs of variant effects for a variety of molecular and cellular phenotypes. We discuss the prospects of integrating large-scale functional data with machine learning and clinical knowledge for the development of accurate pathogenicity predictions for clinical applications.
Collapse
Affiliation(s)
- Malte Spielmann
- Institute of Human Genetics, University of Lübeck, 23562 Lübeck, Germany;,Institute of Human Genetics, Christian-Albrechts-Universität, 24105 Kiel, Germany;,Human Molecular Genomics Group, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany;,DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Lübeck/Kiel, 23562 Lübeck, Germany
| | - Martin Kircher
- Institute of Human Genetics, University of Lübeck, 23562 Lübeck, Germany;,Berlin Institute of Health at Charité—Universitätsmedizin Berlin, 10117 Berlin, Germany;,DZHK (German Centre for Cardiovascular Research), partner site Berlin, 10115 Berlin, Germany
| |
Collapse
|
6
|
Riedmayr LM, Hinrichsmeyer KS, Karguth N, Böhm S, Splith V, Michalakis S, Becirovic E. dCas9-VPR-mediated transcriptional activation of functionally equivalent genes for gene therapy. Nat Protoc 2022; 17:781-818. [PMID: 35132255 DOI: 10.1038/s41596-021-00666-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 11/18/2021] [Indexed: 12/19/2022]
Abstract
Many disease-causing genes possess functionally equivalent counterparts, which are often expressed in distinct cell types. An attractive gene therapy approach for inherited disorders caused by mutations in such genes is to transcriptionally activate the appropriate counterpart(s) to compensate for the missing gene function. This approach offers key advantages over conventional gene therapies because it is mutation- and gene size-independent. Here, we describe a protocol for the design, execution and evaluation of such gene therapies using dCas9-VPR. We offer guidelines on how to identify functionally equivalent genes, design and clone single guide RNAs and evaluate transcriptional activation in vitro. Moreover, focusing on inherited retinal diseases, we provide a detailed protocol on how to apply this strategy in mice using dual recombinant adeno-associated virus vectors and how to evaluate its functionality and off-target effects in the target tissue. This strategy is in principle applicable to all organisms that possess functionally equivalent genes suitable for transcriptional activation and addresses pivotal unmet needs in gene therapy with high translational potential. The protocol can be completed in 15-20 weeks.
Collapse
Affiliation(s)
- Lisa M Riedmayr
- Department of Pharmacy-Center for Drug Research, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Klara S Hinrichsmeyer
- Department of Pharmacy-Center for Drug Research, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Nina Karguth
- Department of Pharmacy-Center for Drug Research, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Sybille Böhm
- Department of Pharmacy-Center for Drug Research, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Victoria Splith
- Department of Pharmacy-Center for Drug Research, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Stylianos Michalakis
- Department of Ophthalmology, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Elvir Becirovic
- Department of Pharmacy-Center for Drug Research, Ludwig-Maximilians-Universität München, Munich, Germany.
| |
Collapse
|
7
|
Zha J, Li M, Kong R, Lu S, Zhang J. Explaining and Predicting Allostery with Allosteric Database and Modern Analytical Techniques. J Mol Biol 2022; 434:167481. [DOI: 10.1016/j.jmb.2022.167481] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/25/2022] [Accepted: 01/31/2022] [Indexed: 12/17/2022]
|
8
|
Jiang Y, Urresti J, Pagel KA, Pramod AB, Iakoucheva LM, Radivojac P. Prioritizing de novo autism risk variants with calibrated gene- and variant-scoring models. Hum Genet 2021; 141:1595-1613. [PMID: 34549350 DOI: 10.1007/s00439-021-02356-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/26/2021] [Indexed: 12/17/2022]
Abstract
Whole-exome and whole-genome sequencing studies in autism spectrum disorder (ASD) have identified hundreds of thousands of exonic variants. Only a handful of them, primarily loss-of-function variants, have been shown to increase the risk for ASD, while the contributory roles of other variants, including most missense variants, remain unknown. New approaches that combine tissue-specific molecular profiles with patients' genetic data can thus play an important role in elucidating the functional impact of exonic variation and improve understanding of ASD pathogenesis. Here, we integrate spatio-temporal gene co-expression networks from the developing human brain and protein-protein interaction networks to first reach accurate prioritization of ASD risk genes based on their connectivity patterns with previously known high-confidence ASD risk genes. We subsequently integrate these gene scores with variant pathogenicity predictions to further prioritize individual exonic variants based on the positive-unlabeled learning framework with gene- and variant-score calibration. We demonstrate that this approach discriminates among variants between cases and controls at the high end of the prediction range. Finally, we experimentally validate our top-scoring de novo mutation NP_001243143.1:p.Phe309Ser in the sodium/potassium-transporting ATPase ATP1A3 to disrupt protein binding with different partners.
Collapse
Affiliation(s)
- Yuxiang Jiang
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Jorge Urresti
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Kymberleigh A Pagel
- Department of Computer Science, Indiana University, Bloomington, IN, USA.,Institute for Computational Medicine, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Akula Bala Pramod
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Lilia M Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA.
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| |
Collapse
|
9
|
Mazmanian K, Sargsyan K, Lim C. How the Local Environment of Functional Sites Regulates Protein Function. J Am Chem Soc 2020; 142:9861-9871. [PMID: 32407086 DOI: 10.1021/jacs.0c02430] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Proteins form complex biological machineries whose functions in the cell are highly regulated at both the cellular and molecular levels. Cellular regulation of protein functions involves differential gene expressions, post-translation modifications, and signaling cascades. Molecular regulation, on the other hand, involves tuning an optimal local protein environment for the functional site. Precisely how a protein achieves such an optimal environment around a given functional site is not well understood. Herein, by surveying the literature, we first summarize the various reported strategies used by certain proteins to ensure their correct functioning. We then formulate three key physicochemical factors for regulating a protein's functional site, namely, (i) its immediate interactions, (ii) its solvent accessibility, and (iii) its conformational flexibility. We illustrate how these factors are applied to regulate the functions of free/metal-bound Cys and Zn sites in proteins.
Collapse
Affiliation(s)
- Karine Mazmanian
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Karen Sargsyan
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Carmay Lim
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan.,Department of Chemistry, National Tsing Hua University, Hsinchu 300, Taiwan
| |
Collapse
|
10
|
Pejaver V, Babbi G, Casadio R, Folkman L, Katsonis P, Kundu K, Lichtarge O, Martelli PL, Miller M, Moult J, Pal LR, Savojardo C, Yin Y, Zhou Y, Radivojac P, Bromberg Y. Assessment of methods for predicting the effects of PTEN and TPMT protein variants. Hum Mutat 2019; 40:1495-1506. [PMID: 31184403 PMCID: PMC6744362 DOI: 10.1002/humu.23838] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 05/27/2019] [Accepted: 06/06/2019] [Indexed: 01/16/2023]
Abstract
Thermodynamic stability is a fundamental property shared by all proteins. Changes in stability due to mutation are a widespread molecular mechanism in genetic diseases. Methods for the prediction of mutation-induced stability change have typically been developed and evaluated on incomplete and/or biased data sets. As part of the Critical Assessment of Genome Interpretation, we explored the utility of high-throughput variant stability profiling (VSP) assay data as an alternative for the assessment of computational methods and evaluated state-of-the-art predictors against over 7,000 nonsynonymous variants from two proteins. We found that predictions were modestly correlated with actual experimental values. Predictors fared better when evaluated as classifiers of extreme stability effects. While different methods emerging as top performers depending on the metric, it is nontrivial to draw conclusions on their adoption or improvement. Our analyses revealed that only 16% of all variants in VSP assays could be confidently defined as stability-affecting. Furthermore, it is unclear as to what extent VSP abundance scores were reasonable proxies for the stability-related quantities that participating methods were designed to predict. Overall, our observations underscore the need for clearly defined objectives when developing and using both computational and experimental methods in the context of measuring variant impact.
Collapse
Affiliation(s)
- Vikas Pejaver
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington
- The eScience Institute, University of Washington, Seattle, Washington
| | - Giulia Babbi
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Lukas Folkman
- School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Kunal Kundu
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
- Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas
- Department of Pharmacology, Baylor College of Medicine, Houston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas
| | - Pier Luigi Martelli
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Lipika R Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
| | - Castrense Savojardo
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Yizhou Yin
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey
- Department of Genetics, Human Genetics Institute, Rutgers University, Piscataway, New Jersey
- Institute for Advanced Study at Technische Universität München (TUM-IAS), Garching/Munich, Germany
| |
Collapse
|
11
|
Pagel KA, Antaki D, Lian A, Mort M, Cooper DN, Sebat J, Iakoucheva LM, Mooney SD, Radivojac P. Pathogenicity and functional impact of non-frameshifting insertion/deletion variation in the human genome. PLoS Comput Biol 2019; 15:e1007112. [PMID: 31199787 PMCID: PMC6594643 DOI: 10.1371/journal.pcbi.1007112] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 06/26/2019] [Accepted: 05/17/2019] [Indexed: 11/19/2022] Open
Abstract
Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.
Collapse
Affiliation(s)
- Kymberleigh A. Pagel
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America
| | - Danny Antaki
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - AoJie Lian
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - Jonathan Sebat
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - Lilia M. Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, United States of America
| | - Predrag Radivojac
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts, United States of America
| |
Collapse
|
12
|
Li Y, Zhang Y, Li X, Yi S, Xu J. Gain-of-Function Mutations: An Emerging Advantage for Cancer Biology. Trends Biochem Sci 2019; 44:659-674. [PMID: 31047772 DOI: 10.1016/j.tibs.2019.03.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 03/21/2019] [Accepted: 03/26/2019] [Indexed: 02/08/2023]
Abstract
Advances in next-generation sequencing have identified thousands of genomic variants that perturb the normal functions of proteins, further contributing to diverse phenotypic consequences in cancer. Elucidating the functional pathways altered by loss-of-function (LOF) or gain-of-function (GOF) mutations will be crucial for prioritizing cancer-causing variants and their resultant therapeutic liabilities. In this review, we highlight the fundamental function of GOF mutations and discuss the potential mechanistic effects in the context of signaling networks. We also summarize advances in experimental and computational resources, which will dramatically help with studies on the functional and phenotypic consequences of mutations. Together, systematic investigations of the function of GOF mutations will provide an important missing piece for cancer biology and precision therapy.
Collapse
Affiliation(s)
- Yongsheng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China; Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China; College of Bioinformatics, Hainan Medical University, Haikou 570100, China.
| | - Song Yi
- Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA; Department of Biomedical Engineering, Cockrell School of Engineering, The University of Texas at Austin, Austin, TX 78712, USA.
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| |
Collapse
|
13
|
Dobson L, Mészáros B, Tusnády GE. Structural Principles Governing Disease-Causing Germline Mutations. J Mol Biol 2018; 430:4955-4970. [DOI: 10.1016/j.jmb.2018.10.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 10/11/2018] [Indexed: 01/03/2023]
|
14
|
Aggarwal S, Nayek A, Pradhan D, Verma R, Yadav M, Ponnusamy K, Jain AK. dbGAPs: A comprehensive database of genes and genetic markers associated with psoriasis and its subtypes. Genomics 2017; 110:S0888-7543(17)30115-5. [PMID: 29031638 DOI: 10.1016/j.ygeno.2017.10.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 10/04/2017] [Accepted: 10/10/2017] [Indexed: 01/12/2023]
Abstract
Psoriasis is a systemic hyperproliferative inflammatory skin disorder, although rarely fatal but significantly reduces quality of life. Understanding the full genetic component of the disease association may provide insight into biological pathways as well as targets and biomarkers for diagnosis, prognosis and therapy. Studies related to psoriasis associated genes and genetic markers are scattered and not easily amendable to data-mining. To alleviate difficulties, we have developed dbGAPs an integrated knowledgebase representing a gateway to psoriasis associated genomic data. The database contains annotation for 202 manually curated genes associated with psoriasis and its subtypes with cross-references. Functional enrichment of these genes, in context of Gene Ontology and pathways, provide insight into their important role in psoriasis etiology and pathogenesis. The dbGAPs interface is enriched with an interactive search engine for data retrieval along with unique customized tools for Single Nucleotide Polymorphism (SNP)/indel detection and SNP/indel annotations. dbGAPs is accessible at http://www.bmicnip.in/dbgaps/.
Collapse
Affiliation(s)
- Shweta Aggarwal
- Biomedical Informatics Centre, National Institute of Pathology - ICMR, New Delhi, India
| | - Arnab Nayek
- Biomedical Informatics Centre, National Institute of Pathology - ICMR, New Delhi, India
| | - Dibyabhaba Pradhan
- Biomedical Informatics Centre, National Institute of Pathology - ICMR, New Delhi, India
| | - Rashi Verma
- Biomedical Informatics Centre, National Institute of Pathology - ICMR, New Delhi, India
| | - Monika Yadav
- Biomedical Informatics Centre, National Institute of Pathology - ICMR, New Delhi, India
| | | | - Arun Kumar Jain
- Biomedical Informatics Centre, National Institute of Pathology - ICMR, New Delhi, India.
| |
Collapse
|
15
|
Spatial distribution of disease-associated variants in three-dimensional structures of protein complexes. Oncogenesis 2017; 6:e380. [PMID: 28945216 PMCID: PMC5623905 DOI: 10.1038/oncsis.2017.79] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Revised: 07/26/2017] [Accepted: 08/06/2017] [Indexed: 01/06/2023] Open
Abstract
Next-generation sequencing enables simultaneous analysis of hundreds of human genomes
associated with a particular phenotype, for example, a disease. These genomes
naturally contain a lot of sequence variation that ranges from single-nucleotide
variants (SNVs) to large-scale structural rearrangements. In order to establish a
functional connection between genotype and disease-associated phenotypes, one needs
to distinguish disease drivers from neutral passenger variants. Functional annotation
based on experimental assays is feasible only for a limited number of candidate
mutations. Thus alternative computational tools are needed. A possible approach to
annotating mutations functionally is to consider their spatial location relative to
functionally relevant sites in three-dimensional (3D) structures of the harboring
proteins. This is impeded by the lack of available protein 3D structures.
Complementing experimentally resolved structures with reliable computational models
is an attractive alternative. We developed a structure-based approach to
characterizing comprehensive sets of non-synonymous single-nucleotide variants
(nsSNVs): associated with cancer, non-cancer diseases and putatively functionally
neutral. We searched experimentally resolved protein 3D structures for potential
homology-modeling templates for proteins harboring corresponding mutations. We found
such templates for all proteins with disease-associated nsSNVs, and 51 and 66%
of proteins carrying common polymorphisms and annotated benign variants. Many
mutations caused by nsSNVs can be found in protein–protein,
protein–nucleic acid or protein–ligand complexes. Correction for the
number of available templates per protein reveals that protein–protein
interaction interfaces are not enriched in either cancer nsSNVs, or nsSNVs associated
with non-cancer diseases. Whereas cancer-associated mutations are enriched in
DNA-binding proteins, they are rarely located directly in DNA-interacting interfaces.
In contrast, mutations associated with non-cancer diseases are in general rare in
DNA-binding proteins, but enriched in DNA-interacting interfaces in these proteins.
All disease-associated nsSNVs are overrepresented in ligand-binding pockets, and
nsSNVs associated with non-cancer diseases are additionally enriched in protein core,
where they probably affect overall protein stability.
Collapse
|
16
|
Pejaver V, Mooney SD, Radivojac P. Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges. Hum Mutat 2017; 38:1092-1108. [PMID: 28508593 PMCID: PMC5561458 DOI: 10.1002/humu.23258] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 03/16/2017] [Accepted: 03/26/2017] [Indexed: 11/08/2022]
Abstract
The steady advances in machine learning and accumulation of biomedical data have contributed to the development of numerous computational models that assess the impact of missense variants. Different methods, however, operationalize impact differently. Two common tasks in this context are the prediction of the pathogenicity of variants and the prediction of their effects on a protein's function. These are related but distinct problems, and it is unclear whether methods developed for one are optimized for the other. The Critical Assessment of Genome Interpretation (CAGI) experiment provides a means to address this question empirically. To this end, we participated in various protein-specific challenges in CAGI with two objectives in mind. First, to compare the performance of methods in the MutPred family with the state-of-the-art. Second and more importantly, to investigate the applicability of general-purpose pathogenicity predictors to the classification of specific function-altering variants without additional training or calibration. We find that our pathogenicity predictors performed competitively with other methods, outputting score distributions in agreement with experimental outcomes. Overall, we conclude that binary classifiers learned from disease-causing mutations are capable of modeling important aspects of the underlying biology and the alteration of protein function resulting from mutations.
Collapse
Affiliation(s)
- Vikas Pejaver
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana 47405
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington 98109
| | - Predrag Radivojac
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana 47405
| |
Collapse
|
17
|
Stenson PD, Mort M, Ball EV, Evans K, Hayden M, Heywood S, Hussain M, Phillips AD, Cooper DN. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet 2017. [PMID: 28349240 DOI: 10.1007/s00439‐017‐1779‐6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD ( http://www.hgmd.org ) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.
Collapse
Affiliation(s)
- Peter D Stenson
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK.
| | - Matthew Mort
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Edward V Ball
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Katy Evans
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Matthew Hayden
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Sally Heywood
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Michelle Hussain
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Andrew D Phillips
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - David N Cooper
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK.
| |
Collapse
|
18
|
Stenson PD, Mort M, Ball EV, Evans K, Hayden M, Heywood S, Hussain M, Phillips AD, Cooper DN. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet 2017; 136:665-677. [PMID: 28349240 PMCID: PMC5429360 DOI: 10.1007/s00439-017-1779-6] [Citation(s) in RCA: 975] [Impact Index Per Article: 121.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Accepted: 03/14/2017] [Indexed: 02/06/2023]
Abstract
The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD (http://www.hgmd.org) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.
Collapse
Affiliation(s)
- Peter D Stenson
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK.
| | - Matthew Mort
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Edward V Ball
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Katy Evans
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Matthew Hayden
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Sally Heywood
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Michelle Hussain
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Andrew D Phillips
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - David N Cooper
- School of Medicine, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK.
| |
Collapse
|