1
|
Zhang Y, Leung AK, Kang JJ, Sun Y, Wu G, Li L, Sun J, Cheng L, Qiu T, Zhang J, Wierbowski SD, Gupta S, Booth JG, Yu H. A multiscale functional map of somatic mutations in cancer integrating protein structure and network topology. Nat Commun 2025; 16:975. [PMID: 39856048 PMCID: PMC11760531 DOI: 10.1038/s41467-024-54176-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 11/04/2024] [Indexed: 01/27/2025] Open
Abstract
A major goal of cancer biology is to understand the mechanisms driven by somatically acquired mutations. Two distinct methodologies-one analyzing mutation clustering within protein sequences and 3D structures, the other leveraging protein-protein interaction network topology-offer complementary strengths. We present NetFlow3D, a unified, end-to-end 3D structurally-informed protein interaction network propagation framework that maps the multiscale mechanistic effects of mutations. Built upon the Human Protein Structurome, which incorporates the 3D structures of every protein and the binding interfaces of all known protein interactions, NetFlow3D integrates atomic, residue, protein and network-level information: It clusters mutations on 3D protein structures to identify driver mutations and propagates their impacts anisotropically across the protein interaction network, guided by the involved interaction interfaces, to reveal systems-level impacts. Applied to 33 cancer types, NetFlow3D identifies 2 times more 3D clusters and incorporates 8 times more proteins in significantly interconnected network modules compared to traditional methods.
Collapse
Affiliation(s)
- Yingying Zhang
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, 14853, NY, USA
| | - Alden K Leung
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Yu Sun
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Guanxi Wu
- College of Agriculture and Life Sciences, Cornell University, Ithaca, 14853, NY, USA
| | - Le Li
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Jiayang Sun
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Lily Cheng
- Department of Science and Technology Studies, Cornell University, Ithaca, 14853, NY, USA
| | - Tian Qiu
- School of Electrical and Computer Engineering, Cornell University, Ithaca, 14853, NY, USA
| | - Junke Zhang
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Shayne D Wierbowski
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - Shagun Gupta
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA
| | - James G Booth
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA
- Department of Statistics and Data Science, Cornell University, Ithaca, 14853, NY, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, 14853, NY, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, 14853, NY, USA.
| |
Collapse
|
2
|
Chhibbar P, Guha Roy P, Harioudh MK, McGrail DJ, Yang D, Singh H, Hinterleitner R, Gong YN, Yi SS, Sahni N, Sarkar SN, Das J. Uncovering cell-type-specific immunomodulatory variants and molecular phenotypes in COVID-19 using structurally resolved protein networks. Cell Rep 2024; 43:114930. [PMID: 39504244 DOI: 10.1016/j.celrep.2024.114930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 07/22/2024] [Accepted: 10/15/2024] [Indexed: 11/08/2024] Open
Abstract
Immunomodulatory variants that lead to the loss or gain of specific protein interactions often manifest only as organismal phenotypes in infectious disease. Here, we propose a network-based approach to integrate genetic variation with a structurally resolved human protein interactome network to prioritize immunomodulatory variants in COVID-19. We find that, in addition to variants that pass genome-wide significance thresholds, variants at the interface of specific protein-protein interactions, even though they do not meet genome-wide thresholds, are equally immunomodulatory. The integration of these variants with single-cell epigenomic and transcriptomic data prioritizes myeloid and T cell subsets as the most affected by these variants across both the peripheral blood and the lung compartments. Of particular interest is a common coding variant that disrupts the OAS1-PRMT6 interaction and affects downstream interferon signaling. Critically, our framework is generalizable across infectious disease contexts and can be used to implicate immunomodulatory variants that do not meet genome-wide significance thresholds.
Collapse
Affiliation(s)
- Prabal Chhibbar
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA; Integrative Systems Biology PhD Program, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Priyamvada Guha Roy
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA; Human Genetics PhD Program, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Munesh K Harioudh
- Department of Microbiology and Molecular Genetics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Daniel J McGrail
- Center for Immunotherapy and Precision Immuno Oncology, Cleveland Clinic, Cleveland, OH, USA; Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Donghui Yang
- Department of Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Harinder Singh
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Reinhard Hinterleitner
- Department of Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Yi-Nan Gong
- Department of Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - S Stephen Yi
- Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, USA; Department of Biomedical Engineering, Oden Institute for Computational Engineering and Sciences (ICES) and Interdisciplinary Life Sciences Graduate Programs, The University of Texas at Austin, Austin, TX, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, MD Anderson Cancer Center, Houston, TX, USA; Program in Quantitative and Computational Biosciences (QCB), Baylor College of Medicine, Houston, TX, USA; Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Saumendra N Sarkar
- Department of Microbiology and Molecular Genetics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA; Department of Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Jishnu Das
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
3
|
Xiong D, Qiu Y, Zhao J, Zhou Y, Lee D, Gupta S, Torres M, Lu W, Liang S, Kang JJ, Eng C, Loscalzo J, Cheng F, Yu H. A structurally informed human protein-protein interactome reveals proteome-wide perturbations caused by disease mutations. Nat Biotechnol 2024:10.1038/s41587-024-02428-4. [PMID: 39448882 DOI: 10.1038/s41587-024-02428-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/11/2024] [Indexed: 10/26/2024]
Abstract
To assist the translation of genetic findings to disease pathobiology and therapeutics discovery, we present an ensemble deep learning framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that predicts protein-binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms to generate comprehensive structurally informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods and experimentally validate its predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces and explore their impact on disease prognosis and drug responses. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from analysis of approximately 11,000 whole exomes across 33 cancer types and show significant associations of oncoPPIs with patient survival and drug responses. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.
Collapse
Grants
- R01GM124559 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01GM125639 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01GM130885 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- RM1GM139738 U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
- R01DK115398 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- U01HG007691 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- R01HL155107 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01HL155096 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01HL166137 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U54HL119145 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- AHA957729 American Heart Association (American Heart Association, Inc.)
- 24MERIT1185447 American Heart Association (American Heart Association, Inc.)
- R01AG084250 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R56AG074001 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- U01AG073323 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R01AG066707 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R01AG076448 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R01AG082118 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- RF1AG082211 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R21AG083003 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- RF1NS133812 U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
| | - Yunguang Qiu
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Junfei Zhao
- Department of Systems Biology, Herbert Irving Comprehensive Center, Columbia University, New York, NY, USA
| | - Yadi Zhou
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Shobhita Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
- Biophysics Program, Cornell University, Ithaca, NY, USA
| | - Mateo Torres
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
| | - Weiqiang Lu
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA
| | - Charis Eng
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Feixiong Cheng
- Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA.
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
- Center for Innovative Proteomics, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
4
|
Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature 2024; 634:995-1003. [PMID: 39322666 PMCID: PMC11499273 DOI: 10.1038/s41586-024-07966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]
Abstract
There are more ways to synthesize a 100-amino acid (aa) protein (20100) than there are atoms in the universe. Only a very small fraction of such a vast sequence space can ever be experimentally or computationally surveyed. Deep neural networks are increasingly being used to navigate high-dimensional sequence spaces1. However, these models are extremely complicated. Here, by experimentally sampling from sequence spaces larger than 1010, we show that the genetic architecture of at least some proteins is remarkably simple, allowing accurate genetic prediction in high-dimensional sequence spaces with fully interpretable energy models. These models capture the nonlinear relationships between free energies and phenotypes but otherwise consist of additive free energy changes with a small contribution from pairwise energetic couplings. These energetic couplings are sparse and associated with structural contacts and backbone proximity. Our results indicate that protein genetics is actually both rather simple and intelligible.
Collapse
Affiliation(s)
- Andre J Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- ALLOX, Barcelona, Spain.
| | - Aina Martí-Aranda
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Cristina Hidalgo-Carcedo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Antoni Beltran
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jörn M Schmiedel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- factorize.bio, Berlin, Germany
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
5
|
Zhang Y, Leung AK, Kang JJ, Sun Y, Wu G, Li L, Sun J, Cheng L, Qiu T, Zhang J, Wierbowski S, Gupta S, Booth J, Yu H. A multiscale functional map of somatic mutations in cancer integrating protein structure and network topology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.06.531441. [PMID: 36945530 PMCID: PMC10028849 DOI: 10.1101/2023.03.06.531441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
A major goal of cancer biology is to understand the mechanisms underlying tumorigenesis driven by somatically acquired mutations. Two distinct types of computational methodologies have emerged: one focuses on analyzing clustering of mutations within protein sequences and 3D structures, while the other characterizes mutations by leveraging the topology of protein-protein interaction network. Their insights are largely non-overlapping, offering complementary strengths. Here, we established a unified, end-to-end 3D structurally-informed protein interaction network propagation framework, NetFlow3D, that systematically maps the multiscale mechanistic effects of somatic mutations in cancer. The establishment of NetFlow3D hinges upon the Human Protein Structurome, a comprehensive repository we compiled that incorporates the 3D structures of every single protein as well as the binding interfaces of all known protein interactions in humans. NetFlow3D leverages the Structurome to integrate information across atomic, residue, protein and network levels: It conducts 3D clustering of mutations across atomic and residue levels on protein structures to identify potential driver mutations. It then anisotropically propagates their impacts across the protein interaction network, with propagation guided by the specific 3D structural interfaces involved, to identify significantly interconnected network "modules", thereby uncovering key biological processes underlying disease etiology. Applied to 1,038,899 somatic protein-altering mutations in 9,946 TCGA tumors across 33 cancer types, NetFlow3D identified 1,4444 significant 3D clusters throughout the Human Protein Structurome, of which ~55% would not have been found if using only experimentally-determined structures. It then identified 26 significantly interconnected modules that encompass ~8-fold more proteins than applying standard network analyses. NetFlow3D and our pan-cancer results can be accessed from http://netflow3d.yulab.org.
Collapse
Affiliation(s)
- Yingying Zhang
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University; Ithaca, 14853, USA
| | - Alden K. Leung
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Yu Sun
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Guanxi Wu
- College of Agriculture and Life Sciences, Cornell University; Ithaca, 14853, USA
| | - Le Li
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Jiayang Sun
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
| | - Lily Cheng
- Department of Science and Technology Studies, Cornell University; Ithaca, 14853, USA
| | - Tian Qiu
- School of Electrical and Computer Engineering, Cornell University; Ithaca, 14853, USA
| | - Junke Zhang
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Shayne Wierbowski
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Shagun Gupta
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - James Booth
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Department of Statistics and Data Science, Cornell University; Ithaca, 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| |
Collapse
|
6
|
Omelchenko AA, Siwek JC, Chhibbar P, Arshad S, Nazarali I, Nazarali K, Rosengart A, Rahimikollu J, Tilstra J, Shlomchik MJ, Koes DR, Joglekar AV, Das J. Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.01.592062. [PMID: 38746274 PMCID: PMC11092674 DOI: 10.1101/2024.05.01.592062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The explosion of sequence data has allowed the rapid growth of protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration or the sequences are concatenated prior to embedding. Interestingly, no method utilizes a language representation of the interaction itself. We developed an interaction LM (iLM), which uses a novel language to represent interactions between protein/peptide sequences. Sliding Window Interaction Grammar (SWING) leverages differences in amino acid properties to generate an interaction vocabulary. This vocabulary is the input into a LM followed by a supervised prediction step where the LM's representations are used as features. SWING was first applied to predicting peptide:MHC (pMHC) interactions. SWING was not only successful at generating Class I and Class II models that have comparable prediction to state-of-the-art approaches, but the unique Mixed Class model was also successful at jointly predicting both classes. Further, the SWING model trained only on Class I alleles was predictive for Class II, a complex prediction task not attempted by any existing approach. For de novo data, using only Class I or Class II data, SWING also accurately predicted Class II pMHC interactions in murine models of SLE (MRL/lpr model) and T1D (NOD model), that were validated experimentally. To further evaluate SWING's generalizability, we tested its ability to predict the disruption of specific protein-protein interactions by missense mutations. Although modern methods like AlphaMissense and ESM1b can predict interfaces and variant effects/pathogenicity per mutation, they are unable to predict interaction-specific disruptions. SWING was successful at accurately predicting the impact of both Mendelian mutations and population variants on PPIs. This is the first generalizable approach that can accurately predict interaction-specific disruptions by missense mutations with only sequence information. Overall, SWING is a first-in-class generalizable zero-shot iLM that learns the language of PPIs.
Collapse
Affiliation(s)
- Alisa A. Omelchenko
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA
- The joint CMU-Pitt PhD program in computational biology, School of Medicine, University of Pittsburgh, PA, USA
| | - Jane C. Siwek
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA
- The joint CMU-Pitt PhD program in computational biology, School of Medicine, University of Pittsburgh, PA, USA
| | - Prabal Chhibbar
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Integrative systems biology PhD program, School of Medicine, University of Pittsburgh, PA, USA
| | - Sanya Arshad
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Iliyan Nazarali
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kiran Nazarali
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - AnnaElaine Rosengart
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Javad Rahimikollu
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA
- The joint CMU-Pitt PhD program in computational biology, School of Medicine, University of Pittsburgh, PA, USA
| | - Jeremy Tilstra
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Division of Rheumatology and Clinical Immunology, Department of Medicine, School of Medicine, University of Pittsburgh, PA, USA
| | - Mark J. Shlomchik
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - David R. Koes
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA
| | - Alok V. Joglekar
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA
| | - Jishnu Das
- Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA
| |
Collapse
|
7
|
Xiong D, Qiu Y, Zhao J, Zhou Y, Lee D, Gupta S, Torres M, Lu W, Liang S, Kang JJ, Eng C, Loscalzo J, Cheng F, Yu H. Structurally-informed human interactome reveals proteome-wide perturbations by disease mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.24.538110. [PMID: 37162909 PMCID: PMC10168245 DOI: 10.1101/2023.04.24.538110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Human genome sequencing studies have identified numerous loci associated with complex diseases. However, translating human genetic and genomic findings to disease pathobiology and therapeutic discovery remains a major challenge at multiscale interactome network levels. Here, we present a deep-learning-based ensemble framework, termed PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), that accurately predicts protein binding partner-specific interfaces for all known protein interactions in humans and seven other common model organisms, generating comprehensive structurally-informed protein interactomes. We demonstrate that PIONEER outperforms existing state-of-the-art methods. We further systematically validated PIONEER predictions experimentally through generating 2,395 mutations and testing their impact on 6,754 mutation-interaction pairs, confirming the high quality and validity of PIONEER predictions. We show that disease-associated mutations are enriched in PIONEER-predicted protein-protein interfaces after mapping mutations from ~60,000 germline exomes and ~36,000 somatic genomes. We identify 586 significant protein-protein interactions (PPIs) enriched with PIONEER-predicted interface somatic mutations (termed oncoPPIs) from pan-cancer analysis of ~11,000 tumor whole-exomes across 33 cancer types. We show that PIONEER-predicted oncoPPIs are significantly associated with patient survival and drug responses from both cancer cell lines and patient-derived xenograft mouse models. We identify a landscape of PPI-perturbing tumor alleles upon ubiquitination by E3 ligases, and we experimentally validate the tumorigenic KEAP1-NRF2 interface mutation p.Thr80Lys in non-small cell lung cancer. We show that PIONEER-predicted PPI-perturbing alleles alter protein abundance and correlates with drug responses and patient survival in colon and uterine cancers as demonstrated by proteogenomic data from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium. PIONEER, implemented as both a web server platform and a software package, identifies functional consequences of disease-associated alleles and offers a deep learning tool for precision medicine at multiscale interactome network levels.
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Yunguang Qiu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Junfei Zhao
- Department of Systems Biology, Herbert Irving Comprehensive Center, Columbia University, New York, NY 10032, USA
| | - Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Shobhita Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
- Biophysics Program, Cornell University, Ithaca, NY 14853, USA
| | - Mateo Torres
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Weiqiang Lu
- Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Innovative Proteomics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
8
|
Ding X, Singh P, Schimenti K, Tran TN, Fragoza R, Hardy J, Orwig KE, Olszewska M, Kurpisz MK, Yatsenko AN, Conrad DF, Yu H, Schimenti JC. In vivo versus in silico assessment of potentially pathogenic missense variants in human reproductive genes. Proc Natl Acad Sci U S A 2023; 120:e2219925120. [PMID: 37459509 PMCID: PMC10372637 DOI: 10.1073/pnas.2219925120] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 05/25/2023] [Indexed: 07/20/2023] Open
Abstract
Infertility is a heterogeneous condition, with genetic causes thought to underlie a substantial fraction of cases. Genome sequencing is becoming increasingly important for genetic diagnosis of diseases including idiopathic infertility; however, most rare or minor alleles identified in patients are variants of uncertain significance (VUS). Interpreting the functional impacts of VUS is challenging but profoundly important for clinical management and genetic counseling. To determine the consequences of these variants in key fertility genes, we functionally evaluated 11 missense variants in the genes ANKRD31, BRDT, DMC1, EXO1, FKBP6, MCM9, M1AP, MEI1, MSH4 and SEPT12 by generating genome-edited mouse models. Nine variants were classified as deleterious by most functional prediction algorithms, and two disrupted a protein-protein interaction (PPI) in the yeast two hybrid (Y2H) assay. Though these genes are essential for normal meiosis or spermiogenesis in mice, only one variant, observed in the MCM9 gene of a male infertility patient, compromised fertility or gametogenesis in the mouse models. To explore the disconnect between predictions and outcomes, we compared pathogenicity calls of missense variants made by ten widely used algorithms to 1) those annotated in ClinVar and 2) those evaluated in mice. All the algorithms performed poorly in terms of predicting the effects of human missense variants modeled in mice. These studies emphasize caution in the genetic diagnoses of infertile patients based primarily on pathogenicity prediction algorithms and emphasize the need for alternative and efficient in vitro or in vivo functional validation models for more effective and accurate VUS description to either pathogenic or benign categories.
Collapse
Affiliation(s)
- Xinbao Ding
- College of Veterinary Medicine, Department of Biomedical Sciences, Cornell University, Ithaca, NY14853
| | - Priti Singh
- College of Veterinary Medicine, Department of Biomedical Sciences, Cornell University, Ithaca, NY14853
| | - Kerry Schimenti
- College of Veterinary Medicine, Department of Biomedical Sciences, Cornell University, Ithaca, NY14853
| | - Tina N. Tran
- College of Veterinary Medicine, Department of Biomedical Sciences, Cornell University, Ithaca, NY14853
| | - Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY14853
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY14853
| | - Jimmaline Hardy
- School of Medicine, Department of Obstetrics, Gynecology, and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh, Pittsburgh, PA15213
| | - Kyle E. Orwig
- School of Medicine, Department of Obstetrics, Gynecology, and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh, Pittsburgh, PA15213
| | - Marta Olszewska
- Institute of Human Genetics, Polish Academy of Sciences, Poznan60-479, Poland
| | - Maciej K. Kurpisz
- Institute of Human Genetics, Polish Academy of Sciences, Poznan60-479, Poland
| | - Alexander N. Yatsenko
- School of Medicine, Department of Obstetrics, Gynecology, and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh, Pittsburgh, PA15213
| | - Donald F. Conrad
- Oregon Health & Science University, Division of Genetics, Oregon National Primate Research Center, Beaverton, OR97006
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY14853
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY14853
| | - John C. Schimenti
- College of Veterinary Medicine, Department of Biomedical Sciences, Cornell University, Ithaca, NY14853
| |
Collapse
|
9
|
Llargués-Sistac G, Bonjoch L, Castellvi-Bel S. HAP1, a new revolutionary cell model for gene editing using CRISPR-Cas9. Front Cell Dev Biol 2023; 11:1111488. [PMID: 36936678 PMCID: PMC10020200 DOI: 10.3389/fcell.2023.1111488] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
The use of next-generation sequencing (NGS) technologies has been instrumental in the characterization of the mutational landscape of complex human diseases like cancer. But despite the enormous rise in the identification of disease candidate genetic variants, their functionality is yet to be fully elucidated in order to have a clear implication in patient care. Haploid human cell models have become the tool of choice for functional gene studies, since they only contain one copy of the genome and can therefore show the unmasked phenotype of genetic variants. Over the past few years, the human near-haploid cell line HAP1 has widely been consolidated as one of the favorite cell line models for functional genetic studies. Its rapid turnover coupled with the fact that only one allele needs to be modified in order to express the subsequent desired phenotype has made this human cell line a valuable tool for gene editing by CRISPR-Cas9 technologies. This review examines the recent uses of the HAP1 cell line model in functional genetic studies and high-throughput genetic screens using the CRISPR-Cas9 system. It covers its use in an attempt to develop new and relevant disease models to further elucidate gene function, and create new ways to understand the genetic basis of human diseases. We will cover the advantages and potential of the use of CRISPR-Cas9 technology on HAP1 to easily and efficiently study the functional interpretation of gene function and human single-nucleotide genetic variants of unknown significance identified through NGS technologies, and its implications for changes in clinical practice and patient care.
Collapse
Affiliation(s)
- Gemma Llargués-Sistac
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Gastroenterology Department, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Barcelona, Spain
| | | | - Sergi Castellvi-Bel
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Gastroenterology Department, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Barcelona, Spain
| |
Collapse
|
10
|
Ozturk K, Carter H. Predicting functional consequences of mutations using molecular interaction network features. Hum Genet 2022; 141:1195-1210. [PMID: 34432150 PMCID: PMC8873243 DOI: 10.1007/s00439-021-02329-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Accepted: 07/31/2021] [Indexed: 12/13/2022]
Abstract
Variant interpretation remains a central challenge for precision medicine. Missense variants are particularly difficult to understand as they change only a single amino acid in a protein sequence yet can have large and varied effects on protein activity. Numerous tools have been developed to identify missense variants with putative disease consequences from protein sequence and structure. However, biological function arises through higher order interactions among proteins and molecules within cells. We therefore sought to capture information about the potential of missense mutations to perturb protein interaction networks by integrating protein structure and interaction data. We developed 16 network-based annotations for missense mutations that provide orthogonal information to features classically used to prioritize variants. We then evaluated them in the context of a proven machine-learning framework for variant effect prediction across multiple benchmark datasets to demonstrate their potential to improve variant classification. Interestingly, network features resulted in larger performance gains for classifying somatic mutations than for germline variants, possibly due to different constraints on what mutations are tolerated at the cellular versus organismal level. Our results suggest that modeling variant potential to perturb context-specific interactome networks is a fruitful strategy to advance in silico variant effect prediction.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
| | - Hannah Carter
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA.
- Moores Cancer Center, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
11
|
Faure AJ, Domingo J, Schmiedel JM, Hidalgo-Carcedo C, Diss G, Lehner B. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 2022; 604:175-183. [PMID: 35388192 DOI: 10.1038/s41586-022-04586-4] [Citation(s) in RCA: 124] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 02/25/2022] [Indexed: 11/09/2022]
Abstract
Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development1-6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes-here we examine binding and protein abundance-in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering 'edgetic' variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.
Collapse
Affiliation(s)
- Andre J Faure
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Júlia Domingo
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,New York Genome Center (NYGC), New York, NY, USA
| | - Jörn M Schmiedel
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Cristina Hidalgo-Carcedo
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Guillaume Diss
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Ben Lehner
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
12
|
Kunowska N, Stelzl U. Decoding the cellular effects of genetic variation through interaction proteomics. Curr Opin Chem Biol 2022; 66:102100. [PMID: 34801969 DOI: 10.1016/j.cbpa.2021.102100] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 10/07/2021] [Accepted: 10/14/2021] [Indexed: 12/24/2022]
Abstract
It is often unclear how genetic variation translates into cellular phenotypes, including how much of the coding variation can be recovered in the proteome. Proteogenomic analyses of heterogenous cell lines revealed that the genetic differences impact mostly the abundance and stoichiometry of protein complexes, with the effects propagating post-transcriptionally via protein interactions onto other subunits. Conversely, large scale binary interaction analyses of missense variants revealed that loss of interaction is widespread and caused by about 50% disease-associated mutations, while deep scanning mutagenesis of binary interactions identified thousands of interaction-deficient variants per interaction. The idea that phenotypes arise from genetic variation through protein-protein interaction is therefore substantiated by both forward and reverse interaction proteomics. With improved methodologies, these two approaches combined can close the knowledge gap between nucleotide sequence variation and its functional consequences on the cellular proteome.
Collapse
Affiliation(s)
- Natalia Kunowska
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Austria
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences, Pharmaceutical Chemistry, University of Graz, Austria; BioTechMed-Graz, Austria; Field of Excellence BioHealth - University of Graz, Austria.
| |
Collapse
|
13
|
Chen S, Liu Y, Zhang Y, Wierbowski SD, Lipkin SM, Wei X, Yu H. A full-proteome, interaction-specific characterization of mutational hotspots across human cancers. Genome Res 2022; 32:135-149. [PMID: 34963661 PMCID: PMC8744679 DOI: 10.1101/gr.275437.121] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 11/22/2021] [Indexed: 11/24/2022]
Abstract
Rapid accumulation of cancer genomic data has led to the identification of an increasing number of mutational hotspots with uncharacterized significance. Here we present a biologically informed computational framework that characterizes the functional relevance of all 1107 published mutational hotspots identified in approximately 25,000 tumor samples across 41 cancer types in the context of a human 3D interactome network, in which the interface of each interaction is mapped at residue resolution. Hotspots reside in network hub proteins and are enriched on protein interaction interfaces, suggesting that alteration of specific protein-protein interactions is critical for the oncogenicity of many hotspot mutations. Our framework enables, for the first time, systematic identification of specific protein interactions affected by hotspot mutations at the full proteome scale. Furthermore, by constructing a hotspot-affected network that connects all hotspot-affected interactions throughout the whole-human interactome, we uncover genome-wide relationships among hotspots and implicate novel cancer proteins that do not harbor hotspot mutations themselves. Moreover, applying our network-based framework to specific cancer types identifies clinically significant hotspots that can be used for prognosis and therapy targets. Overall, we show that our framework bridges the gap between the statistical significance of mutational hotspots and their biological and clinical significance in human cancers.
Collapse
Affiliation(s)
- Siwei Chen
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Yuan Liu
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| | - Yingying Zhang
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Shayne D Wierbowski
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| | - Steven M Lipkin
- Department of Medicine, Weill Cornell Medicine, New York, New York 10021, USA
| | - Xiaomu Wei
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Medicine, Weill Cornell Medicine, New York, New York 10021, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
14
|
A 3D structural SARS-CoV-2-human interactome to explore genetic and drug perturbations. Nat Methods 2021; 18:1477-1488. [PMID: 34845387 PMCID: PMC8665054 DOI: 10.1038/s41592-021-01318-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 10/05/2021] [Indexed: 01/08/2023]
Abstract
Emergence of new viral agents is driven by evolution of interactions between viral proteins and host targets. For instance, increased infectivity of SARS-CoV-2 compared to SARS-CoV-1 arose in part through rapid evolution along the interface between the spike protein and its human receptor ACE2, leading to increased binding affinity. To facilitate broader exploration of how pathogen-host interactions might impact transmission and virulence in the ongoing COVID-19 pandemic, we performed state-of-the-art interface prediction followed by molecular docking to construct a three-dimensional structural interactome between SARS-CoV-2 and human. We additionally carried out downstream meta-analyses to investigate enrichment of sequence divergence between SARS-CoV-1 and SARS-CoV-2 or human population variants along viral-human protein-interaction interfaces, predict changes in binding affinity by these mutations/variants and further prioritize drug repurposing candidates predicted to competitively bind human targets. We believe this resource ( http://3D-SARS2.yulab.org ) will aid in development and testing of informed hypotheses for SARS-CoV-2 etiology and treatments.
Collapse
|
15
|
Singh P, Fragoza R, Blengini CS, Tran TN, Pannafino G, Al-Sweel N, Schimenti KJ, Schindler K, Alani EA, Yu H, Schimenti JC. Human MLH1/3 variants causing aneuploidy, pregnancy loss, and premature reproductive aging. Nat Commun 2021; 12:5005. [PMID: 34408140 PMCID: PMC8373927 DOI: 10.1038/s41467-021-25028-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 07/20/2021] [Indexed: 01/12/2023] Open
Abstract
Embryonic aneuploidy from mis-segregation of chromosomes during meiosis causes pregnancy loss. Proper disjunction of homologous chromosomes requires the mismatch repair (MMR) genes MLH1 and MLH3, essential in mice for fertility. Variants in these genes can increase colorectal cancer risk, yet the reproductive impacts are unclear. To determine if MLH1/3 single nucleotide polymorphisms (SNPs) in human populations could cause reproductive abnormalities, we use computational predictions, yeast two-hybrid assays, and MMR and recombination assays in yeast, selecting nine MLH1 and MLH3 variants to model in mice via genome editing. We identify seven alleles causing reproductive defects in mice including female subfertility and male infertility. Remarkably, in females these alleles cause age-dependent decreases in litter size and increased embryo resorption, likely a consequence of fewer chiasmata that increase univalents at meiotic metaphase I. Our data suggest that hypomorphic alleles of meiotic recombination genes can predispose females to increased incidence of pregnancy loss from gamete aneuploidy. Proper meiotic chromosome segregation requires mismatch repair genes MLH1 and MLH3, of which variants occur in the human population. Here, the authors use computational predictions and yeast assays to select human MLH1/3 variants for modelling in mice, observing reproductive defects from abnormal levels of crossing over.
Collapse
Affiliation(s)
- Priti Singh
- Dept of Biomedical Sciences, Cornell University College of Veterinary Medicine, Ithaca, NY, USA.,Preclinical Modeling Core Lab, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | | | - Tina N Tran
- Dept of Biomedical Sciences, Cornell University College of Veterinary Medicine, Ithaca, NY, USA
| | - Gianno Pannafino
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Najla Al-Sweel
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Kerry J Schimenti
- Dept of Biomedical Sciences, Cornell University College of Veterinary Medicine, Ithaca, NY, USA
| | | | - Eric A Alani
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.,Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - John C Schimenti
- Dept of Biomedical Sciences, Cornell University College of Veterinary Medicine, Ithaca, NY, USA. .,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
16
|
Moesslacher CS, Kohlmayr JM, Stelzl U. Exploring absent protein function in yeast: assaying post translational modification and human genetic variation. MICROBIAL CELL (GRAZ, AUSTRIA) 2021; 8:164-183. [PMID: 34395585 PMCID: PMC8329848 DOI: 10.15698/mic2021.08.756] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 06/13/2021] [Accepted: 06/18/2021] [Indexed: 01/08/2023]
Abstract
Yeast is a valuable eukaryotic model organism that has evolved many processes conserved up to humans, yet many protein functions, including certain DNA and protein modifications, are absent. It is this absence of protein function that is fundamental to approaches using yeast as an in vivo test system to investigate human proteins. Functionality of the heterologous expressed proteins is connected to a quantitative, selectable phenotype, enabling the systematic analyses of mechanisms and specificity of DNA modification, post-translational protein modifications as well as the impact of annotated cancer mutations and coding variation on protein activity and interaction. Through continuous improvements of yeast screening systems, this is increasingly carried out on a global scale using deep mutational scanning approaches. Here we discuss the applicability of yeast systems to investigate absent human protein function with a specific focus on the impact of protein variation on protein-protein interaction modulation.
Collapse
Affiliation(s)
- Christina S Moesslacher
- Institute of Pharmaceutical Sciences and BioTechMed-Graz, University of Graz, Graz, Austria
- Contributed equally to the writing of this review
| | - Johanna M Kohlmayr
- Institute of Pharmaceutical Sciences and BioTechMed-Graz, University of Graz, Graz, Austria
- Contributed equally to the writing of this review
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences and BioTechMed-Graz, University of Graz, Graz, Austria
- Contributed equally to the writing of this review
| |
Collapse
|
17
|
Lanz MC, Yugandhar K, Gupta S, Sanford EJ, Faça VM, Vega S, Joiner AMN, Fromme JC, Yu H, Smolka MB. In-depth and 3-dimensional exploration of the budding yeast phosphoproteome. EMBO Rep 2021; 22:e51121. [PMID: 33491328 PMCID: PMC7857435 DOI: 10.15252/embr.202051121] [Citation(s) in RCA: 91] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 11/30/2020] [Accepted: 12/03/2020] [Indexed: 01/11/2023] Open
Abstract
Phosphorylation is one of the most dynamic and widespread post-translational modifications regulating virtually every aspect of eukaryotic cell biology. Here, we assemble a dataset from 75 independent phosphoproteomic experiments performed in our laboratory using Saccharomyces cerevisiae. We report 30,902 phosphosites identified from cells cultured in a range of DNA damage conditions and/or arrested in distinct cell cycle stages. To generate a comprehensive resource for the budding yeast community, we aggregate our dataset with the Saccharomyces Genome Database and another recently published study, resulting in over 46,000 budding yeast phosphosites. With the goal of enhancing the identification of functional phosphorylation events, we perform computational positioning of phosphorylation sites on available 3D protein structures and systematically identify events predicted to regulate protein complex architecture. Results reveal hundreds of phosphorylation sites mapping to or near protein interaction interfaces, many of which result in steric or electrostatic "clashes" predicted to disrupt the interaction. With the advancement of Cryo-EM and the increasing number of available structures, our approach should help drive the functional and spatial exploration of the phosphoproteome.
Collapse
Affiliation(s)
- Michael C Lanz
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
- Present address:
Department of BiologyStanford UniversityStanfordCAUSA
| | - Kumar Yugandhar
- Department of Computational BiologyWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Shagun Gupta
- Department of Computational BiologyWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Ethan J Sanford
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Vitor M Faça
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Stephanie Vega
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Aaron M N Joiner
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - J Christopher Fromme
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Haiyuan Yu
- Department of Computational BiologyWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| | - Marcus B Smolka
- Department of Molecular Biology and GeneticsWeill Institute for Cell and Molecular BiologyCornell UniversityIthacaNYUSA
| |
Collapse
|
18
|
Ding X, Schimenti JC. Strategies to Identify Genetic Variants Causing Infertility. Trends Mol Med 2021; 27:792-806. [PMID: 33431240 DOI: 10.1016/j.molmed.2020.12.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Revised: 11/26/2020] [Accepted: 12/11/2020] [Indexed: 12/19/2022]
Abstract
Genetic causes are thought to underlie about half of infertility cases, but understanding the genetic bases has been a major challenge. Modern genomics tools allow more sophisticated exploration of genetic causes of infertility through population, family-based, and individual studies. Nevertheless, potential therapies based on genetic diagnostics will be limited until there is certainty regarding the causality of genetic variants identified in an individual. Genome modulation and editing technologies have revolutionized our ability to functionally test such variants, and also provide a potential means for clinical correction of infertility variants. This review addresses strategies being used to identify causative variants of infertility.
Collapse
Affiliation(s)
- Xinbao Ding
- Cornell University, College of Veterinary Medicine, Department of Biomedical Sciences, Ithaca, NY 14853, USA
| | - John C Schimenti
- Cornell University, College of Veterinary Medicine, Department of Biomedical Sciences, Ithaca, NY 14853, USA.
| |
Collapse
|
19
|
Tippens ND, Liang J, Leung AKY, Wierbowski SD, Ozer A, Booth JG, Lis JT, Yu H. Transcription imparts architecture, function and logic to enhancer units. Nat Genet 2020; 52:1067-1075. [PMID: 32958950 PMCID: PMC7541647 DOI: 10.1038/s41588-020-0686-2] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 07/28/2020] [Indexed: 01/09/2023]
Abstract
Distal enhancers play pivotal roles in development and disease yet remain one of the least understood regulatory elements. We used massively parallel reporter assays to perform functional comparisons of two leading enhancer models and find that gene-distal transcription start sites are robust predictors of active enhancers with higher resolution than histone modifications. We show that active enhancer units are precisely delineated by active transcription start sites, validate that these boundaries are sufficient for capturing enhancer function, and confirm that core promoter sequences are necessary for this activity. We assay adjacent enhancers and find that their joint activity is often driven by the stronger unit within the cluster. Finally, we validate these results through functional dissection of a distal enhancer cluster using CRISPR-Cas9 deletions. In summary, definition of high-resolution enhancer boundaries enables deconvolution of complex regulatory loci into modular units.
Collapse
Affiliation(s)
- Nathaniel D Tippens
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, Cornell University, Ithaca, NY, USA
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Alden King-Yung Leung
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Shayne D Wierbowski
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Abdullah Ozer
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - James G Booth
- Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA
| | - John T Lis
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA.
- Tri-Institutional Training Program in Computational Biology and Medicine, Cornell University, Ithaca, NY, USA.
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Tri-Institutional Training Program in Computational Biology and Medicine, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
20
|
Zhang ZD, Milman S, Lin JR, Wierbowski S, Yu H, Barzilai N, Gorbunova V, Ladiges WC, Niedernhofer LJ, Suh Y, Robbins PD, Vijg J. Genetics of extreme human longevity to guide drug discovery for healthy ageing. Nat Metab 2020; 2:663-672. [PMID: 32719537 PMCID: PMC7912776 DOI: 10.1038/s42255-020-0247-0] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 06/22/2020] [Indexed: 02/07/2023]
Abstract
Ageing is the greatest risk factor for most common chronic human diseases, and it therefore is a logical target for developing interventions to prevent, mitigate or reverse multiple age-related morbidities. Over the past two decades, genetic and pharmacologic interventions targeting conserved pathways of growth and metabolism have consistently led to substantial extension of the lifespan and healthspan in model organisms as diverse as nematodes, flies and mice. Recent genetic analysis of long-lived individuals is revealing common and rare variants enriched in these same conserved pathways that significantly correlate with longevity. In this Perspective, we summarize recent insights into the genetics of extreme human longevity and propose the use of this rare phenotype to identify genetic variants as molecular targets for gaining insight into the physiology of healthy ageing and the development of new therapies to extend the human healthspan.
Collapse
Affiliation(s)
- Zhengdong D Zhang
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA.
| | - Sofiya Milman
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Jhih-Rong Lin
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
| | - Shayne Wierbowski
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, New York, NY, USA
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, New York, NY, USA
| | - Nir Barzilai
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Department of Medicine, Albert Einstein College of Medicine, New York, NY, USA
| | - Vera Gorbunova
- Department of Biology, University of Rochester, Rochester, NY, USA
| | - Warren C Ladiges
- Department of Comparative Medicine, School of Medicine, University of Washington, Seattle, WA, USA
| | - Laura J Niedernhofer
- Institute on the Biology of Aging and Metabolism and Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Yousin Suh
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Departments of Obstetrics and Gynecology, Genetics and Development, Columbia University, New York, NY, USA
| | - Paul D Robbins
- Institute on the Biology of Aging and Metabolism and Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Jan Vijg
- Department of Genetics, Albert Einstein College of Medicine, New York, NY, USA
- Center for Single-Cell Omics in Aging and Disease, School of Public Health, Shanghai, Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
21
|
Bonjoch L, Franch-Expósito S, Garre P, Belhadj S, Muñoz J, Arnau-Collell C, Díaz-Gay M, Gratacós-Mulleras A, Raimondi G, Esteban-Jurado C, Soares de Lima Y, Herrera-Pariente C, Cuatrecasas M, Ocaña T, Castells A, Fillat C, Capellá G, Balaguer F, Caldés T, Valle L, Castellví-Bel S. Germline Mutations in FAF1 Are Associated With Hereditary Colorectal Cancer. Gastroenterology 2020; 159:227-240.e7. [PMID: 32179092 DOI: 10.1053/j.gastro.2020.03.015] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 02/19/2020] [Accepted: 03/08/2020] [Indexed: 01/03/2023]
Abstract
BACKGROUND & AIMS A significant proportion of colorectal cancer (CRC) cases have familial aggregation but little is known about the genetic factors that contribute to these cases. We performed an exhaustive functional characterization of genetic variants associated with familial CRC. METHODS We performed whole-exome sequencing analyses of 75 patients from 40 families with a history of CRC (including early-onset cases) of an unknown germline basis (discovery cohort). We also sequenced specific genes in DNA from an external replication cohort of 473 families, including 488 patients with colorectal tumors that had normal expression of mismatch repair proteins (validation cohort). We disrupted the Fas-associated factor 1 gene (FAF1) in DLD-1 CRC cells using CRISPR/Cas9 gene editing; some cells were transfected with plasmids that express FAF1 missense variants. Cells were analyzed by immunoblots, quantitative real-time polymerase chain reaction, and functional assays monitoring apoptosis, proliferation, and assays for Wnt signaling or nuclear factor (NF)-kappa-B activity. RESULTS We identified predicted pathogenic variant in the FAF1 gene (c.1111G>A; p.Asp371Asn) in the discovery cohort; it was present in 4 patients of the same family. We identified a second variant in FAF1 in the validation cohort (c.254G>C; p.Arg85Pro). Both variants encoded unstable FAF1 proteins. Expression of these variants in CRC cells caused them to become resistant to apoptosis, accumulate beta-catenin in the cytoplasm, and translocate NF-kappa-B to the nucleus. CONCLUSIONS In whole-exome sequencing analyses of patients from families with a history of CRC, we identified variants in FAF1 that associate with development of CRC. These variants encode unstable forms of FAF1 that increase resistance of CRC cells to apoptosis and increase activity of beta-catenin and NF-kappa-B.
Collapse
Affiliation(s)
- Laia Bonjoch
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Sebastià Franch-Expósito
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Pilar Garre
- Molecular Oncology Laboratory, Centro Investigación Biomédica en Red de Cáncer (CIBERONC). Hospital Clínico San Carlos. Instituto de Investigación Sanitaria San Carlos (IdISSC), Madrid, Spain
| | - Sami Belhadj
- Hereditary Cancer Program, Catalan Institute of Oncology, Oncobell, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Barcelona, Spain
| | - Jenifer Muñoz
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Coral Arnau-Collell
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Marcos Díaz-Gay
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Anna Gratacós-Mulleras
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Giulia Raimondi
- Gene Therapy and Cancer, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universitat de Barcelona, Barcelona, Spain
| | - Clara Esteban-Jurado
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Yasmin Soares de Lima
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Cristina Herrera-Pariente
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Miriam Cuatrecasas
- Pathology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD) and Tumor Bank-Biobank, Hospital Clínic, Barcelona, Spain
| | - Teresa Ocaña
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Antoni Castells
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Cristina Fillat
- Gene Therapy and Cancer, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Universitat de Barcelona, Barcelona, Spain
| | - Gabriel Capellá
- Hereditary Cancer Program, Catalan Institute of Oncology, Oncobell, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Barcelona, Spain
| | - Francesc Balaguer
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain
| | - Trinidad Caldés
- Molecular Oncology Laboratory, Centro Investigación Biomédica en Red de Cáncer (CIBERONC). Hospital Clínico San Carlos. Instituto de Investigación Sanitaria San Carlos (IdISSC), Madrid, Spain
| | - Laura Valle
- Hereditary Cancer Program, Catalan Institute of Oncology, Oncobell, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Barcelona, Spain
| | - Sergi Castellví-Bel
- Gastroenterology Department, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Hospital Clínic, Universitat de Barcelona, Barcelona, Spain.
| |
Collapse
|
22
|
Yadav A, Vidal M, Luck K. Precision medicine - networks to the rescue. Curr Opin Biotechnol 2020; 63:177-189. [PMID: 32199228 PMCID: PMC7308189 DOI: 10.1016/j.copbio.2020.02.005] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 02/13/2020] [Indexed: 12/11/2022]
Abstract
Genetic variants are often not predictive of the phenotypic outcome. Individuals carrying the same pathogenic variant, associated with Mendelian or complex disease, can manifest to different extents, from severe-to-mild to no disease. Improving the accuracy of predicted clinical manifestations of genetic variants has emerged as one of the biggest challenges in precision medicine, which can only be addressed by understanding the mechanisms underlying genotype-phenotype relationships. Efforts to understand the molecular basis of these relationships have identified complex systems of interacting biomolecules that underlie cellular function. Here, we review recent advances in how modeling cellular systems as networks of interacting proteins has fueled identification of disease-associated processes, delineation of underlying molecular mechanisms, and prediction of the pathogenicity of variants. This review is intended to be inspiring for clinicians, geneticists, and network biologists alike who aim to jointly advance our understanding of human disease and accelerate progress toward precision medicine.
Collapse
Affiliation(s)
- Anupama Yadav
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA; Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA; Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
| | - Katja Luck
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA; Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA; Current address: Institute of Molecular Biology, Mainz, Germany.
| |
Collapse
|
23
|
Pahari S, Li G, Murthy AK, Liang S, Fragoza R, Yu H, Alexov E. SAAMBE-3D: Predicting Effect of Mutations on Protein-Protein Interactions. Int J Mol Sci 2020; 21:E2563. [PMID: 32272725 PMCID: PMC7177817 DOI: 10.3390/ijms21072563] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 04/04/2020] [Accepted: 04/05/2020] [Indexed: 12/26/2022] Open
Abstract
Maintaining wild type protein-protein interactions is essential for the normal function of cell and any mutation that alter their characteristics can cause disease. Therefore, the ability to correctly and quickly predict the effect of amino acid mutations is crucial for understanding disease effects and to be able to carry out genome-wide studies. Here, we report a new development of the SAAMBE method, SAAMBE-3D, which is a machine learning-based approach, resulting in accurate predictions and is extremely fast. It achieves the Pearson correlation coefficient ranging from 0.78 to 0.82 depending on the training protocol in benchmarking five-fold validation test against the SKEMPI v2.0 database and outperforms currently existing algorithms on various blind-tests. Furthermore, optimized and tested via five-fold cross-validation on the Cornell University dataset, the SAAMBE-3D achieves AUC of 1.0 and 0.96 on a homo and hereto-dimer test datasets. Another important feature of SAAMBE-3D is that it is very fast, it takes less than a fraction of a second to complete a prediction. SAAMBE-3D is available as a web server and as well as a stand-alone code, the last one being another important feature allowing other researchers to directly download the code and run it on their local computer. Combined all together, SAAMBE-3D is an accurate and fast software applicable for genome-wide studies to assess the effect of amino acid mutations on protein-protein interactions. The webserver and the stand-alone codes (SAAMBE-3D for predicting the change of binding free energy and SAAMBE-3D-DN for predicting if the mutation is disruptive or non-disruptive) are available.
Collapse
Affiliation(s)
- Swagata Pahari
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| | - Gen Li
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| | - Adithya Krishna Murthy
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA; (S.L.); (R.F.); (H.Y.)
| | - Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA; (S.L.); (R.F.); (H.Y.)
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA; (S.L.); (R.F.); (H.Y.)
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (S.P.); (G.L.); (A.K.M.)
| |
Collapse
|
24
|
Lyozin GT, Brunelli L. Live-cell PCR and one-step purification streamline DNA engineering. FASEB J 2020; 34:3448-3460. [PMID: 31944382 DOI: 10.1096/fj.201902261r] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 11/21/2019] [Accepted: 12/16/2019] [Indexed: 01/12/2023]
Abstract
In vivo DNA engineering such as recombineering (recombination-mediated genetic engineering) and DNA gap repair typically involve growing Escherichia coli (E coli) containing plasmids, followed by plasmid DNA extraction and purification prior to downstream PCR-mediated DNA modifications and DNA sequencing. We previously demonstrated that crude cell lysates could be used for some limited downstream DNA applications. Here, we show how live E coli cell PCR and one-step LiCl-isopropanol purification can streamline DNA engineering. In DNA gap repair, live-cell PCR allowed the convenient elimination of clones containing background plasmids prior to DNA sequencing. Live-cell PCR also enabled the generation of specific DNA sequences for DNA engineering up to 11 kilo base pairs in length and with up to 80 base pair terminal non-homology. Using gel electrophoresis and DNA melting curve analysis, we showed that LiCl-isopropanol DNA precipitation removed primers and small, nonspecific PCR products from live-cell PCR products in only ~10-minutes. DNA sequencing of purified products yielded Phred quality scores values of ~55%. These data indicate that live-cell PCR and LiCl-isopropanol DNA precipitation are ideal to prepare DNA for sequencing and other downstream DNA applications, and might therefore accelerate high-throughput DNA engineering pipelines.
Collapse
Affiliation(s)
- George T Lyozin
- Department of Pediatrics, University of Nebraska Medical Center, Omaha, NE, USA.,Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA
| | - Luca Brunelli
- Department of Pediatrics, University of Nebraska Medical Center, Omaha, NE, USA.,Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA
| |
Collapse
|
25
|
Fragoza R, Das J, Wierbowski SD, Liang J, Tran TN, Liang S, Beltran JF, Rivera-Erick CA, Ye K, Wang TY, Yao L, Mort M, Stenson PD, Cooper DN, Wei X, Keinan A, Schimenti JC, Clark AG, Yu H. Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations. Nat Commun 2019; 10:4141. [PMID: 31515488 PMCID: PMC6742646 DOI: 10.1038/s41467-019-11959-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 08/06/2019] [Indexed: 12/19/2022] Open
Abstract
Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual's genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations.
Collapse
Affiliation(s)
- Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Jishnu Das
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Shayne D Wierbowski
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Jin Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Tina N Tran
- Department of Biomedical Science, Cornell University, Ithaca, NY, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Juan F Beltran
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Christen A Rivera-Erick
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Kaixiong Ye
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Ting-Yi Wang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Li Yao
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Xiaomu Wei
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Alon Keinan
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - John C Schimenti
- Department of Biomedical Science, Cornell University, Ithaca, NY, 14853, USA
| | - Andrew G Clark
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
26
|
Lou S, Cotter KA, Li T, Liang J, Mohsen H, Liu J, Zhang J, Cohen S, Xu J, Yu H, Rubin MA, Gerstein M. GRAM: A GeneRAlized Model to predict the molecular effect of a non-coding variant in a cell-type specific manner. PLoS Genet 2019; 15:e1007860. [PMID: 31469829 PMCID: PMC6742416 DOI: 10.1371/journal.pgen.1007860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 09/12/2019] [Accepted: 07/22/2019] [Indexed: 12/19/2022] Open
Abstract
There has been much effort to prioritize genomic variants with respect to their impact on "function". However, function is often not precisely defined: sometimes it is the disease association of a variant; on other occasions, it reflects a molecular effect on transcription or epigenetics. Here, we coupled multiple genomic predictors to build GRAM, a GeneRAlized Model, to predict a well-defined experimental target: the expression-modulating effect of a non-coding variant on its associated gene, in a transferable, cell-specific manner. Firstly, we performed feature engineering: using LASSO, a regularized linear model, we found transcription factor (TF) binding most predictive, especially for TFs that are hubs in the regulatory network; in contrast, evolutionary conservation, a popular feature in many other variant-impact predictors, has almost no contribution. Moreover, TF binding inferred from in vitro SELEX is as effective as that from in vivo ChIP-Seq. Second, we implemented GRAM integrating only SELEX features and expression profiles; thus, the program combines a universal regulatory score with an easily obtainable modifier reflecting the particular cell type. We benchmarked GRAM on large-scale MPRA datasets, achieving AUROC scores of 0.72 in GM12878 and 0.66 in a multi-cell line dataset. We then evaluated the performance of GRAM on targeted regions using luciferase assays in the MCF7 and K562 cell lines. We noted that changing the insertion position of the construct relative to the reporter gene gave very different results, highlighting the importance of carefully defining the exact prediction target of the model. Finally, we illustrated the utility of GRAM in fine-mapping causal variants and developed a practical software pipeline to carry this out. In particular, we demonstrated in specific examples how the pipeline could pinpoint variants that directly modulate gene expression within a larger linkage-disequilibrium block associated with a phenotype of interest (e.g., for an eQTL).
Collapse
Affiliation(s)
- Shaoke Lou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Kellie A. Cotter
- Department for BioMedical Research, University of Bern, CH, Bern, Switzerland
| | - Tianxiao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
| | - Hussein Mohsen
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Program in the History of Science and Medicine, Yale University, New Haven, Connecticut, United States of America
| | - Jason Liu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Jing Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Sandra Cohen
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, Cornell University, New York, New York, United States of America
| | - Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
- Department of Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Mark A. Rubin
- Department for BioMedical Research, University of Bern, CH, Bern, Switzerland
- Weill Cornell Medicine, New York, United States of America
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
27
|
Woodsmith J, Stelzl U. Understanding Disease Variants through the Lens of Protein Interactions. Cell Syst 2019; 5:544-546. [PMID: 29284128 DOI: 10.1016/j.cels.2017.12.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
High-density interaction mapping of mitochondrial proteins provides clues to molecular mechanisms implicated in the progression of neurological disorders.
Collapse
Affiliation(s)
- Jonathan Woodsmith
- Institute of Pharmaceutical Sciences, University of Graz and BioTechMed-Graz, Graz, Austria.
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences, University of Graz and BioTechMed-Graz, Graz, Austria.
| |
Collapse
|
28
|
Bonjoch L, Mur P, Arnau-Collell C, Vargas-Parra G, Shamloo B, Franch-Expósito S, Pineda M, Capellà G, Erman B, Castellví-Bel S. Approaches to functionally validate candidate genetic variants involved in colorectal cancer predisposition. Mol Aspects Med 2019; 69:27-40. [PMID: 30935834 DOI: 10.1016/j.mam.2019.03.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 03/26/2019] [Accepted: 03/26/2019] [Indexed: 02/07/2023]
Abstract
Most next generation sequencing (NGS) studies identified candidate genetic variants predisposing to colorectal cancer (CRC) but do not tackle its functional interpretation to unequivocally recognize a new hereditary CRC gene. Besides, germline variants in already established hereditary CRC-predisposing genes or somatic variants share the same need when trying to categorize those with relevant significance. Functional genomics approaches have an important role in identifying the causal links between genetic architecture and phenotypes, in order to decipher cellular function in health and disease. Therefore, functional interpretation of identified genetic variants by NGS platforms is now essential. Available approaches nowadays include bioinformatics, cell and molecular biology and animal models. Recent advances, such as the CRISPR-Cas9, ZFN and TALEN systems, have been already used as a powerful tool with this objective. However, the use of cell lines is of limited value due to the CRC heterogeneity and its close interaction with microenvironment. Access to tridimensional cultures or organoids and xenograft models that mimic the in vivo tissue architecture could revolutionize functional analysis. This review will focus on the application of state-of-the-art functional studies to better tackle new genes involved in germline predisposition to this neoplasm.
Collapse
Affiliation(s)
- Laia Bonjoch
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona, Spain
| | - Pilar Mur
- Hereditary Cancer Program, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), ONCOBELL Program, L'Hospitalet de Llobregat, Barcelona, Spain; Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Coral Arnau-Collell
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona, Spain
| | - Gardenia Vargas-Parra
- Hereditary Cancer Program, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), ONCOBELL Program, L'Hospitalet de Llobregat, Barcelona, Spain; Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Bahar Shamloo
- Molecular Biology, Genetics, and Bioengineering Department, Legacy Research Institute, Portland, OR, USA
| | - Sebastià Franch-Expósito
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona, Spain
| | - Marta Pineda
- Hereditary Cancer Program, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), ONCOBELL Program, L'Hospitalet de Llobregat, Barcelona, Spain; Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Gabriel Capellà
- Hereditary Cancer Program, Catalan Institute of Oncology, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), ONCOBELL Program, L'Hospitalet de Llobregat, Barcelona, Spain; Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Batu Erman
- Molecular Biology, Genetics and Bioengineering Program, Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul, Turkey
| | - Sergi Castellví-Bel
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona, Spain.
| |
Collapse
|
29
|
Capriotti E, Ozturk K, Carter H. Integrating molecular networks with genetic variant interpretation for precision medicine. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2018; 11:e1443. [PMID: 30548534 PMCID: PMC6450710 DOI: 10.1002/wsbm.1443] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 10/23/2018] [Accepted: 10/30/2018] [Indexed: 02/01/2023]
Abstract
More reliable and cheaper sequencing technologies have revealed the vast mutational landscapes characteristic of many phenotypes. The analysis of such genetic variants has led to successful identification of altered proteins underlying many Mendelian disorders. Nevertheless the simple one‐variant one‐phenotype model valid for many monogenic diseases does not capture the complexity of polygenic traits and disorders. Although experimental and computational approaches have improved detection of functionally deleterious variants and important interactions between gene products, the development of comprehensive models relating genotype and phenotypes remains a challenge in the field of genomic medicine. In this context, a new view of the pathologic state as significant perturbation of the network of interactions between biomolecules is crucial for the identification of biochemical pathways associated with complex phenotypes. Seminal studies in systems biology combined the analysis of genetic variation with protein–protein interaction networks to demonstrate that even as biological systems evolve to be robust to genetic variation, their topologies create disease vulnerabilities. More recent analyses model the impact of genetic variants as changes to the “wiring” of the interactome to better capture heterogeneity in genotype–phenotype relationships. These studies lay the foundation for using networks to predict variant effects at scale using machine‐learning or algorithmic approaches. A wealth of databases and resources for the annotation of genotype–phenotype relationships have been developed to support developments in this area. This overview describes how study of the molecular interactome has generated insights linking the organization of biological systems to disease mechanism, and how this information can enable precision medicine. This article is categorized under:
Translational, Genomic, and Systems Medicine > Translational Medicine Biological Mechanisms > Cell Signaling Models of Systems Properties and Processes > Mechanistic Models Analytical and Computational Methods > Computational Methods
Collapse
Affiliation(s)
- Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Kivilcim Ozturk
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, California
| | - Hannah Carter
- Department of Medicine and Institute for Genomic Medicine, University of California, San Diego, La Jolla, California
| |
Collapse
|
30
|
Wierbowski SD, Fragoza R, Liang S, Yu H. Extracting Complementary Insights from Molecular Phenotypes for Prioritization of Disease-Associated Mutations. CURRENT OPINION IN SYSTEMS BIOLOGY 2018; 11:107-116. [PMID: 31086831 PMCID: PMC6510504 DOI: 10.1016/j.coisb.2018.09.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Rapid advances in next-generation sequencing technology have resulted in an explosion of whole-exome/genome sequencing data, providing an unprecedented opportunity to identify disease- and trait-associated variants in humans on a large scale. To date, the long-standing paradigm has leveraged fitness-based approximations to translate this ever-expanding sequencing data into causal insights in disease. However, while this approach robustly identifies variants under evolutionary constraint, it fails to provide molecular insights. Moreover, complex disease phenomena often violate standard assumptions of a direct organismal phenotype to overall fitness effect relationship. Here we discuss the potential of a molecular phenotype-oriented paradigm to uniquely identify candidate disease-causing mutations from the human genetic background. By providing a direct connection between single nucleotide mutations and observable organismal and cellular phenotypes associated with disease, we suggest that molecular phenotypes can readily incorporate alongside established fitness-based methodologies to provide complementary insights to the functional impact of human mutations. Lastly, we discuss how integrated approaches between molecular phenotypes and fitness-based perspectives facilitate new insights into the molecular mechanisms underlying disease-associated mutations while also providing a platform for improved interpretation of epistasis in human disease.
Collapse
Affiliation(s)
- Shayne D. Wierbowski
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Siqi Liang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
31
|
An interactome perturbation framework prioritizes damaging missense mutations for developmental disorders. Nat Genet 2018; 50:1032-1040. [PMID: 29892012 DOI: 10.1038/s41588-018-0130-z] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 04/06/2018] [Indexed: 01/20/2023]
Abstract
Identifying disease-associated missense mutations remains a challenge, especially in large-scale sequencing studies. Here we establish an experimentally and computationally integrated approach to investigate the functional impact of missense mutations in the context of the human interactome network and test our approach by analyzing ~2,000 de novo missense mutations found in autism subjects and their unaffected siblings. Interaction-disrupting de novo missense mutations are more common in autism probands, principally affect hub proteins, and disrupt a significantly higher fraction of hub interactions than in unaffected siblings. Moreover, they tend to disrupt interactions involving genes previously implicated in autism, providing complementary evidence that strengthens previously identified associations and enhances the discovery of new ones. Importantly, by analyzing de novo missense mutation data from six disorders, we demonstrate that our interactome perturbation approach offers a generalizable framework for identifying and prioritizing missense mutations that contribute to the risk of human disease.
Collapse
|
32
|
Meyer MJ, Beltrán JF, Liang S, Fragoza R, Rumack A, Liang J, Wei X, Yu H. Interactome INSIDER: a structural interactome browser for genomic studies. Nat Methods 2018; 15:107-114. [PMID: 29355848 PMCID: PMC6026581 DOI: 10.1038/nmeth.4540] [Citation(s) in RCA: 112] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Accepted: 10/22/2017] [Indexed: 02/07/2023]
Abstract
We present Interactome INSIDER, a tool to link genomic variant information with structural protein-protein interactomes. Underlying this tool is the application of machine learning to predict protein interaction interfaces for 185,957 protein interactions with previously unresolved interfaces in human and seven model organisms, including the entire experimentally determined human binary interactome. Predicted interfaces exhibit functional properties similar to those of known interfaces, including enrichment for disease mutations and recurrent cancer mutations. Through 2,164 de novo mutagenesis experiments, we show that mutations of predicted and known interface residues disrupt interactions at a similar rate and much more frequently than mutations outside of predicted interfaces. To spur functional genomic studies, Interactome INSIDER (http://interactomeinsider.yulab.org) enables users to identify whether variants or disease mutations are enriched in known and predicted interaction interfaces at various resolutions. Users may explore known population variants, disease mutations, and somatic cancer mutations, or they may upload their own set of mutations for this purpose.
Collapse
Affiliation(s)
- Michael J. Meyer
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
- Tri-Institutional Training Program in Computational Biology and Medicine,
New York, New York, 10065, USA
| | - Juan Felipe Beltrán
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Siqi Liang
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY
14853, USA
| | - Aaron Rumack
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| | - Xiaomu Wei
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Department of Medicine, Weill Cornell College of Medicine, New York, New
York, 10065, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell
University, Ithaca, New York, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca,
New York, 14853, USA
| |
Collapse
|
33
|
Protein interaction perturbation profiling at amino-acid resolution. Nat Methods 2017; 14:1213-1221. [PMID: 29039417 DOI: 10.1038/nmeth.4464] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 09/06/2017] [Indexed: 12/13/2022]
Abstract
The identification of genomic variants in healthy and diseased individuals continues to rapidly outpace our ability to functionally annotate these variants. Techniques that both systematically assay the functional consequences of nucleotide-resolution variation and can scale to hundreds of genes are urgently required. We designed a sensitive yeast two-hybrid-based 'off switch' for positive selection of interaction-disruptive variants from complex genetic libraries. Combined with massively parallel programmed mutagenesis and a sequencing readout, this method enables systematic profiling of protein-interaction determinants at amino-acid resolution. We defined >1,000 interaction-disrupting amino acid mutations across eight subunits of the BBSome, the major human cilia protein complex associated with the pleiotropic genetic disorder Bardet-Biedl syndrome. These high-resolution interaction-perturbation profiles provide a framework for interpreting patient-derived mutations across the entire protein complex and thus highlight how the impact of disease variation on interactome networks can be systematically assessed.
Collapse
|
34
|
Abstract
A complete understanding of human cancer variants requires new methods to systematically and efficiently assess the functional effects of genomic mutations at a large scale. Here, we describe a set of tools to rapidly clone and stratify thousands of cancer mutations at base resolution. This protocol provides a massively parallel pipeline to achieve high stringency and throughput. The approach includes high-throughput generation of mutant clones by Gateway, confirmation of variant identity by barcoding and next-generation sequencing, and stratification of cancer variants by multiplexed interaction profiling. Compared with alternative site-directed mutagenesis methods, our protocol requires less sequencing effort and enables robust statistical calling of allele-specific effects. To ensure the precision of variant interaction profiling, we further describe two complementary methods-a high-throughput enhanced yeast two-hybrid (HT-eY2H) assay and a mammalian-cell-based Gaussia princeps luciferase protein-fragment complementation assay (GPCA). These independent assays with standard controls validate mutational interaction profiles with high quality. This protocol provides experimentally derived guidelines for classifying candidate cancer alleles emerging from whole-genome or whole-exome sequencing projects as 'drivers' or 'passengers'. For ∼100 genomic mutations, the protocol-including target primer design, variant library construction, and sequence verification-can be completed within as little as 2-3 weeks, and cancer variant stratification can be completed within 2 weeks.
Collapse
|
35
|
Woodsmith J, Stelzl U, Vinayagam A. Bioinformatics Analysis of PTM-Modified Protein Interaction Networks and Complexes. Methods Mol Biol 2017; 1558:321-332. [PMID: 28150245 DOI: 10.1007/978-1-4939-6783-4_15] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Normal cellular functioning is maintained by macromolecular machines that control both core and specialized molecular tasks. These machines are in large part multi-subunit protein complexes that undergo regulation at multiple levels, from expression of requisite components to a vast array of post-translational modifications (PTMs). PTMs such as phosphorylation, ubiquitination, and acetylation currently number more than 200,000 in the human proteome and function within all molecular pathways. Here we provide a framework for systematically studying these PTMs in the context of global protein-protein interaction networks. This analytical framework allows insight into which functions specific PTMs tend to cluster in, and furthermore which complexes either single or multiple PTM signaling pathways converge on.
Collapse
Affiliation(s)
- Jonathan Woodsmith
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Ihnestrasse 63-73, Berlin, Germany
| | - Ulrich Stelzl
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Ihnestrasse 63-73, Berlin, Germany.
- Department of Pharmaceutical Chemistry, Institute of Pharmaceutical Sciences, University of Graz, Universitätsplatz 1, Graz, Austria.
| | - Arunachalam Vinayagam
- Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA, 02115, USA
| |
Collapse
|
36
|
Vo TV, Das J, Meyer MJ, Cordero NA, Akturk N, Wei X, Fair BJ, Degatano AG, Fragoza R, Liu LG, Matsuyama A, Trickey M, Horibata S, Grimson A, Yamano H, Yoshida M, Roth FP, Pleiss JA, Xia Y, Yu H. A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human. Cell 2016; 164:310-323. [PMID: 26771498 DOI: 10.1016/j.cell.2015.11.037] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Revised: 10/12/2015] [Accepted: 11/04/2015] [Indexed: 01/01/2023]
Abstract
Here, we present FissionNet, a proteome-wide binary protein interactome for S. pombe, comprising 2,278 high-quality interactions, of which ∼ 50% were previously not reported in any species. FissionNet unravels previously unreported interactions implicated in processes such as gene silencing and pre-mRNA splicing. We developed a rigorous network comparison framework that accounts for assay sensitivity and specificity, revealing extensive species-specific network rewiring between fission yeast, budding yeast, and human. Surprisingly, although genes are better conserved between the yeasts, S. pombe interactions are significantly better conserved in human than in S. cerevisiae. Our framework also reveals that different modes of gene duplication influence the extent to which paralogous proteins are functionally repurposed. Finally, cross-species interactome mapping demonstrates that coevolution of interacting proteins is remarkably prevalent, a result with important implications for studying human disease in model organisms. Overall, FissionNet is a valuable resource for understanding protein functions and their evolution.
Collapse
Affiliation(s)
- Tommy V Vo
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA; Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA; Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA
| | - Nicolas A Cordero
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Nurten Akturk
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Xiaomu Wei
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA; Department of Medicine, Weill Cornell College of Medicine, New York, NY 10021, USA
| | - Benjamin J Fair
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Andrew G Degatano
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Robert Fragoza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA; Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Lisa G Liu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Akihisa Matsuyama
- Chemical Genomics Research Group, RIKEN Center for Sustainable Resource Center, Wako, Saitama 351-0198, Japan
| | - Michelle Trickey
- University College London Cancer Institute, Paul O'Gorman Building, 72 Huntley Street, London WC1E 6BT, UK
| | - Sachi Horibata
- Department of Biomedical Sciences, Baker Institute for Animal Health, Cornell University, Ithaca, NY 14853, USA
| | - Andrew Grimson
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Hiroyuki Yamano
- University College London Cancer Institute, Paul O'Gorman Building, 72 Huntley Street, London WC1E 6BT, UK
| | - Minoru Yoshida
- Chemical Genomics Research Group, RIKEN Center for Sustainable Resource Center, Wako, Saitama 351-0198, Japan
| | - Frederick P Roth
- Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON M5S 3E1, Canada; Canadian Institute for Advanced Research, Toronto, ON M5G 1Z8, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, ON M5G 1X5, Canada
| | - Jeffrey A Pleiss
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Yu Xia
- Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC H3A 0C3, Canada
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.
| |
Collapse
|
37
|
Meyer MJ, Lapcevic R, Romero AE, Yoon M, Das J, Beltrán JF, Mort M, Stenson PD, Cooper DN, Paccanaro A, Yu H. mutation3D: Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome. Hum Mutat 2016; 37:447-56. [PMID: 26841357 DOI: 10.1002/humu.22963] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 01/14/2016] [Indexed: 12/20/2022]
Abstract
A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported.
Collapse
Affiliation(s)
- Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853.,Tri-Institutional Training Program in Computational Biology and Medicine, New York, New York, 10065
| | - Ryan Lapcevic
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Alfonso E Romero
- Department of Computer Science and Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham TW20 0EX, UK
| | - Mark Yoon
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Juan Felipe Beltrán
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Alberto Paccanaro
- Department of Computer Science and Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham TW20 0EX, UK
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| |
Collapse
|
38
|
Bakail M, Ochsenbein F. Targeting protein–protein interactions, a wide open field for drug design. CR CHIM 2016. [DOI: 10.1016/j.crci.2015.12.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
39
|
Das J, Meyer MJ, Yu H. Studying Autism in Context. Cell Syst 2015; 1:312-3. [PMID: 27136240 DOI: 10.1016/j.cels.2015.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Studying autism genes in the context of the protein complexes to which they belong illustrates the potential of network-centric approaches for understanding complex genetic disease.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA; Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.
| |
Collapse
|
40
|
Biancalana V, Laporte J. Diagnostic use of Massively Parallel Sequencing in Neuromuscular Diseases: Towards an Integrated Diagnosis. J Neuromuscul Dis 2015; 2:193-203. [PMID: 27858740 PMCID: PMC5240547 DOI: 10.3233/jnd-150092] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Massively parallel sequencing is revolutionizing the genetic testing in diagnosis laboratories, replacing gene-by-gene investigations with a "gene panel" strategy. This new approach is particularly promising for the diagnosis of neuromuscular disorders affecting children as well as adults, which is constrained by strong clinical and genetic heterogeneity. While it leads to a strong improvement in molecular diagnosis, this new approach is dramatically changing the whole diagnosis process, establishing new decision trees and requiring integrated strategies between clinicians and laboratories. To have an overview of the implementation and benefit of these novel sequencing strategies for the diagnosis of neuromuscular disorders, we surveyed the current literature on the application of targeted genes panel sequencing, exome sequencing and genome sequencing. We highlight advantages and disadvantages of these different strategies in a diagnosis setting, discuss about unresolved cases, and point potential validation approaches and outcomes of massively parallel sequencing. It appears important to integrate such novel strategies with clinical, histopathological and imaging investigations, for a faster and more accurate diagnosis and patient care, and to foster research projects and clinical trials.
Collapse
Affiliation(s)
- Valérie Biancalana
- Faculté de Médecine, Laboratoire de Diagnostic Génétique, Nouvel Hôpital Civil, Strasbourg, France
- Department of Translational Medicine and Neurogenetics, IGBMC, INSERM U964, CNRS UMR7104, University of Strasbourg, Collège de France, Illkirch, France
| | - Jocelyn Laporte
- Department of Translational Medicine and Neurogenetics, IGBMC, INSERM U964, CNRS UMR7104, University of Strasbourg, Collège de France, Illkirch, France
| |
Collapse
|
41
|
Yeger-Lotem E, Sharan R. Human protein interaction networks across tissues and diseases. Front Genet 2015; 6:257. [PMID: 26347769 PMCID: PMC4541328 DOI: 10.3389/fgene.2015.00257] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 07/17/2015] [Indexed: 11/13/2022] Open
Abstract
Protein interaction networks are an important framework for studying protein function, cellular processes, and genotype-to-phenotype relationships. While our view of the human interaction network is constantly expanding, less is known about networks that form in biologically important contexts such as within distinct tissues or in disease conditions. Here we review efforts to characterize these networks and to harness them to gain insights into the molecular mechanisms underlying human disease.
Collapse
Affiliation(s)
- Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Ben-Gurion University of the Negev Beer-Sheva, Israel
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University Tel Aviv, Israel
| |
Collapse
|
42
|
Grossmann A, Benlasfer N, Birth P, Hegele A, Wachsmuth F, Apelt L, Stelzl U. Phospho-tyrosine dependent protein-protein interaction network. Mol Syst Biol 2015; 11:794. [PMID: 25814554 PMCID: PMC4380928 DOI: 10.15252/msb.20145968] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Post-translational protein modifications, such as tyrosine phosphorylation, regulate protein–protein interactions (PPIs) critical for signal processing and cellular phenotypes. We extended an established yeast two-hybrid system employing human protein kinases for the analyses of phospho-tyrosine (pY)-dependent PPIs in a direct experimental, large-scale approach. We identified 292 mostly novel pY-dependent PPIs which showed high specificity with respect to kinases and interacting proteins and validated a large fraction in co-immunoprecipitation experiments from mammalian cells. About one-sixth of the interactions are mediated by known linear sequence binding motifs while the majority of pY-PPIs are mediated by other linear epitopes or governed by alternative recognition modes. Network analysis revealed that pY-mediated recognition events are tied to a highly connected protein module dedicated to signaling and cell growth pathways related to cancer. Using binding assays, protein complementation and phenotypic readouts to characterize the pY-dependent interactions of TSPAN2 (tetraspanin 2) and GRB2 or PIK3R3 (p55γ), we exemplarily provide evidence that the two pY-dependent PPIs dictate cellular cancer phenotypes.
Collapse
Affiliation(s)
- Arndt Grossmann
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| | - Nouhad Benlasfer
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| | - Petra Birth
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| | - Anna Hegele
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| | - Franziska Wachsmuth
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| | - Luise Apelt
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| | - Ulrich Stelzl
- Otto-Warburg Laboratory, Max-Planck Institute for Molecular Genetics (MPIMG), Berlin, Germany
| |
Collapse
|