1
|
Li Z, Chen L. Predicting functional consequences of SNPs on mRNA translation via machine learning. Nucleic Acids Res 2023; 51:7868-7881. [PMID: 37427781 PMCID: PMC10450169 DOI: 10.1093/nar/gkad576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 05/18/2023] [Accepted: 06/23/2023] [Indexed: 07/11/2023] Open
Abstract
The functional impact of single nucleotide polymorphisms (SNPs) on translation has yet to be considered when prioritizing disease-causing SNPs from genome-wide association studies (GWAS). Here we apply machine learning models to genome-wide ribosome profiling data to predict SNP function by forecasting ribosome collisions during mRNA translation. SNPs causing remarkable ribosome occupancy changes are named RibOc-SNPs (Ribosome-Occupancy-SNPs). We found that disease-related SNPs tend to cause notable changes in ribosome occupancy, suggesting translational regulation as an essential pathogenesis step. Nucleotide conversions, such as 'G → T', 'T → G' and 'C → A', are enriched in RibOc-SNPs, with the most significant impact on ribosome occupancy, while 'A → G' (or 'A→ I' RNA editing) and 'G → A' are less deterministic. Among amino acid conversions, 'Glu → stop (codon)' shows the most significant enrichment in RibOc-SNPs. Interestingly, there is selection pressure on stop codons with a lower collision likelihood. RibOc-SNPs are enriched at the 5'-coding sequence regions, implying hot spots of translation initiation regulation. Strikingly, ∼22.1% of the RibOc-SNPs lead to opposite changes in ribosome occupancy on alternative transcript isoforms, suggesting that SNPs can amplify the differences between splicing isoforms by oppositely regulating their translation efficiency.
Collapse
Affiliation(s)
- Zheyu Li
- Mork Family Department of Chemical Engineering and Materials Science, University of Southern California, 925 Bloom Walk, Los Angeles, CA 90089, USA
| | - Liang Chen
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA
| |
Collapse
|
2
|
Tichkule S, Myung Y, Naung MT, Ansell BRE, Guy AJ, Srivastava N, Mehra S, Cacciò SM, Mueller I, Barry AE, van Oosterhout C, Pope B, Ascher DB, Jex AR. VIVID: a web application for variant interpretation and visualisation in multidimensional analyses. Mol Biol Evol 2022; 39:6697981. [PMID: 36103257 PMCID: PMC9514033 DOI: 10.1093/molbev/msac196] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Large-scale comparative genomics- and population genetic studies generate enormous amounts of polymorphism data in the form of DNA variants. Ultimately, the goal of many of these studies is to associate genetic variants to phenotypes or fitness. We introduce VIVID, an interactive, user-friendly web application that integrates a wide range of approaches for encoding genotypic to phenotypic information in any organism or disease, from an individual or population, in three-dimensional (3D) space. It allows mutation mapping and annotation, calculation of interactions and conservation scores, prediction of harmful effects, analysis of diversity and selection, and 3D visualization of genotypic information encoded in Variant Call Format on AlphaFold2 protein models. VIVID enables the rapid assessment of genes of interest in the study of adaptive evolution and the genetic load, and it helps prioritizing targets for experimental validation. We demonstrate the utility of VIVID by exploring the evolutionary genetics of the parasitic protist Plasmodium falciparum, revealing geographic variation in the signature of balancing selection in potential targets of functional antibodies.
Collapse
Affiliation(s)
- Swapnil Tichkule
- Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research , Melbourne , Australia
- Department of Medical Biology, University of Melbourne , Melbourne , Australia
| | - Yoochan Myung
- Systems and Computational Biology, Bio21 Institute, University of Melbourne , Melbourne , Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes , Melbourne , Australia
| | - Myo T Naung
- Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research , Melbourne , Australia
- Department of Medical Biology, University of Melbourne , Melbourne , Australia
| | - Brendan R E Ansell
- Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research , Melbourne , Australia
| | - Andrew J Guy
- School of Science, RMIT University , Melbourne , Australia
| | - Namrata Srivastava
- Department of Data Science and AI, Monash University , Melbourne , Australia
| | - Somya Mehra
- Life Sciences Discipline, Burnet Institute , Melbourne , Australia
| | - Simone M Cacciò
- Department of Infectious Disease, Istituto Superiore di Sanità , Rome , Italy
| | - Ivo Mueller
- Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research , Melbourne , Australia
| | - Alyssa E Barry
- Life Sciences Discipline, Burnet Institute , Melbourne , Australia
- Institute of Mental and Physical Health and Clinical Translation (IMPACT) and School of Medicine, Deakin University , Geelong , Australia
| | - Cock van Oosterhout
- School of Environmental Sciences, University of East Anglia, Norwich Research Park , Norwich , UK
| | - Bernard Pope
- Melbourne Bioinformatics, University of Melbourne , Melbourne , Australia
- Australian BioCommons , Sydney , Australia
- Department of Clinical Pathology, University of Melbourne , Melbourne , Australia
- Department of Surgery (Royal Melbourne Hospital), University of Melbourne , Melbourne , Australia
| | - David B Ascher
- Systems and Computational Biology, Bio21 Institute, University of Melbourne , Melbourne , Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes , Melbourne , Australia
| | - Aaron R Jex
- Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research , Melbourne , Australia
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne , Melbourne , Australia
| |
Collapse
|
3
|
Ammar A, Cavill R, Evelo C, Willighagen E. PSnpBind: a database of mutated binding site protein-ligand complexes constructed using a multithreaded virtual screening workflow. J Cheminform 2022; 14:8. [PMID: 35227289 PMCID: PMC8886843 DOI: 10.1186/s13321-021-00573-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 11/18/2021] [Indexed: 11/15/2022] Open
Abstract
A key concept in drug design is how natural variants, especially the ones occurring in the binding site of drug targets, affect the inter-individual drug response and efficacy by altering binding affinity. These effects have been studied on very limited and small datasets while, ideally, a large dataset of binding affinity changes due to binding site single-nucleotide polymorphisms (SNPs) is needed for evaluation. However, to the best of our knowledge, such a dataset does not exist. Thus, a reference dataset of ligands binding affinities to proteins with all their reported binding sites' variants was constructed using a molecular docking approach. Having a large database of protein-ligand complexes covering a wide range of binding pocket mutations and a large small molecules' landscape is of great importance for several types of studies. For example, developing machine learning algorithms to predict protein-ligand affinity or a SNP effect on it requires an extensive amount of data. In this work, we present PSnpBind: A large database of 0.6 million mutated binding site protein-ligand complexes constructed using a multithreaded virtual screening workflow. It provides a web interface to explore and visualize the protein-ligand complexes and a REST API to programmatically access the different aspects of the database contents. PSnpBind is open source and freely available at https://psnpbind.org .
Collapse
Affiliation(s)
- Ammar Ammar
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Rachel Cavill
- Department of Data Science and Knowledge Engineering, Maastricht University, Maastricht, The Netherlands
| | - Chris Evelo
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Egon Willighagen
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
4
|
Krebs FS, Zoete V, Trottet M, Pouchon T, Bovigny C, Michielin O. Swiss-PO: a new tool to analyze the impact of mutations on protein three-dimensional structures for precision oncology. NPJ Precis Oncol 2021; 5:19. [PMID: 33737716 PMCID: PMC7973488 DOI: 10.1038/s41698-021-00156-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 02/04/2021] [Indexed: 12/12/2022] Open
Abstract
Swiss-PO is a new web tool to map gene mutations on the 3D structure of corresponding proteins and to intuitively assess the structural implications of protein variants for precision oncology. Swiss-PO is constructed around a manually curated database of 3D structures, variant annotations, and sequence alignments, for a list of 50 genes taken from the Ion AmpliSeqTM Custom Cancer Hotspot Panel. The website was designed to guide users in the choice of the most appropriate structure to analyze regarding the mutated residue, the role of the protein domain it belongs to, or the drug that could be selected to treat the patient. The importance of the mutated residue for the structure and activity of the protein can be assessed based on the molecular interactions exchanged with neighbor residues in 3D within the same protein or between different biomacromolecules, its conservation in orthologs, or the known effect of reported mutations in its 3D or sequence-based vicinity. Swiss-PO is available free of charge or login at https://www.swiss-po.ch .
Collapse
Affiliation(s)
- Fanny S Krebs
- Computer-Aided Molecular Engineering, Department of Oncology, Ludwig Institute for Cancer Research Lausanne Branch, University of Lausanne, Lausanne, Switzerland
| | - Vincent Zoete
- Computer-Aided Molecular Engineering, Department of Oncology, Ludwig Institute for Cancer Research Lausanne Branch, University of Lausanne, Lausanne, Switzerland.
- Molecular Modelling Group, Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| | - Maxence Trottet
- Computer-Aided Molecular Engineering, Department of Oncology, Ludwig Institute for Cancer Research Lausanne Branch, University of Lausanne, Lausanne, Switzerland
- Molecular Modelling Group, Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Timothée Pouchon
- Molecular Modelling Group, Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Christophe Bovigny
- Molecular Modelling Group, Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Olivier Michielin
- Computer-Aided Molecular Engineering, Department of Oncology, Ludwig Institute for Cancer Research Lausanne Branch, University of Lausanne, Lausanne, Switzerland.
- Molecular Modelling Group, Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
- Department of Oncology, Ludwig Institute for Cancer Research, University Hospital of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
5
|
Konc J, Skrlj B, Erzen N, Kunej T, Janezic D. GenProBiS: web server for mapping of sequence variants to protein binding sites. Nucleic Acids Res 2019; 45:W253-W259. [PMID: 28498966 PMCID: PMC5570222 DOI: 10.1093/nar/gkx420] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Accepted: 05/02/2017] [Indexed: 02/02/2023] Open
Abstract
Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein–protein, protein–nucleic acid, protein–compound, and protein–metal ion binding sites. The concept of a protein–compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org.
Collapse
Affiliation(s)
- Janez Konc
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia.,University of Primorska, Faculty of Mathematics, Natural Sciences and Information Technologies, 6000 Koper, Slovenia
| | - Blaz Skrlj
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Nika Erzen
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Tanja Kunej
- Biotechnical Faculty, University of Ljubljana, 1000 Ljubljana, Slovenia
| | - Dusanka Janezic
- University of Primorska, Faculty of Mathematics, Natural Sciences and Information Technologies, 6000 Koper, Slovenia
| |
Collapse
|
6
|
Ofoegbu TC, David A, Kelley LA, Mezulis S, Islam SA, Mersmann SF, Strömich L, Vakser IA, Houlston RS, Sternberg MJE. PhyreRisk: A Dynamic Web Application to Bridge Genomics, Proteomics and 3D Structural Data to Guide Interpretation of Human Genetic Variants. J Mol Biol 2019; 431:2460-2466. [PMID: 31075275 PMCID: PMC6597944 DOI: 10.1016/j.jmb.2019.04.043] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 04/02/2019] [Accepted: 04/29/2019] [Indexed: 12/12/2022]
Abstract
PhyreRisk is an open-access, publicly accessible web application for interactively bridging genomic, proteomic and structural data facilitating the mapping of human variants onto protein structures. A major advance over other tools for sequence-structure variant mapping is that PhyreRisk provides information on 20,214 human canonical proteins and an additional 22,271 alternative protein sequences (isoforms). Specifically, PhyreRisk provides structural coverage (partial or complete) for 70% (14,035 of 20,214 canonical proteins) of the human proteome, by storing 18,874 experimental structures and 84,818 pre-built models of canonical proteins and their isoforms generated using our in house Phyre2. PhyreRisk reports 55,732 experimentally, multi-validated protein interactions from IntAct and 24,260 experimental structures of protein complexes. Another major feature of PhyreRisk is that, rather than presenting a limited set of precomputed variant-structure mapping of known genetic variants, it allows the user to explore novel variants using, as input, genomic coordinates formats (Ensembl, VCF, reference SNP ID and HGVS notations) and Human Build GRCh37 and GRCh38. PhyreRisk also supports mapping variants using amino acid coordinates and searching for genes or proteins of interest. PhyreRisk is designed to empower researchers to translate genetic data into protein structural information, thereby providing a more comprehensive appreciation of the functional impact of variants. PhyreRisk is freely available at http://phyrerisk.bc.ic.ac.uk.
Collapse
Affiliation(s)
- Tochukwu C Ofoegbu
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Alessia David
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.
| | - Lawrence A Kelley
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Stefans Mezulis
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Suhail A Islam
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Sophia F Mersmann
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Léonie Strömich
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66045, USA
| | - Richard S Houlston
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, SM2 5NG, UK
| | - Michael J E Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
7
|
Kreft L, Turan D, Hulstaert N, Botzki A, Martens L, Vandermarliere E. Scop3D: Online Visualization of Mutation Rates on Protein Structure. J Proteome Res 2019; 18:765-769. [PMID: 30540477 DOI: 10.1021/acs.jproteome.8b00681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Scop3D is a tool that automatically annotates protein structure with sequence conservation starting from a set of protein sequence variants. We present a complete upgrade and rewrite of Scop3D. We have included a DNA module that allows the analysis of single nucleotide polymorphisms in relation to the structural context of the protein. Scop3D therefore forms a bridge between genomics and protein structure. Moreover, Scop3D is now also available through an intuitive web-interface that makes the tool highly user-friendly.
Collapse
Affiliation(s)
- Lukasz Kreft
- VIB Bioinformatics Core , VIB , Ghent 120-9052 , Belgium
| | - Demet Turan
- VIB-UGent Center for Medical Biotechnology , VIB , Ghent 9000 , Belgium.,Department of Biochemistry , Ghent University , Ghent 9000 , Belgium
| | - Niels Hulstaert
- VIB-UGent Center for Medical Biotechnology , VIB , Ghent 9000 , Belgium.,Department of Biochemistry , Ghent University , Ghent 9000 , Belgium
| | | | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology , VIB , Ghent 9000 , Belgium.,Department of Biochemistry , Ghent University , Ghent 9000 , Belgium
| | - Elien Vandermarliere
- VIB-UGent Center for Medical Biotechnology , VIB , Ghent 9000 , Belgium.,Department of Biochemistry , Ghent University , Ghent 9000 , Belgium.,VIB Headquarters , VIB , Ghent 120-9052 , Belgium
| |
Collapse
|
8
|
Glusman G, Rose PW, Prlić A, Dougherty J, Duarte JM, Hoffman AS, Barton GJ, Bendixen E, Bergquist T, Bock C, Brunk E, Buljan M, Burley SK, Cai B, Carter H, Gao J, Godzik A, Heuer M, Hicks M, Hrabe T, Karchin R, Leman JK, Lane L, Masica DL, Mooney SD, Moult J, Omenn GS, Pearl F, Pejaver V, Reynolds SM, Rokem A, Schwede T, Song S, Tilgner H, Valasatava Y, Zhang Y, Deutsch EW. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework. Genome Med 2017; 9:113. [PMID: 29254494 PMCID: PMC5735928 DOI: 10.1186/s13073-017-0509-y] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.
Collapse
Affiliation(s)
| | - Peter W Rose
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA
| | - Andreas Prlić
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA.,RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
| | | | - José M Duarte
- RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
| | - Andrew S Hoffman
- Human Centered Design & Engineering, University of Washington, Seattle, WA, 98195, USA
| | - Geoffrey J Barton
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, DD1 5EH, UK
| | - Emøke Bendixen
- Department of Molecular Biology and Genetics, Aarhus University, 8000, Aarhus, Denmark
| | - Timothy Bergquist
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Christian Bock
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Elizabeth Brunk
- University of California San Diego, La Jolla, CA, 92093, USA
| | - Marija Buljan
- Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Zurich, Switzerland
| | - Stephen K Burley
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA.,RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Binghuang Cai
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Hannah Carter
- University of California San Diego, La Jolla, CA, 92093, USA
| | - JianJiong Gao
- Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Adam Godzik
- SBP Medical Discovery Institute, La Jolla, CA, 92037, USA
| | - Michael Heuer
- AMPLab, University of California, Berkeley, CA, 94720, USA
| | | | - Thomas Hrabe
- SBP Medical Discovery Institute, La Jolla, CA, 92037, USA
| | - Rachel Karchin
- Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA.,Department of Oncology, Johns Hopkins Medicine, Baltimore, MD, 21287, USA
| | - Julia Koehler Leman
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, 10010, USA.,Department of Biology and Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and University of Geneva, CH-1211, Geneva, Switzerland
| | - David L Masica
- Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, 20850, USA.,Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, 20742, USA
| | - Gilbert S Omenn
- Institute for Systems Biology, Seattle, WA, 98109, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | - Frances Pearl
- School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK
| | - Vikas Pejaver
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA.,The University of Washington eScience Institute, Seattle, WA, 98195, USA
| | | | - Ariel Rokem
- The University of Washington eScience Institute, Seattle, WA, 98195, USA
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics and Biozentrum University of Basel, CH-4056, Basel, Switzerland
| | - Sicheng Song
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
| | - Hagen Tilgner
- Brain and Mind Research Institute, Weill Cornell Medicine, New York City, NY, 10021, USA
| | - Yana Valasatava
- RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
| | - Yang Zhang
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | | |
Collapse
|
9
|
Are Next-Generation Sequencing Tools Ready for the Cloud? Trends Biotechnol 2017; 35:486-489. [DOI: 10.1016/j.tibtech.2017.03.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Revised: 02/23/2017] [Accepted: 03/03/2017] [Indexed: 11/22/2022]
|
10
|
Gress A, Ramensky V, Büch J, Keller A, Kalinina OV. StructMAn: annotation of single-nucleotide polymorphisms in the structural context. Nucleic Acids Res 2016; 44:W463-8. [PMID: 27150811 PMCID: PMC4987916 DOI: 10.1093/nar/gkw364] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 04/22/2016] [Indexed: 01/08/2023] Open
Abstract
The next generation sequencing technologies produce unprecedented amounts of data on the genetic sequence of individual organisms. These sequences carry a substantial amount of variation that may or may be not related to a phenotype. Phenotypically important part of this variation often comes in form of protein-sequence altering (non-synonymous) single nucleotide variants (nsSNVs). Here we present StructMAn, a Web-based tool for annotation of human and non-human nsSNVs in the structural context. StructMAn analyzes the spatial location of the amino acid residue corresponding to nsSNVs in the three-dimensional (3D) protein structure relative to other proteins, nucleic acids and low molecular-weight ligands. We make use of all experimentally available 3D structures of query proteins, and also, unlike other tools in the field, of structures of proteins with detectable sequence identity to them. This allows us to provide a structural context for around 20% of all nsSNVs in a typical human sequencing sample, for up to 60% of nsSNVs in genes related to human diseases and for around 35% of nsSNVs in a typical bacterial sample. Each nsSNV can be visualized and inspected by the user in the corresponding 3D structure of a protein or protein complex. The StructMAn server is available at http://structman.mpi-inf.mpg.de.
Collapse
Affiliation(s)
- Alexander Gress
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, 66123 Saarbrücken, Germany Graduate School of Computer Science, Saarland University, Campus E1 3, 66123 Saarbrücken, Germany
| | - Vasily Ramensky
- Center for Neurobehavioral Genetics, University of California, Los Angeles, 695 Charles E. Young Drive South, Los Angeles, CA 90095, USA
| | - Joachim Büch
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, 66123 Saarbrücken, Germany
| | - Andreas Keller
- Chair for Medical Bioinformatics, Saarland University, Campus E2 2, 66123 Saarbrücken, Germany
| | - Olga V Kalinina
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, 66123 Saarbrücken, Germany
| |
Collapse
|