1
|
Identifying the impact of structurally and functionally high-risk nonsynonymous SNPs on human patched protein using in-silico approach. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2021.101097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
2
|
Ajadi MB, Soremekun OS, Adewumi AT, Kumalo HM, Soliman MES. Functional Analysis of Single Nucleotide Polymorphism in ZUFSP Protein and Implication in Pathogenesis. Protein J 2021; 40:28-40. [PMID: 33512633 DOI: 10.1007/s10930-021-09962-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2021] [Indexed: 11/25/2022]
Abstract
Researches have revealed that functional non-synonymous Single Nucleotide Polymorphism (nsSNPs) present in the Zinc-finger with UFM1-Specific Peptidase domain protein (ZUFSP) may be involved in genetic instability and carcinogenesis. For the first time, we employed in-silico approach using predictive tools to identify and validate potential nsSNPs that could be pathogenic. Our result revealed that 8 nsSNPs (rs 112738382, rs 140094037, rs 201652589, rs 201847265, rs 202076827, rs 373634906, rs 375114528, rs 772591104) are pathogenic after being subjected to rigorous filtering process. The structural impact of the nsSNPs on ZUFSP structure indicated that the nsSNPs affect the stability of the protein by lowering ZUFSP protein stability. Furthermore, conservation analysis showed that rs 201652589, rs 140094037, rs 201847265, and rs 772591104 were highly conserved. Interestingly, the protein-protein affinity between ZUFSP and Ubiquitin was altered rs 201652589, rs 140094037, rs 201847265, and rs 772591104 had a binding affinity of - 0.46, - 0.83, - 1.62, and - 1.12 kcal/mol respectively. Our study has been able to identify potential nsSNPs that could be used as genetic biomarkers for some diseases arising as a result of aberration in the ZUFSP structure, however, being a predictive study, the identified nsSNPs need to be experimentally investigated.
Collapse
Affiliation(s)
- Mary B Ajadi
- Department of Medical Biochemistry, School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Howard Campus, Durban, 4000, South Africa
- Chemical Pathology Department, Faculty of Basic Medical Sciences, College of Health Sciences, Ladoke Akintola University of Technology, PMB 4400, Osogbo, Nigeria
| | - Opeyemi S Soremekun
- Molecular Bio-Computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban, 4001, South Africa
| | - Adeniyi T Adewumi
- Molecular Bio-Computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban, 4001, South Africa
| | - Hezekiel M Kumalo
- Department of Medical Biochemistry, School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Howard Campus, Durban, 4000, South Africa
| | - Mahmoud E S Soliman
- Molecular Bio-Computation and Drug Design Laboratory, School of Health Sciences, University of KwaZulu-Natal, Westville Campus, Durban, 4001, South Africa.
| |
Collapse
|
3
|
Emadi E, Akhoundi F, Kalantar SM, Emadi-Baygi M. Predicting the most deleterious missense nsSNPs of the protein isoforms of the human HLA-G gene and in silico evaluation of their structural and functional consequences. BMC Genet 2020; 21:94. [PMID: 32867672 PMCID: PMC7457528 DOI: 10.1186/s12863-020-00890-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 07/19/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The Human Leukocyte Antigen G (HLA-G) protein is an immune tolerogenic molecule with 7 isoforms. The change of expression level and some polymorphisms of the HLA-G gene are involved in various pathologies. Therefore, this study aimed to predict the most deleterious missense non-synonymous single nucleotide polymorphisms (nsSNPs) in HLA-G isoforms via in silico analyses and to examine structural and functional effects of the predicted nsSNPs on HLA-G isoforms. RESULTS Out of 301 reported SNPs in dbSNP, 35 missense SNPs in isoform 1, 35 missense SNPs in isoform 5, 8 missense SNPs in all membrane-bound HLA-G isoforms and 8 missense SNPs in all soluble HLA-G isoforms were predicted as deleterious by all eight servers (SIFT, PROVEAN, PolyPhen-2, I-Mutant 3.0, SNPs&GO, PhD-SNP, SNAP2, and MUpro). The Structural and functional effects of the predicted nsSNPs on HLA-G isoforms were determined by MutPred2 and HOPE servers, respectively. Consurf analyses showed that the majority of the predicted nsSNPs occur in conserved sites. I-TASSER and Chimera were used for modeling of the predicted nsSNPs. rs182801644 and rs771111444 were related to creating functional patterns in 5'UTR. 5 SNPs in 3'UTR of the HLA-G gene were predicted to affect the miRNA target sites. Kaplan-Meier analysis showed the HLA-G deregulation can serve as a prognostic marker for some cancers. CONCLUSIONS The implementation of in silico SNP prioritization methods provides a great framework for the recognition of functional SNPs. The results obtained from the current study would be called laboratory investigations.
Collapse
Affiliation(s)
- Elaheh Emadi
- Department of Genetics, Faculty of Medicine, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | - Fatemeh Akhoundi
- Department of Genetics, Faculty of Basic Sciences, Shahrekord University, Shahrekord, Iran
| | - Seyed Mehdi Kalantar
- Department of Genetics, Faculty of Medicine, Shahid Sadoughi University of Medical Sciences, Yazd, Iran
| | - Modjtaba Emadi-Baygi
- Department of Genetics, Faculty of Basic Sciences, Shahrekord University, Shahrekord, Iran.
- Research Institute of Biotechnology, Shahrekord University, Shahrekord, Iran.
| |
Collapse
|
4
|
Computational analysis of high-risk SNPs in human CHK2 gene responsible for hereditary breast cancer: A functional and structural impact. PLoS One 2019; 14:e0220711. [PMID: 31398194 PMCID: PMC6688789 DOI: 10.1371/journal.pone.0220711] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 07/22/2019] [Indexed: 12/18/2022] Open
Abstract
Nowadays CHK2 mutation is studied frequently in hereditary breast and ovarian cancer patients in addition to BRCA1/BRCA2. CHK2 is a tumor suppressor gene that encodes a serine/threonine kinase, also involved in pathways such as DNA repair, cell cycle regulation and apoptosis in response to DNA damage. CHK2 is a well-studied moderate penetrance gene that correlates with third high risk susceptibility gene with an increased risk for breast cancer. Hence before planning large population study, it is better to scrutinize putative functional SNPs of CHK2 using different computational tools. In this study, we have used various computational approaches to identify nsSNPs which are deleterious to the structure and/or function of CHK2 protein that might be causing this disease. Computational analysis was performed by different in silico tools including SIFT, Align GVGD, SNAP-2, PROVEAN, Poly-Phen-2, PANTHER, PhD-SNP, MUpro, iPTREE-STAB, Consurf, InterPro, NCBI Conserved Domain Search tool, ModPred, SPARKS-X, RAMPAGE, Verify-3D, FT Site, COACH and PyMol. Out of 78 nsSNP of human CHK2 gene, seven nsSNPs were predicted functionally most significant SNPs. Among these seven nsSNP, p.Arg160Gly, p.Gly210Arg and p.Ser415Phe are highly conserved residues with conservation score of 9 and three nsSNP were predicted to be involved in post translational modification. The p.Arg160Gly and p.Gly210Arg may interfere in phosphopeptide binding site on FHA conserved domain. The p.Ser415Phe may interfere in formation of activation loop of protein-kinase domain and might interfere in interactions of CHK2 with ligand. The study concludes that mutation of serine to phenylalanine at position 415 is a major mutation in native CHK2 protein which might contribute to its malfunction, ultimately causing disease. This is the first comprehensive study, where CHK2 gene variants are analyzed using in silico tools hence it will be of great help while considering large scale studies and also in developing precision medicines related to these polymorphisms in the era of personalized medicine.
Collapse
|
5
|
Desai M, Chauhan JB. Predicting the functional and structural consequences of nsSNPs in human methionine synthase gene using computational tools. Syst Biol Reprod Med 2019; 65:288-300. [DOI: 10.1080/19396368.2019.1568611] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Mansi Desai
- P. G. Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Science (ARIBAS), New Vallabh Vidyanagar, India
| | - Jenabhai B. Chauhan
- P. G. Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Science (ARIBAS), New Vallabh Vidyanagar, India
| |
Collapse
|
6
|
Nailwal M, Chauhan JB. Computational Analysis of High-Risk SNPs in Human DBY Gene Responsible for Male Infertility: A Functional and Structural Impact. Interdiscip Sci 2018. [PMID: 29520635 DOI: 10.1007/s12539-018-0290-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
BACKGROUND DEAD-box helicase 3, Y-linked (DBY) is a candidate gene of the AZF region which is involved in spermatogenesis process. Mutations in the DBY gene may disrupt the spermatogenesis and lead to infertility in men. Identification of functionally neutral mutation from the disease-causing mutation is the biggest challenge in human genetic variation analysis. Owing to the importance of DBY in male infertility, functional analysis was carried out to reveal the association between genetic mutation and phenotypic variation through various in silico approaches. METHODS The present study analyzed the functional consequences of the nsSNPs in human DBY gene using SIFT, PolyPhen 2, PROVEAN, SNAP2, PMut, nsSNPAnalyzer, PhD-SNP and SNPs&GO along with stability analysis through I-Mutant2.0, MuPro and iPTREE-STAB. The conservational analysis of amino acid residues, biophysical properties and conserved domains of the DBY protein was analyzed using various computational tools. The 3D structure of the protein was generated using SPARKS-X and validated using RAMPAGE. RESULTS Out of 1130 SNPs reported in dbSNP, only one nsSNP (G300D) was found to have a functional effect on stability as well as the function of the DBY protein. The results showed the presence of G300 in the putative structure of DBY domain. CONCLUSION To the best of our knowledge, this is the first study to detect pathologically significant nsSNPs (G300D) through a computational approach in the DBY which can be useful for development in potent drug discovery studies.
Collapse
Affiliation(s)
- Mili Nailwal
- P.G. Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Sciences (ARIBAS), New Vallabh Vidyanagar, Dist-Anand, Gujarat, 388121, India
| | - Jenabhai B Chauhan
- P.G. Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Sciences (ARIBAS), New Vallabh Vidyanagar, Dist-Anand, Gujarat, 388121, India.
| |
Collapse
|
7
|
Abstract
Methylenetetrahydrofolate reductase (MTHFR) is a key enzyme involved in folate metabolism and plays a central role in DNA methylation and biosynthesis. MTHFR mutations may alter the cellular folate supply which in turn affects nucleic acid synthesis, DNA methylation and chromosomal damage. The identification of number of SNPs in the human genome growing nowadays and hence, the evaluation of functional & structural consequences of these SNPs is very laborious by means of experimental analysis. Therefore, in the present study, recently developed various computational algorithms have been used which can predict the functional and structural consequences of the SNPs. Various computational tools like SIFT, PolyPhen2, PROVEAN, SNAP2, nsSNPAnalyzer, SNPs&GO, PhD-SNP, PMut, I-Mutant, iPTREE-STAB and MUpro were used to predict most deleterious SNPs. Additionally, ConSurf was used to find amino acids conservation and NCBI conserved domain search tool to find conserved domains in MTHFR. Post translational modification sites were predicted using ModPred. SPARKS-X was used to generate 3D structure of the native and mutant MTHFR protein, ModRefiner for further refinement, Varify3D and RAMPAGE to validate structure. Ligand binding sites were predicted using FTsite, RaptorX binding and COACH. Three SNPs i.e. R157Q, L323P and W500C predicted the most deleterious in all the tools used for functional and stability analysis. Moreover, both residues R157, L323 and W500 were predicted highly conserved, buried and structural residues by ConSurf. Post translational modification sites were also predicted at R157 and W500. The ligand binding sites were predicted at R157, L323 and W500.
Collapse
Affiliation(s)
- Mansi Desai
- P. G. Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Science (ARIBAS), New Vallabh Vidyanagar, Affiliated to Sardar Patel University, India.
| | - J B Chauhan
- P. G. Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Science (ARIBAS), New Vallabh Vidyanagar, Affiliated to Sardar Patel University, India.
| |
Collapse
|
8
|
Desai M, Chauhan JB. Computational analysis for the determination of deleterious nsSNPs in human MTHFD1 gene. Comput Biol Chem 2017; 70:7-14. [PMID: 28734179 DOI: 10.1016/j.compbiolchem.2017.07.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 06/20/2017] [Accepted: 07/09/2017] [Indexed: 11/24/2022]
Abstract
Single nucleotide polymorphisms (SNPs) are the most common genetic polymorphisms and play a major role in many inherited diseases. Methylenetetrahydrofolate dehydrogenase 1 (MTHFD1) is one of the enzymes involved in folate metabolism. In the present study, the functional and structural consequences of nsSNPs of human MTHFD1 gene was analyzed using various computational tools like SIFT, PolyPhen2, PANTHER, PROVEAN, SNAP2, nsSNPAnalyzer, PhD-SNP, SNPs&GO, I-Mutant, MuPro, ConSurf, InterPro, NCBI Conserved Domain Search tool, ModPred, SPARKS-X, RAMPAGE, FT Site and PyMol. Out of 327 nsSNPs form human MTHFD1 gene, total 45 SNPs were predicted as functionally most significant SNPs, among which 17 were highly conserved and functional, 17 were highly conserved and structural residues. Among 45 most significant SNPs, 15 were predicted to be involved in post translational modifications. The p.Gly165Arg may interfere in homodimer interface formation. The p.Asn439Lys and p.Asp445Asn may interfere in binding interactions of MTHFD1 protein with cesium cation and potassium. The two SNPs (p.Asp562Gly and p.Gly637Cys) might interfere in interactions of MTHFD1 with ligand.
Collapse
Affiliation(s)
- Mansi Desai
- Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Science (ARIBAS), Affiliated to Sardar Patel University, New Vallabh Vidyanagar 388121, Gujarat, India.
| | - J B Chauhan
- Department of Genetics, Ashok and Rita Patel Institute of Integrated Study and Research in Biotechnology and Allied Science (ARIBAS), Affiliated to Sardar Patel University, New Vallabh Vidyanagar 388121, Gujarat, India.
| |
Collapse
|
9
|
Khomtchouk BB, Hennessy JR, Wahlestedt C. shinyheatmap: Ultra fast low memory heatmap web interface for big data genomics. PLoS One 2017; 12:e0176334. [PMID: 28493881 PMCID: PMC5426587 DOI: 10.1371/journal.pone.0176334] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2016] [Accepted: 04/10/2017] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Transcriptomics, metabolomics, metagenomics, and other various next-generation sequencing (-omics) fields are known for their production of large datasets, especially across single-cell sequencing studies. Visualizing such big data has posed technical challenges in biology, both in terms of available computational resources as well as programming acumen. Since heatmaps are used to depict high-dimensional numerical data as a colored grid of cells, efficiency and speed have often proven to be critical considerations in the process of successfully converting data into graphics. For example, rendering interactive heatmaps from large input datasets (e.g., 100k+ rows) has been computationally infeasible on both desktop computers and web browsers. In addition to memory requirements, programming skills and knowledge have frequently been barriers-to-entry for creating highly customizable heatmaps. RESULTS We propose shinyheatmap: an advanced user-friendly heatmap software suite capable of efficiently creating highly customizable static and interactive biological heatmaps in a web browser. shinyheatmap is a low memory footprint program, making it particularly well-suited for the interactive visualization of extremely large datasets that cannot typically be computed in-memory due to size restrictions. Also, shinyheatmap features a built-in high performance web plug-in, fastheatmap, for rapidly plotting interactive heatmaps of datasets as large as 105-107 rows within seconds, effectively shattering previous performance benchmarks of heatmap rendering speed. CONCLUSIONS shinyheatmap is hosted online as a freely available web server with an intuitive graphical user interface: http://shinyheatmap.com. The methods are implemented in R, and are available as part of the shinyheatmap project at: https://github.com/Bohdan-Khomtchouk/shinyheatmap. Users can access fastheatmap directly from within the shinyheatmap web interface, and all source code has been made publicly available on Github: https://github.com/Bohdan-Khomtchouk/fastheatmap.
Collapse
Affiliation(s)
- Bohdan B. Khomtchouk
- Center for Therapeutic Innovation, University of Miami Miller School of Medicine, 1501 NW 10th Ave., Miami, FL, 33136, United States of America
- Department of Psychiatry and Behavioral Sciences, University of Miami Miller School of Medicine, 1120 NW 14th St., Miami, FL, 33136, United States of America
| | - James R. Hennessy
- Department of Mathematics, University of Miami, 1365 Memorial Drive, Coral Gables, FL, 33146, United States of America
| | - Claes Wahlestedt
- Center for Therapeutic Innovation, University of Miami Miller School of Medicine, 1501 NW 10th Ave., Miami, FL, 33136, United States of America
- Department of Psychiatry and Behavioral Sciences, University of Miami Miller School of Medicine, 1120 NW 14th St., Miami, FL, 33136, United States of America
| |
Collapse
|
10
|
O’Halloran DM. phylo-node: A molecular phylogenetic toolkit using Node.js. PLoS One 2017; 12:e0175480. [PMID: 28410421 PMCID: PMC5391935 DOI: 10.1371/journal.pone.0175480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2016] [Accepted: 03/27/2017] [Indexed: 12/05/2022] Open
Abstract
Background Node.js is an open-source and cross-platform environment that provides a JavaScript codebase for back-end server-side applications. JavaScript has been used to develop very fast and user-friendly front-end tools for bioinformatic and phylogenetic analyses. However, no such toolkits are available using Node.js to conduct comprehensive molecular phylogenetic analysis. Results To address this problem, I have developed, phylo-node, which was developed using Node.js and provides a stable and scalable toolkit that allows the user to perform diverse molecular and phylogenetic tasks. phylo-node can execute the analysis and process the resulting outputs from a suite of software options that provides tools for read processing and genome alignment, sequence retrieval, multiple sequence alignment, primer design, evolutionary modeling, and phylogeny reconstruction. Furthermore, phylo-node enables the user to deploy server dependent applications, and also provides simple integration and interoperation with other Node modules and languages using Node inheritance patterns, and a customized piping module to support the production of diverse pipelines. Conclusions phylo-node is open-source and freely available to all users without sign-up or login requirements. All source code and user guidelines are openly available at the GitHub repository: https://github.com/dohalloran/phylo-node.
Collapse
Affiliation(s)
- Damien M. O’Halloran
- Department of Biological Sciences, The George Washington University, Washington, DC, United States of America
- Institute for Neuroscience, The George Washington University, Washington, DC, United States of America
- * E-mail:
| |
Collapse
|
11
|
Pavlopoulos GA, Malliarakis D, Papanikolaou N, Theodosiou T, Enright AJ, Iliopoulos I. Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future. Gigascience 2015; 4:38. [PMID: 26309733 PMCID: PMC4548842 DOI: 10.1186/s13742-015-0077-2] [Citation(s) in RCA: 74] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 08/03/2015] [Indexed: 01/31/2023] Open
Abstract
"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | | | - Nikolas Papanikolaou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Theodosis Theodosiou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Anton J Enright
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD UK
| | - Ioannis Iliopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| |
Collapse
|
12
|
Skuta C, Bartůněk P, Svozil D. InCHlib - interactive cluster heatmap for web applications. J Cheminform 2014; 6:44. [PMID: 25264459 PMCID: PMC4173117 DOI: 10.1186/s13321-014-0044-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Accepted: 09/08/2014] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap. RESULTS We developed InCHlib (Interactive Cluster Heatmap Library), a highly interactive and lightweight JavaScript library for cluster heatmap visualization and exploration. InCHlib enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for InCHlib is facilitated by the Python utility script inchlib_clust. CONCLUSIONS The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented JavaScript library InCHlib is a client-side solution for cluster heatmap exploration. InCHlib can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though InCHlib is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.
Collapse
Affiliation(s)
- Ctibor Skuta
- Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic ; CZ-OPENSCREEN, Institute of Molecular Genetics of the ASCR, v. v. i, Vídeňská 1083, CZ-142 20 Prague, Czech Republic
| | - Petr Bartůněk
- CZ-OPENSCREEN, Institute of Molecular Genetics of the ASCR, v. v. i, Vídeňská 1083, CZ-142 20 Prague, Czech Republic
| | - Daniel Svozil
- Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic ; CZ-OPENSCREEN, Institute of Molecular Genetics of the ASCR, v. v. i, Vídeňská 1083, CZ-142 20 Prague, Czech Republic
| |
Collapse
|
13
|
Yachdav G, Kloppmann E, Kajan L, Hecht M, Goldberg T, Hamp T, Hönigschmid P, Schafferhans A, Roos M, Bernhofer M, Richter L, Ashkenazy H, Punta M, Schlessinger A, Bromberg Y, Schneider R, Vriend G, Sander C, Ben-Tal N, Rost B. PredictProtein--an open resource for online prediction of protein structural and functional features. Nucleic Acids Res 2014; 42:W337-43. [PMID: 24799431 PMCID: PMC4086098 DOI: 10.1093/nar/gku366] [Citation(s) in RCA: 435] [Impact Index Per Article: 43.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
PredictProtein is a meta-service for sequence analysis that has been predicting
structural and functional features of proteins since 1992. Queried with a
protein sequence it returns: multiple sequence alignments, predicted aspects of
structure (secondary structure, solvent accessibility, transmembrane helices
(TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered
regions) and function. The service incorporates analysis methods for the
identification of functional regions (ConSurf), homology-based inference of Gene
Ontology terms (metastudent), comprehensive subcellular localization prediction
(LocTree3), protein–protein binding sites (ISIS2),
protein–polynucleotide binding sites (SomeNA) and predictions of the
effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our
goal has always been to develop a system optimized to meet the demands of
experimentalists not highly experienced in bioinformatics. To this end, the
PredictProtein results are presented as both text and a series of intuitive,
interactive and visually appealing figures. The web server and sources are
available at http://ppopen.rostlab.org.
Collapse
Affiliation(s)
- Guy Yachdav
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany Biosof LLC, New York, NY 10001, USA TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Edda Kloppmann
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany New York Consortium on Membrane Protein Structure (NYCOMPS), Columbia University, New York, NY 10032, USA
| | - Laszlo Kajan
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Maximilian Hecht
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Tatyana Goldberg
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Tobias Hamp
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Peter Hönigschmid
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising 85354, Germany
| | - Andrea Schafferhans
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Manfred Roos
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Michael Bernhofer
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Lothar Richter
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany
| | - Haim Ashkenazy
- The Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel
| | - Marco Punta
- Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK Institute for Food and Plant Sciences WZW-Weihenstephan, Alte Akademie 8, Freising 85350, Germany
| | - Avner Schlessinger
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Yana Bromberg
- Biosof LLC, New York, NY 10001, USA Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Reinhard Schneider
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08901, USA
| | - Gerrit Vriend
- Luxembourg University & Luxembourg Centre for Systems Biomedicine, 4362 Belval, Luxembourg
| | - Chris Sander
- CMBI, NCMLS, Radboudumc Nijmegen Medical Centre, 6525 GA Nijmegen, The Netherlands
| | - Nir Ben-Tal
- Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, 10065 NY, USA
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany Biosof LLC, New York, NY 10001, USA New York Consortium on Membrane Protein Structure (NYCOMPS), Columbia University, New York, NY 10032, USA The Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel Department of Biochemistry and Molecular Biophysics & New York Consortium on Membrane Protein Structure (NYCOMPS), Columbia University, New York, NY 10032, USA Institute for Advanced Study (TUM-IAS), Garching/Munich 85748, Germany
| |
Collapse
|
14
|
Abstract
Data-driven research has gained momentum in the life sciences. Visualisation of these data is essential for quick generation of hypotheses and their translation into useful knowledge. BioJS is a new proposed standard for JavaScript-based components to visualise biological data. BioJS is an open source community project that to date provides 39 different components contributed by a global community. Here, we present the BioJS F1000Research collection series. A total of 12 components and a project status article are published in bulk. This collection does not intend to be an all-encompassing, comprehensive source of BioJS articles, but an initial set; future submissions from BioJS contributors are welcome.
Collapse
Affiliation(s)
- Manuel Corpas
- The Genome Analysis Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| |
Collapse
|