1
|
V M DD, Sivaramakrishnan V, Arvind Kumar K. Structural systems biology approach delineate the functional implications of SNPs in exon junction complex interaction network. J Biomol Struct Dyn 2023; 41:11969-11986. [PMID: 36617892 DOI: 10.1080/07391102.2022.2164355] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 12/26/2022] [Indexed: 01/10/2023]
Abstract
In eukaryotes, transcripts that carry premature termination codons (PTC) leading to truncated proteins are degraded by the Nonsense Mediated Decay (NMD) machinery. Missense and nonsense Single Nucleotide Polymorphisms (SNPs) in proteins belonging to Exon junction complex (EJC) and up-frameshift protein (UPF) will compromise NMD leading to the accumulation of truncated proteins in various diseases. The EJC and UPF which are involved in NMD is a good model system to study the effect of SNPs at a system level. Despite the availability of crystal structures, computational tools, and data on mutational and deletion studies, with functional implications, an integrated effort to understand the impact of SNPs at the systems level is lacking. To study the functional consequences of missense SNPs, sequence-based techniques like SIFT and PolyPhen which classify SNPs as deleterious or non-deleterious and structure-based methods like FoldX which calculate the Delta Delta G, (ddGs, ∆∆G) are used. Using FoldX, the ddG for mutations with experimentally validated functional effects is calculated and compared with those calculated for SNPs in the same protein-protein interaction interface. Further, a model is conceived to explain the functional implications of SNPs based on the effects observed for known mutants. The results are visualized in a network format. The effects of nonsense mutations are discerned by comparing with deletion mutation studies and loss of interaction in the crystal structure. The present work not only integrates genomics, proteomics, and classical genetics with 'Structural Biology' but also helps to integrate it into a 'systems-level functional network'.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Datta Darshan V M
- Disease Biology Lab, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Prasanthi Nilayam, Anantapur, Andhra Pradesh, India
| | - Venketesh Sivaramakrishnan
- Disease Biology Lab, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Prasanthi Nilayam, Anantapur, Andhra Pradesh, India
| | - K Arvind Kumar
- Disease Biology Lab, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Prasanthi Nilayam, Anantapur, Andhra Pradesh, India
- Department of Physiology and Biophysics, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
2
|
María Hernández-Domínguez E, Sofía Castillo-Ortega L, García-Esquivel Y, Mandujano-González V, Díaz-Godínez G, Álvarez-Cervantes J. Bioinformatics as a Tool for the Structural and Evolutionary Analysis of Proteins. Comput Biol Chem 2020. [DOI: 10.5772/intechopen.89594] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
This chapter deals with the topic of bioinformatics, computational, mathematics, and statistics tools applied to biology, essential for the analysis and characterization of biological molecules, in particular proteins, which play an important role in all cellular and evolutionary processes of the organisms. In recent decades, with the next generation sequencing technologies and bioinformatics, it has facilitated the collection and analysis of a large amount of genomic, transcriptomic, proteomic, and metabolomic data from different organisms that have allowed predictions on the regulation of expression, transcription, translation, structure, and mechanisms of action of proteins as well as homology, mutations, and evolutionary processes that generate structural and functional changes over time. Although the information in the databases is greater every day, all bioinformatics tools continue to be constantly modified to improve performance that leads to more accurate predictions regarding protein functionality, which is why bioinformatics research remains a great challenge.
Collapse
|
3
|
Havranek B, Islam SM. Prediction and evaluation of deleterious and disease causing non-synonymous SNPs (nsSNPs) in human NF2 gene responsible for neurofibromatosis type 2 (NF2). J Biomol Struct Dyn 2020; 39:7044-7055. [PMID: 32787631 DOI: 10.1080/07391102.2020.1805018] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The majority of genetic variations in the human genome that lead to variety of different diseases are caused by non-synonymous single nucleotide polymorphisms (nsSNPs). Neurofibromatosis type 2 (NF2) is a deadly disease caused by nsSNPs in the NF2 gene that encodes for a protein called merlin. This study used various in silico methods, SIFT, Polyphen-2, PhD-SNP and MutPred, to investigate the pathogenic effect of 14 nsSNPs in the merlin FERM domain. The G197C and L234R mutations were found to be two deleterious and disease mutations associated with the mild and severe forms of NF2, respectively. Molecular dynamics (MD) simulations were conducted to understand the stability, structure and dynamics of these mutations. Both mutant structures experienced larger flexibility compared to the wildtype. The L234R mutant suffered from more prominent structural instability, which may help to explain why it is associated with the more severe form of NF2. The intramolecular hydrogen bonding in L234R mutation decreased from the wildtype, while intermolecular hydrogen bonding of L234R mutation with solvent greatly increased. The native contacts were also found to be important. Protein-protein docking revealed that L234R mutation decreased the binding complementarity and binding affinity of LATS2 to merlin, which may have an impact on merlin's ability to regulate the Hippo signaling pathway. The calculated binding affinity of the LATS2 to L234R mutant and wildtype merlin protein is found to be 21.73 and -11 kcal/mol, respectively. The binding affinity of the wildtype merlin agreed very well with the experimental value, -8 kcal/mol.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Brandon Havranek
- Department of Chemistry, University of Illinois at Chicago, Chicago, IL, USA
| | - Shahidul M Islam
- Department of Chemistry, University of Illinois at Chicago, Chicago, IL, USA
| |
Collapse
|
4
|
Alzahrani FA, Ahmed F, Sharma M, Rehan M, Mahfuz M, Baeshen MN, Hawsawi Y, Almatrafi A, Alsagaby SA, Kamal MA, Warsi MK, Choudhry H, Jamal MS. Investigating the pathogenic SNPs in BLM helicase and their biological consequences by computational approach. Sci Rep 2020; 10:12377. [PMID: 32704157 PMCID: PMC7378827 DOI: 10.1038/s41598-020-69033-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 07/06/2020] [Indexed: 12/15/2022] Open
Abstract
The BLM helicase protein plays a vital role in DNA replication and the maintenance of genomic integrity. Variation in the BLM helicase gene resulted in defects in the DNA repair mechanism and was reported to be associated with Bloom syndrome (BS) and cancer. Despite extensive investigation of helicase proteins in humans, no attempt has previously been made to comprehensively analyse the single nucleotide polymorphism (SNPs) of the BLM gene. In this study, a comprehensive analysis of SNPs on the BLM gene was performed to identify, characterize and validate the pathogenic SNPs using computational approaches. We obtained SNP data from the dbSNP database version 150 and mapped these data to the genomic coordinates of the "NM_000057.3" transcript expressing BLM helicase (P54132). There were 607 SNPs mapped to missense, 29 SNPs mapped to nonsense, and 19 SNPs mapped to 3'-UTR regions. Initially, we used many consensus tools of SIFT, PROVEAN, Condel, and PolyPhen-2, which together increased the accuracy of prediction and identified 18 highly pathogenic non-synonymous SNPs (nsSNPs) out of 607 SNPs. Subsequently, these 18 high-confidence pathogenic nsSNPs were analysed for BLM protein stability, structure-function relationships and disease associations using various bioinformatics tools. These 18 mutants of the BLM protein along with the native protein were further investigated using molecular dynamics simulations to examine the structural consequences of the mutations, which might reveal their malfunction and contribution to disease. In addition, 28 SNPs were predicted as "stop gained" nonsense SNPs and one SNP was predicted as "start lost". Two SNPs in the 3'UTR were found to abolish miRNA binding and thus may enhance the expression of BLM. Interestingly, we found that BLM mRNA overexpression is associated with different types of cancers. Further investigation showed that the dysregulation of BLM is associated with poor overall survival (OS) for lung and gastric cancer patients and hence led to the conclusion that BLM has the potential to be used as an important prognostic marker for the detection of lung and gastric cancer.
Collapse
Affiliation(s)
- Faisal A Alzahrani
- Department of Biochemistry, Faculty of Science, Stem Cells Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
- Aston Medical Research Institute, Aston Medical School, Aston University, Birmingham, B4 7ET, UK
| | - Firoz Ahmed
- Department of Biochemistry, College of Science, University of Jeddah, Jeddah, 21589, Saudi Arabia.
- University of Jeddah Centre for Scientific and Medical Research (UJ-CSMR), University of Jeddah, Jeddah, 21589, Saudi Arabia.
| | - Monika Sharma
- Department of Chemical Sciences, Indian Institute of Science Education and Research (IISER), Mohali, India
| | - Mohd Rehan
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Maryam Mahfuz
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| | - Mohammed N Baeshen
- Department of Biology, College of Science, University of Jeddah, Jeddah, 21589, Saudi Arabia
| | - Yousef Hawsawi
- Department of Genetics, Research Center, King Faisal Specialist Hospital, and Research Center, MBC-03, PO Box 3354, Riyadh, 11211, Saudi Arabia
| | - Ahmed Almatrafi
- Department of Biology, Faculty of Science, University of Taibah, Medinah, Saudi Arabia
| | - Suliman Abdallah Alsagaby
- Department of Medical Laboratories, Central Biosciences Research Laboratories, College of Science in Al Zulfi, Majmaah University, Al Majma'ah, Saudi Arabia
| | - Mohammad Azhar Kamal
- Department of Biochemistry, College of Science, University of Jeddah, Jeddah, 21589, Saudi Arabia
- University of Jeddah Centre for Scientific and Medical Research (UJ-CSMR), University of Jeddah, Jeddah, 21589, Saudi Arabia
| | - Mohiuddin Khan Warsi
- Department of Biochemistry, College of Science, University of Jeddah, Jeddah, 21589, Saudi Arabia
- University of Jeddah Centre for Scientific and Medical Research (UJ-CSMR), University of Jeddah, Jeddah, 21589, Saudi Arabia
| | - Hani Choudhry
- Department of Biochemistry, Cancer Metabolism and Epigenetic Unit, Faculty of Science, Cancer and Mutagenesis Unit, King Fahd Center for Medical Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohammad Sarwar Jamal
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia.
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia.
- Integrative Biosciences Center, Wayne State University, Detroit, MI, 48202, USA.
| |
Collapse
|
5
|
Siegel RJ, Bridges SL, Ahmed S. HLA-C: An Accomplice in Rheumatic Diseases. ACR Open Rheumatol 2019; 1:571-579. [PMID: 31777841 PMCID: PMC6858028 DOI: 10.1002/acr2.11065] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 07/08/2019] [Indexed: 01/14/2023] Open
Abstract
Human leukocyte antigen c (HLA-C) is a polymorphic membrane protein encoded by the HLA-C gene in the class I major histocompatibility complex. HLA-C plays an essential role in protection against cancer and viruses but has also been implicated in allograft rejection, preeclampsia, and autoimmune disease. This review summarizes reports and proposed mechanisms for the accessory role of HLA-C in rheumatic diseases. Historically, contributions of HLA-C to rheumatic diseases were eclipsed by the stronger association with HLA-DRB1 alleles containing the "shared epitope" with rheumatoid arthritis. Larger genetic association studies and more powerful analytical approaches have revealed independent associations of HLA-C with rheumatic disease-associated phenotypes, including development of anticitrullinated peptide antibodies. HLA-C functions by presenting antigens to T cells and by binding activatory and inhibitory receptors on natural killer (NK) cells, but the exact mechanisms by which the HLA-C locus contributes to autoimmunity are largely undefined. Studies have suggested that HLA-C and NK cell receptor polymorphisms may predict responsiveness to pharmacotherapy. Understanding the mechanisms of the role of HLA-C in rheumatic disease could uncover therapeutic targets or guide precision pharmacologic treatments.
Collapse
Affiliation(s)
- Ruby J. Siegel
- Department of Pharmaceutical SciencesWashington State University College of Pharmacy and Pharmaceutical SciencesSpokaneWashington
| | - S. Louis Bridges
- Division of Clinical Immunology and RheumatologyUniversity of Alabama at BirminghamBirminghamAlabama
| | - Salahuddin Ahmed
- Department of Pharmaceutical SciencesWashington State University College of Pharmacy and Pharmaceutical SciencesSpokaneWashington
- Division of RheumatologyUniversity of Washington School of MedicineSeattleWashington
| |
Collapse
|
6
|
López-Ferrando V, Gazzo A, de la Cruz X, Orozco M, Gelpí JL. PMut: a web-based tool for the annotation of pathological variants on proteins, 2017 update. Nucleic Acids Res 2019; 45:W222-W228. [PMID: 28453649 PMCID: PMC5793831 DOI: 10.1093/nar/gkx313] [Citation(s) in RCA: 168] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2017] [Accepted: 04/18/2017] [Indexed: 12/18/2022] Open
Abstract
We present here a full update of the PMut predictor, active since 2005 and with a large acceptance in the field of predicting Mendelian pathological mutations. PMut internal engine has been renewed, and converted into a fully featured standalone training and prediction engine that not only powers PMut web portal, but that can generate custom predictors with alternative training sets or validation schemas. PMut Web portal allows the user to perform pathology predictions, to access a complete repository of pre-calculated predictions, and to generate and validate new predictors. The default predictor performs with good quality scores (MCC values of 0.61 on 10-fold cross validation, and 0.42 on a blind test with SwissVar 2016 mutations). The PMut portal is freely accessible at http://mmb.irbbarcelona.org/PMut. A complete help and tutorial is available at http://mmb.irbbarcelona.org/PMut/help.
Collapse
Affiliation(s)
- Víctor López-Ferrando
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Joint Program BSC-CRG-IRB Research Program for Computational Biology, Barcelona, Spain
| | - Andrea Gazzo
- Joint Program BSC-CRG-IRB Research Program for Computational Biology, Barcelona, Spain.,Institute for Research in Biomedicine (IRB) Barcelona, The Barcelona Institute of Science and Technology, Barcelona. Spain
| | - Xavier de la Cruz
- Vall d'Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain.,ICREA, Barcelona, Spain
| | - Modesto Orozco
- Joint Program BSC-CRG-IRB Research Program for Computational Biology, Barcelona, Spain.,Institute for Research in Biomedicine (IRB) Barcelona, The Barcelona Institute of Science and Technology, Barcelona. Spain.,Dept. of Biochemistry and Molecular Biomedicine, University of Barcelona, Barcelona, Spain
| | - Josep Ll Gelpí
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Joint Program BSC-CRG-IRB Research Program for Computational Biology, Barcelona, Spain.,Dept. of Biochemistry and Molecular Biomedicine, University of Barcelona, Barcelona, Spain
| |
Collapse
|
7
|
Rashid MU, Khan FA, Muhammad N, Loya A, Hamann U. Prevalence of PALB2 Germline Mutations in Early-onset and Familial Breast/Ovarian Cancer Patients from Pakistan. Cancer Res Treat 2019; 51:992-1000. [PMID: 30309218 PMCID: PMC6639217 DOI: 10.4143/crt.2018.356] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 10/10/2018] [Indexed: 01/02/2023] Open
Abstract
PURPOSE Partner and localizer of BRCA2 (PALB2) is a breast cancer susceptibility gene that plays an important role in DNA repair. This is the first study assessing the prevalence of PALB2 mutations in early-onset and familial breast/ovarian cancer patients from Pakistan. MATERIALS AND METHODS PALB2 mutation screening was performed in 370 Pakistani patients with early-onset and familial breast/ovarian cancer, who were negative for BRCA1, BRCA2, TP53, CHEK2, and RAD51C mutations, using denaturing high-performance liquid chromatography analysis. Mutations were confirmed by DNA sequencing. Novel PALB2 alterations were analyzed for their potential effect on protein function or splicing using various in silico prediction tools. Three-hundred and seventy-two healthy controls were screened for the presence of the identified (potentially) functional mutations. RESULTS A novel nonsense mutation, p.Y743*, was identified in one familial breast cancer patient (1/127, 0.8%). Besides, four in silico-predicted potentially functional mutations including three missense mutations and one 5' untranslated region mutation were identified: p.D498Y, novel p.G644R, novel p.E744K, and novel c.-134_-133delTCinsGGGT. The mutations p.Y743* and p.D498Y were identified in two familial patients diagnosed with unilateral or synchronous bilateral breast cancer at the ages of 29 and 39, respectively. The other mutations were identified in an early-onset (≤ 30 years of age) breast cancer patient each. All five mutations were absent in 372 healthy controls suggesting that they are disease associated. CONCLUSION Our findings show that PALB2 mutations account for a small proportion of early-onset and hereditary breast/ovarian cancer cases in Pakistan.
Collapse
Affiliation(s)
- Muhammad Usman Rashid
- Department of Basic Sciences Research, Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH&RC), Lahore, Pakistan
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Faiz Ali Khan
- Department of Basic Sciences Research, Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH&RC), Lahore, Pakistan
| | - Noor Muhammad
- Department of Basic Sciences Research, Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH&RC), Lahore, Pakistan
| | - Asif Loya
- Department of Pathology, Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH&RC), Lahore, Pakistan
| | - Ute Hamann
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
8
|
Sutherland HG, Albury CL, Griffiths LR. Advances in genetics of migraine. J Headache Pain 2019; 20:72. [PMID: 31226929 PMCID: PMC6734342 DOI: 10.1186/s10194-019-1017-9] [Citation(s) in RCA: 125] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Accepted: 05/24/2019] [Indexed: 02/06/2023] Open
Abstract
Background Migraine is a complex neurovascular disorder with a strong genetic component. There are rare monogenic forms of migraine, as well as more common polygenic forms; research into the genes involved in both types has provided insights into the many contributing genetic factors. This review summarises advances that have been made in the knowledge and understanding of the genes and genetic variations implicated in migraine etiology. Findings Migraine is characterised into two main types, migraine without aura (MO) and migraine with aura (MA). Hemiplegic migraine is a rare monogenic MA subtype caused by mutations in three main genes - CACNA1A, ATP1A2 and SCN1A - which encode ion channel and transport proteins. Functional studies in cellular and animal models show that, in general, mutations result in impaired glutamatergic neurotransmission and cortical hyperexcitability, which make the brain more susceptible to cortical spreading depression, a phenomenon thought to coincide with aura symptoms. Variants in other genes encoding ion channels and solute carriers, or with roles in regulating neurotransmitters at neuronal synapses, or in vascular function, can also cause monogenic migraine, hemiplegic migraine and related disorders with overlapping symptoms. Next-generation sequencing will accelerate the finding of new potentially causal variants and genes, with high-throughput bioinformatics analysis methods and functional analysis pipelines important in prioritising, confirming and understanding the mechanisms of disease-causing variants. With respect to common migraine forms, large genome-wide association studies (GWAS) have greatly expanded our knowledge of the genes involved, emphasizing the role of both neuronal and vascular pathways. Dissecting the genetic architecture of migraine leads to greater understanding of what underpins relationships between subtypes and comorbid disorders, and may have utility in diagnosis or tailoring treatments. Further work is required to identify causal polymorphisms and the mechanism of their effect, and studies of gene expression and epigenetic factors will help bridge the genetics with migraine pathophysiology. Conclusions The complexity of migraine disorders is mirrored by their genetic complexity. A comprehensive knowledge of the genetic factors underpinning migraine will lead to improved understanding of molecular mechanisms and pathogenesis, to enable better diagnosis and treatments for migraine sufferers.
Collapse
Affiliation(s)
- Heidi G Sutherland
- Genomics Research Centre, Institute of Health and Biomedical Innovation. School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD, Australia
| | - Cassie L Albury
- Genomics Research Centre, Institute of Health and Biomedical Innovation. School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD, Australia
| | - Lyn R Griffiths
- Genomics Research Centre, Institute of Health and Biomedical Innovation. School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD, Australia.
| |
Collapse
|
9
|
Pereira GRC, Tellini GHAS, De Mesquita JF. In silico analysis of PFN1 related to amyotrophic lateral sclerosis. PLoS One 2019; 14:e0215723. [PMID: 31216283 PMCID: PMC6583998 DOI: 10.1371/journal.pone.0215723] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 04/09/2019] [Indexed: 12/11/2022] Open
Abstract
Profilin 1 (PFN1) protein plays key roles in neuronal growth and differentiation, membrane trafficking, and regulation of the actin cytoskeleton. Four natural variants of PFN1 were described as related to ALS, the most common adult-onset motor neuron disorder. However, the pathological mechanism of PFN1 in ALS is not yet completely understood. The goal of this work is to thoroughly analyze the effects of the ALS-related mutations on PFN1 structure and function using computational simulations. Here, PhD-SNP, PMUT, PolyPhen-2, SIFT, SNAP, SNPS&GO, SAAP, nsSNPAnalyzer, SNPeffect4.0 and I-Mutant2.0 were used to predict the functional and stability effects of PFN1 mutations. ConSurf was used for the evolutionary conservation analysis, and GROMACS was used to perform the MD simulations. The mutations C71G, M114T, and G118V, but not E117G, were predicted as deleterious by most of the functional prediction algorithms that were used. The stability prediction indicated that the ALS-related mutations could destabilize PFN1. The ConSurf analysis indicated that the mutation C71G, M114T, E117G, and G118V occur in highly conserved positions. The MD results indicated that the studied mutations could affect the PFN1 flexibility at the actin and PLP-binding domains, and consequently, their intermolecular interactions. It may be therefore related to the functional impairment of PFN1 upon C71G, M114T, E117G and G118V mutations, and their involvement in ALS development. We also developed a database, SNPMOL (http://www.snpmol.org/), containing the results presented on this paper for biologists and clinicians to exploit PFN1 and its natural variants.
Collapse
Affiliation(s)
- Gabriel Rodrigues Coutinho Pereira
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Giovanni Henrique Almeida Silva Tellini
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Joelma Freire De Mesquita
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
- * E-mail:
| |
Collapse
|
10
|
De Oliveira CCS, Pereira GRC, De Alcantara JYS, Antunes D, Caffarena ER, De Mesquita JF. In silico analysis of the V66M variant of human BDNF in psychiatric disorders: An approach to precision medicine. PLoS One 2019; 14:e0215508. [PMID: 30998730 PMCID: PMC6472887 DOI: 10.1371/journal.pone.0215508] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 04/04/2019] [Indexed: 11/19/2022] Open
Abstract
Brain-derived neurotrophic factor (BDNF) plays an important role in neurogenesis and synapse formation. The V66M is the most prevalent BDNF mutation in humans and impairs the function and distribution of BDNF. This mutation is related to several psychiatric disorders. The pro-region of BDNF, particularly position 66 and its adjacent residues, are determinant for the intracellular sorting and activity-dependent secretion of BDNF. However, it has not yet been fully elucidated. The present study aims to analyze the effects of the V66M mutation on BDNF structure and function. Here, we applied nine algorithms, including SIFT and PolyPhen-2, for functional and stability prediction of the V66M mutation. The complete theoretical model of BNDF was generated by Rosetta and validated by PROCHECK, RAMPAGE, ProSa, QMEAN and Verify-3D algorithms. Structural alignment was performed using TM-align. Phylogenetic analysis was performed using the ConSurf server. Molecular dynamics (MD) simulations were performed and analyzed using the GROMACS 2018.2 package. The V66M mutation was predicted as deleterious by PolyPhen-2 and SIFT in addition to being predicted as destabilizing by I-Mutant. According to SNPeffect, the V66M mutation does not affect protein aggregation, amyloid propensity, and chaperone binding. The complete theoretical structure of BDNF proved to be a reliable model. Phylogenetic analysis indicated that the V66M mutation of BDNF occurs at a non-conserved position of the protein. MD analyses indicated that the V66M mutation does not affect the BDNF flexibility and surface-to-volume ratio, but affects the BDNF essential motions, hydrogen-bonding and secondary structure particularly at its pre and pro-domain, which are crucial for its activity and distribution. Thus, considering that these parameters are determinant for protein interactions and, consequently, protein function; the alterations observed throughout the MD analyses may be related to the functional impairment of BDNF upon V66M mutation, as well as its involvement in psychiatric disorders.
Collapse
Affiliation(s)
- Clara Carolina Silva De Oliveira
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Gabriel Rodrigues Coutinho Pereira
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Jamile Yvis Santos De Alcantara
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| | - Deborah Antunes
- Computational Biophysics and Molecular Modeling Group, Scientific Computing Program (PROCC), Fundação Oswaldo Cruz, Manguinhos, Rio de Janeiro, Brazil
| | - Ernesto Raul Caffarena
- Computational Biophysics and Molecular Modeling Group, Scientific Computing Program (PROCC), Fundação Oswaldo Cruz, Manguinhos, Rio de Janeiro, Brazil
| | - Joelma Freire De Mesquita
- Department of Genetics and Molecular Biology, Bioinformatics and Computational Biology Laboratory, Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
11
|
Chen HH, Petty LE, Bush W, Naj AC, Below JE. GWAS and Beyond: Using Omics Approaches to Interpret SNP Associations. CURRENT GENETIC MEDICINE REPORTS 2019; 7:30-40. [PMID: 33312764 PMCID: PMC7731888 DOI: 10.1007/s40142-019-0159-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
PURPOSE OF REVIEW Neurodegenerative diseases, neuropsychiatric disorders, and related traits have highly complex etiologies but are also highly heritable and identifying the causal genes and biological pathways underlying these traits may advance the development of treatments and preventive strategies. While many genome-wide association studies (GWAS) have successfully identified variants contributing to polygenic neurodegenerative and neuropsychiatric phenotypes including Alzheimer's disease (AD), schizophrenia (SCZ), and bipolar disorder (BPD) amongst others, interpreting the biological roles of significantly-associated variants in the genetic architecture of these traits remains a significant challenge. Here we review several 'omics' approaches which attempt to bridge the gap from associated genetic variants to phenotype by helping define the functional roles of GWAS loci in the development of neuropsychiatric disorders and traits. RECENT FINDINGS Several common 'omics' approaches have been applied to examine neuropsychiatric traits, such as nearest-gene mapping, trans-ethnic fine mapping, annotation enrichment analysis, transcriptomic analysis, and pathway analysis, and each of these approaches has strengths and limitations in providing insight into biological mechanisms. One popular emerging method is the examination of tissue-specific genetically-regulated gene expression (GReX), which aggregates the genetic variants' effects at the gene-level. Furthermore, proteomic, metabolomic, and microbiomic studies and phenome-wide association studies will further enhance our understanding of neuropsychiatric traits. SUMMARY GWAS has been applied to neuropsychiatric traits for a decade, but our understanding about the biological function of identified variants remains limited. Today, technological advancements have created analytical approaches for integrating transcriptomics, metabolomics, proteomics, pharmacology and toxicology as tools for understanding the functional roles of genetics variants. These data, as well as the broader clinical information provided by electronic health records, can provide additional insight and complement genomic analyses.
Collapse
Affiliation(s)
- Hung-Hsin Chen
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lauren E. Petty
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - William Bush
- Institute for Computational Biology, Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Adam C. Naj
- Department of Biostatistics, Epidemiology, and Informatics; Department of Pathology and Laboratory Medicine; Center for Clinical Epidemiology and Biostatistics; Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jennifer E. Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
12
|
Pereira GRC, Da Silva ANR, Do Nascimento SS, De Mesquita JF. In silico analysis and molecular dynamics simulation of human superoxide dismutase 3 (SOD3) genetic variants. J Cell Biochem 2018; 120:3583-3598. [DOI: 10.1002/jcb.27636] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 08/16/2018] [Indexed: 01/05/2023]
Affiliation(s)
- G. R. C. Pereira
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| | - A. N. R. Da Silva
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| | - S. S. Do Nascimento
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| | - J. F. De Mesquita
- Department of Genetics and Molecular Biology Federal University of the State of Rio de Janeiro (UNIRIO) Rio de Janeiro Brazil
| |
Collapse
|
13
|
Goodacre N, Edwards N, Danielsen M, Uetz P, Wu C. Predicting nsSNPs that Disrupt Protein-Protein Interactions Using Docking. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1082-1093. [PMID: 26812731 DOI: 10.1109/tcbb.2016.2520931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The human genome contains a large number of protein polymorphisms due to individual genome variation. How many of these polymorphisms lead to altered protein-protein interaction is unknown. We have developed a method to address this question. The intersection of the SKEMPI database (of affinity constants among interacting proteins) and CAPRI 4.0 docking benchmark was docked using HADDOCK, leading to a training set of 166 mutant pairs. A random forest classifier based on the differences in resulting docking scores between the 166 mutant pairs and their wild-types was used, to distinguish between variants that have either completely or partially lost binding ability. Fifty percent of non-binders were correctly predicted with a false discovery rate of only 2 percent. The model was tested on a set of 15 HIV-1 - human, as well as seven human- human glioblastoma-related, mutant protein pairs: 50 percent of combined non-binders were correctly predicted with a false discovery rate of 10 percent. The model was also used to identify 10 protein-protein interactions between human proteins and their HIV-1 partners that are likely to be abolished by rare non-synonymous single-nucleotide polymorphisms (nsSNPs). These nsSNPs may represent novel and potentially therapeutically-valuable targets for anti-viral therapy by disruption of viral binding.
Collapse
|
14
|
Balasubramanian S, Fu Y, Pawashe M, McGillivray P, Jin M, Liu J, Karczewski KJ, MacArthur DG, Gerstein M. Using ALoFT to determine the impact of putative loss-of-function variants in protein-coding genes. Nat Commun 2017; 8:382. [PMID: 28851873 PMCID: PMC5575292 DOI: 10.1038/s41467-017-00443-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Accepted: 06/29/2017] [Indexed: 11/09/2022] Open
Abstract
Variants predicted to result in the loss of function of human genes have attracted interest because of their clinical impact and surprising prevalence in healthy individuals. Here, we present ALoFT (annotation of loss-of-function transcripts), a method to annotate and predict the disease-causing potential of loss-of-function variants. Using data from Mendelian disease-gene discovery projects, we show that ALoFT can distinguish between loss-of-function variants that are deleterious as heterozygotes and those causing disease only in the homozygous state. Investigation of variants discovered in healthy populations suggests that each individual carries at least two heterozygous premature stop alleles that could potentially lead to disease if present as homozygotes. When applied to de novo putative loss-of-function variants in autism-affected families, ALoFT distinguishes between deleterious variants in patients and benign variants in unaffected siblings. Finally, analysis of somatic variants in >6500 cancer exomes shows that putative loss-of-function variants predicted to be deleterious by ALoFT are enriched in known driver genes.Variants causing loss of function (LoF) of human genes have clinical implications. Here, the authors present a method to predict disease-causing potential of LoF variants, ALoFT (annotation of Loss-of-Function Transcripts) and show its application to interpreting LoF variants in different contexts.
Collapse
Affiliation(s)
- Suganthi Balasubramanian
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT, 06520, USA.
- Regeneron Genetics Center, Tarrytown, NY, 10591, USA.
| | - Yao Fu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA
- Bina Technologies, Part of Roche Sequencing, Belmont, CA, 94002, USA
| | - Mayur Pawashe
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT, 06520, USA
| | - Patrick McGillivray
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT, 06520, USA
| | - Mike Jin
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT, 06520, USA
| | - Jeremy Liu
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT, 06520, USA
| | - Konrad J Karczewski
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, 02142, USA
| | - Daniel G MacArthur
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, 02142, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.
- Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT, 06520, USA.
- Department of Computer Science, Yale University, New Haven, CT, 06520, USA.
| |
Collapse
|
15
|
Lugo-Martinez J, Pejaver V, Pagel KA, Jain S, Mort M, Cooper DN, Mooney SD, Radivojac P. The Loss and Gain of Functional Amino Acid Residues Is a Common Mechanism Causing Human Inherited Disease. PLoS Comput Biol 2016; 12:e1005091. [PMID: 27564311 PMCID: PMC5001644 DOI: 10.1371/journal.pcbi.1005091] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 08/02/2016] [Indexed: 01/12/2023] Open
Abstract
Elucidating the precise molecular events altered by disease-causing genetic variants represents a major challenge in translational bioinformatics. To this end, many studies have investigated the structural and functional impact of amino acid substitutions. Most of these studies were however limited in scope to either individual molecular functions or were concerned with functional effects (e.g. deleterious vs. neutral) without specifically considering possible molecular alterations. The recent growth of structural, molecular and genetic data presents an opportunity for more comprehensive studies to consider the structural environment of a residue of interest, to hypothesize specific molecular effects of sequence variants and to statistically associate these effects with genetic disease. In this study, we analyzed data sets of disease-causing and putatively neutral human variants mapped to protein 3D structures as part of a systematic study of the loss and gain of various types of functional attribute potentially underlying pathogenic molecular alterations. We first propose a formal model to assess probabilistically function-impacting variants. We then develop an array of structure-based functional residue predictors, evaluate their performance, and use them to quantify the impact of disease-causing amino acid substitutions on catalytic activity, metal binding, macromolecular binding, ligand binding, allosteric regulation and post-translational modifications. We show that our methodology generates actionable biological hypotheses for up to 41% of disease-causing genetic variants mapped to protein structures suggesting that it can be reliably used to guide experimental validation. Our results suggest that a significant fraction of disease-causing human variants mapping to protein structures are function-altering both in the presence and absence of stability disruption. Identifying the molecular changes caused by mutations is a major challenge in understanding and treating human genetic disease. To address this problem, we have developed a wide range of profiling tools designed to predict specific types of functional site from protein 3D structures. We then apply these tools to data sets of inherited disease-associated and putatively neutral amino acid substitutions and estimate the relative contribution of the loss and gain of functional residues in disease. Our results suggest that alterations of molecular function are involved in a significant number of cases of human genetic disease and are over-represented as compared to putatively neutral variants. Additionally, we use experimental data to show that it is possible to computationally identify the loss of specific functional events in disease pathogenesis. Finally, our methodology can be used to reliably identify the potential molecular consequences of disease-causing genetic variants and hence prioritize experimental validation.
Collapse
Affiliation(s)
- Jose Lugo-Martinez
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana, United States of America
| | - Vikas Pejaver
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana, United States of America
| | - Kymberleigh A. Pagel
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana, United States of America
| | - Shantanu Jain
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana, United States of America
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, United States of America
- * E-mail: (SDM); (PR)
| | - Predrag Radivojac
- Department of Computer Science and Informatics, Indiana University, Bloomington, Indiana, United States of America
- * E-mail: (SDM); (PR)
| |
Collapse
|
16
|
Kamaraj B, Purohit R. Mutational Analysis on Membrane Associated Transporter Protein (MATP) and Their Structural Consequences in Oculocutaeous Albinism Type 4 (OCA4)-A Molecular Dynamics Approach. J Cell Biochem 2016; 117:2608-19. [PMID: 27019209 DOI: 10.1002/jcb.25555] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 03/24/2016] [Indexed: 12/11/2022]
Abstract
Oculocutaneous albinism type IV (OCA4) is an autosomal recessive inherited disorder which is characterized by reduced biosynthesis of melanin pigmentation in skin, hair, and eyes and caused by the genetic mutations in the membrane-associated transporter protein (MATP) encoded by SLC45A2 gene. The MATP protein consists of 530 amino acids which contains 12 putative transmembrane domains and plays an important role in pigmentation and probably functions as a membrane transporter in melanosomes. We scrutinized the most OCA4 disease-associated mutation and their structural consequences on SLC45A2 gene. To understand the atomic arrangement in 3D space, the native and mutant structures were modeled. Further the structural behavior of native and mutant MATP protein was investigated by molecular dynamics simulation (MDS) approach in explicit lipid and water background. We found Y317C as the most deleterious and disease-associated SNP on SLC45A2 gene. In MDS, mutations in MATP protein showed loss of stability and became more flexible, which alter its structural conformation and function. This phenomenon has indicated a significant role in inducing OCA4. Our study explored the understanding of molecular mechanism of MATP protein upon mutation at atomic level and further helps in the field of pharmacogenomics to develop a personalized medicine for OCA4 disorder. J. Cell. Biochem. 117: 2608-2619, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Balu Kamaraj
- Research Group PLASMANT, Department of Chemistry, University of Antwerp, Universiteitsplein 1, 2610, Wilrijk-Antwerp, Belgium
| | - Rituraj Purohit
- Department of Biotechnology, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India.
| |
Collapse
|
17
|
Riahi A, Messaoudi A, Mrad R, Fourati A, Chabouni-Bouhamed H, Kharrat M. Molecular characterization, homology modeling and docking studies of the R2787H missense variation in BRCA2 gene: Association with breast cancer. J Theor Biol 2016; 403:188-196. [DOI: 10.1016/j.jtbi.2016.05.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 03/12/2016] [Accepted: 05/04/2016] [Indexed: 10/21/2022]
|
18
|
Sheynkman GM, Shortreed MR, Cesnik AJ, Smith LM. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2016; 9:521-45. [PMID: 27049631 PMCID: PMC4991544 DOI: 10.1146/annurev-anchem-071015-041722] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215;
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Anthony J Cesnik
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706; ,
- Genome Center of Wisconsin, University of Wisconsin, Madison, Wisconsin 53706;
| |
Collapse
|
19
|
Tang H, Thomas PD. Tools for Predicting the Functional Impact of Nonsynonymous Genetic Variation. Genetics 2016; 203:635-47. [PMID: 27270698 PMCID: PMC4896183 DOI: 10.1534/genetics.116.190033] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2015] [Accepted: 04/01/2016] [Indexed: 01/09/2023] Open
Abstract
As personal genome sequencing becomes a reality, understanding the effects of genetic variants on phenotype-particularly the impact of germline variants on disease risk and the impact of somatic variants on cancer development and treatment-continues to increase in importance. Because of their clear potential for affecting phenotype, nonsynonymous genetic variants (variants that cause a change in the amino acid sequence of a protein encoded by a gene) have long been the target of efforts to predict the effects of genetic variation. Whole-genome sequencing is identifying large numbers of nonsynonymous variants in each genome, intensifying the need for computational methods that accurately predict which of these are likely to impact disease phenotypes. This review focuses on nonsynonymous variant prediction with two aims in mind: (1) to review the prioritization methods that have been developed to date and the principles on which they are based and (2) to discuss the challenges to further improving these methods.
Collapse
Affiliation(s)
- Haiming Tang
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California 90033
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California 90033
| |
Collapse
|
20
|
Masica DL, Karchin R. Towards Increasing the Clinical Relevance of In Silico Methods to Predict Pathogenic Missense Variants. PLoS Comput Biol 2016; 12:e1004725. [PMID: 27171182 PMCID: PMC4865359 DOI: 10.1371/journal.pcbi.1004725] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Affiliation(s)
- David L. Masica
- Department of Biomedical Engineering and The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Rachel Karchin
- Department of Biomedical Engineering and The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Oncology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
21
|
Niroula A, Vihinen M. Variation Interpretation Predictors: Principles, Types, Performance, and Choice. Hum Mutat 2016; 37:579-97. [DOI: 10.1002/humu.22987] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 03/07/2016] [Indexed: 12/18/2022]
Affiliation(s)
- Abhishek Niroula
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| | - Mauno Vihinen
- Department of Experimental Medical Science; Lund University; BMC B13 Lund SE-22184 Sweden
| |
Collapse
|
22
|
Babbitt GA, Coppola EE, Alawad MA, Hudson AO. Can all heritable biology really be reduced to a single dimension? Gene 2016; 578:162-8. [DOI: 10.1016/j.gene.2015.12.043] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Revised: 12/16/2015] [Accepted: 12/17/2015] [Indexed: 12/23/2022]
|
23
|
Moorcraft SY, Gonzalez D, Walker BA. Understanding next generation sequencing in oncology: A guide for oncologists. Crit Rev Oncol Hematol 2015; 96:463-74. [DOI: 10.1016/j.critrevonc.2015.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Revised: 05/21/2015] [Accepted: 06/17/2015] [Indexed: 12/17/2022] Open
|
24
|
Li MJ, Liu Z, Wang P, Wong MP, Nelson MR, Kocher JPA, Yeager M, Sham PC, Chanock SJ, Xia Z, Wang J. GWASdb v2: an update database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res 2015; 44:D869-76. [PMID: 26615194 PMCID: PMC4702921 DOI: 10.1093/nar/gkv1317] [Citation(s) in RCA: 150] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 11/10/2015] [Indexed: 12/19/2022] Open
Abstract
Genome-wide association studies (GWASs), now as a routine approach to study single-nucleotide polymorphism (SNP)-trait association, have uncovered over ten thousand significant trait/disease associated SNPs (TASs). Here, we updated GWASdb (GWASdb v2, http://jjwanglab.org/gwasdb) which provides comprehensive data curation and knowledge integration for GWAS TASs. These updates include: (i) Up to August 2015, we collected 2479 unique publications from PubMed and other resources; (ii) We further curated moderate SNP-trait associations (P-value < 1.0×10−3) from each original publication, and generated a total of 252 530 unique TASs in all GWASdb v2 collected studies; (iii) We manually mapped 1610 GWAS traits to 501 Human Phenotype Ontology (HPO) terms, 435 Disease Ontology (DO) terms and 228 Disease Ontology Lite (DOLite) terms. For each ontology term, we also predicted the putative causal genes; (iv) We curated the detailed sub-populations and related sample size for each study; (v) Importantly, we performed extensive function annotation for each TAS by incorporating gene-based information, ENCODE ChIP-seq assays, eQTL, population haplotype, functional prediction across multiple biological domains, evolutionary signals and disease-related annotation; (vi) Additionally, we compiled a SNP-drug response association dataset for 650 pharmacogenetic studies involving 257 drugs in this update; (vii) Last, we improved the user interface of website.
Collapse
Affiliation(s)
- Mulin Jun Li
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Zipeng Liu
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Panwen Wang
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Maria P Wong
- Department of Pathology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Matthew R Nelson
- Quantitative Sciences, GlaxoSmithKline, Research Triangle Park, NC, USA
| | - Jean-Pierre A Kocher
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Meredith Yeager
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Pak Chung Sham
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China State Key Laboratory of Brain and Cognitive Sciences and Department of Psychiatry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Zhengyuan Xia
- Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Junwen Wang
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
25
|
Pavlopoulos GA, Malliarakis D, Papanikolaou N, Theodosiou T, Enright AJ, Iliopoulos I. Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future. Gigascience 2015; 4:38. [PMID: 26309733 PMCID: PMC4548842 DOI: 10.1186/s13742-015-0077-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 08/03/2015] [Indexed: 01/31/2023] Open
Abstract
"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | | | - Nikolas Papanikolaou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Theodosis Theodosiou
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| | - Anton J Enright
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD UK
| | - Ioannis Iliopoulos
- Bioinformatics & Computational Biology Laboratory, Division of Basic Sciences, University of Crete, Medical School, 70013 Heraklion, Crete Greece
| |
Collapse
|
26
|
Assessment of the predictive accuracy of five in silico prediction tools, alone or in combination, and two metaservers to classify long QT syndrome gene mutations. BMC MEDICAL GENETICS 2015; 16:34. [PMID: 25967940 PMCID: PMC4630850 DOI: 10.1186/s12881-015-0176-z] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 04/22/2015] [Indexed: 11/27/2022]
Abstract
Background Long QT syndrome (LQTS) is an autosomal dominant condition predisposing to sudden death from malignant arrhythmia. Genetic testing identifies many missense single nucleotide variants of uncertain pathogenicity. Establishing genetic pathogenicity is an essential prerequisite to family cascade screening. Many laboratories use in silico prediction tools, either alone or in combination, or metaservers, in order to predict pathogenicity; however, their accuracy in the context of LQTS is unknown. We evaluated the accuracy of five in silico programs and two metaservers in the analysis of LQTS 1–3 gene variants. Methods The in silico tools SIFT, PolyPhen-2, PROVEAN, SNPs&GO and SNAP, either alone or in all possible combinations, and the metaservers Meta-SNP and PredictSNP, were tested on 312 KCNQ1, KCNH2 and SCN5A gene variants that have previously been characterised by either in vitro or co-segregation studies as either “pathogenic” (283) or “benign” (29). The accuracy, sensitivity, specificity and Matthews Correlation Coefficient (MCC) were calculated to determine the best combination of in silico tools for each LQTS gene, and when all genes are combined. Results The best combination of in silico tools for KCNQ1 is PROVEAN, SNPs&GO and SIFT (accuracy 92.7%, sensitivity 93.1%, specificity 100% and MCC 0.70). The best combination of in silico tools for KCNH2 is SIFT and PROVEAN or PROVEAN, SNPs&GO and SIFT. Both combinations have the same scores for accuracy (91.1%), sensitivity (91.5%), specificity (87.5%) and MCC (0.62). In the case of SCN5A, SNAP and PROVEAN provided the best combination (accuracy 81.4%, sensitivity 86.9%, specificity 50.0%, and MCC 0.32). When all three LQT genes are combined, SIFT, PROVEAN and SNAP is the combination with the best performance (accuracy 82.7%, sensitivity 83.0%, specificity 80.0%, and MCC 0.44). Both metaservers performed better than the single in silico tools; however, they did not perform better than the best performing combination of in silico tools. Conclusions The combination of in silico tools with the best performance is gene-dependent. The in silico tools reported here may have some value in assessing variants in the KCNQ1 and KCNH2 genes, but caution should be taken when the analysis is applied to SCN5A gene variants. Electronic supplementary material The online version of this article (doi:10.1186/s12881-015-0176-z) contains supplementary material, which is available to authorized users.
Collapse
|
27
|
Leong IUS, Stuckey A, Lai D, Skinner JR, Love DR. Assessment of the predictive accuracy of five in silico prediction tools, alone or in combination, and two metaservers to classify long QT syndrome gene mutations. BMC MEDICAL GENETICS 2015. [PMID: 25967940 DOI: 10.1186/s12881‐015‐0176‐z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Long QT syndrome (LQTS) is an autosomal dominant condition predisposing to sudden death from malignant arrhythmia. Genetic testing identifies many missense single nucleotide variants of uncertain pathogenicity. Establishing genetic pathogenicity is an essential prerequisite to family cascade screening. Many laboratories use in silico prediction tools, either alone or in combination, or metaservers, in order to predict pathogenicity; however, their accuracy in the context of LQTS is unknown. We evaluated the accuracy of five in silico programs and two metaservers in the analysis of LQTS 1-3 gene variants. METHODS The in silico tools SIFT, PolyPhen-2, PROVEAN, SNPs&GO and SNAP, either alone or in all possible combinations, and the metaservers Meta-SNP and PredictSNP, were tested on 312 KCNQ1, KCNH2 and SCN5A gene variants that have previously been characterised by either in vitro or co-segregation studies as either "pathogenic" (283) or "benign" (29). The accuracy, sensitivity, specificity and Matthews Correlation Coefficient (MCC) were calculated to determine the best combination of in silico tools for each LQTS gene, and when all genes are combined. RESULTS The best combination of in silico tools for KCNQ1 is PROVEAN, SNPs&GO and SIFT (accuracy 92.7%, sensitivity 93.1%, specificity 100% and MCC 0.70). The best combination of in silico tools for KCNH2 is SIFT and PROVEAN or PROVEAN, SNPs&GO and SIFT. Both combinations have the same scores for accuracy (91.1%), sensitivity (91.5%), specificity (87.5%) and MCC (0.62). In the case of SCN5A, SNAP and PROVEAN provided the best combination (accuracy 81.4%, sensitivity 86.9%, specificity 50.0%, and MCC 0.32). When all three LQT genes are combined, SIFT, PROVEAN and SNAP is the combination with the best performance (accuracy 82.7%, sensitivity 83.0%, specificity 80.0%, and MCC 0.44). Both metaservers performed better than the single in silico tools; however, they did not perform better than the best performing combination of in silico tools. CONCLUSIONS The combination of in silico tools with the best performance is gene-dependent. The in silico tools reported here may have some value in assessing variants in the KCNQ1 and KCNH2 genes, but caution should be taken when the analysis is applied to SCN5A gene variants.
Collapse
Affiliation(s)
- Ivone U S Leong
- Diagnostic Genetics, LabPlus, Auckland City Hospital, Auckland, New Zealand.
| | - Alexander Stuckey
- Bioinformatics Institute, University of Auckland, Auckland, New Zealand.
| | - Daniel Lai
- Green Lane Paediatric and Congenital Cardiac Services, Starship Children's Hospital, Private Bag 92024, Auckland, 1142, New Zealand.
| | - Jonathan R Skinner
- Green Lane Paediatric and Congenital Cardiac Services, Starship Children's Hospital, Private Bag 92024, Auckland, 1142, New Zealand. .,Cardiac Inherited Disease Group, Auckland City Hospital, Auckland, New Zealand. .,Department of Child Health, University of Auckland, Auckland, New Zealand.
| | - Donald R Love
- Department of Child Health, University of Auckland, Auckland, New Zealand.
| |
Collapse
|
28
|
Ehmke N, Caliebe A, Koenig R, Kant SG, Stark Z, Cormier-Daire V, Wieczorek D, Gillessen-Kaesbach G, Hoff K, Kawalia A, Thiele H, Altmüller J, Fischer-Zirnsak B, Knaus A, Zhu N, Heinrich V, Huber C, Harabula I, Spielmann M, Horn D, Kornak U, Hecht J, Krawitz PM, Nürnberg P, Siebert R, Manzke H, Mundlos S. Homozygous and compound-heterozygous mutations in TGDS cause Catel-Manzke syndrome. Am J Hum Genet 2014; 95:763-70. [PMID: 25480037 PMCID: PMC4259972 DOI: 10.1016/j.ajhg.2014.11.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Accepted: 11/10/2014] [Indexed: 12/30/2022] Open
Abstract
Catel-Manzke syndrome is characterized by Pierre Robin sequence and a unique form of bilateral hyperphalangy causing a clinodactyly of the index finger. We describe the identification of homozygous and compound heterozygous mutations in TGDS in seven unrelated individuals with typical Catel-Manzke syndrome by exome sequencing. Six different TGDS mutations were detected: c.892A>G (p.Asn298Asp), c.270_271del (p.Lys91Asnfs(∗)22), c.298G>T (p.Ala100Ser), c.294T>G (p.Phe98Leu), c.269A>G (p.Glu90Gly), and c.700T>C (p.Tyr234His), all predicted to be disease causing. By using haplotype reconstruction we showed that the mutation c.298G>T is probably a founder mutation. Due to the spectrum of the amino acid changes, we suggest that loss of function in TGDS is the underlying mechanism of Catel-Manzke syndrome. TGDS (dTDP-D-glucose 4,6-dehydrogenase) is a conserved protein belonging to the SDR family and probably plays a role in nucleotide sugar metabolism.
Collapse
Affiliation(s)
- Nadja Ehmke
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany.
| | - Almuth Caliebe
- Institute of Human Genetics, Christian-Albrechts-University Kiel & University Hospital Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
| | - Rainer Koenig
- Institute of Human Genetics, Goethe-University Frankfurt, 60590 Frankfurt am Main, Germany
| | - Sarina G Kant
- Department of Clinical Genetics, Leiden University Medical Center, 2300 RC Leiden, the Netherlands
| | - Zornitza Stark
- Victorian Clinical Genetics Service, Murdoch Children's Research Institute, Parkville, VIC 3052, Australia
| | - Valérie Cormier-Daire
- Department of Genetics, INSERM UMR 1163, Université Paris Descartes-Sorbonne PARIS Cité, Imagine Institute, Hôpital Necker Enfants Males, 75015 Paris, France
| | - Dagmar Wieczorek
- Institut für Humangenetik, Universitätsklinikum Essen, Universität Duisburg-Essen, 45122 Essen, Germany
| | | | - Kirstin Hoff
- Institute of Human Genetics, Christian-Albrechts-University Kiel & University Hospital Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany; Department of Congenital Heart Disease and Pediatric Cardiology, Christian-Albrechts-University Kiel & University Hospital Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany; DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Kiel/Lübeck, 24105 Kiel, Germany
| | - Amit Kawalia
- Cologne Center for Genomics (CCG), University of Cologne, 50931 Cologne, Germany
| | - Holger Thiele
- Cologne Center for Genomics (CCG), University of Cologne, 50931 Cologne, Germany
| | - Janine Altmüller
- Cologne Center for Genomics (CCG), University of Cologne, 50931 Cologne, Germany; Institute of Human Genetics, University of Cologne, 50931 Cologne, Germany
| | - Björn Fischer-Zirnsak
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Alexej Knaus
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Na Zhu
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany
| | - Verena Heinrich
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany
| | - Celine Huber
- Department of Genetics, INSERM UMR 1163, Université Paris Descartes-Sorbonne PARIS Cité, Imagine Institute, Hôpital Necker Enfants Males, 75015 Paris, France
| | - Izabela Harabula
- Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Malte Spielmann
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Denise Horn
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany
| | - Uwe Kornak
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Jochen Hecht
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Peter M Krawitz
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Peter Nürnberg
- Cologne Center for Genomics (CCG), University of Cologne, 50931 Cologne, Germany; Center for Molecular Medicine Cologne (CMMC), University of Cologne, 50931 Cologne, Germany; Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, 50931 Cologne, Germany
| | - Reiner Siebert
- Institute of Human Genetics, Christian-Albrechts-University Kiel & University Hospital Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
| | | | - Stefan Mundlos
- Institute of Medical and Human Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany.
| |
Collapse
|
29
|
Katsonis P, Koire A, Wilson SJ, Hsu TK, Lua RC, Wilkins AD, Lichtarge O. Single nucleotide variations: biological impact and theoretical interpretation. Protein Sci 2014; 23:1650-66. [PMID: 25234433 PMCID: PMC4253807 DOI: 10.1002/pro.2552] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Revised: 09/12/2014] [Accepted: 09/15/2014] [Indexed: 12/27/2022]
Abstract
Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Amanda Koire
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
| | - Stephen Joseph Wilson
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Teng-Kuei Hsu
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Angela Dawn Wilkins
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
- Department of Pharmacology, Baylor College of MedicineHouston, Texas
| |
Collapse
|
30
|
Computational Analysis Reveals the Association of Threonine 118 Methionine Mutation in PMP22 Resulting in CMT-1A. Adv Bioinformatics 2014; 2014:502618. [PMID: 25400662 PMCID: PMC4220619 DOI: 10.1155/2014/502618] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Revised: 09/26/2014] [Accepted: 09/26/2014] [Indexed: 12/31/2022] Open
Abstract
The T118M mutation in PMP22 gene is associated with Charcot Marie Tooth, type 1A (CMT1A). CMT1A is a form of Charcot-Marie-Tooth disease, the most common inherited disorder of the peripheral nervous system. Mutations in CMT related disorder are seen to increase the stability of the protein resulting in the diseased state. We performed SNP analysis for all the nsSNPs of PMP22 protein and carried out molecular dynamics simulation for T118M mutation to compare the stability difference between the wild type protein structure and the mutant protein structure. The mutation T118M resulted in the overall increase in the stability of the mutant protein. The superimposed structure shows marked structural variation between the wild type and the mutant protein structures.
Collapse
|
31
|
Kumar CV, Swetha RG, Ramaiah S, Anbarasu A. Tryptophan to Glycine mutation in the position 116 leads to protein aggregation and decreases the stability of the LITAF protein. J Biomol Struct Dyn 2014; 33:1695-709. [DOI: 10.1080/07391102.2014.968211] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
32
|
Computational screening of disease associated mutations on NPC1 gene and its structural consequence in Niemann-Pick type-C1. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/s11515-014-1314-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
33
|
Integrating in silico prediction methods, molecular docking, and molecular dynamics simulation to predict the impact of ALK missense mutations in structural perspective. BIOMED RESEARCH INTERNATIONAL 2014; 2014:895831. [PMID: 25054154 PMCID: PMC4098886 DOI: 10.1155/2014/895831] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2013] [Revised: 03/05/2014] [Accepted: 03/06/2014] [Indexed: 01/13/2023]
Abstract
Over the past decade, advancements in next generation sequencing technology have placed personalized genomic medicine upon horizon. Understanding the likelihood of disease causing mutations in complex diseases as pathogenic or neutral remains as a major task and even impossible in the structural context because of its time consuming and expensive experiments. Among the various diseases causing mutations, single nucleotide polymorphisms (SNPs) play a vital role in defining individual's susceptibility to disease and drug response. Understanding the genotype-phenotype relationship through SNPs is the first and most important step in drug research and development. Detailed understanding of the effect of SNPs on patient drug response is a key factor in the establishment of personalized medicine. In this paper, we represent a computational pipeline in anaplastic lymphoma kinase (ALK) for SNP-centred study by the application of in silico prediction methods, molecular docking, and molecular dynamics simulation approaches. Combination of computational methods provides a way in understanding the impact of deleterious mutations in altering the protein drug targets and eventually leading to variable patient's drug response. We hope this rapid and cost effective pipeline will also serve as a bridge to connect the clinicians and in silico resources in tailoring treatments to the patients' specific genotype.
Collapse
|
34
|
Panda R, P.K. S. Computational identification and analysis of functional polymorphisms involved in the activation and detoxification genes implicated in endometriosis. Gene 2014; 542:89-97. [DOI: 10.1016/j.gene.2014.03.058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Revised: 02/28/2014] [Accepted: 03/29/2014] [Indexed: 02/09/2023]
|
35
|
Kamaraj B, Rajendran V, Sethumadhavan R, Kumar CV, Purohit R. Mutational analysis of FUS gene and its structural and functional role in amyotrophic lateral sclerosis 6. J Biomol Struct Dyn 2014; 33:834-44. [DOI: 10.1080/07391102.2014.915762] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Balu Kamaraj
- School of Bio Sciences and Technology (SBST), Bioinformatics Division, Vellore Institute of Technology University, Vellore 632014, Tamil Nadu, India
| | - Vidya Rajendran
- School of Bio Sciences and Technology (SBST), Bioinformatics Division, Vellore Institute of Technology University, Vellore 632014, Tamil Nadu, India
| | - Rao Sethumadhavan
- School of Bio Sciences and Technology (SBST), Bioinformatics Division, Vellore Institute of Technology University, Vellore 632014, Tamil Nadu, India
| | - Chundi Vinay Kumar
- School of Bio Sciences and Technology (SBST), Bioinformatics Division, Vellore Institute of Technology University, Vellore 632014, Tamil Nadu, India
| | - Rituraj Purohit
- School of Bio Sciences and Technology (SBST), Bioinformatics Division, Vellore Institute of Technology University, Vellore 632014, Tamil Nadu, India
| |
Collapse
|
36
|
Abstract
Proteins are macromolecules that serve a cell’s myriad processes and functions in all living organisms via dynamic interactions with other proteins, small molecules and cellular components. Genetic variations in the protein-encoding regions of the human genome account for >85% of all known Mendelian diseases, and play an influential role in shaping complex polygenic diseases. Proteins also serve as the predominant target class for the design of small molecule drugs to modulate their activity. Knowledge of the shape and form of proteins, by means of their three-dimensional structures, is therefore instrumental to understanding their roles in disease and their potentials for drug development. In this chapter we outline, with the wide readership of non-structural biologists in mind, the various experimental and computational methods available for protein structure determination. We summarize how the wealth of structure information, contributed to a large extent by the technological advances in structure determination to date, serves as a useful tool to decipher the molecular basis of genetic variations for disease characterization and diagnosis, particularly in the emerging era of genomic medicine, and becomes an integral component in the modern day approach towards rational drug development.
Collapse
Affiliation(s)
- Nelson L.S. Tang
- Dept. of Chemical Pathology and Lab. of Genetics of Disease Suscept., The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Terence Poon
- Department of Paediatrics and Proteomics Laboratory, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| |
Collapse
|
37
|
Bendl J, Stourac J, Salanda O, Pavelka A, Wieben ED, Zendulka J, Brezovsky J, Damborsky J. PredictSNP: robust and accurate consensus classifier for prediction of disease-related mutations. PLoS Comput Biol 2014; 10:e1003440. [PMID: 24453961 PMCID: PMC3894168 DOI: 10.1371/journal.pcbi.1003440] [Citation(s) in RCA: 579] [Impact Index Per Article: 52.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Accepted: 12/03/2013] [Indexed: 02/07/2023] Open
Abstract
Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.
Collapse
Affiliation(s)
- Jaroslav Bendl
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Brno, Czech Republic
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
- Center of Biomolecular and Cellular Engineering, International Centre for Clinical Research, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jan Stourac
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Brno, Czech Republic
- Center of Biomolecular and Cellular Engineering, International Centre for Clinical Research, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Ondrej Salanda
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
| | - Antonin Pavelka
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Eric D. Wieben
- Department of Biochemistry and Molecular Biology, Mayo Clinic, Rochester, New York, United States of America
| | - Jaroslav Zendulka
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
| | - Jan Brezovsky
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Brno, Czech Republic
- * E-mail: (JB); (JD)
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Brno, Czech Republic
- Center of Biomolecular and Cellular Engineering, International Centre for Clinical Research, St. Anne's University Hospital Brno, Brno, Czech Republic
- * E-mail: (JB); (JD)
| |
Collapse
|
38
|
Thomas DC, Yang Z, Yang F. Two-phase and family-based designs for next-generation sequencing studies. Front Genet 2013; 4:276. [PMID: 24379824 PMCID: PMC3861783 DOI: 10.3389/fgene.2013.00276] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2013] [Accepted: 11/19/2013] [Indexed: 12/21/2022] Open
Abstract
The cost of next-generation sequencing is now approaching that of early GWAS panels, but is still out of reach for large epidemiologic studies and the millions of rare variants expected poses challenges for distinguishing causal from non-causal variants. We review two types of designs for sequencing studies: two-phase designs for targeted follow-up of genomewide association studies using unrelated individuals; and family-based designs exploiting co-segregation for prioritizing variants and genes. Two-phase designs subsample subjects for sequencing from a larger case-control study jointly on the basis of their disease and carrier status; the discovered variants are then tested for association in the parent study. The analysis combines the full sequence data from the substudy with the more limited SNP data from the main study. We discuss various methods for selecting this subset of variants and describe the expected yield of true positive associations in the context of an on-going study of second breast cancers following radiotherapy. While the sharing of variants within families means that family-based designs are less efficient for discovery than sequencing unrelated individuals, the ability to exploit co-segregation of variants with disease within families helps distinguish causal from non-causal ones. Furthermore, by enriching for family history, the yield of causal variants can be improved and use of identity-by-descent information improves imputation of genotypes for other family members. We compare the relative efficiency of these designs with those using unrelated individuals for discovering and prioritizing variants or genes for testing association in larger studies. While associations can be tested with single variants, power is low for rare ones. Recent generalizations of burden or kernel tests for gene-level associations to family-based data are appealing. These approaches are illustrated in the context of a family-based study of colorectal cancer.
Collapse
Affiliation(s)
- Duncan C Thomas
- Department of Preventive Medicine, University of Southern California Los Angeles, CA, USA
| | - Zhao Yang
- Department of Preventive Medicine, University of Southern California Los Angeles, CA, USA
| | - Fan Yang
- Department of Preventive Medicine, University of Southern California Los Angeles, CA, USA
| |
Collapse
|
39
|
Status quo of annotation of human disease variants. BMC Bioinformatics 2013; 14:352. [PMID: 24305467 PMCID: PMC4234487 DOI: 10.1186/1471-2105-14-352] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 09/06/2013] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The ever on-going technical developments in Next Generation Sequencing have led to an increase in detected disease related mutations. Many bioinformatics approaches exist to analyse these variants, and of those the methods that use 3D structure information generally outperform those that do not use this information. 3D structure information today is available for about twenty percent of the human exome, and homology modelling can double that fraction. This percentage is rapidly increasing so that we can expect to analyse the majority of all human exome variants in the near future using protein structure information. RESULTS We collected a test dataset of well-described mutations in proteins for which 3D-structure information is available. This test dataset was used to analyse the possibilities and the limitations of methods based on sequence information alone, hybrid methods, machine learning based methods, and structure based methods. CONCLUSIONS Our analysis shows that the use of structural features improves the classification of mutations. This study suggests strategies for future analyses of disease causing mutations, and it suggests which bioinformatics approaches should be developed to make progress in this field.
Collapse
|
40
|
Morris G, Maes M. A neuro-immune model of Myalgic Encephalomyelitis/Chronic fatigue syndrome. Metab Brain Dis 2013; 28:523-40. [PMID: 22718491 DOI: 10.1007/s11011-012-9324-8] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 06/07/2012] [Indexed: 12/15/2022]
Abstract
This paper proposes a neuro-immune model for Myalgic Encephalomyelitis/Chronic fatigue syndrome (ME/CFS). A wide range of immunological and neurological abnormalities have been reported in people suffering from ME/CFS. They include abnormalities in proinflammatory cytokines, raised production of nuclear factor-κB, mitochondrial dysfunctions, autoimmune responses, autonomic disturbances and brain pathology. Raised levels of oxidative and nitrosative stress (O&NS), together with reduced levels of antioxidants are indicative of an immuno-inflammatory pathology. A number of different pathogens have been reported either as triggering or maintaining factors. Our model proposes that initial infection and immune activation caused by a number of possible pathogens leads to a state of chronic peripheral immune activation driven by activated O&NS pathways that lead to progressive damage of self epitopes even when the initial infection has been cleared. Subsequent activation of autoreactive T cells conspiring with O&NS pathways cause further damage and provoke chronic activation of immuno-inflammatory pathways. The subsequent upregulation of proinflammatory compounds may activate microglia via the vagus nerve. Elevated proinflammatory cytokines together with raised O&NS conspire to produce mitochondrial damage. The subsequent ATP deficit together with inflammation and O&NS are responsible for the landmark symptoms of ME/CFS, including post-exertional malaise. Raised levels of O&NS subsequently cause progressive elevation of autoimmune activity facilitated by molecular mimicry, bystander activation or epitope spreading. These processes provoke central nervous system (CNS) activation in an attempt to restore immune homeostatsis. This model proposes that the antagonistic activities of the CNS response to peripheral inflammation, O&NS and chronic immune activation are responsible for the remitting-relapsing nature of ME/CFS. Leads for future research are suggested based on this neuro-immune model.
Collapse
|
41
|
Izarzugaza JMG, Vazquez M, del Pozo A, Valencia A. wKinMut: an integrated tool for the analysis and interpretation of mutations in human protein kinases. BMC Bioinformatics 2013; 14:345. [PMID: 24289158 PMCID: PMC3879071 DOI: 10.1186/1471-2105-14-345] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 05/30/2013] [Indexed: 11/13/2022] Open
Abstract
Background Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. Results The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases. Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. Conclusions wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases. wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at
http://wkinmut.bioinfo.cnio.es.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), C/Melchor Fernandez Almagro, 3, E-28029 Madrid, Spain.
| | | | | | | |
Collapse
|
42
|
Monteiro ANA, Freedman ML. Lessons from postgenome-wide association studies: functional analysis of cancer predisposition loci. J Intern Med 2013; 274:414-24. [PMID: 24127939 PMCID: PMC3801430 DOI: 10.1111/joim.12085] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In the last few years, genome-wide association studies (GWASs) have identified hundreds of predisposition loci for several types of human cancers. Recent progress has been made in determining the underlying mechanisms through which different single-nucleotide polymorphisms (SNPs) affect predisposition to cancer. Although there has been much debate about the clinical utility of GWASs, less attention has been paid to how GWASs and post-GWASs functional analysis have contributed to understanding the aetiology of cancer. Most common variants associated with cancer risk are localized in nonprotein-coding regions highlighting transcriptional regulation as a common theme in the mechanism of cancer predisposition. Here, we outline strategies to functionally dissect predisposition loci and discuss their limitations as well as challenges for future studies.
Collapse
Affiliation(s)
- A N A Monteiro
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | | |
Collapse
|
43
|
Frousios K, Iliopoulos CS, Schlitt T, Simpson MA. Predicting the functional consequences of non-synonymous DNA sequence variants — evaluation of bioinformatics tools and development of a consensus strategy. Genomics 2013; 102:223-8. [DOI: 10.1016/j.ygeno.2013.06.005] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Revised: 06/11/2013] [Accepted: 06/21/2013] [Indexed: 01/27/2023]
|
44
|
Pavlopoulos GA, Oulas A, Iacucci E, Sifrim A, Moreau Y, Schneider R, Aerts J, Iliopoulos I. Unraveling genomic variation from next generation sequencing data. BioData Min 2013; 6:13. [PMID: 23885890 PMCID: PMC3726446 DOI: 10.1186/1756-0381-6-13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 07/18/2013] [Indexed: 12/29/2022] Open
Abstract
Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece.
| | | | | | | | | | | | | | | |
Collapse
|
45
|
Kamaraj B, Purohit R. Computational Screening of Disease-Associated Mutations in OCA2 Gene. Cell Biochem Biophys 2013; 68:97-109. [DOI: 10.1007/s12013-013-9697-2] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
46
|
In silico screening and molecular dynamics simulation of disease-associated nsSNP in TYRP1 gene and its structural consequences in OCA3. BIOMED RESEARCH INTERNATIONAL 2013; 2013:697051. [PMID: 23862152 PMCID: PMC3703794 DOI: 10.1155/2013/697051] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Revised: 05/23/2013] [Accepted: 05/23/2013] [Indexed: 11/17/2022]
Abstract
Oculocutaneous albinism type III (OCA3), caused by mutations of TYRP1 gene, is an autosomal recessive disorder characterized by reduced biosynthesis of melanin pigment in the hair, skin, and eyes. The TYRP1 gene encodes a protein called tyrosinase-related protein-1 (Tyrp1). Tyrp1 is involved in maintaining the stability of tyrosinase protein and modulating its catalytic activity in eumelanin synthesis. Tyrp1 is also involved in maintenance of melanosome structure and affects melanocyte proliferation and cell death. In this work we implemented computational analysis to filter the most probable mutation that might be associated with OCA3. We found R326H and R356Q as most deleterious and disease associated by using PolyPhen 2.0, SIFT, PANTHER, I-mutant 3.0, PhD-SNP, SNP&GO, Pmut, and Mutpred tools. To understand the atomic arrangement in 3D space, the native and mutant (R326H and R356Q) structures were modelled. Finally the structural analyses of native and mutant Tyrp1 proteins were investigated using molecular dynamics simulation (MDS) approach. MDS results showed more flexibility in native Tyrp1 structure. Due to mutation in Tyrp1 protein, it became more rigid and might disturb the structural conformation and catalytic function of the structure and might also play a significant role in inducing OCA3. The results obtained from this study would facilitate wet-lab researches to develop a potent drug therapies against OCA3.
Collapse
|
47
|
Roadmap to determine the point mutations involved in cardiomyopathy disorder: A Bayesian approach. Gene 2013; 519:34-40. [DOI: 10.1016/j.gene.2013.01.056] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 12/31/2012] [Accepted: 01/27/2013] [Indexed: 11/18/2022]
|
48
|
Liu L, Kumar S. Evolutionary balancing is critical for correctly forecasting disease-associated amino acid variants. Mol Biol Evol 2013; 30:1252-7. [PMID: 23462317 DOI: 10.1093/molbev/mst037] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Computational predictions have become indispensable for evaluating the disease-related impact of nonsynonymous single-nucleotide variants discovered in exome sequencing. Many such methods have their roots in molecular evolution, as they use information derived from multiple sequence alignments. We show that the performance of current methods (e.g., PolyPhen-2 and SIFT) is improved significantly by optimizing their statistical models on evolutionarily balanced training data, where equal numbers of positive and negative controls within each evolutionary conservation class are used. Evolutionary balancing significantly reduces the false-positive rates for variants observed at highly conserved sites and false-negative rates for variants observed at fast evolving sites. Use of these improved methods enables more accurate forecasting when concordant diagnosis from multiple methods is regarded as a more reliable indicator of the prediction. Applied to a large exome variation data set, we find that the current methods produce concordant predictions for less than half of the population variants. These advances are implemented in a web resource for use in practical applications (www.mypeg.info, last accessed March 13, 2013).
Collapse
Affiliation(s)
- Li Liu
- Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University, USA
| | | |
Collapse
|
49
|
Douville C, Carter H, Kim R, Niknafs N, Diekhans M, Stenson PD, Cooper DN, Ryan M, Karchin R. CRAVAT: cancer-related analysis of variants toolkit. Bioinformatics 2013; 29:647-8. [PMID: 23325621 PMCID: PMC3582272 DOI: 10.1093/bioinformatics/btt017] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Revised: 11/19/2012] [Accepted: 01/08/2013] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Advances in sequencing technology have greatly reduced the costs incurred in collecting raw sequencing data. Academic laboratories and researchers therefore now have access to very large datasets of genomic alterations but limited time and computational resources to analyse their potential biological importance. Here, we provide a web-based application, Cancer-Related Analysis of Variants Toolkit, designed with an easy-to-use interface to facilitate the high-throughput assessment and prioritization of genes and missense alterations important for cancer tumorigenesis. Cancer-Related Analysis of Variants Toolkit provides predictive scores for germline variants, somatic mutations and relative gene importance, as well as annotations from published literature and databases. Results are emailed to users as MS Excel spreadsheets and/or tab-separated text files. AVAILABILITY http://www.cravat.us/
Collapse
Affiliation(s)
- Christopher Douville
- Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Affiliation(s)
- Monique Ohanian
- Molecular Cardiology Division, Victor Chang Cardiac Research Institute, Sydney, New South Wales, Australia
| | | | | |
Collapse
|