551
|
Chella Krishnan K, Kurt Z, Barrere-Cain R, Sabir S, Das A, Floyd R, Vergnes L, Zhao Y, Che N, Charugundla S, Qi H, Zhou Z, Meng Y, Pan C, Seldin MM, Norheim F, Hui S, Reue K, Lusis AJ, Yang X. Integration of Multi-omics Data from Mouse Diversity Panel Highlights Mitochondrial Dysfunction in Non-alcoholic Fatty Liver Disease. Cell Syst 2018; 6:103-115.e7. [PMID: 29361464 DOI: 10.1016/j.cels.2017.12.006] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 10/24/2017] [Accepted: 12/08/2017] [Indexed: 12/25/2022]
Abstract
The etiology of non-alcoholic fatty liver disease (NAFLD), the most common form of chronic liver disease, is poorly understood. To understand the causal mechanisms underlying NAFLD, we conducted a multi-omics, multi-tissue integrative study using the Hybrid Mouse Diversity Panel, consisting of ∼100 strains of mice with various degrees of NAFLD. We identified both tissue-specific biological processes and processes that were shared between adipose and liver tissues. We then used gene network modeling to predict candidate regulatory genes of these NAFLD processes, including Fasn, Thrsp, Pklr, and Chchd6. In vivo knockdown experiments of the candidate genes improved both steatosis and insulin resistance. Further in vitro testing demonstrated that downregulation of both Pklr and Chchd6 lowered mitochondrial respiration and led to a shift toward glycolytic metabolism, thus highlighting mitochondria dysfunction as a key mechanistic driver of NAFLD.
Collapse
Affiliation(s)
- Karthickeyan Chella Krishnan
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Rio Barrere-Cain
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Simon Sabir
- Department of Psychology, College of Letters and Science, University of California, Los Angeles, CA, USA
| | - Aditi Das
- Department of Psychology, College of Letters and Science, University of California, Los Angeles, CA, USA
| | - Raquel Floyd
- Department of Microbiology, Immunology and Molecular Genetics, College of Letters and Science, University of California, Los Angeles, CA, USA
| | - Laurent Vergnes
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Yuqi Zhao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Nam Che
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Sarada Charugundla
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Hannah Qi
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Zhiqiang Zhou
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Yonghong Meng
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Calvin Pan
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Marcus M Seldin
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Frode Norheim
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Simon Hui
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Karen Reue
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Aldons J Lusis
- Department of Medicine/Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA; Department of Microbiology, Immunology and Molecular Genetics, College of Letters and Science, University of California, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA.
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA; Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA.
| |
Collapse
|
552
|
Lemaçon A, Joly Beauparlant C, Soucy P, Allen J, Easton D, Kraft P, Simard J, Droit A. VEXOR: an integrative environment for prioritization of functional variants in fine-mapping analysis. Bioinformatics 2018; 33:1389-1391. [PMID: 28453673 DOI: 10.1093/bioinformatics/btw826] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Accepted: 12/27/2016] [Indexed: 11/14/2022] Open
Abstract
Motivation The identification of the functional variants responsible for observed genome-wide association studies (GWAS) signals is one of the most challenging tasks of the post-GWAS research era. Several tools have been developed to annotate genetic variants by their genomic location and potential functional implications. Each of these tools has its own requirements and internal logic, which forces the user to become acquainted with each interface. Results From an awareness of the amount of work needed to analyze a single locus, we have built a flexible, versatile and easy-to-use web interface designed to help in prioritizing variants and predicting their potential functional implications. This interface acts as a single-point of entry linking association results with reference tools and relevant experiments. Availability and Implementation VEXOR is an integrative web application implemented through the Shiny framework and available at: http://romix.genome.ulaval.ca/vexor. Contact arnaud.droit@crchuq.ulaval.ca. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Audrey Lemaçon
- Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval Research Center, Quebec, Canada
| | - Charles Joly Beauparlant
- Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval Research Center, Quebec, Canada
| | - Penny Soucy
- Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval Research Center, Quebec, Canada
| | - Jamie Allen
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK
| | - Douglas Easton
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK.,Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, UK
| | - Peter Kraft
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Jacques Simard
- Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval Research Center, Quebec, Canada
| | - Arnaud Droit
- Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval Research Center, Quebec, Canada
| |
Collapse
|
553
|
Srivastava PK, Bagnati M, Delahaye-Duriez A, Ko JH, Rotival M, Langley SR, Shkura K, Mazzuferi M, Danis B, van Eyll J, Foerch P, Behmoaras J, Kaminski RM, Petretto E, Johnson MR. Genome-wide analysis of differential RNA editing in epilepsy. Genome Res 2018; 27:440-450. [PMID: 28250018 PMCID: PMC5340971 DOI: 10.1101/gr.210740.116] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 01/10/2017] [Indexed: 02/03/2023]
Abstract
The recoding of genetic information through RNA editing contributes to proteomic diversity, but the extent and significance of RNA editing in disease is poorly understood. In particular, few studies have investigated the relationship between RNA editing and disease at a genome-wide level. Here, we developed a framework for the genome-wide detection of RNA sites that are differentially edited in disease. Using RNA-sequencing data from 100 hippocampi from mice with epilepsy (pilocarpine–temporal lobe epilepsy model) and 100 healthy control hippocampi, we identified 256 RNA sites (overlapping with 87 genes) that were significantly differentially edited between epileptic cases and controls. The degree of differential RNA editing in epileptic mice correlated with frequency of seizures, and the set of genes differentially RNA-edited between case and control mice were enriched for functional terms highly relevant to epilepsy, including “neuron projection” and “seizures.” Genes with differential RNA editing were preferentially enriched for genes with a genetic association to epilepsy. Indeed, we found that they are significantly enriched for genes that harbor nonsynonymous de novo mutations in patients with epileptic encephalopathy and for common susceptibility variants associated with generalized epilepsy. These analyses reveal a functional convergence between genes that are differentially RNA-edited in acquired symptomatic epilepsy and those that contribute risk for genetic epilepsy. Taken together, our results suggest a potential role for RNA editing in the epileptic hippocampus in the occurrence and severity of epileptic seizures.
Collapse
Affiliation(s)
| | - Marta Bagnati
- Centre for Complement and Inflammation Research (CCIR), Imperial College London, London W12 0NN, United Kingdom
| | - Andree Delahaye-Duriez
- Division of Brain Sciences, Imperial College Faculty of Medicine, London W12 0NN, United Kingdom
| | - Jeong-Hun Ko
- Centre for Complement and Inflammation Research (CCIR), Imperial College London, London W12 0NN, United Kingdom
| | - Maxime Rotival
- Institut Pasteur, Unit of Human Evolutionary Genetics, Paris 75015, France
| | - Sarah R Langley
- Duke-NUS Medical School, Singapore 169857, Republic of Singapore
| | - Kirill Shkura
- Division of Brain Sciences, Imperial College Faculty of Medicine, London W12 0NN, United Kingdom
| | | | | | | | - Patrik Foerch
- Neuroscience TA, UCB Pharma, 1420 Braine-l'Alleud, Belgium
| | - Jacques Behmoaras
- Centre for Complement and Inflammation Research (CCIR), Imperial College London, London W12 0NN, United Kingdom
| | | | - Enrico Petretto
- Duke-NUS Medical School, Singapore 169857, Republic of Singapore
| | - Michael R Johnson
- Division of Brain Sciences, Imperial College Faculty of Medicine, London W12 0NN, United Kingdom
| |
Collapse
|
554
|
Vlaic S, Conrad T, Tokarski-Schnelle C, Gustafsson M, Dahmen U, Guthke R, Schuster S. ModuleDiscoverer: Identification of regulatory modules in protein-protein interaction networks. Sci Rep 2018; 8:433. [PMID: 29323246 PMCID: PMC5764996 DOI: 10.1038/s41598-017-18370-2] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Accepted: 12/06/2017] [Indexed: 02/08/2023] Open
Abstract
The identification of disease-associated modules based on protein-protein interaction networks (PPINs) and gene expression data has provided new insights into the mechanistic nature of diverse diseases. However, their identification is hampered by the detection of protein communities within large-scale, whole-genome PPINs. A presented successful strategy detects a PPIN's community structure based on the maximal clique enumeration problem (MCE), which is a non-deterministic polynomial time-hard problem. This renders the approach computationally challenging for large PPINs implying the need for new strategies. We present ModuleDiscoverer, a novel approach for the identification of regulatory modules from PPINs and gene expression data. Following the MCE-based approach, ModuleDiscoverer uses a randomization heuristic-based approximation of the community structure. Given a PPIN of Rattus norvegicus and public gene expression data, we identify the regulatory module underlying a rodent model of non-alcoholic steatohepatitis (NASH), a severe form of non-alcoholic fatty liver disease (NAFLD). The module is validated using single-nucleotide polymorphism (SNP) data from independent genome-wide association studies and gene enrichment tests. Based on gene enrichment tests, we find that ModuleDiscoverer performs comparably to three existing module-detecting algorithms. However, only our NASH-module is significantly enriched with genes linked to NAFLD-associated SNPs. ModuleDiscoverer is available at http://www.hki-jena.de/index.php/0/2/490 (Others/ModuleDiscoverer).
Collapse
Affiliation(s)
- Sebastian Vlaic
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Systems Biology and Bioinformatics, Jena, 07745, Germany.
- Friedrich-Schiller-University, Department of Bioinformatics, Jena, 07743, Germany.
| | - Theresia Conrad
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Systems Biology and Bioinformatics, Jena, 07745, Germany
| | - Christian Tokarski-Schnelle
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Systems Biology and Bioinformatics, Jena, 07745, Germany
- University Hospital Jena, Friedrich-Schiller-University, General, Visceral and Vascular Surgery, Jena, 07749, Germany
| | - Mika Gustafsson
- Linköping University, Bioinformatics, Department of Physics, Chemistry and Biology, Linköping, 581 83, Sweden
| | - Uta Dahmen
- University Hospital Jena, Friedrich-Schiller-University, General, Visceral and Vascular Surgery, Jena, 07749, Germany
| | - Reinhard Guthke
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Systems Biology and Bioinformatics, Jena, 07745, Germany
| | - Stefan Schuster
- Friedrich-Schiller-University, Department of Bioinformatics, Jena, 07743, Germany
| |
Collapse
|
555
|
Alaei S, Sadeghi B, Najafi A, Masoudi-Nejad A. LncRNA and mRNA integration network reconstruction reveals novel key regulators in esophageal squamous-cell carcinoma. Genomics 2018; 111:76-89. [PMID: 29317304 DOI: 10.1016/j.ygeno.2018.01.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2017] [Revised: 01/05/2018] [Accepted: 01/05/2018] [Indexed: 12/19/2022]
Abstract
Many experimental and computational studies have identified key protein coding genes in initiation and progression of esophageal squamous cell carcinoma (ESCC). However, the number of researches that tried to reveal the role of long non-coding RNAs (lncRNAs) in ESCC has been limited. LncRNAs are one of the important regulators of cancers which are transcribed dominantly in the genome and in various conditions. The main goal of this study was to use a systems biology approach to predict novel lncRNAs as well as protein coding genes associated with ESCC and assess their prognostic values. By using microarray expression data for mRNAs and lncRNAs from a large number of ESCC patients, we utilized "Weighted Gene Co-expression Network Analysis" (WGCNA) method to make a big coding-non-coding gene co-expression network, and discovered important functional modules. Gene set enrichment and pathway analysis revealed major biological processes and pathways involved in these modules. After selecting some protein coding genes involved in biological processes and pathways related to cancer, we used "LncTar", a computational tool to predict potential interactions between these genes and lncRNAs. By combining interaction results with Pearson correlations, we introduced some novel lncRNAs with putative key regulatory roles in the network. Survival analysis with Kaplan-Meier estimator and Log-rank test statistic confirmed that most of the introduced genes are associated with poor prognosis in ESCC. Overall, our study reveals novel protein coding genes and lncRNAs associated with ESCC, along with their predicted interactions. Based on the promising results of survival analysis, these genes can be used as good estimators of patients' survival, or even can be analyzed further as new potential signatures or targets for the therapy of ESCC disease.
Collapse
Affiliation(s)
- Shervin Alaei
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Balal Sadeghi
- Food Hygiene and Public Health Department, Faculty of Veterinary Medicine, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
556
|
Wang Z, Zhang Q, Zhang W, Lin JR, Cai Y, Mitra J, Zhang ZD. HEDD: Human Enhancer Disease Database. Nucleic Acids Res 2018; 46:D113-D120. [PMID: 29077884 PMCID: PMC5753236 DOI: 10.1093/nar/gkx988] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2017] [Revised: 10/09/2017] [Accepted: 10/11/2017] [Indexed: 12/26/2022] Open
Abstract
Enhancers, as specialized genomic cis-regulatory elements, activate transcription of their target genes and play an important role in pathogenesis of many human complex diseases. Despite recent systematic identification of them in the human genome, currently there is an urgent need for comprehensive annotation databases of human enhancers with a focus on their disease connections. In response, we built the Human Enhancer Disease Database (HEDD) to facilitate studies of enhancers and their potential roles in human complex diseases. HEDD currently provides comprehensive genomic information for ∼2.8 million human enhancers identified by ENCODE, FANTOM5 and RoadMap with disease association scores based on enhancer-gene and gene-disease connections. It also provides Web-based analytical tools to visualize enhancer networks and score enhancers given a set of selected genes in a specific gene network. HEDD is freely accessible at http://zdzlab.einstein.yu.edu/1/hedd.php.
Collapse
Affiliation(s)
- Zhen Wang
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Quanwei Zhang
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Wen Zhang
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Jhih-Rong Lin
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Ying Cai
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Joydeep Mitra
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Zhengdong D Zhang
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
557
|
Abstract
Following the elucidation of the human genome, chemogenomics emerged in the beginning of the twenty-first century as an interdisciplinary research field with the aim to accelerate target and drug discovery by making best usage of the genomic data and the data linkable to it. What started as a systematization approach within protein target families now encompasses all types of chemical compounds and gene products. A key objective of chemogenomics is the establishment, extension, analysis, and prediction of a comprehensive SAR matrix which by application will enable further systematization in drug discovery. Herein we outline future perspectives of chemogenomics including the extension to new molecular modalities, or the potential extension beyond the pharma to the agro and nutrition sectors, and the importance for environmental protection. The focus is on computational sciences with potential applications for compound library design, virtual screening, hit assessment, analysis of phenotypic screens, lead finding and optimization, and systems biology-based prediction of toxicology and translational research.
Collapse
Affiliation(s)
- Edgar Jacoby
- Janssen Research & Development, Beerse, Belgium.
| | - J B Brown
- Life Science Informatics Research Unit, Laboratory of Molecular Biosciences, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
558
|
Gupta S, Dingerdissen H, Ross KE, Hu Y, Wu CH, Mazumder R, Vijay-Shanker K. DEXTER: Disease-Expression Relation Extraction from Text. Database (Oxford) 2018; 2018:5025486. [PMID: 29860481 PMCID: PMC6007211 DOI: 10.1093/database/bay045] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 04/02/2018] [Accepted: 04/19/2018] [Indexed: 01/23/2023]
Abstract
Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.
Collapse
Affiliation(s)
- Samir Gupta
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Avenue, Newark, DE 19716, USA
| | - Hayley Dingerdissen
- Department of Biochemistry and Molecular Medicine, The George Washington University, Ross Hall, 2300 Eye Street N.W., Washington, DC 20037, USA
| | - Karen E Ross
- Department of Biochemistry and Molecular and Cellular Biology, Georgetown University Medical Center, 3300 Whitehaven St. NW, Suite 1200 Washington, DC 20007, USA
| | - Yu Hu
- Department of Biochemistry and Molecular Medicine, The George Washington University, Ross Hall, 2300 Eye Street N.W., Washington, DC 20037, USA
| | - Cathy H Wu
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Avenue, Newark, DE 19716, USA
- Center for Bioinformatics and Computational Biology, University of Delaware, 15 Innovation Way, Suite 205 Newark, DE 19711, USA
| | - Raja Mazumder
- Department of Biochemistry and Molecular Medicine, The George Washington University, Ross Hall, 2300 Eye Street N.W., Washington, DC 20037, USA
| | - K Vijay-Shanker
- Department of Computer and Information Sciences, University of Delaware, 18 Amstel Avenue, Newark, DE 19716, USA
| |
Collapse
|
559
|
Felgueiras J, Silva JV, Fardilha M. Adding biological meaning to human protein-protein interactions identified by yeast two-hybrid screenings: A guide through bioinformatics tools. J Proteomics 2018; 171:127-140. [PMID: 28526529 DOI: 10.1016/j.jprot.2017.05.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 04/26/2017] [Accepted: 05/13/2017] [Indexed: 02/02/2023]
|
560
|
Martín-Gálvez D, Dunoyer de Segonzac D, Ma MCJ, Kwitek AE, Thybert D, Flicek P. Genome variation and conserved regulation identify genomic regions responsible for strain specific phenotypes in rat. BMC Genomics 2017; 18:986. [PMID: 29272997 PMCID: PMC5741965 DOI: 10.1186/s12864-017-4351-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 11/27/2017] [Indexed: 11/10/2022] Open
Abstract
Background The genomes of laboratory rat strains are characterised by a mosaic haplotype structure caused by their unique breeding history. These mosaic haplotypes have been recently mapped by extensive sequencing of key strains. Comparison of genomic variation between two closely related rat strains with different phenotypes has been proposed as an effective strategy for the discovery of candidate strain-specific regions involved in phenotypic differences. We developed a method to prioritise strain-specific haplotypes by integrating genomic variation and genomic regulatory data predicted to be involved in specific phenotypes. Specifically, we aimed to identify genomic regions associated with Metabolic Syndrome (MetS), a disorder of energy utilization and storage affecting several organ systems. Results We compared two Lyon rat strains, Lyon Hypertensive (LH) which is susceptible to MetS, and Lyon Low pressure (LL), which is susceptible to obesity as an intermediate MetS phenotype, with a third strain (Lyon Normotensive, LN) that is resistant to both MetS and obesity. Applying a novel metric, we ranked the identified strain-specific haplotypes using evolutionary conservation of the occupancy three liver-specific transcription factors (HNF4A, CEBPA, and FOXA1) in five rodents including rat. Consideration of regulatory information effectively identified regions with liver-associated genes and rat orthologues of human GWAS variants related to obesity and metabolic traits. We attempted to find possible causative variants and compared them with the candidate genes proposed by previous studies. In strain-specific regions with conserved regulation, we found a significant enrichment for published evidence to obesity—one of the metabolic symptoms shown by the Lyon strains—amongst the genes assigned to promoters with strain-specific variation. Conclusions Our results show that the use of functional regulatory conservation is a potentially effective approach to select strain-specific genomic regions associated with phenotypic differences among Lyon rats and could be extended to other systems. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4351-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David Martín-Gálvez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Denis Dunoyer de Segonzac
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Man Chun John Ma
- Department of Pharmacology, University of Iowa, Iowa City, IA, USA.,Iowa Institute of Human Genetics, University of Iowa, Iowa City, IA, USA.,Present address: MD Anderson Cancer Center, University of Texas, Houston, TX, USA
| | - Anne E Kwitek
- Department of Pharmacology, University of Iowa, Iowa City, IA, USA.,Iowa Institute of Human Genetics, University of Iowa, Iowa City, IA, USA
| | - David Thybert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. .,Present address: Earlham Institute, Norwich research Park, Norwich, NR4 7UH, UK.
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
561
|
Shi JY, Huang H, Zhang YN, Long YX, Yiu SM. Predicting binary, discrete and continued lncRNA-disease associations via a unified framework based on graph regression. BMC Med Genomics 2017; 10:65. [PMID: 29322937 PMCID: PMC5763297 DOI: 10.1186/s12920-017-0305-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND In human genomes, long non-coding RNAs (lncRNAs) have attracted more and more attention because their dysfunctions are involved in many diseases. However, the associations between lncRNAs and diseases (LDA) still remain unknown in most cases. While identifying disease-related lncRNAs in vivo is costly, computational approaches are promising to not only accelerate the possible identification of associations but also provide clues on the underlying mechanism of various lncRNA-caused diseases. Former computational approaches usually only focus on predicting new associations between lncRNAs having known associations with diseases and other lncRNA-associated diseases. They also only work on binary lncRNA-disease associations (whether the pair has an association or not), which cannot reflect and reveal other biological facts, such as the number of proteins involved in LDA or how strong the association is (i.e., the intensity of LDA). RESULTS To address abovementioned issues, we propose a graph regression-based unified framework (GRUF). In particular, our method can work on lncRNAs, which have no previously known disease association and diseases that have no known association with any lncRNAs. Also, instead of only a binary answer for the association, our method tries to uncover more biological relationship between a pair of lncRNA and disease, which may provide better clues for researchers. We compared GRUF with three state-of-the-art approaches and demonstrated the superiority of GRUF, which achieves 5%~16% improvement in terms of the area under the receiver operating characteristic curve (AUC). GRUF also provides a predicted confidence score for the predicted LDA, which reveals the significant correlation between the score and the number of RNA-Binding Proteins involved in LDAs. Lastly, three out of top-5 LDA candidates generated by GRUF in novel prediction are verified indirectly by medical literature and known biological facts. CONCLUSIONS The proposed GRUF has two advantages over existing approaches. Firstly, it can be used to work on lncRNAs that have no known disease association and diseases that have no known association with any lncRNAs. Secondly, instead of providing a binary answer (with or without association), GRUF works for both discrete and continued LDA, which help revealing the pathological implications between lncRNAs and diseases.
Collapse
Affiliation(s)
- Jian-Yu Shi
- School of Life Sciences, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Hua Huang
- School of Software and Microelectronics, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Yan-Ning Zhang
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Yu-Xi Long
- School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072 China
| | - Siu-Ming Yiu
- Department of Computer Science, the University of Hong Kong, Hong Kong, 999077 China
| |
Collapse
|
562
|
Zhang W, Xin L, Lu Y. Integrative Analysis to Identify Common Genetic Markers of Metabolic Syndrome, Dementia, and Diabetes. Med Sci Monit 2017; 23:5885-5891. [PMID: 29229897 PMCID: PMC5737114 DOI: 10.12659/msm.905521] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Accepted: 06/18/2017] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Emerging data have established links between systemic metabolic dysfunction, such as diabetes and metabolic syndrome (MetS), with neurocognitive impairment, including dementia. The common gene signature and the associated signaling pathways of MetS, diabetes, and dementia have not been widely studied. MATERIAL AND METHODS We exploited the translational bioinformatics approach to choose the common gene signatures for both dementia and MetS. For this we employed "DisGeNET discovery platform". RESULTS Gene mining analysis revealed that a total of 173 genes (86 genes common to all three diseases) which comprised a proportion of 43% of the total genes associated with dementia. The gene enrichment analysis showed that these genes were involved in dysregulation in the neurological system (23.2%) and the central nervous system (20.8%) phenotype processes. The network analysis revealed APOE, APP, PARK2, CEPBP, PARP1, MT-CO2, CXCR4, IGFIR, CCR5, and PIK3CD as important nodes with significant interacting partners. The meta-regression analysis showed modest association of APOE with dementia and metabolic complications. The directionality of effects of the variants on Alzheimer disease is generally consistent with previous observations and did not differ by race/ethnicity (p>0.05), although our study had low power for this test. CONCLUSIONS Our novel approach showed APOE as a common gene signature with a link to dementia, MetS, and diabetes. Future gene association studies should focus on the association of gene polymorphisms with multiple disease models to identify novel putative drug targets.
Collapse
Affiliation(s)
| | | | - Ying Lu
- Corresponding Author: Ying Lu, e-mail:
| |
Collapse
|
563
|
Sanchez CG, Molinski SV, Gongora R, Sosulski M, Fuselier T, MacKinnon SS, Mondal D, Lasky JA. The Antiretroviral Agent Nelfinavir Mesylate. Arthritis Rheumatol 2017; 70:115-126. [DOI: 10.1002/art.40326] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Accepted: 09/13/2017] [Indexed: 12/28/2022]
Affiliation(s)
| | | | - Rafael Gongora
- Tulane University Health Sciences Center New Orleans Louisiana
| | | | - Taylor Fuselier
- Tulane University Health Sciences Center New Orleans Louisiana
| | | | - Debasis Mondal
- Tulane University School of Medicine New Orleans Louisiana
| | - Joseph A. Lasky
- Tulane University Health Sciences Center New Orleans Louisiana
| |
Collapse
|
564
|
Mukund K, Subramaniam S. Co-expression Network Approach Reveals Functional Similarities among Diseases Affecting Human Skeletal Muscle. Front Physiol 2017; 8:980. [PMID: 29249983 PMCID: PMC5717538 DOI: 10.3389/fphys.2017.00980] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 11/16/2017] [Indexed: 12/27/2022] Open
Abstract
Diseases affecting skeletal muscle exhibit considerable heterogeneity in intensity, etiology, phenotypic manifestation and gene expression. Systems biology approaches using network theory, allows for a holistic understanding of functional similarities amongst diseases. Here we propose a co-expression based, network theoretic approach to extract functional similarities from 20 heterogeneous diseases comprising of dystrophinopathies, inflammatory myopathies, neuromuscular, and muscle metabolic diseases. Utilizing this framework we identified seven closely associated disease clusters with 20 disease pairs exhibiting significant correlation (p < 0.05). Mapping the diseases onto a human protein-protein interaction network enabled the inference of a common program of regulation underlying more than half the muscle diseases considered here and referred to as the “protein signature.” Enrichment analysis of 17 protein modules identified as part of this signature revealed a statistically non-random dysregulation of muscle bioenergetic pathways and calcium homeostasis. Further, analysis of mechanistic similarities of less explored significant disease associations [such as between amyotrophic lateral sclerosis (ALS) and cerebral palsy (CP)] using a proposed “functional module” framework revealed adaptation of the calcium signaling machinery. Integrating drug-gene information into the quantitative framework highlighted the presence of therapeutic opportunities through drug repurposing for diseases affecting the skeletal muscle.
Collapse
Affiliation(s)
- Kavitha Mukund
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, United States
| | - Shankar Subramaniam
- Departments Cellular and Molecular Medicine, Computer Science and Engineering, University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|
565
|
Zhao H, Yang Y, Lu Y, Mort M, Cooper DN, Zuo Z, Zhou Y. Quantitative mapping of genetic similarity in human heritable diseases by shared mutations. Hum Mutat 2017; 39:292-301. [PMID: 29044887 DOI: 10.1002/humu.23358] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 09/22/2017] [Accepted: 09/27/2017] [Indexed: 01/12/2023]
Abstract
Many genetic diseases exhibit considerable epidemiological comorbidity and common symptoms, which provokes debate about the extent of their etiological overlap. The rapid growth in the number of known disease-causing mutations in the Human Gene Mutation Database (HGMD) has allowed us to characterize genetic similarities between diseases by ascertaining the extent to which identical genetic mutations are shared between diseases. Using this approach, we show that 41.6% of disease pairs in all possible pairs (42, 083) exhibit a significant sharing of mutations (P value < 0.05). These mutation-related disease pairs are in agreement with heritability-based disease-disease relations in 48 neurological and psychiatric disease pairs (Spearman's correlation coefficient = 0.50; P value = 3.4 × 10-5 ), and share over-expressed genes significantly more often than unrelated disease pairs (1.5-1.8-fold higher; P value ≤ 1.6 × 10-4 ). The usefulness of mutation-related disease pairs was further demonstrated for predicting novel mutations and identifying individuals susceptible to Crohn disease. Moreover, the mutation-based disease network concurs closely with that based on phenotypes.
Collapse
Affiliation(s)
- Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, People's Republic of China.,Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Yutong Lu
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, UK
| | - Zhiyi Zuo
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, People's Republic of China.,Department of Anesthesiology, University of Virginia, Charlottesville, Virginia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, Australia
| |
Collapse
|
566
|
Erdoğan C, Kurt Z, Diri B. Estimation of the proteomic cancer co-expression sub networks by using association estimators. PLoS One 2017; 12:e0188016. [PMID: 29145449 PMCID: PMC5690670 DOI: 10.1371/journal.pone.0188016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 10/29/2017] [Indexed: 01/02/2023] Open
Abstract
In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators’ performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists.
Collapse
Affiliation(s)
- Cihat Erdoğan
- Department of Computer Engineering, Namik Kemal University, Tekirdag, Turkey
- * E-mail:
| | - Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Banu Diri
- Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey
| |
Collapse
|
567
|
Miryala SK, Anbarasu A, Ramaiah S. Discerning molecular interactions: A comprehensive review on biomolecular interaction databases and network analysis tools. Gene 2017; 642:84-94. [PMID: 29129810 DOI: 10.1016/j.gene.2017.11.028] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 10/17/2017] [Accepted: 11/08/2017] [Indexed: 12/12/2022]
Abstract
Computational analysis of biomolecular interaction networks is now gaining a lot of importance to understand the functions of novel genes/proteins. Gene interaction (GI) network analysis and protein-protein interaction (PPI) network analysis play a major role in predicting the functionality of interacting genes or proteins and gives an insight into the functional relationships and evolutionary conservation of interactions among the genes. An interaction network is a graphical representation of gene/protein interactome, where each gene/protein is a node, and interaction between gene/protein is an edge. In this review, we discuss the popular open source databases that serve as data repositories to search and collect protein/gene interaction data, and also tools available for the generation of interaction network, visualization and network analysis. Also, various network analysis approaches like topological approach and clustering approach to study the network properties and functional enrichment server which illustrates the functions and pathway of the genes and proteins has been discussed. Hence the distinctive attribute mentioned in this review is not only to provide an overview of tools and web servers for gene and protein-protein interaction (PPI) network analysis but also to extract useful and meaningful information from the interaction networks.
Collapse
Affiliation(s)
- Sravan Kumar Miryala
- Medical and Biological Computing Laboratory, School of Biosciences and Technology, VIT University, Vellore 632014, Tamil Nadu, India
| | - Anand Anbarasu
- Medical and Biological Computing Laboratory, School of Biosciences and Technology, VIT University, Vellore 632014, Tamil Nadu, India
| | - Sudha Ramaiah
- Medical and Biological Computing Laboratory, School of Biosciences and Technology, VIT University, Vellore 632014, Tamil Nadu, India.
| |
Collapse
|
568
|
England J, Drouin S, Beaulieu P, St-Onge P, Krajinovic M, Laverdière C, Levy E, Marcil V, Sinnett D. Genomic determinants of long-term cardiometabolic complications in childhood acute lymphoblastic leukemia survivors. BMC Cancer 2017; 17:751. [PMID: 29126409 PMCID: PMC5681795 DOI: 10.1186/s12885-017-3722-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 10/30/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND While cure rates for childhood acute lymphoblastic leukemia (cALL) now exceed 80%, over 60% of survivors will face treatment-related long-term sequelae, including cardiometabolic complications such as obesity, insulin resistance, dyslipidemia and hypertension. Although genetic susceptibility contributes to the development of these problems, there are very few studies that have so far addressed this issue in a cALL survivorship context. METHODS In this study, we aimed at evaluating the associations between common and rare genetic variants and long-term cardiometabolic complications in survivors of cALL. We examined the cardiometabolic profile and performed whole-exome sequencing in 209 cALL survivors from the PETALE cohort. Variants associated with cardiometabolic outcomes were identified using PLINK (common) or SKAT (common and rare) and a logistic regression was used to evaluate their impact in multivariate models. RESULTS Our results showed that rare and common variants in the BAD and FCRL3 genes were associated (p<0.05) with an extreme cardiometabolic phenotype (3 or more cardiometabolic risk factors). Common variants in OGFOD3 and APOB as well as rare and common BAD variants were significantly (p<0.05) associated with dyslipidemia. Common BAD and SERPINA6 variants were associated (p<0.05) with obesity and insulin resistance, respectively. CONCLUSIONS In summary, we identified genetic susceptibility loci as contributing factors to the development of late treatment-related cardiometabolic complications in cALL survivors. These biomarkers could be used as early detection strategies to identify susceptible individuals and implement appropriate measures and follow-up to prevent the development of risk factors in this high-risk population.
Collapse
Affiliation(s)
- Jade England
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
| | - Simon Drouin
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
| | - Patrick Beaulieu
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
| | - Pascal St-Onge
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
| | - Maja Krajinovic
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
| | - Caroline Laverdière
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
- Departments of Pediatrics, Université de Montréal, Montreal, Quebec, H3T 1C5 Canada
| | - Emile Levy
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
- Departments of Nutrition, Université de Montréal, Montreal, Quebec, H3T 1C5 Canada
| | - Valérie Marcil
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
- Departments of Nutrition, Université de Montréal, Montreal, Quebec, H3T 1C5 Canada
| | - Daniel Sinnett
- Research Centre, Sainte-Justine University Health Center, 3175 chemin de la Côte-Sainte-Catherine, Montreal, Quebec, H3T 1C5 Canada
- Departments of Pediatrics, Université de Montréal, Montreal, Quebec, H3T 1C5 Canada
| |
Collapse
|
569
|
Identification of CDC42BPG as a novel susceptibility locus for hyperuricemia in a Japanese population. Mol Genet Genomics 2017; 293:371-379. [PMID: 29124443 PMCID: PMC5854719 DOI: 10.1007/s00438-017-1394-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 11/04/2017] [Indexed: 12/29/2022]
Abstract
Chronic kidney disease and hyperuricemia are serious global health problems. Recent genome-wide association studies have identified various genetic variants related to these disorders. However, most studies have been conducted in a cross-sectional manner. To identify novel susceptibility loci for chronic kidney disease or hyperuricemia, we performed longitudinal exome-wide association studies (EWASs), using ~ 244,000 genetic variants and clinical data of Japanese individuals who had undergone annual health checkups for several years. After establishing quality controls, the association of renal function-related traits in 5648 subjects (excluding patients with dialysis and population outliers) with 24,579 single nucleotide variants (SNVs) for three genetic models (P < 3.39 × 10− 7) was tested using generalized estimating equation models. The longitudinal EWASs revealed novel relations of five SNVs to renal function-related traits. Cross-sectional data for renal function-related traits in 7699 Japanese subjects were examined in a replication study. Among the five SNVs, rs55975541 in CDC42BPG was significantly (P < 4.90 × 10− 4) related to the serum concentration of uric acid in the replication cohort. We also examined the SNVs detected in our longitudinal EWASs with the information on P values in GKDGEN meta-analysis data. Four SNVs in SLC15A2 were significantly associated with the estimated glomerular filtration rate in European ancestry populations, although these SNVs were related to the serum concentration of uric acid with borderline significance in our longitudinal EWASs. Our findings indicate that CDC42BPG may be a novel susceptibility locus for hyperuricemia.
Collapse
|
570
|
Joo YB, Kim Y, Park Y, Kim K, Ryu JA, Lee S, Bang SY, Lee HS, Yi GS, Bae SC. Biological function integrated prediction of severe radiographic progression in rheumatoid arthritis: a nested case control study. Arthritis Res Ther 2017; 19:244. [PMID: 29065906 PMCID: PMC5655942 DOI: 10.1186/s13075-017-1414-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Accepted: 08/31/2017] [Indexed: 12/05/2022] Open
Abstract
Background Radiographic progression is reported to be highly heritable in rheumatoid arthritis (RA). However, previous study using genetic loci showed an insufficient accuracy of prediction for radiographic progression. The aim of this study is to identify a biologically relevant prediction model of radiographic progression in patients with RA using a genome-wide association study (GWAS) combined with bioinformatics analysis. Methods We obtained genome-wide single nucleotide polymorphism (SNP) data for 374 Korean patients with RA using Illumina HumanOmni2.5Exome-8 arrays. Radiographic progression was measured using the yearly Sharp/van der Heijde modified score rate, and categorized in no or severe progression. Significant SNPs for severe radiographic progression from GWAS were mapped on the functional genes and reprioritized by post-GWAS analysis. For robust prediction of radiographic progression, tenfold cross-validation using a support vector machine (SVM) classifier was conducted. Accuracy was used for selection of optimal SNPs set in the Hanyang Bae RA cohort. The performance of our final model was compared with that of other models based on GWAS results and SPOT (one of the post-GWAS analyses) using receiver operating characteristic (ROC) curves. The reliability of our model was confirmed using GWAS data of Caucasian patients with RA. Results A total of 36,091 significant SNPs with a p value <0.05 from GWAS were reprioritized using post-GWAS analysis and approximately 2700 were identified as SNPs related to RA biological features. The best average accuracy of ten groups was 0.6015 with 85 SNPs, and this increased to 0.7481 when combined with clinical information. In comparisons of the performance of the model, the 0.7872 area under the curve (AUC) in our model was superior to that obtained with GWAS (AUC 0.6586, p value 8.97 × 10-5) or SPOT (AUC 0.7449, p value 0.0423). Our model strategy also showed superior prediction accuracy in Caucasian patients with RA compared with GWAS (p value 0.0049) and SPOT (p value 0.0151). Conclusions Using various biological functions of SNPs and repeated machine learning, our model could predict severe radiographic progression relevantly and robustly in patients with RA compared with models using only GWAS results or other post-GWAS tools. Electronic supplementary material The online version of this article (doi:10.1186/s13075-017-1414-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Young Bin Joo
- Department of Rheumatology, St. Vincent's Hospital, The Catholic University of Korea, Suwon, Republic of Korea
| | - Yul Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Youngho Park
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Kwangwoo Kim
- Department of Biology, Kyung Hee University, Seoul, Republic of Korea
| | - Jeong Ah Ryu
- Department of Radiology, Hanyang University Hospital, Seoul, Republic of Korea
| | - Seunghun Lee
- Department of Radiology, Hanyang University Hospital, Seoul, Republic of Korea
| | - So-Young Bang
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Hye-Soon Lee
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea.
| | - Gwan-Su Yi
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.
| | - Sang-Cheol Bae
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea.
| |
Collapse
|
571
|
Dou J, Zhang L, Xie X, Ye L, Yang C, Wen L, Shen C, Zhu C, Zhao S, Zhu Z, Liang B, Wang Z, Li H, Fan X, Liu S, Yin X, Zheng X, Sun L, Yang S, Cui Y, Zhou F, Zhang X. Integrative analyses reveal biological pathways and key genes in psoriasis. Br J Dermatol 2017; 177:1349-1357. [PMID: 28542811 DOI: 10.1111/bjd.15682] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2017] [Indexed: 12/25/2022]
Abstract
BACKGROUND Psoriasis is a complex disease influenced by both genetic and environmental factors with abnormal gene expression in lesional skin. However, no studies are available on genome-scale gene expression of psoriatic lesions in the Chinese population. In addition, systematic studies on the biological pathways, pathogenicity and interaction networks of psoriasis-related genes with abnormal expression profiles require further investigation. OBJECTIVES To further explore the associated pathways in psoriasis by functional analysis and to identify the key genes by gene pathogenicity analysis. METHODS We performed RNA sequencing on 60 skin biopsy samples from patients with psoriasis and healthy controls to identify the primary differentially expressed genes in psoriatic lesional skin. We retrieved all reported psoriasis-associated genes and performed integrative analyses covering gene expression profiling, pathway analysis, gene pathogenicities and protein-protein interaction networks. RESULTS We found that internal and external stimuli may activate immunoinflammatory responses to promote the development of psoriasis. Pathways associated with infectious diseases and cancers were identified by functional and pathway analyses. The gene pathogenicity analysis revealed five key genes in psoriasis: PPARD, GATA3, TIMP3, WNT5A and PTTG1. CONCLUSIONS Our analyses showed that genes contributed to the pathogenesis of psoriasis by activating risk pathways with components abnormality in expression. We identified five potentially pathogenic genes for psoriasis that may serve as important biomarkers for the diagnosis and treatment.
Collapse
Affiliation(s)
- J Dou
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - L Zhang
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - X Xie
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - L Ye
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - C Yang
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - L Wen
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - C Shen
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - C Zhu
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - S Zhao
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - Z Zhu
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - B Liang
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - Z Wang
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - H Li
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - X Fan
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - S Liu
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - X Yin
- Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, U.S.A
| | - X Zheng
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - L Sun
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - S Yang
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - Y Cui
- Department of Dermatology, China-Japan Friendship Hospital, Beijing, China
| | - F Zhou
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| | - X Zhang
- Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, China.,Key Laboratory of Dermatology, Anhui Medical University, Ministry of Education, China
| |
Collapse
|
572
|
Rampogu S, Son M, Park C, Kim HH, Suh JK, Lee KW. Sulfonanilide Derivatives in Identifying Novel Aromatase Inhibitors by Applying Docking, Virtual Screening, and MD Simulations Studies. BIOMED RESEARCH INTERNATIONAL 2017; 2017:2105610. [PMID: 29312992 PMCID: PMC5664374 DOI: 10.1155/2017/2105610] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Revised: 07/31/2017] [Accepted: 08/27/2017] [Indexed: 01/04/2023]
Abstract
Breast cancer is one of the leading causes of death noticed in women across the world. Of late the most successful treatments rendered are the use of aromatase inhibitors (AIs). In the current study, a two-way approach for the identification of novel leads has been adapted. 81 chemical compounds were assessed to understand their potentiality against aromatase along with the four known drugs. Docking was performed employing the CDOCKER protocol available on the Discovery Studio (DS v4.5). Exemestane has displayed a higher dock score among the known drug candidates and is labeled as reference. Out of 81 ligands 14 have exhibited higher dock scores than the reference. In the second approach, these 14 compounds were utilized for the generation of the pharmacophore. The validated four-featured pharmacophore was then allowed to screen Chembridge database and the potential Hits were obtained after subjecting them to Lipinski's rule of five and the ADMET properties. Subsequently, the acquired 3,050 Hits were escalated to molecular docking utilizing GOLD v5.0. Finally, the obtained Hits were consequently represented to be ideal lead candidates that were escalated to the MD simulations and binding free energy calculations. Additionally, the gene-disease association was performed to delineate the associated disease caused by CYP19A1.
Collapse
Affiliation(s)
- Shailima Rampogu
- Division of Applied Life Science (BK21 Plus), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Systems and Synthetic Agrobiotech Center (SSAC), Research Institute of Natural Science (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea
| | - Minky Son
- Division of Applied Life Science (BK21 Plus), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Systems and Synthetic Agrobiotech Center (SSAC), Research Institute of Natural Science (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea
| | - Chanin Park
- Division of Applied Life Science (BK21 Plus), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Systems and Synthetic Agrobiotech Center (SSAC), Research Institute of Natural Science (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea
| | - Hyong-Ha Kim
- Division of Quality of Life, Korea Research Institute of Standards and Science, Daejeon 34113, Republic of Korea
| | - Jung-Keun Suh
- Bio-Computing Major, Korean German Institute of Technology, Seoul 07582, Republic of Korea
| | - Keun Woo Lee
- Division of Applied Life Science (BK21 Plus), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Systems and Synthetic Agrobiotech Center (SSAC), Research Institute of Natural Science (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Jinju 52828, Republic of Korea
| |
Collapse
|
573
|
Wu H, Miller E, Wijegunawardana D, Regan K, Payne PRO, Li F. MD-Miner: a network-based approach for personalized drug repositioning. BMC SYSTEMS BIOLOGY 2017; 11:86. [PMID: 28984195 PMCID: PMC5629618 DOI: 10.1186/s12918-017-0462-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
BACKGROUND Due to advances in next generation sequencing technologies and corresponding reductions in cost, it is now attainable to investigate genome-wide gene expression and variants at a patient-level, so as to better understand and anticipate heterogeneous responses to therapy. Consequently, it is feasible to inform personalized drug treatment decisions using personal genomics data. However, these efforts are limited due to a lack of reliable computational approaches for predicting effective drugs for individual patients. The reverse gene set enrichment analysis (i.e., connectivity mapping) approach and its variants have been widely and successfully used for drug prediction. However, the performance of these methods is limited by undefined mechanism of action (MoA) of drugs and reliance on cohorts of patients rather than personalized predictions for individual patients. RESULTS In this study, we have developed and evaluated a computational approach, known as Mechanism and Drug Miner (MD-Miner), using a network-based computational approach to predict effective drugs and reveal potential drug mechanisms of action at the level of signaling pathways. Specifically, the patient-specific signaling network is constructed by integrating known disease associated genes with patient-derived gene expression profiles. In parallel, a drug mechanism of action network is constructed by integrating drug targets and z-score profiles of drug-induced gene expression (pre vs. post-drug treatment). Potentially effective candidate drugs are prioritized according to the number of common genes between the patient-specific dysfunctional signaling network and drug MoA network. We evaluated the MD-Miner method on the PC-3 prostate cancer cell line, and showed that it significantly improved the success rate of discovering effective drugs compared with the random selection, and could provide insight into potential mechanisms of action. CONCLUSIONS This work provides a signaling network-based drug repositioning approach. Compared with the reverse gene signature based drug repositioning approaches, the proposed method can provide clues of mechanism of action in terms of signaling transduction networks.
Collapse
Affiliation(s)
- Haoyang Wu
- Department of BioMedical Informatics (BMI), The Ohio State University, Columbus, OH, 43210, USA.,College of Engineering, The Ohio State University, Columbus, OH, 43210, USA
| | - Elise Miller
- Department of BioMedical Informatics (BMI), The Ohio State University, Columbus, OH, 43210, USA.,College of Engineering, Northeastern University, Boston, MA, 02115, USA
| | - Denethi Wijegunawardana
- Department of BioMedical Informatics (BMI), The Ohio State University, Columbus, OH, 43210, USA.,Colledge of Art and Science, The Ohio State University, Columbus, OH, 43210, USA
| | - Kelly Regan
- Department of BioMedical Informatics (BMI), The Ohio State University, Columbus, OH, 43210, USA
| | - Philip R O Payne
- Institute for Informatics, Washington University in St. Louis School of Medicine, St. Louis, MO, 63110, USA
| | - Fuhai Li
- Department of BioMedical Informatics (BMI), The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
574
|
Timing and localization of human dystrophin isoform expression provide insights into the cognitive phenotype of Duchenne muscular dystrophy. Sci Rep 2017; 7:12575. [PMID: 28974727 PMCID: PMC5626779 DOI: 10.1038/s41598-017-12981-5] [Citation(s) in RCA: 139] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Accepted: 09/13/2017] [Indexed: 01/14/2023] Open
Abstract
Duchenne muscular dystrophy (DMD) is a muscular dystrophy with high incidence of learning and behavioural problems and is associated with neurodevelopmental disorders. To gain more insights into the role of dystrophin in this cognitive phenotype, we performed a comprehensive analysis of the expression patterns of dystrophin isoforms across human brain development, using unique transcriptomic data from Allen Human Brain and BrainSpan atlases. Dystrophin isoforms show large changes in expression through life with pronounced differences between the foetal and adult human brain. The Dp140 isoform was expressed in the cerebral cortex only in foetal life stages, while in the cerebellum it was also expressed postnatally. The Purkinje isoform Dp427p was virtually absent. The expression of dystrophin isoforms was significantly associated with genes implicated in neurodevelopmental disorders, like autism spectrum disorders or attention-deficit hyper-activity disorders, which are known to be associated to DMD. We also identified relevant functional associations of the different isoforms, like an association with axon guidance or neuron differentiation during early development. Our results point to the crucial role of several dystrophin isoforms in the development and function of the human brain.
Collapse
|
575
|
Zaki N, Tennakoon C. BioCarian: search engine for exploratory searches in heterogeneous biological databases. BMC Bioinformatics 2017; 18:435. [PMID: 28969593 PMCID: PMC5625622 DOI: 10.1186/s12859-017-1840-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 09/21/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. RESULTS We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com . CONCLUSIONS We have developed a search engine to explore RDF databases that can be used by both novice and advanced users.
Collapse
Affiliation(s)
- Nazar Zaki
- Department of Comp. Science and Software Engineering, College of Info. Technology, United Arab Emirates University (UAEU), Al Ain, PO Box 15551 United Arab Emirates
| | - Chandana Tennakoon
- Department of Comp. Science and Software Engineering, College of Info. Technology, United Arab Emirates University (UAEU), Al Ain, PO Box 15551 United Arab Emirates
| |
Collapse
|
576
|
Identification of susceptible genes for complex chronic diseases based on disease risk functional SNPs and interaction networks. J Biomed Inform 2017; 74:137-144. [DOI: 10.1016/j.jbi.2017.09.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Revised: 09/15/2017] [Accepted: 09/16/2017] [Indexed: 01/05/2023]
|
577
|
Venkatesan A, Kim JH, Talo F, Ide-Smith M, Gobeill J, Carter J, Batista-Navarro R, Ananiadou S, Ruch P, McEntyre J. SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data. Wellcome Open Res 2017. [PMID: 28948232 DOI: 10.12688/wellcomeopenres.10210.1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts. As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data.
Collapse
Affiliation(s)
- Aravind Venkatesan
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Jee-Hyub Kim
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Francesco Talo
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Michele Ide-Smith
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Julien Gobeill
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Jacob Carter
- National Centre for Text Mining (NaCTeM), Manchester Institute of Biotechnology, Manchester, UK
| | - Riza Batista-Navarro
- National Centre for Text Mining (NaCTeM), Manchester Institute of Biotechnology, Manchester, UK
| | - Sophia Ananiadou
- National Centre for Text Mining (NaCTeM), Manchester Institute of Biotechnology, Manchester, UK
| | - Patrick Ruch
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland.,Bibliomics and Text Mining Group (BiTeM), HES-SO, Geneva, Switzerland
| | - Johanna McEntyre
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| |
Collapse
|
578
|
Latourelle JC, Beste MT, Hadzi TC, Miller RE, Oppenheim JN, Valko MP, Wuest DM, Church BW, Khalil IG, Hayete B, Venuto CS. Large-scale identification of clinical and genetic predictors of motor progression in patients with newly diagnosed Parkinson's disease: a longitudinal cohort study and validation. Lancet Neurol 2017; 16:908-916. [PMID: 28958801 PMCID: PMC5693218 DOI: 10.1016/s1474-4422(17)30328-9] [Citation(s) in RCA: 112] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Revised: 08/15/2017] [Accepted: 08/17/2017] [Indexed: 01/21/2023]
Abstract
Background Better understanding and prediction of PD progression could improve disease management and clinical trial design. We aimed to use longitudinal clinical, molecular, and genetic data to develop predictive models, compare potential biomarkers, and identify novel predictors for motor progression in PD. We also sought to assess the use of these models in the design of treatment trials in PD. Methods A Bayesian multivariate predictive inference platform was applied to data from the Parkinson’s Progression Markers Initiative (PPMI) study (NCT01141023). We used genetic data and baseline molecular and clinical variables from PD patients and healthy controls to construct an ensemble of models to predict the annualised rate of the Movement Disorder Society-Unified Parkinson’s Disease Rating Scale parts II and III combined. We tested our overall explanatory power, as assessed by the coefficient of determination (R2), and replicated novel findings in an independent clinical cohort of PD patients from the Longitudinal and Biomarker Study in PD (LABS-PD; NCT00605163). The potential utility of these models for clinical trial design was quantified by comparing simulated randomized placebo-controlled trials within the out-of sample LABS-PD cohort. Findings A total of 117 controls and 312 PD cases were available for analysis. Our model ensemble exhibited strong performance in-cohort (5-fold cross-validated R2=41%, 95% CI: 35% – 47%) and significant, though reduced, performance out-of-cohort (R2=9%, 95% CI: 4% – 16%). Individual predictive features identified from PPMI data were confirmed in the LABS-PD cohort of 317 PD patients. These included significant replication of higher baseline motor score, male sex, and increased age, as well as a novel PD-specific epistatic interaction all indicative of faster motor progression. Genetic variation was the most useful predictive marker of motor progression (2.9%, 95%CI: 1.5–4.3%). CSF biomarkers at baseline showed a more modest (0.3%; 95%CI: 0.1–0.5%), but still significant effect on motor progression prediction. The simulations (n=5000) showed that incorporating the predicted rates of motor progression into the final models of treatment effect reduced the variability in the study outcome allowing significant differences to be detected at sample sizes up to 20% smaller than in naïve trials. Interpretation Our model ensemble confirmed established and identified novel predictors of PD motor progression. Improving existing prognostic models through machine learning approaches should benefit trial design and evaluation, as well as clinical disease monitoring and treatment. Funding Michael J. Fox Foundation for Parkinson’s Research and National Institute of Neurological Disorders and Stroke (1P20NS092529-01).
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Charles S Venuto
- Center for Health and Technology and Department of Neurology, University of Rochester, Rochester, NY, USA
| |
Collapse
|
579
|
Himmelstein DS, Lizee A, Hessler C, Brueggeman L, Chen SL, Hadley D, Green A, Khankhanian P, Baranzini SE. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife 2017; 6:26726. [PMID: 28936969 PMCID: PMC5640425 DOI: 10.7554/elife.26726] [Citation(s) in RCA: 278] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2017] [Accepted: 09/11/2017] [Indexed: 12/16/2022] Open
Abstract
The ability to computationally predict whether a compound treats a disease would improve the economy and success rate of drug approval. This study describes Project Rephetio to systematically model drug efficacy based on 755 existing treatments. First, we constructed Hetionet (neo4j.het.io), an integrative network encoding knowledge from millions of biomedical studies. Hetionet v1.0 consists of 47,031 nodes of 11 types and 2,250,197 relationships of 24 types. Data were integrated from 29 public resources to connect compounds, diseases, genes, anatomies, pathways, biological processes, molecular functions, cellular components, pharmacologic classes, side effects, and symptoms. Next, we identified network patterns that distinguish treatments from non-treatments. Then, we predicted the probability of treatment for 209,168 compound-disease pairs (het.io/repurpose). Our predictions validated on two external sets of treatment and provided pharmacological insights on epilepsy, suggesting they will help prioritize drug repurposing candidates. This study was entirely open and received realtime feedback from 40 community members.
Collapse
Affiliation(s)
- Daniel Scott Himmelstein
- Biological and Medical Informatics Program, University of California, San Francisco, San Francisco, United States.,Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, United States
| | - Antoine Lizee
- Department of Neurology, University of California, San Francisco, San Francisco, United States.,ITUN-CRTI-UMR 1064 Inserm, University of Nantes, Nantes, France
| | - Christine Hessler
- Department of Neurology, University of California, San Francisco, San Francisco, United States
| | - Leo Brueggeman
- Department of Neurology, University of California, San Francisco, San Francisco, United States.,University of Iowa, Iowa City, United States
| | - Sabrina L Chen
- Department of Neurology, University of California, San Francisco, San Francisco, United States.,Johns Hopkins University, Baltimore, United States
| | - Dexter Hadley
- Department of Pediatrics, University of California, San Fransisco, San Fransisco, United States.,Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, United States
| | - Ari Green
- Department of Neurology, University of California, San Francisco, San Francisco, United States
| | - Pouya Khankhanian
- Department of Neurology, University of California, San Francisco, San Francisco, United States.,Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, United States
| | - Sergio E Baranzini
- Biological and Medical Informatics Program, University of California, San Francisco, San Francisco, United States.,Department of Neurology, University of California, San Francisco, San Francisco, United States
| |
Collapse
|
580
|
Systems Biology Genetic Approach Identifies Serotonin Pathway as a Possible Target for Obstructive Sleep Apnea: Results from a Literature Search Review. SLEEP DISORDERS 2017; 2017:6768323. [PMID: 29057124 PMCID: PMC5625807 DOI: 10.1155/2017/6768323] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 06/14/2017] [Indexed: 12/25/2022]
Abstract
Rationale Overall validity of existing genetic biomarkers in the diagnosis of obstructive sleep apnea (OSA) remains unclear. The objective of this systematic genetic study is to identify “novel” biomarkers for OSA using systems biology approach. Methods Candidate genes for OSA were extracted from PubMed, MEDLINE, and Embase search engines and DisGeNET database. The gene ontology (GO) analyses and candidate genes prioritization were performed using Enrichr tool. Genes pertaining to the top 10 pathways were extracted and used for Ingenuity Pathway Analysis. Results In total, we have identified 153 genes. The top 10 pathways associated with OSA include (i) serotonin receptor interaction, (ii) pathways in cancer, (iii) AGE-RAGE signaling in diabetes, (iv) infectious diseases, (v) serotonergic synapse, (vi) inflammatory bowel disease, (vii) HIF-1 signaling pathway, (viii) PI3-AKT signaling pathway, (ix) regulation lipolysis in adipocytes, and (x) rheumatoid arthritis. After removing the overlapping genes, we have identified 23 candidate genes, out of which >30% of the genes were related to the genes involved in the serotonin pathway. Among these 4 serotonin receptors SLC6A4, HTR2C, HTR2A, and HTR1B were strongly associated with OSA. Conclusions This preliminary report identifies several potential candidate genes associated with OSA and also describes the possible regulatory mechanisms.
Collapse
|
581
|
Cheng SJ, Shi FY, Liu H, Ding Y, Jiang S, Liang N, Gao G. Accurately annotate compound effects of genetic variants using a context-sensitive framework. Nucleic Acids Res 2017; 45:e82. [PMID: 28158838 PMCID: PMC5449550 DOI: 10.1093/nar/gkx041] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 01/24/2017] [Indexed: 02/07/2023] Open
Abstract
In genomics, effectively identifying the biological effects of genetic variants is crucial. Current methods handle each variant independently, assuming that each variant acts in a context-free manner. However, variants within the same gene may interfere with each other, producing combinational (compound) rather than individual effects. In this work, we introduce COPE, a gene-centric variant annotation tool that integrates the entire sequential context in evaluating the functional effects of intra-genic variants. Applying COPE to the 1000 Genomes dataset, we identified numerous cases of multiple-variant compound effects that frequently led to false-positive and false-negative loss-of-function calls by conventional variant-centric tools. Specifically, 64 disease-causing mutations were identified to be rescued in a specific genomic context, thus potentially contributing to the buffering effects for highly penetrant deleterious mutations. COPE is freely available for academic use at http://cope.cbi.pku.edu.cn.
Collapse
Affiliation(s)
- Si-Jin Cheng
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Fang-Yuan Shi
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Huan Liu
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Yang Ding
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Shuai Jiang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Nan Liang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Center for Bioinformatics, Peking University, Beijing 100871, People's Republic of China
| |
Collapse
|
582
|
Shu L, Chan KHK, Zhang G, Huan T, Kurt Z, Zhao Y, Codoni V, Trégouët DA, Cardiogenics Consortium, Yang J, Wilson JG, Luo X, Levy D, Lusis AJ, Liu S, Yang X. Shared genetic regulatory networks for cardiovascular disease and type 2 diabetes in multiple populations of diverse ethnicities in the United States. PLoS Genet 2017; 13:e1007040. [PMID: 28957322 PMCID: PMC5634657 DOI: 10.1371/journal.pgen.1007040] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Revised: 10/10/2017] [Accepted: 09/21/2017] [Indexed: 12/18/2022] Open
Abstract
Cardiovascular diseases (CVD) and type 2 diabetes (T2D) are closely interrelated complex diseases likely sharing overlapping pathogenesis driven by aberrant activities in gene networks. However, the molecular circuitries underlying the pathogenic commonalities remain poorly understood. We sought to identify the shared gene networks and their key intervening drivers for both CVD and T2D by conducting a comprehensive integrative analysis driven by five multi-ethnic genome-wide association studies (GWAS) for CVD and T2D, expression quantitative trait loci (eQTLs), ENCODE, and tissue-specific gene network models (both co-expression and graphical models) from CVD and T2D relevant tissues. We identified pathways regulating the metabolism of lipids, glucose, and branched-chain amino acids, along with those governing oxidation, extracellular matrix, immune response, and neuronal system as shared pathogenic processes for both diseases. Further, we uncovered 15 key drivers including HMGCR, CAV1, IGF1 and PCOLCE, whose network neighbors collectively account for approximately 35% of known GWAS hits for CVD and 22% for T2D. Finally, we cross-validated the regulatory role of the top key drivers using in vitro siRNA knockdown, in vivo gene knockout, and two Hybrid Mouse Diversity Panels each comprised of >100 strains. Findings from this in-depth assessment of genetic and functional data from multiple human cohorts provide strong support that common sets of tissue-specific molecular networks drive the pathogenesis of both CVD and T2D across ethnicities and help prioritize new therapeutic avenues for both CVD and T2D.
Collapse
Affiliation(s)
- Le Shu
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Kei Hang K. Chan
- Departments of Epidemiology and Medicine and Center for Global Cardiometabolic Health, Brown University, Providence, RI, United States of America
- Hong Kong Institute of Diabetes and Obesity, Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Guanglin Zhang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Tianxiao Huan
- The Framingham Heart Study, Framingham, MA, USA and the Population Sciences Branch, National Heart, Lung, and Blood Institute, Bethesda, MD, United States of America
| | - Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Yuqi Zhao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Veronica Codoni
- Sorbonne Universités, UPMC Univ. Paris 06, INSERM, UMR_S 1166, Team Genomics & Pathophysiology of Cardiovascular Diseases, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - David-Alexandre Trégouët
- Sorbonne Universités, UPMC Univ. Paris 06, INSERM, UMR_S 1166, Team Genomics & Pathophysiology of Cardiovascular Diseases, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | | | - Jun Yang
- Department of Public Health, Hangzhou Normal University School of Medicine, Hangzhou, China
- Collaborative Innovation Center for the Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University, Hangzhou, China
| | - James G. Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS, United States of America
| | - Xi Luo
- Department of Biostatistics, Brown University, Providence, RI, United States of America
| | - Daniel Levy
- The Framingham Heart Study, Framingham, MA, USA and the Population Sciences Branch, National Heart, Lung, and Blood Institute, Bethesda, MD, United States of America
| | - Aldons J. Lusis
- Departments of Medicine, Human Genetics, and Microbiology, Immunology, and Molecular Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Simin Liu
- Departments of Epidemiology and Medicine and Center for Global Cardiometabolic Health, Brown University, Providence, RI, United States of America
- Department of Endocrinology, Guangdong General Hospital/Guangdong Academy of Medical Sciences, Guangzhou, Guangdong, China
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, United States of America
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, CA, United States of America
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, United States of America
| |
Collapse
|
583
|
Keck M, Fournier A, Gualtieri F, Walker A, von Rüden EL, Russmann V, Deeg CA, Hauck SM, Krause R, Potschka H. A systems level analysis of epileptogenesis-associated proteome alterations. Neurobiol Dis 2017; 105:164-178. [PMID: 28576708 DOI: 10.1016/j.nbd.2017.05.017] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Revised: 05/22/2017] [Accepted: 05/29/2017] [Indexed: 12/18/2022] Open
Abstract
Despite intense research efforts, the knowledge about the mechanisms of epileptogenesis and epilepsy is still considered incomplete and limited. However, an in-depth understanding of molecular pathophysiological processes is crucial for the rational selection of innovative biomarkers and target candidates. Here, we subjected proteomic data from different phases of a chronic rat epileptogenesis model to a comprehensive systems level analysis. Weighted Gene Co-expression Network analysis identified several modules of interconnected protein groups reflecting distinct molecular aspects of epileptogenesis in the hippocampus and the parahippocampal cortex. Characterization of these modules did not only further validate the data but also revealed regulation of molecular processes not described previously in the context of epilepsy development. The data sets also provide valuable information about temporal patterns, which should be taken into account for development of preventive strategies in particular when it comes to multi-targeting network pharmacology approaches. In addition, principal component analysis suggests candidate biomarkers, which might inform the design of novel molecular imaging approaches aiming to predict epileptogenesis during different phases or confirm epilepsy manifestation. Further studies are necessary to distinguish between molecular alterations, which correlate with epileptogenesis versus those reflecting a mere consequence of the status epilepticus.
Collapse
Affiliation(s)
- Michael Keck
- Institute of Pharmacology, Toxicology and Pharmacy, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany
| | - Anna Fournier
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4367 Belvaux, Luxembourg
| | - Fabio Gualtieri
- Institute of Pharmacology, Toxicology and Pharmacy, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany
| | - Andreas Walker
- Institute of Pharmacology, Toxicology and Pharmacy, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany
| | - Eva-Lotta von Rüden
- Institute of Pharmacology, Toxicology and Pharmacy, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany
| | - Vera Russmann
- Institute of Pharmacology, Toxicology and Pharmacy, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany
| | - Cornelia A Deeg
- Institute of Animal Physiology, Department of Veterinary Sciences, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany; Experimental Ophthalmology, Philipps University of Marburg, 35037 Marburg, Germany
| | - Stefanie M Hauck
- Research Unit Protein Science, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Roland Krause
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4367 Belvaux, Luxembourg.
| | - Heidrun Potschka
- Institute of Pharmacology, Toxicology and Pharmacy, Ludwig-Maximilians-University (LMU), 80539 Munich, Germany.
| |
Collapse
|
584
|
Ferrero E, Dunham I, Sanseau P. In silico prediction of novel therapeutic targets using gene-disease association data. J Transl Med 2017; 15:182. [PMID: 28851378 PMCID: PMC5576250 DOI: 10.1186/s12967-017-1285-6] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 08/22/2017] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Target identification and validation is a pressing challenge in the pharmaceutical industry, with many of the programmes that fail for efficacy reasons showing poor association between the drug target and the disease. Computational prediction of successful targets could have a considerable impact on attrition rates in the drug discovery pipeline by significantly reducing the initial search space. Here, we explore whether gene-disease association data from the Open Targets platform is sufficient to predict therapeutic targets that are actively being pursued by pharmaceutical companies or are already on the market. METHODS To test our hypothesis, we train four different classifiers (a random forest, a support vector machine, a neural network and a gradient boosting machine) on partially labelled data and evaluate their performance using nested cross-validation and testing on an independent set. We then select the best performing model and use it to make predictions on more than 15,000 genes. Finally, we validate our predictions by mining the scientific literature for proposed therapeutic targets. RESULTS We observe that the data types with the best predictive power are animal models showing a disease-relevant phenotype, differential expression in diseased tissue and genetic association with the disease under investigation. On a test set, the neural network classifier achieves over 71% accuracy with an AUC of 0.76 when predicting therapeutic targets in a semi-supervised learning setting. We use this model to gain insights into current and failed programmes and to predict 1431 novel targets, of which a highly significant proportion has been independently proposed in the literature. CONCLUSIONS Our in silico approach shows that data linking genes and diseases is sufficient to predict novel therapeutic targets effectively and confirms that this type of evidence is essential for formulating or strengthening hypotheses in the target discovery process. Ultimately, more rapid and automated target prioritisation holds the potential to reduce both the costs and the development times associated with bringing new medicines to patients.
Collapse
Affiliation(s)
- Enrico Ferrero
- Computational Biology and Stats, Target Sciences, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 2NY UK
| | - Ian Dunham
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Philippe Sanseau
- Computational Biology and Stats, Target Sciences, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 2NY UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| |
Collapse
|
585
|
Ghadie MA, Lambourne L, Vidal M, Xia Y. Domain-based prediction of the human isoform interactome provides insights into the functional impact of alternative splicing. PLoS Comput Biol 2017; 13:e1005717. [PMID: 28846689 PMCID: PMC5591010 DOI: 10.1371/journal.pcbi.1005717] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2017] [Revised: 09/08/2017] [Accepted: 08/03/2017] [Indexed: 11/19/2022] Open
Abstract
Alternative splicing is known to remodel protein-protein interaction networks (“interactomes”), yet large-scale determination of isoform-specific interactions remains challenging. We present a domain-based method to predict the isoform interactome from the reference interactome. First, we construct the domain-resolved reference interactome by mapping known domain-domain interactions onto experimentally-determined interactions between reference proteins. Then, we construct the isoform interactome by predicting that an isoform loses an interaction if it loses the domain mediating the interaction. Our prediction framework is of high-quality when assessed by experimental data. The predicted human isoform interactome reveals extensive network remodeling by alternative splicing. Protein pairs interacting with different isoforms of the same gene tend to be more divergent in biological function, tissue expression, and disease phenotype than protein pairs interacting with the same isoforms. Our prediction method complements experimental efforts, and demonstrates that integrating structural domain information with interactomes provides insights into the functional impact of alternative splicing. Protein-protein interaction networks have been extensively used in systems biology to study the role of proteins in cell function and disease. However, current network biology studies typically assume that one gene encodes one protein isoform, ignoring the effect of alternative splicing. Alternative splicing allows a gene to produce multiple protein isoforms, by alternatively selecting distinct regions in the gene to be translated to protein products. Here, we present a computational method to predict and analyze the large-scale effect of alternative splicing on protein-protein interaction networks. Starting with a reference protein-protein interaction network determined by experiments, our method annotates protein-protein interactions with domain-domain interactions, and predicts that a protein isoform loses an interaction if it loses the domain mediating the interaction as a result of alternative splicing. Our predictions reveal the central role of alternative splicing in extensively remodeling the human protein-protein interaction network, and in increasing the functional complexity of the human cell. Our prediction method complements ongoing experimental efforts by predicting isoform-specific interactions for genes not tested yet by experiments and providing insights into the functional impact of alternative splicing.
Collapse
Affiliation(s)
- Mohamed Ali Ghadie
- Department of Bioengineering, McGill University, Montreal, Québec, Canada
| | - Luke Lambourne
- Department of Bioengineering, McGill University, Montreal, Québec, Canada
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, Québec, Canada
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
586
|
Yea SJ, Kim BY, Kim C, Yi MY. A framework for the targeted selection of herbs with similar efficacy by exploiting drug repositioning technique and curated biomedical knowledge. JOURNAL OF ETHNOPHARMACOLOGY 2017; 208:117-128. [PMID: 28687508 DOI: 10.1016/j.jep.2017.06.048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Revised: 06/27/2017] [Accepted: 06/27/2017] [Indexed: 06/07/2023]
Abstract
ETHNO PHARMACOLOGICAL RELEVANCE Plants have been the most important natural resources for traditional medicine and for the modern pharmaceutical industry. They have been in demand in regards to finding alternative medicinal herbs with similar efficacy. Due to the very low probability of discovering useful compounds by random screening, researchers have advocated for using targeted selection approaches. Furthermore, because drug repositioning can speed up the process of drug development, an integrated technique that exploits chemical, genetic, and disease information has been recently developed. Building upon these findings, in this paper, we propose a novel framework for the targeted selection of herbs with similar efficacy by exploiting drug repositioning technique and curated modern scientific biomedical knowledge, with the goal of improving the possibility of inferring the traditional empirical ethno-pharmacological knowledge. MATERIALS AND METHODS To rank candidate herbs on the basis of similarities against target herb, we proposed and evaluated a framework that is comprised of the following four layers: links, extract, similarity, and model. In the framework, multiple databases are linked to build an herb-compound-protein-disease network which was composed of one tripartite network and two bipartite networks allowing comprehensive and detailed information to be extracted. Further, various similarity scores between herbs are calculated, and then prediction models are trained and tested on the basis of theses similarity features. RESULTS The proposed framework has been found to be feasible in terms of link loss. Out of the 50 similarities, the best one enhanced the performance of ranking herbs with similar efficacy by about 120-320% compared with our previous study. Also, the prediction model showed improved performance by about 180-480%. While building the prediction model, we identified the compound information as being the most important knowledge source and structural similarity as the most useful measure. CONCLUSIONS In the proposed framework, we took the knowledge of herbal medicine, chemistry, biology, and medicine into consideration to rank herbs with similar efficacy in candidates. The experimental results demonstrated that the performances of framework outperformed the baselines and identified the important knowledge source and useful similarity measure.
Collapse
Affiliation(s)
- Sang-Jun Yea
- Graduate School of Knowledge Service Engineering, Korea Advanced Institute of Science and Technology, Republic of Korea; K-herb Research Center, Korea Institute of Oriental Medicine, Republic of Korea
| | - Bu-Yeo Kim
- KM Convergence Research Division, Korea Institute of Oriental Medicine, Republic of Korea
| | - Chul Kim
- K-herb Research Center, Korea Institute of Oriental Medicine, Republic of Korea.
| | - Mun Yong Yi
- Graduate School of Knowledge Service Engineering, Korea Advanced Institute of Science and Technology, Republic of Korea.
| |
Collapse
|
587
|
Tejera E, Cruz-Monteagudo M, Burgos G, Sánchez ME, Sánchez-Rodríguez A, Pérez-Castillo Y, Borges F, Cordeiro MNDS, Paz-Y-Miño C, Rebelo I. Consensus strategy in genes prioritization and combined bioinformatics analysis for preeclampsia pathogenesis. BMC Med Genomics 2017; 10:50. [PMID: 28789679 PMCID: PMC5549357 DOI: 10.1186/s12920-017-0286-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 07/28/2017] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Preeclampsia is a multifactorial disease with unknown pathogenesis. Even when recent studies explored this disease using several bioinformatics tools, the main objective was not directed to pathogenesis. Additionally, consensus prioritization was proved to be highly efficient in the recognition of genes-disease association. However, not information is available about the consensus ability to early recognize genes directly involved in pathogenesis. Therefore our aim in this study is to apply several theoretical approaches to explore preeclampsia; specifically those genes directly involved in the pathogenesis. METHODS We firstly evaluated the consensus between 12 prioritization strategies to early recognize pathogenic genes related to preeclampsia. A communality analysis in the protein-protein interaction network of previously selected genes was done including further enrichment analysis. The enrichment analysis includes metabolic pathways as well as gene ontology. Microarray data was also collected and used in order to confirm our results or as a strategy to weight the previously enriched pathways. RESULTS The consensus prioritized gene list was rationally filtered to 476 genes using several criteria. The communality analysis showed an enrichment of communities connected with VEGF-signaling pathway. This pathway is also enriched considering the microarray data. Our result point to VEGF, FLT1 and KDR as relevant pathogenic genes, as well as those connected with NO metabolism. CONCLUSION Our results revealed that consensus strategy improve the detection and initial enrichment of pathogenic genes, at least in preeclampsia condition. Moreover the combination of the first percent of the prioritized genes with protein-protein interaction network followed by communality analysis reduces the gene space. This approach actually identifies well known genes related with pathogenesis. However, genes like HSP90, PAK2, CD247 and others included in the first 1% of the prioritized list need to be further explored in preeclampsia pathogenesis through experimental approaches.
Collapse
Affiliation(s)
- Eduardo Tejera
- Facultad de Medicina, Universidad de Las Américas, Av. de los Granados E12-41y Colimes esq, EC170125, Quito, Ecuador.
| | - Maykel Cruz-Monteagudo
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine and Center for Computational Science, University of Miami, FL 33136, Miami, USA.,Department of General Education, West Coast University-Miami Campus, Doral, FL 33178, USA.,CIQUP/Departamento de Quimica e Bioquimica, Faculdade de Ciências, Universidade do Porto, 4169-007, Porto, Portugal.,REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007, Porto, Portugal
| | - Germán Burgos
- Facultad de Medicina, Universidad de Las Américas, Av. de los Granados E12-41y Colimes esq, EC170125, Quito, Ecuador
| | - María-Eugenia Sánchez
- Facultad de Medicina, Universidad de Las Américas, Av. de los Granados E12-41y Colimes esq, EC170125, Quito, Ecuador
| | - Aminael Sánchez-Rodríguez
- Departamento de Ciencias Naturales, Universidad Técnica Particular de Loja, Calle París S/N, EC1101608, Loja, Ecuador
| | | | - Fernanda Borges
- CIQUP/Departamento de Quimica e Bioquimica, Faculdade de Ciências, Universidade do Porto, 4169-007, Porto, Portugal
| | | | - César Paz-Y-Miño
- Centro de Investigaciones genética y genómica, Facultad de Ciencias de la Salud, Universidad Tecnológica Equinoccial, Quito, Ecuador
| | - Irene Rebelo
- Faculty of Pharmacy, University of Porto, Porto, Portugal.,UCIBIO@REQUIMTE, Caparica, Portugal
| |
Collapse
|
588
|
Rani J, Mittal I, Pramanik A, Singh N, Dube N, Sharma S, Puniya BL, Raghunandanan MV, Mobeen A, Ramachandran S. T2DiACoD: A Gene Atlas of Type 2 Diabetes Mellitus Associated Complex Disorders. Sci Rep 2017; 7:6892. [PMID: 28761062 PMCID: PMC5537262 DOI: 10.1038/s41598-017-07238-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 06/28/2017] [Indexed: 12/11/2022] Open
Abstract
We performed integrative analysis of genes associated with type 2 Diabetes Mellitus (T2DM) associated complications by automated text mining with manual curation and also gene expression analysis from Gene Expression Omnibus. They were analysed for pathogenic or protective role, trends, interaction with risk factors, Gene Ontology enrichment and tissue wise differential expression. The database T2DiACoD houses 650 genes, and 34 microRNAs associated with T2DM complications. Seven genes AGER, TNFRSF11B, CRK, PON1, ADIPOQ, CRP and NOS3 are associated with all 5 complications. Several genes are studied in multiple years in all complications with high proportion in cardiovascular (75.8%) and atherosclerosis (51.3%). T2DM Patients' skeletal muscle tissues showed high fold change in differentially expressed genes. Among the differentially expressed genes, VEGFA is associated with several complications of T2DM. A few genes ACE2, ADCYAP1, HDAC4, NCF1, NFE2L2, OSM, SMAD1, TGFB1, BDNF, SYVN1, TXNIP, CD36, CYP2J2, NLRP3 with details of protective role are catalogued. Obesity is clearly a dominant risk factor interacting with the genes of T2DM complications followed by inflammation, diet and stress to variable extents. This information emerging from the integrative approach used in this work could benefit further therapeutic approaches. The T2DiACoD is available at www.http://t2diacod.igib.res.in/ .
Collapse
Affiliation(s)
- Jyoti Rani
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Inna Mittal
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Atreyi Pramanik
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Namita Singh
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Namita Dube
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Smriti Sharma
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Bhanwar Lal Puniya
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Muthukurussi Varieth Raghunandanan
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
| | - Ahmed Mobeen
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India
- Academy of Scientific and Innovative Research, CSIR-IGIB South Campus, New Delhi, 110025, India
| | - Srinivasan Ramachandran
- G N Ramachandran Knowledge of Centre, Council of Scientific and Industrial Research - Institute of Genomics and Integrative Biology (CSIR-IGIB), Room No. 130, Mathura Road, New Delhi, 110025, India.
- Academy of Scientific and Innovative Research, CSIR-IGIB South Campus, New Delhi, 110025, India.
| |
Collapse
|
589
|
Summer G, Kelder T, Radonjic M, van Bilsen M, Wopereis S, Heymans S. The Network Library: a framework to rapidly integrate network biology resources. Bioinformatics 2017; 32:i473-i478. [PMID: 27587664 DOI: 10.1093/bioinformatics/btw436] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
MOTIVATION Much of the biological knowledge accumulated over the last decades is stored in different databases governed by various organizations and institutes. Integrating and connecting these vast knowledge repositories is an extremely useful method to support life sciences research and help formulate novel hypotheses. RESULTS We developed the Network Library (NL), a framework and toolset to rapidly integrate different knowledge sources to build a network biology resource that matches a specific research question. As a use-case we explore the interactions of genes related to heart failure with miRNAs and diseases through the integration of 6 databases. AVAILABILITY AND IMPLEMENTATION The NL is open-source, developed in Java and available on Github (https://github.com/gsummer). CONTACT georg.summer@gmail.com.
Collapse
Affiliation(s)
- Georg Summer
- CARIM, Maastricht University, Maastricht, The Netherlands TNO, Zeist, The Netherlands
| | | | | | | | | | | |
Collapse
|
590
|
|
591
|
Regan KE, Payne PR, Li F. Integrative network and transcriptomics-based approach predicts genotype- specific drug combinations for melanoma. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017; 2017:247-256. [PMID: 28815138 PMCID: PMC5543336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Computational methods for drug combination predictions are needed to identify effective therapies that improve durability and prevent drug resistance in an efficient manner. In this paper, we present SynGeNet, a computational method that integrates transcriptomics data characterizing disease and drug z-score profiles with network mining algorithms in order to predict synergistic drug combinations. We compare SynGeNet to other available transcriptomics-based tools to predict drug combinations validated across melanoma cell lines in three genotype groups: BRAF-mutant, NRAS-mutant and combined. We showed that SynGeNet outperforms other available tools in predicting validated drug combinations and single agents tested as part of additional drug pairs. Interestingly, we observed that the performance of SynGeNet decreased when the network construction step was removed and improved when the proportion of matched-genotype validation cell lines increased. These results suggest that delineating functional information from transcriptomics data via network mining and genomic features can improve drug combination predictions.
Collapse
Affiliation(s)
- Kelly E. Regan
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Philip R.O. Payne
- Institute for Informatics, Washington University in St. Louis, St. Louis, MO, USA
| | - Fuhai Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
592
|
Rubio-Perez C, Guney E, Aguilar D, Piñero J, Garcia-Garcia J, Iadarola B, Sanz F, Fernandez-Fuentes N, Furlong LI, Oliva B. Genetic and functional characterization of disease associations explains comorbidity. Sci Rep 2017; 7:6207. [PMID: 28740175 PMCID: PMC5524755 DOI: 10.1038/s41598-017-04939-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 05/23/2017] [Indexed: 12/19/2022] Open
Abstract
Understanding relationships between diseases, such as comorbidities, has important socio-economic implications, ranging from clinical study design to health care planning. Most studies characterize disease comorbidity using shared genetic origins, ignoring pathway-based commonalities between diseases. In this study, we define the disease pathways using an interactome-based extension of known disease-genes and introduce several measures of functional overlap. The analysis reveals 206 significant links among 94 diseases, giving rise to a highly clustered disease association network. We observe that around 95% of the links in the disease network, though not identified by genetic overlap, are discovered by functional overlap. This disease network portraits rheumatoid arthritis, asthma, atherosclerosis, pulmonary diseases and Crohn's disease as hubs and thus pointing to common inflammatory processes underlying disease pathophysiology. We identify several described associations such as the inverse comorbidity relationship between Alzheimer's disease and neoplasms. Furthermore, we investigate the disruptions in protein interactions by mapping mutations onto the domains involved in the interaction, suggesting hypotheses on the causal link between diseases. Finally, we provide several proof-of-principle examples in which we model the effect of the mutation and the change of the association strength, which could explain the observed comorbidity between diseases caused by the same genetic alterations.
Collapse
Affiliation(s)
- Carlota Rubio-Perez
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), 08028, Barcelona, Spain.,Structural Bioinformatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Catalonia, Spain
| | - Emre Guney
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), 08028, Barcelona, Spain.,Center for Complex Network Research and Department of Physics, Northeastern University, Boston, 02115, MA, USA
| | - Daniel Aguilar
- Structural Bioinformatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Catalonia, Spain.,Barcelona Institute for Global Health (ISGlobal), 08003, Barcelona, Catalonia, Spain
| | - Janet Piñero
- Integrative Biomedical Informatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, Barcelona, 08003, Catalonia, Spain
| | - Javier Garcia-Garcia
- Structural Bioinformatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Catalonia, Spain.,Integrative Biomedical Informatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, Barcelona, 08003, Catalonia, Spain
| | - Barbara Iadarola
- Structural Bioinformatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Catalonia, Spain
| | - Ferran Sanz
- Integrative Biomedical Informatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, Barcelona, 08003, Catalonia, Spain
| | - Narcís Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3EB, United Kingdom.
| | - Laura I Furlong
- Integrative Biomedical Informatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, Barcelona, 08003, Catalonia, Spain.
| | - Baldo Oliva
- Structural Bioinformatics Group, GRIB, IMIM, Department of Experimental and Life Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Catalonia, Spain.
| |
Collapse
|
593
|
Development of zebrafish medulloblastoma-like PNET model by TALEN-mediated somatic gene inactivation. Oncotarget 2017; 8:55280-55297. [PMID: 28903419 PMCID: PMC5589658 DOI: 10.18632/oncotarget.19424] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 07/11/2017] [Indexed: 01/09/2023] Open
Abstract
Genetically engineered animal tumor models have traditionally been generated by the gain of single or multiple oncogenes or the loss of tumor suppressor genes; however, the development of live animal models has been difficult given that cancer phenotypes are generally induced by somatic mutation rather than by germline genetic inactivation. In this study, we developed somatically mutated tumor models using TALEN-mediated somatic gene inactivation of cdkn2a/b or rb1 tumor suppressor genes in zebrafish. One-cell stage injection of cdkn2a/b-TALEN mRNA resulted in malignant peripheral nerve sheath tumors with high frequency (about 39%) and early onset (about 35 weeks of age) in F0 tp53e7/e7 mutant zebrafish. Injection of rb1-TALEN mRNA also led to the formation of brain tumors at high frequency (58%, 31 weeks of age) in F0 tp53e7/e7 mutant zebrafish. Analysis of each tumor induced by somatic inactivation showed that the targeted genes had bi-allelic mutations. Tumors induced by rb1 somatic inactivation were characterized as medulloblastoma-like primitive neuroectodermal tumors based on incidence location, histopathological features, and immunohistochemical tests. In addition, 3' mRNA Quanti-Seq analysis showed differential activation of genes involved in cell cycle, DNA replication, and protein synthesis; especially, genes involved in neuronal development were up-regulated.
Collapse
|
594
|
Genome-Wide Linkage Analysis of Large Multiple Multigenerational Families Identifies Novel Genetic Loci for Coronary Artery Disease. Sci Rep 2017; 7:5472. [PMID: 28710368 PMCID: PMC5511258 DOI: 10.1038/s41598-017-05381-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 05/30/2017] [Indexed: 01/10/2023] Open
Abstract
Coronary artery disease (CAD) is the leading cause of death, and genetic factors contribute significantly to risk of CAD. This study aims to identify new CAD genetic loci through a large-scale linkage analysis of 24 large and multigenerational families with 433 family members (GeneQuest II). All family members were genotyped with markers spaced by every 10 cM and a model-free nonparametric linkage (NPL-all) analysis was carried out. Two highly significant CAD loci were identified on chromosome 17q21.2 (NPL score of 6.20) and 7p22.2 (NPL score of 5.19). We also identified four loci with significant NPL scores between 4.09 and 4.99 on 2q33.3, 3q29, 5q13.2 and 9q22.33. Similar analyses in individual families confirmed the six significant CAD loci and identified seven new highly significant linkages on 9p24.2, 9q34.2, 12q13.13, 15q26.1, 17q22, 20p12.3, and 22q12.1, and two significant loci on 2q11.2 and 11q14.1. Two loci on 3q29 and 9q22.33 were also successfully replicated in our previous linkage analysis of 428 nuclear families. Moreover, two published risk variants, SNP rs46522 in UBE2Z and SNP rs6725887 in WDR12 by GWAS, were found within the 17q21.2 and 2q33.3 loci. These studies lay a foundation for future identification of causative variants and genes for CAD.
Collapse
|
595
|
Wu WS, Tu BW, Chen TT, Hou SW, Tseng JT. CSmiRTar: Condition-Specific microRNA targets database. PLoS One 2017; 12:e0181231. [PMID: 28704505 PMCID: PMC5509330 DOI: 10.1371/journal.pone.0181231] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2017] [Accepted: 06/28/2017] [Indexed: 01/11/2023] Open
Abstract
MicroRNAs (miRNAs) are functional RNA molecules which play important roles in the post-transcriptional regulation. miRNAs regulate their target genes by repressing translation or inducing degradation of the target genes’ mRNAs. Many databases have been constructed to provide computationally predicted miRNA targets. However, they cannot provide the miRNA targets expressed in a specific tissue and related to a specific disease at the same time. Moreover, they cannot provide the common targets of multiple miRNAs and the common miRNAs of multiple genes at the same time. To solve these two problems, we construct a database called CSmiRTar (Condition-Specific miRNA Targets). CSmiRTar collects computationally predicted targets of 2588 human miRNAs and 1945 mouse miRNAs from four most widely used miRNA target prediction databases (miRDB, TargetScan, microRNA.org and DIANA-microT) and implements functional filters which allows users to search (i) a miRNA’s targets expressed in a specific tissue or/and related to a specific disease, (ii) multiple miRNAs’ common targets expressed in a specific tissue or/and related to a specific disease, (iii) a gene’s miRNAs related to a specific disease, and (iv) multiple genes’ common miRNAs related to a specific disease. We believe that CSmiRTar will be a useful database for biologists to study the molecular mechanisms of post-transcriptional regulation in human or mouse. CSmiRTar is available at http://cosbi.ee.ncku.edu.tw/CSmiRTar/ or http://cosbi4.ee.ncku.edu.tw/CSmiRTar/.
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
- * E-mail: (WSW); (JTT)
| | - Bor-Wen Tu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Tsung-Te Chen
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Shang-Wei Hou
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Joseph T. Tseng
- Department of Biotechnology and Bioindustry Sciences, National Cheng Kung University, Tainan, Taiwan
- * E-mail: (WSW); (JTT)
| |
Collapse
|
596
|
Venkatesan A, Kim JH, Talo F, Ide-Smith M, Gobeill J, Carter J, Batista-Navarro R, Ananiadou S, Ruch P, McEntyre J. SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data. Wellcome Open Res 2017; 1:25. [PMID: 28948232 PMCID: PMC5527546 DOI: 10.12688/wellcomeopenres.10210.2] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/06/2017] [Indexed: 12/31/2022] Open
Abstract
The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts. As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data.
Collapse
Affiliation(s)
- Aravind Venkatesan
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Jee-Hyub Kim
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Francesco Talo
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Michele Ide-Smith
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Julien Gobeill
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Jacob Carter
- National Centre for Text Mining (NaCTeM), Manchester Institute of Biotechnology, Manchester, UK
| | - Riza Batista-Navarro
- National Centre for Text Mining (NaCTeM), Manchester Institute of Biotechnology, Manchester, UK
| | - Sophia Ananiadou
- National Centre for Text Mining (NaCTeM), Manchester Institute of Biotechnology, Manchester, UK
| | - Patrick Ruch
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland.,Bibliomics and Text Mining Group (BiTeM), HES-SO, Geneva, Switzerland
| | - Johanna McEntyre
- Literature Service group, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| |
Collapse
|
597
|
Lysenko A, Boroevich KA, Tsunoda T. Arete - candidate gene prioritization using biological network topology with additional evidence types. BioData Min 2017; 10:22. [PMID: 28694847 PMCID: PMC5501438 DOI: 10.1186/s13040-017-0141-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 06/12/2017] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Refinement of candidate gene lists to select the most promising candidates for further experimental verification remains an essential step between high-throughput exploratory analysis and the discovery of specific causal genes. Given the qualitative and semantic complexity of biological data, successfully addressing this challenge requires development of flexible and interoperable solutions for making the best possible use of the largest possible fraction of all available data. RESULTS We have developed an easily accessible framework that links two established network-based gene prioritization approaches with a supporting isolation forest-based integrative ranking method. The defining feature of the method is that both topological information of the biological networks and additional sources of evidence can be considered at the same time. The implementation was realized as an app extension for the Cytoscape graph analysis suite, and therefore can further benefit from the synergy with other analysis methods available as part of this system. CONCLUSIONS We provide efficient reference implementations of two popular gene prioritization algorithms - DIAMOnD and random walk with restart for the Cytoscape system. An extension of those methods was also developed that allows outputs of these algorithms to be combined with additional data. To demonstrate the utility of our software, we present two example disease gene prioritization application cases and show how our tool can be used to evaluate these different approaches.
Collapse
Affiliation(s)
- Artem Lysenko
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi, Yokohama, 230-0045 Japan
| | - Keith Anthony Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi, Yokohama, 230-0045 Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi, Yokohama, 230-0045 Japan.,Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo, 113-8510 Japan.,CREST, JST, Tokyo, 113-8510 Japan
| |
Collapse
|
598
|
Salhi A, Essack M, Alam T, Bajic VP, Ma L, Radovanovic A, Marchand B, Schmeier S, Zhang Z, Bajic VB. DES-ncRNA: A knowledgebase for exploring information about human micro and long noncoding RNAs based on literature-mining. RNA Biol 2017; 14:963-971. [PMID: 28387604 PMCID: PMC5546543 DOI: 10.1080/15476286.2017.1312243] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Revised: 02/23/2017] [Accepted: 03/24/2017] [Indexed: 01/08/2023] Open
Abstract
Noncoding RNAs (ncRNAs), particularly microRNAs (miRNAs) and long ncRNAs (lncRNAs), are important players in diseases and emerge as novel drug targets. Thus, unraveling the relationships between ncRNAs and other biomedical entities in cells are critical for better understanding ncRNA roles that may eventually help develop their use in medicine. To support ncRNA research and facilitate retrieval of relevant information regarding miRNAs and lncRNAs from the plethora of published ncRNA-related research, we developed DES-ncRNA ( www.cbrc.kaust.edu.sa/des_ncrna ). DES-ncRNA is a knowledgebase containing text- and data-mined information from public scientific literature and other public resources. Exploration of mined information is enabled through terms and pairs of terms from 19 topic-specific dictionaries including, for example, antibiotics, toxins, drugs, enzymes, mutations, pathways, human genes and proteins, drug indications and side effects, mutations, diseases, etc. DES-ncRNA contains approximately 878,000 associations of terms from these dictionaries of which 36,222 (5,373) are with regards to miRNAs (lncRNAs). We provide several ways to explore information regarding ncRNAs to users including controlled generation of association networks as well as hypotheses generation. We show an example how DES-ncRNA can aid research on Alzheimer disease and suggest potential therapeutic role for Fasudil. DES-ncRNA is a powerful tool that can be used on its own or as a complement to the existing resources, to support research in human ncRNA. To our knowledge, this is the only knowledgebase dedicated to human miRNAs and lncRNAs derived primarily through literature-mining enabling exploration of a broad spectrum of associated biomedical entities, not paralleled by any other resource.
Collapse
Affiliation(s)
- Adil Salhi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | - Tanvir Alam
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | - Vladan P. Bajic
- VINCA Institute of Nuclear Sciences, Belgrade, Republic of Serbia
| | - Lina Ma
- BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing, China
| | - Aleksandar Radovanovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | | | - Sebastian Schmeier
- Massey University Auckland, Institute of Natural and Mathematical Sciences, Albany, Auckland, New Zealand
| | - Zhang Zhang
- BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai, China
| | - Vladimir B. Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| |
Collapse
|
599
|
Ansari S, Donato M, Saberian N, Draghici S. An approach to infer putative disease-specific mechanisms using neighboring gene networks. Bioinformatics 2017; 33:1987-1994. [PMID: 28200075 PMCID: PMC5870849 DOI: 10.1093/bioinformatics/btx097] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 01/18/2017] [Accepted: 02/10/2017] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The ultimate goal of any experiment is to understand the biological phenomena underlying the condition investigated. This process often results in genes network through which a certain biological mechanism is explained. Such networks have been proven to be extremely useful, for the prediction of mechanisms of action of drugs or the responses of an organism to a specific impact (e.g. a disease, a treatment, etc.). Here, we introduce an approach able to build a network that captures the putative mechanisms at play in the given condition, by using datasets from multiple experiments studying the same phenotype. This method takes advantage of known interactions extracted from multiple sources such as protein-protein interactions and curated biological pathways. Based on such prior knowledge, we overcome the drawbacks of snap-shot data by considering the possible effects of each gene on its neighbors. RESULTS We show the effectiveness of this approach in three different case studies and validate the results in two ways considering the identified genes and interactions between them. We compare our findings with the results of two widely-used methods in the same category as well as the classical approach of selecting differentially expressed (DE) genes in an investigated condition. The results show that 'neighbor-net' analysis is able to report biological mechanisms that are significantly relevant to the given diseases in all the three case studies, and performs better compared to all reference methods using both validation approaches. AVAILABILITY AND IMPLEMENTATION The proposed method is implemented as in R and will be available an a Bioconductor package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sahar Ansari
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Michele Donato
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Nafiseh Saberian
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA
| |
Collapse
|
600
|
Liu Y, Zeng X, He Z, Zou Q. Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:905-915. [PMID: 27076459 DOI: 10.1109/tcbb.2016.2550432] [Citation(s) in RCA: 209] [Impact Index Per Article: 26.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Since the discovery of the regulatory function of microRNA (miRNA), increased attention has focused on identifying the relationship between miRNA and disease. It has been suggested that computational method are an efficient way to identify potential disease-related miRNAs for further confirmation using biological experiments. In this paper, we first highlighted three limitations commonly associated with previous computational methods. To resolve these limitations, we established disease similarity subnetwork and miRNA similarity subnetwork by integrating multiple data sources, where the disease similarity is composed of disease semantic similarity and disease functional similarity, and the miRNA similarity is calculated using the miRNA-target gene and miRNA-lncRNA (long non-coding RNA) associations. Then, a heterogeneous network was constructed by connecting the disease similarity subnetwork and the miRNA similarity subnetwork using the known miRNA-disease associations. We extended random walk with restart to predict miRNA-disease associations in the heterogeneous network. The leave-one-out cross-validation achieved an average area under the curve (AUC) of 0:8049 across 341 diseases and 476 miRNAs. For five-fold cross-validation, our method achieved an AUC from 0:7970 to 0:9249 for 15 human diseases. Case studies further demonstrated the feasibility of our method to discover potential miRNA-disease associations. An online service for prediction is freely available at http://ifmda.aliapp.com.
Collapse
|