1
|
Kim H, Westerman KE, Smith K, Chiou J, Cole JB, Majarian T, von Grotthuss M, Kwak SH, Kim J, Mercader JM, Florez JC, Gaulton K, Manning AK, Udler MS. High-throughput genetic clustering of type 2 diabetes loci reveals heterogeneous mechanistic pathways of metabolic disease. Diabetologia 2023; 66:495-507. [PMID: 36538063 PMCID: PMC10108373 DOI: 10.1007/s00125-022-05848-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 10/28/2022] [Indexed: 12/24/2022]
Abstract
AIMS/HYPOTHESIS Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. METHODS We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. RESULTS We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. CONCLUSIONS/INTERPRETATION Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.
Collapse
Affiliation(s)
- Hyunkyung Kim
- Diabetes Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Kenneth E Westerman
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Kirk Smith
- Diabetes Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Joshua Chiou
- Department of Pediatrics, University of California San Diego, San Diego, CA, USA
| | - Joanne B Cole
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA
| | | | - Marcin von Grotthuss
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Takeda Pharmaceuticals, Cambridge, MA, USA
| | - Soo Heon Kwak
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Jaegil Kim
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- GlaxoSmithKline, Cambridge, MA, USA
| | - Josep M Mercader
- Diabetes Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Jose C Florez
- Diabetes Unit, Massachusetts General Hospital, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Kyle Gaulton
- Department of Pediatrics, University of California San Diego, San Diego, CA, USA
| | - Alisa K Manning
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Miriam S Udler
- Diabetes Unit, Massachusetts General Hospital, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
2
|
Louadi Z, Elkjaer ML, Klug M, Lio CT, Fenn A, Illes Z, Bongiovanni D, Baumbach J, Kacprowski T, List M, Tsoy O. Functional enrichment of alternative splicing events with NEASE reveals insights into tissue identity and diseases. Genome Biol 2021; 22:327. [PMID: 34857024 PMCID: PMC8638120 DOI: 10.1186/s13059-021-02538-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/10/2021] [Indexed: 01/27/2023] Open
Abstract
Alternative splicing (AS) is an important aspect of gene regulation. Nevertheless, its role in molecular processes and pathobiology is far from understood. A roadblock is that tools for the functional analysis of AS-set events are lacking. To mitigate this, we developed NEASE, a tool integrating pathways with structural annotations of protein-protein interactions to functionally characterize AS events. We show in four application cases how NEASE can identify pathways contributing to tissue identity and cell type development, and how it highlights splicing-related biomarkers. With a unique view on AS, NEASE generates unique and meaningful biological insights complementary to classical pathways analysis.
Collapse
Affiliation(s)
- Zakaria Louadi
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
| | - Maria L Elkjaer
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Melissa Klug
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Department of Internal Medicine I, School of Medicine, University hospital rechts der Isar, Technical University of Munich, Munich, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Munich Heart Alliance, Munich, Germany
| | - Chit Tong Lio
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
| | - Amit Fenn
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
| | - Zsolt Illes
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Dario Bongiovanni
- Department of Internal Medicine I, School of Medicine, University hospital rechts der Isar, Technical University of Munich, Munich, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Munich Heart Alliance, Munich, Germany
- Department of Cardiovascular Medicine, Humanitas Clinical and Research Center IRCCS and Humanitas University, Rozzano, Milan, Italy
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany
- Institute of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5000, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
| | - Olga Tsoy
- Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607, Hamburg, Germany.
| |
Collapse
|
3
|
Farooq QUA, Shaukat Z, Aiman S, Li CH. Protein-protein interactions: Methods, databases, and applications in virus-host study. World J Virol 2021; 10:288-300. [PMID: 34909403 PMCID: PMC8641042 DOI: 10.5501/wjv.v10.i6.288] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/19/2021] [Accepted: 07/30/2021] [Indexed: 02/06/2023] Open
Abstract
Almost all the cellular processes in a living system are controlled by proteins: They regulate gene expression, catalyze chemical reactions, transport small molecules across membranes, and transmit signal across membranes. Even, a viral infection is often initiated through virus-host protein interactions. Protein-protein interactions (PPIs) are the physical contacts between two or more proteins and they represent complex biological functions. Nowadays, PPIs have been used to construct PPI networks to study complex pathways for revealing the functions of unknown proteins. Scientists have used PPIs to find the molecular basis of certain diseases and also some potential drug targets. In this review, we will discuss how PPI networks are essential to understand the molecular basis of virus-host relationships and several databases which are dedicated to virus-host interaction studies. Here, we present a short but comprehensive review on PPIs, including the experimental and computational methods of finding PPIs, the databases dedicated to virus-host PPIs, and the associated various applications in protein interaction networks of some lethal viruses with their hosts.
Collapse
Affiliation(s)
- Qurat ul Ain Farooq
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Zeeshan Shaukat
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Sara Aiman
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chun-Hua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
4
|
Kosnik MB, Planchart A, Marvel SW, Reif DM, Mattingly CJ. Integration of curated and high-throughput screening data to elucidate environmental influences on disease pathways. Comput Toxicol 2019; 12:100094. [PMID: 31453412 PMCID: PMC6709694 DOI: 10.1016/j.comtox.2019.100094] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Addressing the complex relationship between public health and environmental exposure requires multiple types and sources of data. An important source of chemical data derives from high-throughput screening (HTS) efforts, such as the Tox21/ToxCast program, which aim to identify chemical hazard using primarily in vitro assays to probe toxicity. While most of these assays target specific genes, assessing the disease-relevance of these assays remains challenging. Integration with additional data sets may help to resolve these questions by providing broader context for individual assay results. The Comparative Toxicogenomics Database (CTD), a publicly available database that builds networks of chemical, gene, and disease information from manually curated literature sources, offers a promising solution for contextual integration with HTS data. Here, we tested the value of integrating data across Tox21/ToxCast and CTD by linking elements common to both databases (i.e., assays, genes, and chemicals). Using polymarcine and Parkinson's disease as a case study, we found that their union significantly increased chemical-gene associations and disease-pathway coverage. Integration also enabled new disease associations to be made with HTS assays, expanding coverage of chemical-gene data associated with diseases. We demonstrate how integration enables development of predictive adverse outcome pathways using 4-nonylphenol, branched as an example. Thus, we demonstrate enhancements to each data source through database integration, including scenarios where HTS data can efficiently probe chemical space that may be understudied in the literature, as well as how CTD can add biological context to those results.
Collapse
Affiliation(s)
- Marissa B. Kosnik
- Toxicology Program, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Bioinformatics Research Center, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Department of Biological Sciences, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
| | - Antonio Planchart
- Toxicology Program, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Department of Biological Sciences, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695-7617, United States
| | - Skylar W. Marvel
- Bioinformatics Research Center, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Department of Biological Sciences, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
| | - David M. Reif
- Toxicology Program, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Bioinformatics Research Center, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Department of Biological Sciences, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695-7617, United States
| | - Carolyn J. Mattingly
- Toxicology Program, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Department of Biological Sciences, North Carolina State University, North Carolina State University, Raleigh, NC 27695-7617, United States
- Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695-7617, United States
| |
Collapse
|
5
|
Abstract
PURPOSE OF REVIEW Type 2 diabetes (T2D), which accounts for the vast majority of diabetes cases, is essentially a diagnosis of exclusion in current clinical practice. Therefore, it is not surprising that T2D is heterogenous in terms of patients' clinical presentation, disease course, and response to treatment. This review summarizes published attempts to improve diabetes subclassification, with a particular focus on the role of genetics. RECENT FINDINGS A handful of diabetes subclassification schemas have been proposed using clinical data (patient characteristics and laboratory values), with some subgroups associated with distinct management trends or complication risks. However, phenotypically driven classifications suffer from dependencies on time of variable measurement and are not readily linked to disease mechanism. Germline genetic data, in contrast, are essentially unchanged over a person's lifetime and rooted in mechanism. Clustering of T2D genetic loci has identified at least five groupings of loci representing mechanisms of disease that may aid in deconstructing heterogeneity of T2D, but further work is needed to determine clinical utility. Exciting progress in subclassification of diabetes has demonstrated initial steps in deconstructing disease heterogeneity. Incorporation of genetics into classification schemas will require additional research but has the potential to improve our understanding and management of T2D, both as a single disease and as a part of an integrated metabolic disease network.
Collapse
Affiliation(s)
- Miriam S Udler
- Massachusetts General Hospital Diabetes Center, 50 Staniford St, Suite 340, Boston, MA, 02114, USA.
| |
Collapse
|
6
|
Abstract
BACKGROUND Understanding the genetic basis of disease is an important challenge in biology and medicine. The observation that disease-related proteins often interact with one another has motivated numerous network-based approaches for deciphering disease mechanisms. In particular, protein-protein interaction networks were successfully used to illuminate disease modules, i.e., interacting proteins working in concert to drive a disease. The identification of these modules can further our understanding of disease mechanisms. METHODS We devised a global method for the prediction of multiple disease modules simultaneously named GLADIATOR (GLobal Approach for DIsease AssociaTed mOdule Reconstruction). GLADIATOR relies on a gold-standard disease phenotypic similarity to obtain a pan-disease view of the underlying modules. To traverse the search space of potential disease modules, we applied a simulated annealing algorithm aimed at maximizing the correlation between module similarity and the gold-standard phenotypic similarity. Importantly, this optimization is employed over hundreds of diseases simultaneously. RESULTS GLADIATOR's predicted modules highly agree with current knowledge about disease-related proteins. Furthermore, the modules exhibit high coherence with respect to functional annotations and are highly enriched with known curated pathways, outperforming previous methods. Examination of the predicted proteins shared by similar diseases demonstrates the diverse role of these proteins in mediating related processes across similar diseases. Last, we provide a detailed analysis of the suggested molecular mechanism predicted by GLADIATOR for hyperinsulinism, suggesting novel proteins involved in its pathology. CONCLUSIONS GLADIATOR predicts disease modules by integrating knowledge of disease-related proteins and phenotypes across multiple diseases. The predicted modules are functionally coherent and are more in line with current biological knowledge compared to modules obtained using previous disease-centric methods. The source code for GLADIATOR can be downloaded from http://www.cs.tau.ac.il/~roded/GLADIATOR.zip .
Collapse
Affiliation(s)
- Yael Silberberg
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv, Israel
| | - Martin Kupiec
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv, Israel
| | - Roded Sharan
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|