1
|
Li Z, Windels SFL, Malod-Dognin N, Weinberg SM, Marazita ML, Walsh S, Shriver MD, Fardo DW, Claes P, Pržulj N, Van Steen K. Clustering individuals using INMTD: a novel versatile multi-view embedding framework integrating omics and imaging data. Bioinformatics 2025; 41:btaf122. [PMID: 40119919 PMCID: PMC11978392 DOI: 10.1093/bioinformatics/btaf122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 02/19/2025] [Accepted: 03/20/2025] [Indexed: 03/25/2025] Open
Abstract
MOTIVATION Combining omics and images can lead to a more comprehensive clustering of individuals than classic single-view approaches. Among the various approaches for multi-view clustering, nonnegative matrix tri-factorization (NMTF) and nonnegative Tucker decomposition (NTD) are advantageous in learning low-rank embeddings with promising interpretability. Besides, there is a need to handle unwanted drivers of clusterings (i.e. confounders). RESULTS In this work, we introduce a novel multi-view clustering method based on NMTF and NTD, named INMTD, which integrates omics and 3D imaging data to derive unconfounded subgroups of individuals. According to the adjusted Rand index, INMTD outperformed other clustering methods on a synthetic dataset with known clusters. In the application to real-life facial-genomic data, INMTD generated biologically relevant embeddings for individuals, genetics, and facial morphology. By removing confounded embedding vectors, we derived an unconfounded clustering with better internal and external quality; the genetic and facial annotations of each derived subgroup highlighted distinctive characteristics. In conclusion, INMTD can effectively integrate omics data and 3D images for unconfounded clustering with biologically meaningful interpretation. AVAILABILITY AND IMPLEMENTATION INMTD is freely available at https://github.com/ZuqiLi/INMTD.
Collapse
Affiliation(s)
- Zuqi Li
- Department of Human Genetics, KU Leuven, 3000 Leuven, Belgium
- Medical Imaging Research Center, UZ Leuven, 3000 Leuven, Belgium
- GIGA Molecular & Computational Biology, University of Liège, 4000 Liège, Belgium
| | | | | | - Seth M Weinberg
- Department of Oral and Craniofacial Sciences, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15260, United States
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Mary L Marazita
- Department of Oral and Craniofacial Sciences, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15260, United States
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - Susan Walsh
- Department of Biology, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, United States
| | - Mark D Shriver
- Department of Anthropology, Pennsylvania State University, University Park, PA 16802, United States
| | - David W Fardo
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY 40506, United States
| | - Peter Claes
- Department of Human Genetics, KU Leuven, 3000 Leuven, Belgium
- Medical Imaging Research Center, UZ Leuven, 3000 Leuven, Belgium
- Department of Electrical Engineering, ESAT/PSI, KU Leuven, 3000 Leuven, Belgium
- Murdoch Children’s Research Institute, Parkville, VIC 3052, Australia
| | - Nataša Pržulj
- Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Department of Computer Science, University College London, London WC1E 6BT, United Kingdom
- Catalan Institution for Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
| | - Kristel Van Steen
- Department of Human Genetics, KU Leuven, 3000 Leuven, Belgium
- GIGA Molecular & Computational Biology, University of Liège, 4000 Liège, Belgium
| |
Collapse
|
2
|
Pržulj N, Malod-Dognin N. Simplicity within biological complexity. BIOINFORMATICS ADVANCES 2025; 5:vbae164. [PMID: 39927291 PMCID: PMC11805345 DOI: 10.1093/bioadv/vbae164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 10/01/2024] [Accepted: 10/23/2024] [Indexed: 02/11/2025]
Abstract
Motivation Heterogeneous, interconnected, systems-level, molecular (multi-omic) data have become increasingly available and key in precision medicine. We need to utilize them to better stratify patients into risk groups, discover new biomarkers and targets, repurpose known and discover new drugs to personalize medical treatment. Existing methodologies are limited and a paradigm shift is needed to achieve quantitative and qualitative breakthroughs. Results In this perspective paper, we survey the literature and argue for the development of a comprehensive, general framework for embedding of multi-scale molecular network data that would enable their explainable exploitation in precision medicine in linear time. Network embedding methods (also called graph representation learning) map nodes to points in low-dimensional space, so that proximity in the learned space reflects the network's topology-function relationships. They have recently achieved unprecedented performance on hard problems of utilizing few omic data in various biomedical applications. However, research thus far has been limited to special variants of the problems and data, with the performance depending on the underlying topology-function network biology hypotheses, the biomedical applications, and evaluation metrics. The availability of multi-omic data, modern graph embedding paradigms and compute power call for a creation and training of efficient, explainable and controllable models, having no potentially dangerous, unexpected behaviour, that make a qualitative breakthrough. We propose to develop a general, comprehensive embedding framework for multi-omic network data, from models to efficient and scalable software implementation, and to apply it to biomedical informatics, focusing on precision medicine and personalized drug discovery. It will lead to a paradigm shift in the computational and biomedical understanding of data and diseases that will open up ways to solve some of the major bottlenecks in precision medicine and other domains.
Collapse
Affiliation(s)
- Nataša Pržulj
- Computational Biology Department, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, 00000, United Arabic Emirates
- Barcelona Supercomputing Center, Barcelona 08034, Spain
- Department of Computer Science, University College London, London WC1E6BT, United Kingdom
- ICREA, Pg. Lluís Companys 23, Barcelona 08010, Spain
| | | |
Collapse
|
3
|
Windels SFL, Tello Velasco D, Rotkevich M, Malod-Dognin N, Pržulj N. Graphlet-based hyperbolic embeddings capture evolutionary dynamics in genetic networks. Bioinformatics 2024; 40:btae650. [PMID: 39495120 PMCID: PMC11568109 DOI: 10.1093/bioinformatics/btae650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 09/29/2024] [Accepted: 10/30/2024] [Indexed: 11/05/2024] Open
Abstract
MOTIVATION Spatial Analysis of Functional Enrichment (SAFE) is a popular tool for biologists to investigate the functional organization of biological networks via highly intuitive 2D functional maps. To create these maps, SAFE uses Spring embedding to project a given network into a 2D space in which nodes connected in the network are near each other in space. However, many biological networks are scale-free, containing highly connected hub nodes. Because Spring embedding fails to separate hub nodes, it provides uninformative embeddings that resemble a 'hairball'. In addition, Spring embedding only captures direct node connectivity in the network and does not consider higher-order node wiring patterns, which are best captured by graphlets, small, connected, nonisomorphic, induced subgraphs. The scale-free structure of biological networks is hypothesized to stem from an underlying low-dimensional hyperbolic geometry, which novel hyperbolic embedding methods try to uncover. These include coalescent embedding, which projects a network onto a 2D disk. RESULTS To better capture the functional organization of scale-free biological networks, whilst also going beyond simple direct connectivity patterns, we introduce Graphlet Coalescent (GraCoal) embedding, which embeds nodes nearby on a disk if they frequently co-occur on a given graphlet together. We use GraCoal to extend SAFE-based network analysis. Through SAFE-enabled enrichment analysis, we show that GraCoal outperforms graphlet-based Spring embedding in capturing the functional organization of the genetic interaction networks of fruit fly, budding yeast, fission yeast and Escherichia coli. We show that depending on the underlying graphlet, GraCoal embeddings capture different topology-function relationships. We show that triangle-based GraCoal embedding captures functional redundancies between paralogs. AVAILABILITY AND IMPLEMENTATION https://gitlab.bsc.es/swindels/gracoal_embedding.
Collapse
Affiliation(s)
| | - Daniel Tello Velasco
- Barcelona Supercomputing Center, Barcelona 08034, Spain
- Universitat de Barcelona, Barcelona 08007, Spain
| | - Mikhail Rotkevich
- Barcelona Supercomputing Center, Barcelona 08034, Spain
- Universitat Politècnica de Catalunya, Barcelona 08034, Spain
| | | | - Nataša Pržulj
- Barcelona Supercomputing Center, Barcelona 08034, Spain
- ICREA, Barcelona 08010, Spain
- Department of Computer Science, University College London, London WC1E 6BT, United Kingdom
| |
Collapse
|
4
|
Nasser R, Schaffer LV, Ideker T, Sharan R. Multi-modal contrastive learning of subcellular organization using DICE. Bioinformatics 2024; 40:ii105-ii110. [PMID: 39230695 PMCID: PMC11520230 DOI: 10.1093/bioinformatics/btae387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024] Open
Abstract
The data deluge in biology calls for computational approaches that can integrate multiple datasets of different types to build a holistic view of biological processes or structures of interest. An emerging paradigm in this domain is the unsupervised learning of data embeddings that can be used for downstream clustering and classification tasks. While such approaches for integrating data of similar types are becoming common, there is scarcer work on consolidating different data modalities such as network and image information. Here, we introduce DICE (Data Integration through Contrastive Embedding), a contrastive learning model for multi-modal data integration. We apply this model to study the subcellular organization of proteins by integrating protein-protein interaction data and protein image data measured in HEK293 cells. We demonstrate the advantage of data integration over any single modality and show that our framework outperforms previous integration approaches. Availability: https://github.com/raminass/protein-contrastive Contact: raminass@gmail.com.
Collapse
Affiliation(s)
- Rami Nasser
- School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | - Leah V Schaffer
- Department of Medicine, University of California, San Diego, San Diego, CA 92037, United States
| | - Trey Ideker
- Department of Medicine, University of California, San Diego, San Diego, CA 92037, United States
- Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA 92037, United States
- Moores Cancer Center, University of California, San Diego, San Diego, CA 92037, United States
- Department of Bioengineering, University of California, San Diego, San Diego, CA 92037, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
5
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
6
|
Arici MK, Tuncbag N. Unveiling hidden connections in omics data via pyPARAGON: an integrative hybrid approach for disease network construction. Brief Bioinform 2024; 25:bbae399. [PMID: 39163205 PMCID: PMC11334722 DOI: 10.1093/bib/bbae399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 06/26/2024] [Accepted: 08/07/2024] [Indexed: 08/22/2024] Open
Abstract
Network inference or reconstruction algorithms play an integral role in successfully analyzing and identifying causal relationships between omics hits for detecting dysregulated and altered signaling components in various contexts, encompassing disease states and drug perturbations. However, accurate representation of signaling networks and identification of context-specific interactions within sparse omics datasets in complex interactomes pose significant challenges in integrative approaches. To address these challenges, we present pyPARAGON (PAgeRAnk-flux on Graphlet-guided network for multi-Omic data integratioN), a novel tool that combines network propagation with graphlets. pyPARAGON enhances accuracy and minimizes the inclusion of nonspecific interactions in signaling networks by utilizing network rather than relying on pairwise connections among proteins. Through comprehensive evaluations on benchmark signaling pathways, we demonstrate that pyPARAGON outperforms state-of-the-art approaches in node propagation and edge inference. Furthermore, pyPARAGON exhibits promising performance in discovering cancer driver networks. Notably, we demonstrate its utility in network-based stratification of patient tumors by integrating phosphoproteomic data from 105 breast cancer tumors with the interactome and demonstrating tumor-specific signaling pathways. Overall, pyPARAGON is a novel tool for analyzing and integrating multi-omic data in the context of signaling networks. pyPARAGON is available at https://github.com/netlab-ku/pyPARAGON.
Collapse
Affiliation(s)
- Muslum Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara 06800, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul 34450, Turkey
- School of Medicine, Koc University, Istanbul 34450, Turkey
- Koc University Research Center for Translational Medicine (KUTTAM), Koc University, Istanbul 34450, Turkey
| |
Collapse
|
7
|
Nussinov R, Yavuz BR, Demirel HC, Arici MK, Jang H, Tuncbag N. Review: Cancer and neurodevelopmental disorders: multi-scale reasoning and computational guide. Front Cell Dev Biol 2024; 12:1376639. [PMID: 39015651 PMCID: PMC11249571 DOI: 10.3389/fcell.2024.1376639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 06/10/2024] [Indexed: 07/18/2024] Open
Abstract
The connection and causality between cancer and neurodevelopmental disorders have been puzzling. How can the same cellular pathways, proteins, and mutations lead to pathologies with vastly different clinical presentations? And why do individuals with neurodevelopmental disorders, such as autism and schizophrenia, face higher chances of cancer emerging throughout their lifetime? Our broad review emphasizes the multi-scale aspect of this type of reasoning. As these examples demonstrate, rather than focusing on a specific organ system or disease, we aim at the new understanding that can be gained. Within this framework, our review calls attention to computational strategies which can be powerful in discovering connections, causalities, predicting clinical outcomes, and are vital for drug discovery. Thus, rather than centering on the clinical features, we draw on the rapidly increasing data on the molecular level, including mutations, isoforms, three-dimensional structures, and expression levels of the respective disease-associated genes. Their integrated analysis, together with chromatin states, can delineate how, despite being connected, neurodevelopmental disorders and cancer differ, and how the same mutations can lead to different clinical symptoms. Here, we seek to uncover the emerging connection between cancer, including pediatric tumors, and neurodevelopmental disorders, and the tantalizing questions that this connection raises.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Cancer Innovation Laboratory, National Cancer Institute, Frederick, MD, United States
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv-Yafo, Israel
| | - Bengi Ruken Yavuz
- Cancer Innovation Laboratory, National Cancer Institute, Frederick, MD, United States
| | | | - M. Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, Türkiye
| | - Hyunbum Jang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Cancer Innovation Laboratory, National Cancer Institute, Frederick, MD, United States
| | - Nurcan Tuncbag
- Department of Chemical and Biological Engineering, Koc University, Istanbul, Türkiye
- School of Medicine, Koc University, Istanbul, Türkiye
- Koc University Research Center for Translational Medicine (KUTTAM), Istanbul, Türkiye
| |
Collapse
|
8
|
Gureghian V, Herbst H, Kozar I, Mihajlovic K, Malod-Dognin N, Ceddia G, Angeli C, Margue C, Randic T, Philippidou D, Nomigni MT, Hemedan A, Tranchevent LC, Longworth J, Bauer M, Badkas A, Gaigneaux A, Muller A, Ostaszewski M, Tolle F, Pržulj N, Kreis S. A multi-omics integrative approach unravels novel genes and pathways associated with senescence escape after targeted therapy in NRAS mutant melanoma. Cancer Gene Ther 2023; 30:1330-1345. [PMID: 37420093 PMCID: PMC10581906 DOI: 10.1038/s41417-023-00640-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/19/2023] [Accepted: 06/21/2023] [Indexed: 07/09/2023]
Abstract
Therapy Induced Senescence (TIS) leads to sustained growth arrest of cancer cells. The associated cytostasis has been shown to be reversible and cells escaping senescence further enhance the aggressiveness of cancers. Chemicals specifically targeting senescent cells, so-called senolytics, constitute a promising avenue for improved cancer treatment in combination with targeted therapies. Understanding how cancer cells evade senescence is needed to optimise the clinical benefits of this therapeutic approach. Here we characterised the response of three different NRAS mutant melanoma cell lines to a combination of CDK4/6 and MEK inhibitors over 33 days. Transcriptomic data show that all cell lines trigger a senescence programme coupled with strong induction of interferons. Kinome profiling revealed the activation of Receptor Tyrosine Kinases (RTKs) and enriched downstream signaling of neurotrophin, ErbB and insulin pathways. Characterisation of the miRNA interactome associates miR-211-5p with resistant phenotypes. Finally, iCell-based integration of bulk and single-cell RNA-seq data identifies biological processes perturbed during senescence and predicts 90 new genes involved in its escape. Overall, our data associate insulin signaling with persistence of a senescent phenotype and suggest a new role for interferon gamma in senescence escape through the induction of EMT and the activation of ERK5 signaling.
Collapse
Affiliation(s)
- Vincent Gureghian
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Hailee Herbst
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Ines Kozar
- Laboratoire National de Santé, Dudelange, Luxembourg
| | | | | | - Gaia Ceddia
- Barcelona Supercomputing Center, 08034, Barcelona, Spain
| | - Cristian Angeli
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Christiane Margue
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Tijana Randic
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Demetra Philippidou
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Milène Tetsi Nomigni
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Ahmed Hemedan
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Leon-Charles Tranchevent
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Joseph Longworth
- Experimental and Molecular Immunology, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Mark Bauer
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Apurva Badkas
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Anthoula Gaigneaux
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Arnaud Muller
- LuxGen, TMOH and Bioinformatics platform, Data Integration and Analysis unit, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Fabrice Tolle
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Nataša Pržulj
- Barcelona Supercomputing Center, 08034, Barcelona, Spain
- Department of Computer Science, University College London, London, WC1E 6BT, UK
- ICREA, Pg. Lluís Companys 23, 08010, Barcelona, Spain
| | - Stephanie Kreis
- Department of Life Sciences and Medicine, University of Luxembourg, 6, Avenue du Swing, L-4367, Belvaux, Luxembourg.
| |
Collapse
|
9
|
Simonovsky E, Sharon M, Ziv M, Mauer O, Hekselman I, Jubran J, Vinogradov E, Argov CM, Basha O, Kerber L, Yogev Y, Segrè AV, Im HK, GTEx Consortium, Birk O, Rokach L, Yeger‐Lotem E. Predicting molecular mechanisms of hereditary diseases by using their tissue-selective manifestation. Mol Syst Biol 2023; 19:e11407. [PMID: 37232043 PMCID: PMC10407743 DOI: 10.15252/msb.202211407] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 04/30/2023] [Accepted: 05/10/2023] [Indexed: 05/27/2023] Open
Abstract
How do aberrations in widely expressed genes lead to tissue-selective hereditary diseases? Previous attempts to answer this question were limited to testing a few candidate mechanisms. To answer this question at a larger scale, we developed "Tissue Risk Assessment of Causality by Expression" (TRACE), a machine learning approach to predict genes that underlie tissue-selective diseases and selectivity-related features. TRACE utilized 4,744 biologically interpretable tissue-specific gene features that were inferred from heterogeneous omics datasets. Application of TRACE to 1,031 disease genes uncovered known and novel selectivity-related features, the most common of which was previously overlooked. Next, we created a catalog of tissue-associated risks for 18,927 protein-coding genes (https://netbio.bgu.ac.il/trace/). As proof-of-concept, we prioritized candidate disease genes identified in 48 rare-disease patients. TRACE ranked the verified disease gene among the patient's candidate genes significantly better than gene prioritization methods that rank by gene constraint or tissue expression. Thus, tissue selectivity combined with machine learning enhances genetic and clinical understanding of hereditary diseases.
Collapse
Affiliation(s)
- Eyal Simonovsky
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Moran Sharon
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Maya Ziv
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Omry Mauer
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Idan Hekselman
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Juman Jubran
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Ekaterina Vinogradov
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Chanan M Argov
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Omer Basha
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Lior Kerber
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Yuval Yogev
- Morris Kahn Laboratory of Human Genetics and the Genetics Institute at Soroka Medical Center, Faculty of Health SciencesBen Gurion University of the NegevBeer ShevaIsrael
| | - Ayellet V Segrè
- Ocular Genomics Institute, Massachusetts Eye and EarHarvard Medical SchoolBostonMAUSA
- The Broad Institute of MIT and HarvardCambridgeMAUSA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoILUSA
| | | | - Ohad Birk
- Morris Kahn Laboratory of Human Genetics and the Genetics Institute at Soroka Medical Center, Faculty of Health SciencesBen Gurion University of the NegevBeer ShevaIsrael
- The National Institute for Biotechnology in the NegevBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Lior Rokach
- Department of Software & Information Systems EngineeringBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Esti Yeger‐Lotem
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
- The National Institute for Biotechnology in the NegevBen‐Gurion University of the NegevBeer ShevaIsrael
| |
Collapse
|
10
|
Nabuco Leva Ferreira de Freitas JA, Bischof O. Dynamic modeling of the cellular senescence gene regulatory network. Heliyon 2023; 9:e14007. [PMID: 36938415 PMCID: PMC10015196 DOI: 10.1016/j.heliyon.2023.e14007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 02/13/2023] [Accepted: 02/17/2023] [Indexed: 02/27/2023] Open
Abstract
Cellular senescence is a cell fate that prominently impacts physiological and pathophysiological processes. Diverse cellular stresses induce it, and dramatic gene expression changes accompany it. However, determining the interactions comprising the gene regulatory network (GRN) governing senescence remains challenging. Recent advances in signal processing techniques provide opportunities to reconstruct GRNs. Here, we describe a GRN for senescence integrating time-series transcriptome and transcription factor depletion datasets. Specifically, we infer a set of differential equations using the "Sparse Identification of Nonlinear Dynamics" (SINDy) algorithm, discriminate genes with potential hidden regulators, validate the inferred GRN for time-points not included in the training data, and comprehensively benchmark our approach. Our work is a proof of concept for a data-driven GRN reconstruction method, consolidating an iterative, powerful mathematical platform for senescence modeling that can be used to test hypotheses in silico and has the potential for future discoveries of clinical impact.
Collapse
Affiliation(s)
- José Américo Nabuco Leva Ferreira de Freitas
- IMRB, Mondor Institute for Biomedical Research, INSERM U955 – Université Paris Est Créteil, UPEC, Faculté de Médecine de Créteil 8, rue du Général Sarrail, 94010 Créteil
- Sorbonne Université, UMR 8256, Biological Adaptation and Ageing B2A–IBPS, F-75005, Paris, France
- INSERM U1164, F-75005, Paris, France
| | - Oliver Bischof
- IMRB, Mondor Institute for Biomedical Research, INSERM U955 – Université Paris Est Créteil, UPEC, Faculté de Médecine de Créteil 8, rue du Général Sarrail, 94010 Créteil
- Corresponding author.
| |
Collapse
|
11
|
Malod-Dognin N, Ceddia G, Gvozdenov M, Tomić B, Dunjić Manevski S, Djordjević V, Pržulj N. A phenotype driven integrative framework uncovers molecular mechanisms of a rare hereditary thrombophilia. PLoS One 2023; 18:e0284084. [PMID: 37098010 PMCID: PMC10128975 DOI: 10.1371/journal.pone.0284084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 03/23/2023] [Indexed: 04/26/2023] Open
Abstract
Antithrombin resistance is a rare subtype of hereditary thrombophilia caused by prothrombin gene variants, leading to thrombotic disorders. Recently, the Prothrombin Belgrade variant has been reported as a specific variant that leads to antithrombin resistance in two Serbian families with thrombosis. However, due to clinical data scarcity and the inapplicability of traditional genome-wide association studies (GWAS), a broader perspective on molecular and phenotypic mechanisms associated with the Prothrombin Belgrade variant is yet to be uncovered. Here, we propose an integrative framework to address the lack of genomic samples and support the genomic signal from the full genome sequences of five heterozygous subjects by integrating it with subjects' phenotypes and the genes' molecular interactions. Our goal is to identify candidate thrombophilia-related genes for which our subjects possess germline variants by focusing on the resulting gene clusters of our integrative framework. We applied a Non-negative Matrix Tri-Factorization-based method to simultaneously integrate different data sources, taking into account the observed phenotypes. In other words, our data-integration framework reveals gene clusters involved with this rare disease by fusing different datasets. Our results are in concordance with the current literature about antithrombin resistance. We also found candidate disease-related genes that need to be further investigated. CD320, RTEL1, UCP2, APOA5 and PROZ participate in healthy-specific or disease-specific subnetworks involving thrombophilia-annotated genes and are related to general thrombophilia mechanisms according to the literature. Moreover, the ADRA2A and TBXA2R subnetworks analysis suggested that their variants may have a protective effect due to their connection with decreased platelet activation. The results show that our method can give insights into antithrombin resistance even if a small amount of genetic data is available. Our framework is also customizable, meaning that it applies to any other rare disease.
Collapse
Affiliation(s)
- Noël Malod-Dognin
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Department of Computer Science, University College London, London, United Kingdom
| | - Gaia Ceddia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Maja Gvozdenov
- Institute of Molecular Genetics and Genetic Engineering (IMGGE), University of Belgrade, Belgrade, Serbia
| | - Branko Tomić
- Institute of Molecular Genetics and Genetic Engineering (IMGGE), University of Belgrade, Belgrade, Serbia
| | - Sofija Dunjić Manevski
- Institute of Molecular Genetics and Genetic Engineering (IMGGE), University of Belgrade, Belgrade, Serbia
| | - Valentina Djordjević
- Institute of Molecular Genetics and Genetic Engineering (IMGGE), University of Belgrade, Belgrade, Serbia
| | - Nataša Pržulj
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Department of Computer Science, University College London, London, United Kingdom
- ICREA, Barcelona, Spain
| |
Collapse
|
12
|
Adamer MF, Brüningk SC, Tejada-Arranz A, Estermann F, Basler M, Borgwardt K. reComBat: batch-effect removal in large-scale multi-source gene-expression data integration. BIOINFORMATICS ADVANCES 2022; 2:vbac071. [PMID: 36699372 PMCID: PMC9710604 DOI: 10.1093/bioadv/vbac071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/01/2022] [Accepted: 09/26/2022] [Indexed: 01/28/2023]
Abstract
Motivation With the steadily increasing abundance of omics data produced all over the world under vastly different experimental conditions residing in public databases, a crucial step in many data-driven bioinformatics applications is that of data integration. The challenge of batch-effect removal for entire databases lies in the large number of batches and biological variation, which can result in design matrix singularity. This problem can currently not be solved satisfactorily by any common batch-correction algorithm. Results We present reComBat, a regularized version of the empirical Bayes method to overcome this limitation and benchmark it against popular approaches for the harmonization of public gene-expression data (both microarray and bulkRNAsq) of the human opportunistic pathogen Pseudomonas aeruginosa. Batch-effects are successfully mitigated while biologically meaningful gene-expression variation is retained. reComBat fills the gap in batch-correction approaches applicable to large-scale, public omics databases and opens up new avenues for data-driven analysis of complex biological processes beyond the scope of a single study. Availability and implementation The code is available at https://github.com/BorgwardtLab/reComBat, all data and evaluation code can be found at https://github.com/BorgwardtLab/batchCorrectionPublicData. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | | | | | - Marek Basler
- Biozentrum, University of Basel, Basel 4056, Switzerland
| | - Karsten Borgwardt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland,Swiss Institute for Bioinformatics (SIB), Lausanne 1015, Switzerland
| |
Collapse
|
13
|
Forster DT, Li SC, Yashiroda Y, Yoshimura M, Li Z, Isuhuaylas LAV, Itto-Nakama K, Yamanaka D, Ohya Y, Osada H, Wang B, Bader GD, Boone C. BIONIC: biological network integration using convolutions. Nat Methods 2022; 19:1250-1261. [PMID: 36192463 PMCID: PMC11236286 DOI: 10.1038/s41592-022-01616-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 08/16/2022] [Indexed: 01/21/2023]
Abstract
Biological networks constructed from varied data can be used to map cellular function, but each data type has limitations. Network integration promises to address these limitations by combining and automatically weighting input information to obtain a more accurate and comprehensive representation of the underlying biology. We developed a deep learning-based network integration algorithm that incorporates a graph convolutional network framework. Our method, BIONIC (Biological Network Integration using Convolutions), learns features that contain substantially more functional information compared to existing approaches. BIONIC has unsupervised and semisupervised learning modes, making use of available gene function annotations. BIONIC is scalable in both size and quantity of the input networks, making it feasible to integrate numerous networks on the scale of the human genome. To demonstrate the use of BIONIC in identifying new biology, we predicted and experimentally validated essential gene chemical-genetic interactions from nonessential gene profiles in yeast.
Collapse
Affiliation(s)
- Duncan T Forster
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Sheena C Li
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- RIKEN Center for Sustainable Resource Science, Wako, Saitama, Japan
| | - Yoko Yashiroda
- RIKEN Center for Sustainable Resource Science, Wako, Saitama, Japan
| | - Mami Yoshimura
- RIKEN Center for Sustainable Resource Science, Wako, Saitama, Japan
| | - Zhijian Li
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | | | - Kaori Itto-Nakama
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| | - Daisuke Yamanaka
- Laboratory for Immunopharmacology of Microbial Products, School of Pharmacy, Tokyo University of Pharmacy and Life Sciences, Hachioji, Tokyo, Japan
| | - Yoshikazu Ohya
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
- Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Tokyo, Japan
| | - Hiroyuki Osada
- RIKEN Center for Sustainable Resource Science, Wako, Saitama, Japan
| | - Bo Wang
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.
- Peter Munk Cardiac Center, University Health Network, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada.
| | - Gary D Bader
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- The Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada.
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.
| | - Charles Boone
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.
- RIKEN Center for Sustainable Resource Science, Wako, Saitama, Japan.
| |
Collapse
|
14
|
Ziv M, Gruber G, Sharon M, Vinogradov E, Yeger-Lotem E. The TissueNet v.3 database: Protein-protein interactions in adult and embryonic human tissue contexts. J Mol Biol 2022; 434:167532. [DOI: 10.1016/j.jmb.2022.167532] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/03/2022] [Accepted: 03/03/2022] [Indexed: 12/28/2022]
|
15
|
Chen Z, Chen S, Qiang X. Identification of Biomarker in Brain-specific Gene Regulatory Network Using Structural Controllability Analysis. FRONTIERS IN BIOINFORMATICS 2022; 2:812314. [PMID: 36304271 PMCID: PMC9580899 DOI: 10.3389/fbinf.2022.812314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 01/05/2022] [Indexed: 12/09/2022] Open
Abstract
Brain tumor research has been stapled for human health while brain network research is crucial for us to understand brain activity. Here the structural controllability theory is applied to study three human brain-specific gene regulatory networks, including forebrain gene regulatory network, hindbrain gene regulatory network and neuron associated cells cancer related gene regulatory network, whose nodes are neural genes and the edges represent the gene expression regulation among the genes. The nodes are classified into two classes: critical nodes and ordinary nodes, based on the change of the number of driver nodes upon its removal. Eight topological properties (out-degree DO, in-degree DI, degree D, betweenness B, closeness CA, in-closeness CI, out-closeness CO and clustering coefficient CC) are calculated in this paper and the results prove that the critical genes have higher score of topological properties than the ordinary genes. Then two bioinformatic analysis are used to explore the biologic significance of the critical genes. On the one hand, the enrichment scores in several kinds of gene databases are calculated and reveal that the critical nodes are richer in essential genes, cancer genes and the neuron related disease genes than the ordinary nodes, which indicates that the critical nodes may be the biomarker in brain-specific gene regulatory network. On the other hand, GO analysis and KEGG pathway analysis are applied on them and the results show that the critical genes mainly take part in 14 KEGG pathways that are transcriptional misregulation in cancer, pathways in cancer and so on, which indicates that the critical genes are related to the brain tumor. Finally, by deleting the edges or routines in the network, the robustness analysis of node classification is realized, and the robustness of node classification is proved. The comparison of neuron associated cells cancer related GRN (Gene Regulatory Network) and normal brain-specific GRNs (including forebrain and hindbrain GRN) shows that the neuron-related cell cancer-related gene regulatory network is more robust than other types.
Collapse
Affiliation(s)
- Zhihua Chen
- The Institute of Computing Science and Technology, Guangzhou University, Guangzhou, China
| | - Siyuan Chen
- The School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaoli Qiang
- The Institute of Computing Science and Technology, Guangzhou University, Guangzhou, China
- *Correspondence: Xiaoli Qiang,
| |
Collapse
|
16
|
Buphamalai P, Kokotovic T, Nagy V, Menche J. Network analysis reveals rare disease signatures across multiple levels of biological organization. Nat Commun 2021; 12:6306. [PMID: 34753928 PMCID: PMC8578255 DOI: 10.1038/s41467-021-26674-1] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 10/19/2021] [Indexed: 01/26/2023] Open
Abstract
Rare genetic diseases are typically caused by a single gene defect. Despite this clear causal relationship between genotype and phenotype, identifying the pathobiological mechanisms at various levels of biological organization remains a practical and conceptual challenge. Here, we introduce a network approach for evaluating the impact of rare gene defects across biological scales. We construct a multiplex network consisting of over 20 million gene relationships that are organized into 46 network layers spanning six major biological scales between genotype and phenotype. A comprehensive analysis of 3,771 rare diseases reveals distinct phenotypic modules within individual layers. These modules can be exploited to mechanistically dissect the impact of gene defects and accurately predict rare disease gene candidates. Our results show that the disease module formalism can be applied to rare diseases and generalized beyond physical interaction networks. These findings open up new venues to apply network-based tools for cross-scale data integration.
Collapse
Affiliation(s)
- Pisanu Buphamalai
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna BioCenter 5, 1030, Vienna, Austria
| | - Tomislav Kokotovic
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Department of Neurology, Medical University of Vienna, Währinger Gürtel 18-20, 1090, Vienna, Austria
| | - Vanja Nagy
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Department of Neurology, Medical University of Vienna, Währinger Gürtel 18-20, 1090, Vienna, Austria
| | - Jörg Menche
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria.
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna BioCenter 5, 1030, Vienna, Austria.
- Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090, Vienna, Austria.
| |
Collapse
|
17
|
Demirel HC, Arici MK, Tuncbag N. Computational approaches leveraging integrated connections of multi-omic data toward clinical applications. Mol Omics 2021; 18:7-18. [PMID: 34734935 DOI: 10.1039/d1mo00158b] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In line with the advances in high-throughput technologies, multiple omic datasets have accumulated to study biological systems and diseases coherently. No single omics data type is capable of fully representing cellular activity. The complexity of the biological processes arises from the interactions between omic entities such as genes, proteins, and metabolites. Therefore, multi-omic data integration is crucial but challenging. The impact of the molecular alterations in multi-omic data is not local in the neighborhood of the altered gene or protein; rather, the impact diffuses in the network and changes the functionality of multiple signaling pathways and regulation of the gene expression. Additionally, multi-omic data is high-dimensional and has background noise. Several integrative approaches have been developed to accurately interpret the multi-omic datasets, including machine learning, network-based methods, and their combination. In this review, we overview the most recent integrative approaches and tools with a focus on network-based methods. We then discuss these approaches according to their specific applications, from disease-network and biomarker identification to patient stratification, drug discovery, and repurposing.
Collapse
Affiliation(s)
- Habibe Cansu Demirel
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey
| | - Muslum Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, 06044, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey.,School of Medicine, Koc University, Istanbul, 34450, Turkey.,Koc University Research Center for Translational Medicine (KUTTAM), Istanbul, Turkey.
| |
Collapse
|
18
|
Arici MK, Tuncbag N. Performance Assessment of the Network Reconstruction Approaches on Various Interactomes. Front Mol Biosci 2021; 8:666705. [PMID: 34676243 PMCID: PMC8523993 DOI: 10.3389/fmolb.2021.666705] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 07/14/2021] [Indexed: 01/04/2023] Open
Abstract
Beyond the list of molecules, there is a necessity to collectively consider multiple sets of omic data and to reconstruct the connections between the molecules. Especially, pathway reconstruction is crucial to understanding disease biology because abnormal cellular signaling may be pathological. The main challenge is how to integrate the data together in an accurate way. In this study, we aim to comparatively analyze the performance of a set of network reconstruction algorithms on multiple reference interactomes. We first explored several human protein interactomes, including PathwayCommons, OmniPath, HIPPIE, iRefWeb, STRING, and ConsensusPathDB. The comparison is based on the coverage of each interactome in terms of cancer driver proteins, structural information of protein interactions, and the bias toward well-studied proteins. We next used these interactomes to evaluate the performance of network reconstruction algorithms including all-pair shortest path, heat diffusion with flux, personalized PageRank with flux, and prize-collecting Steiner forest (PCSF) approaches. Each approach has its own merits and weaknesses. Among them, PCSF had the most balanced performance in terms of precision and recall scores when 28 pathways from NetPath were reconstructed using the listed algorithms. Additionally, the reference interactome affects the performance of the network reconstruction approaches. The coverage and disease- or tissue-specificity of each interactome may vary, which may result in differences in the reconstructed networks.
Collapse
Affiliation(s)
- M Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, Turkey.,School of Medicine, Koc University, Istanbul, Turkey
| |
Collapse
|
19
|
Thomas JP, Modos D, Korcsmaros T, Brooks-Warburton J. Network Biology Approaches to Achieve Precision Medicine in Inflammatory Bowel Disease. Front Genet 2021; 12:760501. [PMID: 34745229 PMCID: PMC8566351 DOI: 10.3389/fgene.2021.760501] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 10/08/2021] [Indexed: 12/22/2022] Open
Abstract
Inflammatory bowel disease (IBD) is a chronic immune-mediated condition arising due to complex interactions between multiple genetic and environmental factors. Despite recent advances, the pathogenesis of the condition is not fully understood and patients still experience suboptimal clinical outcomes. Over the past few years, investigators are increasingly capturing multi-omics data from patient cohorts to better characterise the disease. However, reaching clinically translatable endpoints from these complex multi-omics datasets is an arduous task. Network biology, a branch of systems biology that utilises mathematical graph theory to represent, integrate and analyse biological data through networks, will be key to addressing this challenge. In this narrative review, we provide an overview of various types of network biology approaches that have been utilised in IBD including protein-protein interaction networks, metabolic networks, gene regulatory networks and gene co-expression networks. We also include examples of multi-layered networks that have combined various network types to gain deeper insights into IBD pathogenesis. Finally, we discuss the need to incorporate other data sources including metabolomic, histopathological, and high-quality clinical meta-data. Together with more robust network data integration and analysis frameworks, such efforts have the potential to realise the key goal of precision medicine in IBD.
Collapse
Affiliation(s)
- John P Thomas
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
- Department of Gastroenterology, Norfolk and Norwich University Hospital, Norwich, United Kingdom
| | - Dezso Modos
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Johanne Brooks-Warburton
- Department of Gastroenterology, Lister Hospital, Stevenage, United Kingdom
- Department of Clinical, Pharmaceutical and Biological Sciences, University of Hertfordshire, Hatfield, United Kingdom
| |
Collapse
|
20
|
Zambrana C, Xenos A, Böttcher R, Malod-Dognin N, Pržulj N. Network neighbors of viral targets and differentially expressed genes in COVID-19 are drug target candidates. Sci Rep 2021; 11:18985. [PMID: 34556735 PMCID: PMC8460804 DOI: 10.1038/s41598-021-98289-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 08/23/2021] [Indexed: 12/12/2022] Open
Abstract
The COVID-19 pandemic is raging. It revealed the importance of rapid scientific advancement towards understanding and treating new diseases. To address this challenge, we adapt an explainable artificial intelligence algorithm for data fusion and utilize it on new omics data on viral-host interactions, human protein interactions, and drugs to better understand SARS-CoV-2 infection mechanisms and predict new drug-target interactions for COVID-19. We discover that in the human interactome, the human proteins targeted by SARS-CoV-2 proteins and the genes that are differentially expressed after the infection have common neighbors central in the interactome that may be key to the disease mechanisms. We uncover 185 new drug-target interactions targeting 49 of these key genes and suggest re-purposing of 149 FDA-approved drugs, including drugs targeting VEGF and nitric oxide signaling, whose pathways coincide with the observed COVID-19 symptoms. Our integrative methodology is universal and can enable insight into this and other serious diseases.
Collapse
Affiliation(s)
| | | | | | - Noël Malod-Dognin
- Barcelona Supercomputing Center, Barcelona, Spain
- Department of Computer Science, University College London, London, WC1E 6BT, UK
| | - Nataša Pržulj
- Barcelona Supercomputing Center, Barcelona, Spain.
- Department of Computer Science, University College London, London, WC1E 6BT, UK.
- ICREA, Pg. Lluís Companys 23, Barcelona, Spain.
| |
Collapse
|
21
|
Zhang T, Zhang SW, Li Y. Identifying Driver Genes for Individual Patients through Inductive Matrix Completion. Bioinformatics 2021; 37:4477-4484. [PMID: 34175939 DOI: 10.1093/bioinformatics/btab477] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 04/30/2021] [Accepted: 06/25/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The driver genes play a key role in the evolutionary process of cancer. Effectively identifying these driver genes is crucial to cancer diagnosis and treatment. However, due to the high heterogeneity of cancers, it remains challenging to identify the driver genes for individual patients. Although some computational methods have been proposed to tackle this problem, they seldom consider the fact that the genes functionally similar to the well-established driver genes may likely play similar roles in cancer process, which potentially promotes the driver gene identification. Thus, here we developed a novel approach of IMCDriver to promote the driver gene identification both for cohorts and individual patients. RESULTS IMCDriver first considers the well-established driver genes as prior information, and adopts the using multi-omics data (e.g., somatic mutation, gene expression and protein-protein interaction) to compute the similarity between patients/genes. Then, IMCDriver prioritizes the personalized mutated genes according to their functional similarity to the well-established driver genes via Inductive Matrix Completion. Finally, IMCDriver identifies the highly rank-ordered genes as the personalized driver genes. The results on five cancer datasets from TCGA show that our IMCDriver outperforms other existing state-of-the-art methods both in the cohort and patient-specific driver gene identification. IMCDriver also reveals some novel driver genes that potentially drive cancer development. In addition, even for the driver genes rarely mutated among a population, IMCDriver can still identify them and prioritize them with high priorities. AVAILABILITY Code available at https://github.com/NWPU-903PR/IMCDriver. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tong Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an.,School of Electrical and Mechanical Engineering, Pingdingshan University, Pingdingshan, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an
| | - Yan Li
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an
| |
Collapse
|
22
|
Gaudelet T, Malod-Dognin N, Pržulj N. Integrative Data Analytic Framework to Enhance Cancer Precision Medicine. NETWORK AND SYSTEMS MEDICINE 2021; 4:60-73. [PMID: 33796878 PMCID: PMC8006589 DOI: 10.1089/nsm.2020.0015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/15/2021] [Indexed: 12/20/2022] Open
Abstract
With the advancement of high-throughput biotechnologies, we increasingly accumulate biomedical data about diseases, especially cancer. There is a need for computational models and methods to sift through, integrate, and extract new knowledge from the diverse available data, to improve the mechanistic understanding of diseases and patient care. To uncover molecular mechanisms and drug indications for specific cancer types, we develop an integrative framework able to harness a wide range of diverse molecular and pan-cancer data. We show that our approach outperforms the competing methods and can identify new associations. Furthermore, it captures the underlying biology predictive of drug response. Through the joint integration of data sources, our framework can also uncover links between cancer types and molecular entities for which no prior knowledge is available. Our new framework is flexible and can be easily reformulated to study any biomedical problem.
Collapse
Affiliation(s)
- Thomas Gaudelet
- Department of Computer Science, University College London, London, United Kingdom
| | - Noël Malod-Dognin
- Department of Computer Science, University College London, London, United Kingdom
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, United Kingdom
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- ICREA, Barcelona, Spain
| |
Collapse
|
23
|
Ning N, Yang Y, Song C, Wu B. An adaptive node embedding framework for multiplex networks. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-195065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Network Embedding (NE) has emerged as a powerful tool in many applications. Many real-world networks have multiple types of relations between the same entities, which are appropriate to be modeled as multiplex networks. However, at random walk-based embedding study for multiplex networks, very little attention has been paid to the problems of sampling bias and imbalanced relation types. In this paper, we propose an Adaptive Node Embedding Framework (ANEF) based on cross-layer sampling strategies of nodes for multiplex networks. ANEF is the first framework to focus on the bias issue of sampling strategies. Through metropolis hastings random walk (MHRW) and forest fire sampling (FFS), ANEF is less likely to be trapped in local structure with high degree nodes. We utilize a fixed-length queue to record previously visited layers, which can balance the edge distribution over different layers in sampled node sequence processes. In addition, to adaptively sample the cross-layer context of nodes, we also propose a node metric called Neighbors Partition Coefficient (NPC). Experiments on real-world networks in diverse fields show that our framework outperforms the state-of-the-art methods in application tasks such as cross-domain link prediction and mutual community detection.
Collapse
|
24
|
Savino A, Provero P, Poli V. Differential Co-Expression Analyses Allow the Identification of Critical Signalling Pathways Altered during Tumour Transformation and Progression. Int J Mol Sci 2020; 21:E9461. [PMID: 33322692 PMCID: PMC7764314 DOI: 10.3390/ijms21249461] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/02/2020] [Accepted: 12/09/2020] [Indexed: 02/02/2023] Open
Abstract
Biological systems respond to perturbations through the rewiring of molecular interactions, organised in gene regulatory networks (GRNs). Among these, the increasingly high availability of transcriptomic data makes gene co-expression networks the most exploited ones. Differential co-expression networks are useful tools to identify changes in response to an external perturbation, such as mutations predisposing to cancer development, and leading to changes in the activity of gene expression regulators or signalling. They can help explain the robustness of cancer cells to perturbations and identify promising candidates for targeted therapy, moreover providing higher specificity with respect to standard co-expression methods. Here, we comprehensively review the literature about the methods developed to assess differential co-expression and their applications to cancer biology. Via the comparison of normal and diseased conditions and of different tumour stages, studies based on these methods led to the definition of pathways involved in gene network reorganisation upon oncogenes' mutations and tumour progression, often converging on immune system signalling. A relevant implementation still lagging behind is the integration of different data types, which would greatly improve network interpretability. Most importantly, performance and predictivity evaluation of the large variety of mathematical models proposed would urgently require experimental validations and systematic comparisons. We believe that future work on differential gene co-expression networks, complemented with additional omics data and experimentally tested, will considerably improve our insights into the biology of tumours.
Collapse
Affiliation(s)
- Aurora Savino
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| | - Paolo Provero
- Department of Neurosciences “Rita Levi Montalcini”, University of Turin, Corso Massimo D’Ázeglio 52, 10126 Turin, Italy;
- Center for Omics Sciences, Ospedale San Raffaele IRCCS, Via Olgettina 60, 20132 Milan, Italy
| | - Valeria Poli
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| |
Collapse
|
25
|
Freire PP, Fernandez GJ, de Moraes D, Cury SS, Dal Pai‐Silva M, Dos Reis PP, Rogatto SR, Carvalho RF. The authors reply: Comment on "The expression landscape of cachexia-inducing factors in human cancers" by Freire et al. J Cachexia Sarcopenia Muscle 2020; 11:1854-1857. [PMID: 32996709 PMCID: PMC7749551 DOI: 10.1002/jcsm.12635] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Affiliation(s)
- Paula Paccielli Freire
- Department of Structural and Functional Biology, Institute of BiosciencesSão Paulo State University, UNESPBotucatuBrazil
| | - Geysson Javier Fernandez
- Department of Structural and Functional Biology, Institute of BiosciencesSão Paulo State University, UNESPBotucatuBrazil
- Faculty of MedicineUniversity of Antioquia, UdeAMedellínColombia
| | - Diogo de Moraes
- Department of Structural and Functional Biology, Institute of BiosciencesSão Paulo State University, UNESPBotucatuBrazil
| | - Sarah Santiloni Cury
- Department of Structural and Functional Biology, Institute of BiosciencesSão Paulo State University, UNESPBotucatuBrazil
| | - Maeli Dal Pai‐Silva
- Department of Structural and Functional Biology, Institute of BiosciencesSão Paulo State University, UNESPBotucatuBrazil
| | - Patrícia Pintor Dos Reis
- Department of Surgery and Orthopedics, Faculty of MedicineSão Paulo State University, UNESPBotucatuBrazil
- Experimental Research Unity, Faculty of MedicineSão Paulo State University, UNESPBotucatuBrazil
| | - Silvia Regina Rogatto
- Department of Clinical Genetics, University Hospital, Institute of Regional Health ResearchUniversity of Southern DenmarkVejleDenmark
- Danish Colorectal Cancer Center SouthVejleDenmark
| | - Robson Francisco Carvalho
- Department of Structural and Functional Biology, Institute of BiosciencesSão Paulo State University, UNESPBotucatuBrazil
| |
Collapse
|
26
|
Abstract
The k-assignment problem (or, the k-matching problem) on k-partite graphs is an NP-hard problem for k≥3. In this paper we introduce five new heuristics. Two algorithms, Bm and Cm, arise as natural improvements of Algorithm Am from (He et al., in: Graph Algorithms And Applications 2, World Scientific, 2004). The other three algorithms, Dm, Em, and Fm, incorporate randomization. Algorithm Dm can be considered as a greedy version of Bm, whereas Em and Fm are versions of local search algorithm, specialized for the k-matching problem. The algorithms are implemented in Python and are run on three datasets. On the datasets available, all the algorithms clearly outperform Algorithm Am in terms of solution quality. On the first dataset with known optimal values the average relative error ranges from 1.47% over optimum (algorithm Am) to 0.08% over optimum (algorithm Em). On the second dataset with known optimal values the average relative error ranges from 4.41% over optimum (algorithm Am) to 0.45% over optimum (algorithm Fm). Better quality of solutions demands higher computation times, thus the new algorithms provide a good compromise between quality of solutions and computation time.
Collapse
|
27
|
Burke PEP, Campos CBDL, Costa LDF, Quiles MG. A biochemical network modeling of a whole-cell. Sci Rep 2020; 10:13303. [PMID: 32764598 PMCID: PMC7411072 DOI: 10.1038/s41598-020-70145-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Accepted: 07/23/2020] [Indexed: 01/18/2023] Open
Abstract
All cellular processes can be ultimately understood in terms of respective fundamental biochemical interactions between molecules, which can be modeled as networks. Very often, these molecules are shared by more than one process, therefore interconnecting them. Despite this effect, cellular processes are usually described by separate networks with heterogeneous levels of detail, such as metabolic, protein-protein interaction, and transcription regulation networks. Aiming at obtaining a unified representation of cellular processes, we describe in this work an integrative framework that draws concepts from rule-based modeling. In order to probe the capabilities of the framework, we used an organism-specific database and genomic information to model the whole-cell biochemical network of the Mycoplasma genitalium organism. This modeling accounted for 15 cellular processes and resulted in a single component network, indicating that all processes are somehow interconnected. The topological analysis of the network showed structural consistency with biological networks in the literature. In order to validate the network, we estimated gene essentiality by simulating gene deletions and compared the results with experimental data available in the literature. We could classify 212 genes as essential, being 95% of them consistent with experimental results. Although we adopted a relatively simple organism as a case study, we suggest that the presented framework has the potential for paving the way to more integrated studies of whole organisms leading to a systemic analysis of cells on a broader scale. The modeling of other organisms using this framework could provide useful large-scale models for different fields of research such as bioengineering, network biology, and synthetic biology, and also provide novel tools for medical and industrial applications.
Collapse
Affiliation(s)
- Paulo E P Burke
- University of São Paulo, Bioinformatics Graduate Program, São Carlos, SP, Brazil.
| | - Claudia B de L Campos
- Institute of Science and Technology, Federal University of São Paulo, São José dos Campos, SP, Brazil
| | - Luciano da F Costa
- São Carlos Institute of Physics, University of São Paulo, São Carlos, SP, Brazil
| | - Marcos G Quiles
- Institute of Science and Technology, Federal University of São Paulo, São José dos Campos, SP, Brazil
| |
Collapse
|
28
|
Naquet P, Kerr EW, Vickers SD, Leonardi R. Regulation of coenzyme A levels by degradation: the 'Ins and Outs'. Prog Lipid Res 2020; 78:101028. [PMID: 32234503 DOI: 10.1016/j.plipres.2020.101028] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 02/09/2020] [Accepted: 02/22/2020] [Indexed: 02/06/2023]
Abstract
Coenzyme A (CoA) is the predominant acyl carrier in mammalian cells and a cofactor that plays a key role in energy and lipid metabolism. CoA and its thioesters (acyl-CoAs) regulate a multitude of metabolic processes at different levels: as substrates, allosteric modulators, and via post-translational modification of histones and other non-histone proteins. Evidence is emerging that synthesis and degradation of CoA are regulated in a manner that enables metabolic flexibility in different subcellular compartments. Degradation of CoA occurs through distinct intra- and extracellular pathways that rely on the activity of specific hydrolases. The pantetheinase enzymes specifically hydrolyze pantetheine to cysteamine and pantothenate, the last step in the extracellular degradation pathway for CoA. This reaction releases pantothenate in the bloodstream, making this CoA precursor available for cellular uptake and de novo CoA synthesis. Intracellular degradation of CoA depends on specific mitochondrial and peroxisomal Nudix hydrolases. These enzymes are also active against a subset of acyl-CoAs and play a key role in the regulation of subcellular (acyl-)CoA pools and CoA-dependent metabolic reactions. The evidence currently available indicates that the extracellular and intracellular (acyl-)CoA degradation pathways are regulated in a coordinated and opposite manner by the nutritional state and maximize the changes in the total intracellular CoA levels that support the metabolic switch between fed and fasted states in organs like the liver. The objective of this review is to update the contribution of these pathways to the regulation of metabolism, physiology and pathology and to highlight the many questions that remain open.
Collapse
Affiliation(s)
- Philippe Naquet
- Aix Marseille Univ, INSERM, CNRS, Centre d'Immunologie de Marseille-Luminy, Marseille, France.
| | - Evan W Kerr
- Department of Biochemistry, West Virginia University, Morgantown, West Virginia 26506, United States of America
| | - Schuyler D Vickers
- Department of Biochemistry, West Virginia University, Morgantown, West Virginia 26506, United States of America
| | - Roberta Leonardi
- Department of Biochemistry, West Virginia University, Morgantown, West Virginia 26506, United States of America.
| |
Collapse
|
29
|
Basha O, Argov CM, Artzy R, Zoabi Y, Hekselman I, Alfandari L, Chalifa-Caspi V, Yeger-Lotem E. Differential network analysis of multiple human tissue interactomes highlights tissue-selective processes and genetic disorder genes. Bioinformatics 2020; 36:2821-2828. [DOI: 10.1093/bioinformatics/btaa034] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 01/07/2020] [Accepted: 01/16/2020] [Indexed: 01/19/2023] Open
Abstract
Abstract
Motivation
Differential network analysis, designed to highlight network changes between conditions, is an important paradigm in network biology. However, differential network analysis methods have been typically designed to compare between two conditions and were rarely applied to multiple protein interaction networks (interactomes). Importantly, large-scale benchmarks for their evaluation have been lacking.
Results
Here, we present a framework for assessing the ability of differential network analysis of multiple human tissue interactomes to highlight tissue-selective processes and disorders. For this, we created a benchmark of 6499 curated tissue-specific Gene Ontology biological processes. We applied five methods, including four differential network analysis methods, to construct weighted interactomes for 34 tissues. Rigorous assessment of this benchmark revealed that differential analysis methods perform well in revealing tissue-selective processes (AUCs of 0.82–0.9). Next, we applied differential network analysis to illuminate the genes underlying tissue-selective hereditary disorders. For this, we curated a dataset of 1305 tissue-specific hereditary disorders and their manifesting tissues. Focusing on subnetworks containing the top 1% differential interactions in disease-relevant tissue interactomes revealed significant enrichment for disorder-causing genes in 18.6% of the cases, with a significantly high success rate for blood, nerve, muscle and heart diseases.
Summary
Altogether, we offer a framework that includes expansive manually curated datasets of tissue-selective processes and disorders to be used as benchmarks or to illuminate tissue-selective processes and genes. Our results demonstrate that differential analysis of multiple human tissue interactomes is a powerful tool for highlighting processes and genes with tissue-selective functionality and clinical impact.
Availability and implementation
Datasets are available as part of the Supplementary data.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Omer Basha
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Chanan M Argov
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Raviv Artzy
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Yazeed Zoabi
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Idan Hekselman
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Liad Alfandari
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Vered Chalifa-Caspi
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
30
|
Čopar A, Zupan B, Zitnik M. Fast optimization of non-negative matrix tri-factorization. PLoS One 2019; 14:e0217994. [PMID: 31185054 PMCID: PMC6559648 DOI: 10.1371/journal.pone.0217994] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 05/22/2019] [Indexed: 11/18/2022] Open
Abstract
Non-negative matrix tri-factorization (NMTF) is a popular technique for learning low-dimensional feature representation of relational data. Currently, NMTF learns a representation of a dataset through an optimization procedure that typically uses multiplicative update rules. This procedure has had limited success, and its failure cases have not been well understood. We here perform an empirical study involving six large datasets comparing multiplicative update rules with three alternative optimization methods, including alternating least squares, projected gradients, and coordinate descent. We find that methods based on projected gradients and coordinate descent converge up to twenty-four times faster than multiplicative update rules. Furthermore, alternating least squares method can quickly train NMTF models on sparse datasets but often fails on dense datasets. Coordinate descent-based NMTF converges up to sixteen times faster compared to well-established methods.
Collapse
Affiliation(s)
- Andrej Čopar
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | - Blaž Zupan
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, United States of America
| | - Marinka Zitnik
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
- Department of Computer Science, Stanford University, Stanford, CA, United States of America
| |
Collapse
|
31
|
He Y, Mohamedali A, Huang C, Baker MS, Nice EC. Oncoproteomics: Current status and future opportunities. Clin Chim Acta 2019; 495:611-624. [PMID: 31176645 DOI: 10.1016/j.cca.2019.06.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/05/2019] [Accepted: 06/05/2019] [Indexed: 02/07/2023]
Abstract
Oncoproteomics is the systematic study of cancer samples using omics technologies to detect changes implicated in tumorigenesis. Recent progress in oncoproteomics is already opening new avenues for the identification of novel biomarkers for early clinical stage cancer detection, targeted molecular therapies, disease monitoring, and drug development. Such information will lead to new understandings of cancer biology and impact dramatically on the future care of cancer patients. In this review, we will summarize the advantages and limitations of the key technologies used in (onco)proteogenomics, (the Omics Pipeline), explain how they can assist us in understanding the biology behind the overarching "Hallmarks of Cancer", discuss how they can advance the development of precision/personalised medicine and the future directions in the field.
Collapse
Affiliation(s)
- Yujia He
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, PR China
| | - Abidali Mohamedali
- Department of Molecular Sciences, Faculty of Science and Engineering, Macquarie University, New South Wales 2109, Australia
| | - Canhua Huang
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, PR China
| | - Mark S Baker
- Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, New South Wales 2109, Australia.
| | - Edouard C Nice
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, PR China; Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, New South Wales 2109, Australia; Department of Biochemistry and Molecular Biology, Monash University, Clayton, Australia.
| |
Collapse
|
32
|
Sonawane AR, Weiss ST, Glass K, Sharma A. Network Medicine in the Age of Biomedical Big Data. Front Genet 2019; 10:294. [PMID: 31031797 PMCID: PMC6470635 DOI: 10.3389/fgene.2019.00294] [Citation(s) in RCA: 128] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 03/19/2019] [Indexed: 12/13/2022] Open
Abstract
Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare.
Collapse
Affiliation(s)
- Abhijeet R. Sonawane
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Scott T. Weiss
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Amitabh Sharma
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women’s Hospital, Boston, MA, United States
| |
Collapse
|