1
|
Hossain I, Fanfani V, Fischer J, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. Genome Biol 2024; 25:127. [PMID: 38773638 PMCID: PMC11106922 DOI: 10.1186/s13059-024-03264-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 04/30/2024] [Indexed: 05/24/2024] Open
Abstract
BACKGROUND Gene regulatory network (GRN) models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such gene regulatory ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the underlying GRN governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impede either scalability, explainability, or both. RESULTS We developed PHOENIX, a modeling framework based on neural ordinary differential equations (NeuralODEs) and Hill-Langmuir kinetics, that overcomes limitations of other methods by flexibly incorporating prior domain knowledge and biological constraints to promote sparse, biologically interpretable representations of GRN ODEs. We tested the accuracy of PHOENIX in a series of in silico experiments, benchmarking it against several currently used tools. We demonstrated PHOENIX's flexibility by modeling regulation of oscillating expression profiles obtained from synchronized yeast cells. We also assessed the scalability of PHOENIX by modeling genome-scale GRNs for breast cancer samples ordered in pseudotime and for B cells treated with Rituximab. CONCLUSIONS PHOENIX uses a combination of user-defined prior knowledge and functional forms from systems biology to encode biological "first principles" as soft constraints on the GRN allowing us to predict subsequent gene expression patterns in a biologically explainable manner.
Collapse
Affiliation(s)
| | - Viola Fanfani
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Jonas Fischer
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Rebekka Burkholz
- CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
| |
Collapse
|
2
|
Hossain I, Fanfani V, Fischer J, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. bioRxiv 2024:2023.02.24.529835. [PMID: 36909563 PMCID: PMC10002636 DOI: 10.1101/2023.02.24.529835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Modeling dynamics of gene regulatory networks using ordinary differential equations (ODEs) allow a deeper understanding of disease progression and response to therapy, thus aiding in intervention optimization. Although there exist methods to infer regulatory ODEs, these are generally limited to small networks, rely on dimensional reduction, or impose non-biological parametric restrictions - all impeding scalability and explainability. PHOENIX is a neural ODE framework incorporating prior domain knowledge as soft constraints to infer sparse, biologically interpretable dynamics. Extensive experiments - on simulated and real data - demonstrate PHOENIX's unique ability to learn key regulatory dynamics while scaling to the whole genome.
Collapse
|
3
|
Hossain I, Fanfani V, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. Res Sq 2023:rs.3.rs-2675584. [PMID: 36993392 PMCID: PMC10055646 DOI: 10.21203/rs.3.rs-2675584/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the causal gene-regulatory network (GRN) governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impedes scalability and/or explainability. To overcome these limitations, we developed PHOENIX, a modeling framework based on neural ordinary differential equations (NeuralODEs) and Hill-Langmuir kinetics, that can flexibly incorporate prior domain knowledge and biological constraints to promote sparse, biologically interpretable representations of ODEs. We test accuracy of PHOENIX in a series of in silico experiments benchmarking it against several currently used tools for ODE estimation. We also demonstrate PHOENIX's flexibility by studying oscillating expression data from synchronized yeast cells and assess its scalability by modelling genome-scale breast cancer expression for samples ordered in pseudotime. Finally, we show how the combination of user-defined prior knowledge and functional forms from systems biology allows PHOENIX to encode key properties of the underlying GRN, and subsequently predict expression patterns in a biologically explainable way.
Collapse
Affiliation(s)
- Intekhab Hossain
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Viola Fanfani
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Rebekka Burkholz
- Helmholtz Center for Information Security (CISPA), Saarbrücken, Germany
| |
Collapse
|
4
|
Ben Guebila M, Wang T, Lopes-Ramos CM, Fanfani V, Weighill D, Burkholz R, Schlauch D, Paulson JN, Altenbuchinger M, Shutta KH, Sonawane AR, Lim J, Calderer G, van IJzendoorn DGP, Morgan D, Marin A, Chen CY, Song Q, Saha E, DeMeo DL, Padi M, Platig J, Kuijjer ML, Glass K, Quackenbush J. The Network Zoo: a multilingual package for the inference and analysis of gene regulatory networks. Genome Biol 2023; 24:45. [PMID: 36894939 PMCID: PMC9999668 DOI: 10.1186/s13059-023-02877-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 02/15/2023] [Indexed: 03/11/2023] Open
Abstract
Inference and analysis of gene regulatory networks (GRNs) require software that integrates multi-omic data from various sources. The Network Zoo (netZoo; netzoo.github.io) is a collection of open-source methods to infer GRNs, conduct differential network analyses, estimate community structure, and explore the transitions between biological states. The netZoo builds on our ongoing development of network methods, harmonizing the implementations in various computing languages and between methods to allow better integration of these tools into analytical pipelines. We demonstrate the utility using multi-omic data from the Cancer Cell Line Encyclopedia. We will continue to expand the netZoo to incorporate additional methods.
Collapse
Affiliation(s)
- Marouen Ben Guebila
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Tian Wang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Biology Department, Boston College, Chestnut Hill, MA, USA
| | - Camila M Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Viola Fanfani
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Des Weighill
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
| | - Daniel Schlauch
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Genospace, LLC, Boston, MA, USA
| | - Joseph N Paulson
- Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Michael Altenbuchinger
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Department of Medical Bioinformatics, University Medical Center Göttingen, Göttingen, Germany
| | - Katherine H Shutta
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Abhijeet R Sonawane
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Present Address: Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - James Lim
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
- Present Address: Monoceros Biosystems, LLC, San Diego, CA, USA
| | - Genis Calderer
- Center for Molecular Medicine Norway, Nordic EMBL Partnership, University of Oslo, Oslo, Norway
| | - David G P van IJzendoorn
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Present Address: Department of Pathology, Stanford University School of Medicine, Palo Alto, CA, USA
| | - Daniel Morgan
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Present Address: School of Biomedical Sciences, Hong Kong University, Pokfulam, Hong Kong
| | | | - Cho-Yi Chen
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Present Address: Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, 112, Taiwan
| | - Qi Song
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Present Address: Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Enakshi Saha
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Megha Padi
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - John Platig
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Marieke L Kuijjer
- Center for Molecular Medicine Norway, Nordic EMBL Partnership, University of Oslo, Oslo, Norway
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Center for Computational Oncology, Leiden University, Leiden, The Netherlands
| | - Kimberly Glass
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
5
|
Shutta KH, Weighill D, Burkholz R, Guebila M, DeMeo DL, Zacharias HU, Quackenbush J, Altenbuchinger M. DRAGON: Determining Regulatory Associations using Graphical models on multi-Omic Networks. Nucleic Acids Res 2022; 51:e15. [PMID: 36533448 PMCID: PMC9943674 DOI: 10.1093/nar/gkac1157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 11/08/2022] [Accepted: 11/23/2022] [Indexed: 12/23/2022] Open
Abstract
The increasing quantity of multi-omic data, such as methylomic and transcriptomic profiles collected on the same specimen or even on the same cell, provides a unique opportunity to explore the complex interactions that define cell phenotype and govern cellular responses to perturbations. We propose a network approach based on Gaussian Graphical Models (GGMs) that facilitates the joint analysis of paired omics data. This method, called DRAGON (Determining Regulatory Associations using Graphical models on multi-Omic Networks), calibrates its parameters to achieve an optimal trade-off between the network's complexity and estimation accuracy, while explicitly accounting for the characteristics of each of the assessed omics 'layers.' In simulation studies, we show that DRAGON adapts to edge density and feature size differences between omics layers, improving model inference and edge recovery compared to state-of-the-art methods. We further demonstrate in an analysis of joint transcriptome - methylome data from TCGA breast cancer specimens that DRAGON can identify key molecular mechanisms such as gene regulation via promoter methylation. In particular, we identify Transcription Factor AP-2 Beta (TFAP2B) as a potential multi-omic biomarker for basal-type breast cancer. DRAGON is available as open-source code in Python through the Network Zoo package (netZooPy v0.8; netzoo.github.io).
Collapse
Affiliation(s)
| | | | - Rebekka Burkholz
- CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
| | - Marouen Ben Guebila
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Brigham and Women’s Hospital, and Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Helena U Zacharias
- Department of Internal Medicine I, University Medical Center Schleswig-Holstein, Campus Kiel, Kiel, Germany,Institute of Clinical Molecular Biology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, Kiel, Germany,Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover, Germany
| | | | - Michael Altenbuchinger
- To whom correspondence should be addressed. Tel: +49 551 39 61788; Fax: +49 551 39 61783;
| |
Collapse
|
6
|
Laumer F, Di Vece D, Cammann VL, Würdinger M, Petkova V, Schönberger M, Schönberger A, Mercier JC, Niederseer D, Seifert B, Schwyzer M, Burkholz R, Corinzia L, Becker AS, Scherff F, Brouwers S, Pazhenkottil AP, Dougoud S, Messerli M, Tanner FC, Fischer T, Delgado V, Schulze PC, Hauck C, Maier LS, Nguyen H, Surikow SY, Horowitz J, Liu K, Citro R, Bax J, Ruschitzka F, Ghadri JR, Buhmann JM, Templin C. Assessment of Artificial Intelligence in Echocardiography Diagnostics in Differentiating Takotsubo Syndrome From Myocardial Infarction. JAMA Cardiol 2022; 7:494-503. [PMID: 35353118 PMCID: PMC8968683 DOI: 10.1001/jamacardio.2022.0183] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Importance Machine learning algorithms enable the automatic classification of cardiovascular diseases based on raw cardiac ultrasound imaging data. However, the utility of machine learning in distinguishing between takotsubo syndrome (TTS) and acute myocardial infarction (AMI) has not been studied. Objectives To assess the utility of machine learning systems for automatic discrimination of TTS and AMI. Design, Settings, and Participants This cohort study included clinical data and transthoracic echocardiogram results of patients with AMI from the Zurich Acute Coronary Syndrome Registry and patients with TTS obtained from 7 cardiovascular centers in the International Takotsubo Registry. Data from the validation cohort were obtained from April 2011 to February 2017. Data from the training cohort were obtained from March 2017 to May 2019. Data were analyzed from September 2019 to June 2021. Exposure Transthoracic echocardiograms of 224 patients with TTS and 224 patients with AMI were analyzed. Main Outcomes and Measures Area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity of the machine learning system evaluated on an independent data set and 4 practicing cardiologists for comparison. Echocardiography videos of 228 patients were used in the development and training of a deep learning model. The performance of the automated echocardiogram video analysis method was evaluated on an independent data set consisting of 220 patients. Data were matched according to age, sex, and ST-segment elevation/non-ST-segment elevation (1 patient with AMI for each patient with TTS). Predictions were compared with echocardiographic-based interpretations from 4 practicing cardiologists in terms of sensitivity, specificity, and AUC calculated from confidence scores concerning their binary diagnosis. Results In this cohort study, apical 2-chamber and 4-chamber echocardiographic views of 110 patients with TTS (mean [SD] age, 68.4 [12.1] years; 103 [90.4%] were female) and 110 patients with AMI (mean [SD] age, 69.1 [12.2] years; 103 [90.4%] were female) from an independent data set were evaluated. This approach achieved a mean (SD) AUC of 0.79 (0.01) with an overall accuracy of 74.8 (0.7%). In comparison, cardiologists achieved a mean (SD) AUC of 0.71 (0.03) and accuracy of 64.4 (3.5%) on the same data set. In a subanalysis based on 61 patients with apical TTS and 56 patients with AMI due to occlusion of the left anterior descending coronary artery, the model achieved a mean (SD) AUC score of 0.84 (0.01) and an accuracy of 78.6 (1.6%), outperforming the 4 practicing cardiologists (mean [SD] AUC, 0.72 [0.02]) and accuracy of 66.9 (2.8%). Conclusions and Relevance In this cohort study, a real-time system for fully automated interpretation of echocardiogram videos was established and trained to differentiate TTS from AMI. While this system was more accurate than cardiologists in echocardiography-based disease classification, further studies are warranted for clinical application.
Collapse
Affiliation(s)
- Fabian Laumer
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Davide Di Vece
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Victoria L Cammann
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Michael Würdinger
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Vanya Petkova
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | | | | | - Julien C Mercier
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - David Niederseer
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Burkhardt Seifert
- Division of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - Moritz Schwyzer
- Institute of Diagnostic and Interventional Radiology, University Hospital Zurich, Zurich, Switzerland
| | | | - Luca Corinzia
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Anton S Becker
- Institute of Diagnostic and Interventional Radiology, University Hospital Zurich, Zurich, Switzerland
| | - Frank Scherff
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Sofie Brouwers
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Aju P Pazhenkottil
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland.,Department of Nuclear Medicine, University Hospital Zurich, Zurich, Switzerland
| | - Svetlana Dougoud
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Michael Messerli
- Department of Nuclear Medicine, University Hospital Zurich, Zurich, Switzerland
| | - Felix C Tanner
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Thomas Fischer
- Department of Cardiology, Kantonsspital Winterthur, Winterthur, Switzerland
| | - Victoria Delgado
- Department of Cardiology, Leiden University Medical Centre, Leiden, the Netherlands
| | - P Christian Schulze
- Department of Internal Medicine I, University Hospital Jena, Friedrich-Schiller-University Jena, Jena, Germany
| | - Christian Hauck
- Klinik und Poliklinik für Innere Medizin II, Universitätsklinikum Regensburg, Regensburg, Germany
| | - Lars S Maier
- Klinik und Poliklinik für Innere Medizin II, Universitätsklinikum Regensburg, Regensburg, Germany
| | - Ha Nguyen
- Department of Cardiology, Basil Hetzel Institute, Queen Elizabeth Hospital, University of Adelaide, Adelaide, Australia
| | - Sven Y Surikow
- Department of Cardiology, Basil Hetzel Institute, Queen Elizabeth Hospital, University of Adelaide, Adelaide, Australia
| | - John Horowitz
- Department of Cardiology, Basil Hetzel Institute, Queen Elizabeth Hospital, University of Adelaide, Adelaide, Australia
| | - Kan Liu
- Division of Cardiology, Heart and Vascular Center, University of Iowa, Iowa City
| | - Rodolfo Citro
- Heart Department, University Hospital "San Giovanni di Dio e Ruggi d'Aragona", Salerno, Italy.,IRCCS Neuromed, Pozzilli, (Isernia) Italy
| | - Jeroen Bax
- Department of Cardiology, Leiden University Medical Centre, Leiden, the Netherlands
| | - Frank Ruschitzka
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | - Jelena-Rima Ghadri
- Department of Cardiology, University Hospital Zurich, Zurich, Switzerland
| | | | - Christian Templin
- Division of Biostatistics, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| |
Collapse
|
7
|
Ben Guebila M, Lopes-Ramos CM, Weighill D, Sonawane A, Burkholz R, Shamsaei B, Platig J, Glass K, Kuijjer M, Quackenbush J. GRAND: a database of gene regulatory network models across human conditions. Nucleic Acids Res 2022; 50:D610-D621. [PMID: 34508353 PMCID: PMC8728257 DOI: 10.1093/nar/gkab778] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/17/2021] [Accepted: 09/08/2021] [Indexed: 12/14/2022] Open
Abstract
Gene regulation plays a fundamental role in shaping tissue identity, function, and response to perturbation. Regulatory processes are controlled by complex networks of interacting elements, including transcription factors, miRNAs and their target genes. The structure of these networks helps to determine phenotypes and can ultimately influence the development of disease or response to therapy. We developed GRAND (https://grand.networkmedicine.org) as a database for computationally-inferred, context-specific gene regulatory network models that can be compared between biological states, or used to predict which drugs produce changes in regulatory network structure. The database includes 12 468 genome-scale networks covering 36 human tissues, 28 cancers, 1378 unperturbed cell lines, as well as 173 013 TF and gene targeting scores for 2858 small molecule-induced cell line perturbation paired with phenotypic information. GRAND allows the networks to be queried using phenotypic information and visualized using a variety of interactive tools. In addition, it includes a web application that matches disease states to potentially therapeutic small molecule drugs using regulatory network properties.
Collapse
Affiliation(s)
- Marouen Ben Guebila
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | | | - Deborah Weighill
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Abhijeet Rajendra Sonawane
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA02115, USA
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Behrouz Shamsaei
- Division of Biostatistics and Bioinformatics, Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - John Platig
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, USA
| | - Kimberly Glass
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, USA
| | - Marieke L Kuijjer
- Center for Molecular Medicine Norway, Faculty of Medicine, University of Oslo, Oslo, Norway
- Leiden University Medical Center, Leiden, The Netherlands
| | - John Quackenbush
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, USA
| |
Collapse
|
8
|
Thomès L, Burkholz R, Bojar D. Glycowork: A Python package for glycan data science and machine learning. Glycobiology 2021; 31:1240-1244. [PMID: 34192308 PMCID: PMC8600276 DOI: 10.1093/glycob/cwab067] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/02/2021] [Accepted: 06/25/2021] [Indexed: 12/14/2022] Open
Abstract
While glycans are crucial for biological processes, existing analysis modalities make it difficult for researchers with limited computational background to include these diverse carbohydrates into workflows. Here, we present glycowork, an open-source Python package designed for glycan-related data science and machine learning by end users. Glycowork includes functions to, for instance, automatically annotate glycan motifs and analyze their distributions via heatmaps and statistical enrichment. We also provide visualization methods, routines to interact with stored databases, trained machine learning models and learned glycan representations. We envision that glycowork can extract further insights from glycan datasets and demonstrate this with workflows that analyze glycan motifs in various biological contexts. Glycowork can be freely accessed at https://github.com/BojarLab/glycowork/.
Collapse
Affiliation(s)
- Luc Thomès
- Department of Chemistry and Molecular Biology and Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard School of Public Health, Boston, 02115 MA, USA
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology and Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, 41390 Gothenburg, Sweden
| |
Collapse
|
9
|
Burkholz R, Quackenbush J, Bojar D. Using graph convolutional neural networks to learn a representation for glycans. Cell Rep 2021; 35:109251. [PMID: 34133929 PMCID: PMC9208909 DOI: 10.1016/j.celrep.2021.109251] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 05/05/2021] [Accepted: 05/24/2021] [Indexed: 02/06/2023] Open
Abstract
As the only nonlinear and the most diverse biological sequence, glycans offer substantial challenges for computational biology. These carbohydrates participate in nearly all biological processes—from protein folding to viral cell entry—yet are still not well understood. There are few computational methods to link glycan sequences to functions, and they do not fully leverage all available information about glycans. SweetNet is a graph convolutional neural network that uses graph representation learning to facilitate a computational understanding of glycobiology. SweetNet explicitly incorporates the nonlinear nature of glycans and establishes a framework to map any glycan sequence to a representation. We show that SweetNet outperforms other computational methods in predicting glycan properties on all reported tasks. More importantly, we show that glycan representations, learned by SweetNet, are predictive of organismal phenotypic and environmental properties. Finally, we use glycan-focused machine learning to predict viral glycan binding, which can be used to discover viral receptors. Burkholz et al. develop an analysis platform for glycans, using graph convolutional neural networks, that considers the branched nature of these carbohydrates. They demonstrate that glycan-focused machine learning can be employed for various purposes, such as to cluster species according to their glycomic similarity or to identify viral receptors.
Collapse
Affiliation(s)
- Rebekka Burkholz
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Daniel Bojar
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden; Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
10
|
Weichwald S, Candreva A, Burkholz R, Klingenberg R, Räber L, Heg D, Manka R, Gencer B, Mach F, Nanchen D, Rodondi N, Windecker S, Laaksonen R, Hazen SL, von Eckardstein A, Ruschitzka F, Lüscher TF, Buhmann JM, Matter CM. Improving 1-year mortality prediction in ACS patients using machine learning. Eur Heart J Acute Cardiovasc Care 2021; 10:855-865. [PMID: 34015112 DOI: 10.1093/ehjacc/zuab030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 04/16/2021] [Accepted: 04/21/2021] [Indexed: 01/08/2023]
Abstract
BACKGROUND The Global Registry of Acute Coronary Events (GRACE) score is an established clinical risk stratification tool for patients with acute coronary syndromes (ACS). We developed and internally validated a model for 1-year all-cause mortality prediction in ACS patients. METHODS Between 2009 and 2012, 2'168 ACS patients were enrolled into the Swiss SPUM-ACS Cohort. Biomarkers were determined in 1'892 patients and follow-up was achieved in 95.8% of patients. 1-year all-cause mortality was 4.3% (n = 80). In our analysis we consider all linear models using combinations of 8 out of 56 variables to predict 1-year all-cause mortality and to derive a variable ranking. RESULTS 1.3% of 1'420'494'075 models outperformed the GRACE 2.0 Score. The SPUM-ACS Score includes age, plasma glucose, NT-proBNP, left ventricular ejection fraction (LVEF), Killip class, history of peripheral artery disease (PAD), malignancy, and cardio-pulmonary resuscitation. For predicting 1-year mortality after ACS, the SPUM-ACS Score outperformed the GRACE 2.0 Score which achieves a 5-fold cross-validated AUC of 0.81 (95% CI 0.78-0.84). Ranking individual features according to their importance across all multivariate models revealed age, trimethylamine N-oxide, creatinine, history of PAD or malignancy, LVEF, and haemoglobin as the most relevant variables for predicting 1-year mortality. CONCLUSIONS The variable ranking and the selection for the SPUM-ACS Score highlight the relevance of age, markers of heart failure, and comorbidities for prediction of all-cause death. Before application, this score needs to be externally validated and refined in larger cohorts. CLINICAL TRIAL REGISTRATION NCT01000701.
Collapse
Affiliation(s)
- Sebastian Weichwald
- Department of Computer Science, Institute for Machine Learning, ETH Zurich, Switzerland.,Max Planck Institute for Intelligent Systems, Tübingen, Germany
| | - Alessandro Candreva
- Department of Cardiology, University Heart Center, University Hospital of Zurich, Switzerland
| | - Rebekka Burkholz
- Department of Computer Science, Institute for Machine Learning, ETH Zurich, Switzerland
| | - Roland Klingenberg
- Department of Cardiology, University Heart Center, University Hospital of Zurich, Switzerland.,Kerckhoff Heart and Thorax Center, Department of Cardiology, Kerckhoff-Klinik, Bad Nauheim, Germany.,Campus of the Justus Liebig University of Giessen, Germany.,DZHK (German Center for Cardiovascular Research), Partner Site Rhine-Main, Bad Nauheim, Germany
| | - Lorenz Räber
- Department of Cardiology, Cardiovascular Center, University Hospital of Bern, Switzerland
| | - Dik Heg
- Clinical Trial Unit, University of Bern, Switzerland
| | - Robert Manka
- Department of Cardiology, University Heart Center, University Hospital of Zurich, Switzerland
| | - Baris Gencer
- Department of Cardiology, Cardiovascular Center, University Hospital of Geneva, Switzerland
| | - François Mach
- Department of Cardiology, Cardiovascular Center, University Hospital of Geneva, Switzerland
| | - David Nanchen
- Department of Ambulatory Care and Community Medicine, University of Lausanne, Switzerland
| | - Nicolas Rodondi
- Institute of Primary Health Care (BIHAM), University of Bern, Switzerland.,Department of General Internal Medicine, Inselspital, Bern University Hospital, University of Bern, Switzerland
| | - Stephan Windecker
- Department of Cardiology, Cardiovascular Center, University Hospital of Bern, Switzerland
| | - Reijo Laaksonen
- Zora Biosciences, Espoo, Finland.,Finnish Cardiovascular Research Center Tampere, Tampere University, Tampere, Finland
| | - Stanley L Hazen
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.,Department of Cardiovascular Medicine, Heart and Vascular Institute, Cleveland Clinic, Cleveland, OH, USA
| | | | - Frank Ruschitzka
- Department of Cardiology, University Heart Center, University Hospital of Zurich, Switzerland
| | - Thomas F Lüscher
- Center for Molecular Cardiology, University of Zurich, Switzerland.,Cardiology, Royal Brompton & Harefield Hospitals, London, United Kingdom
| | - Joachim M Buhmann
- Department of Computer Science, Institute for Machine Learning, ETH Zurich, Switzerland
| | - Christian M Matter
- Department of Cardiology, University Heart Center, University Hospital of Zurich, Switzerland.,Center for Molecular Cardiology, University of Zurich, Switzerland
| |
Collapse
|
11
|
Weighill D, Guebila MB, Lopes-Ramos C, Glass K, Quackenbush J, Platig J, Burkholz R. Gene regulatory network inference as relaxed graph matching. Proc AAAI Conf Artif Intell 2021; 35:10263-10272. [PMID: 34707916 PMCID: PMC8546743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Bipartite network inference is a ubiquitous problem across disciplines. One important example in the field molecular biology is gene regulatory network inference. Gene regulatory networks are an instrumental tool aiding in the discovery of the molecular mechanisms driving diverse diseases, including cancer. However, only noisy observations of the projections of these regulatory networks are typically assayed. In an effort to better estimate regulatory networks from their noisy projections, we formulate a non-convex but analytically tractable optimization problem called OTTER. This problem can be interpreted as relaxed graph matching between the two projections of the bipartite network. OTTER's solutions can be derived explicitly and inspire a spectral algorithm, for which we provide network recovery guarantees. We also provide an alternative approach based on gradient descent that is more robust to noise compared to the spectral algorithm. Interestingly, this gradient descent approach resembles the message passing equations of an established gene regulatory network inference method, PANDA. Using three cancer-related data sets, we show that OTTER outperforms state-of-the-art inference methods in predicting transcription factor binding to gene regulatory regions. To encourage new graph matching applications to this problem, we have made all networks and validation data publicly available.
Collapse
Affiliation(s)
- Deborah Weighill
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
| | - Marouen Ben Guebila
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
| | - Camila Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
| | - Kimberly Glass
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
- Channing Division of Network Medicine, Brigham and Women's Hospital
- Harvard Medical School, Boston, MA 02115
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
- Channing Division of Network Medicine, Brigham and Women's Hospital
- Harvard Medical School, Boston, MA 02115
| | - John Platig
- Channing Division of Network Medicine, Brigham and Women's Hospital
- Harvard Medical School, Boston, MA 02115
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
| |
Collapse
|
12
|
Di Vece D, Laumer F, Schwyzer M, Burkholz R, Corinzia L, Cammann V, Citro R, Bax J, Ghadri J, Buhmann J, Templin C. Artificial intelligence in echocardiography diagnostics – detection of takotsubo syndrome. Eur Heart J 2020. [DOI: 10.1093/ehjci/ehaa946.1221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Abstract
Background
Machine learning allows classifying diseases based only on raw echocardiographic imaging data and is therefore a landmark in the development of computer-assisted decision support systems in echocardiography.
Purpose
The present study sought to determine the value of deep (machine) learning systems for automatic discrimination of takotsubo syndrome and acute myocardial infarction.
Methods
Apical 2- and 4-chamber echocardiographic views of 110 patients with takotsubo syndrome and 110 patients with acute myocardial infarction were used in the development, training and validation of a deep learning approach, i.e. a convolutional autoencoder (CAE) for feature extraction followed by classical machine learning models for classification of the diseases.
Results
The deep learning model achieved an area under the receiver operating curve (AUC) of 0.801 with an overall accuracy of 74.5% for 5-fold cross validation evaluated on a clinically relevant dataset. In comparison, experienced cardiologists achieved AUCs in the range 0.678–0.740 and an average accuracy of 64.5% on the same dataset.
Conclusions
A real-time system for fully automated interpretation of echocardiographic videos was established and trained to differentiate takotsubo syndrome from acute myocardial infarction. The framework provides insight into the algorithms' decision process for physicians and yields new and valuable information on the manifestation of disease patterns in echocardiographic data. While our system was superior to cardiologists in echocardiography-based disease classification, further studies should be conducted in a larger patient population to prove its clinical application.
Funding Acknowledgement
Type of funding source: None
Collapse
Affiliation(s)
- D Di Vece
- University Hospital Zurich, Zurich, Switzerland
| | - F Laumer
- Swiss Federal Institute of Technology Zurich (ETH Zurich), Department of Computer Science, Zurich, Switzerland
| | - M Schwyzer
- University Hospital Zurich, Institute of Diagnostic and Interventional Radiology, Zurich, Switzerland
| | - R Burkholz
- Swiss Federal Institute of Technology Zurich (ETH Zurich), Department of Computer Science, Zurich, Switzerland
| | - L Corinzia
- Swiss Federal Institute of Technology Zurich (ETH Zurich), Department of Computer Science, Zurich, Switzerland
| | - V.L Cammann
- University Hospital Zurich, Zurich, Switzerland
| | - R Citro
- AOU S. Giovanni di Dio e Ruggi d'Aragona, Heart Department, Salerno, Italy
| | - J Bax
- Leiden University Medical Center, Department of Cardiology, Leiden, Netherlands (The)
| | - J.R Ghadri
- University Hospital Zurich, Zurich, Switzerland
| | - J.M Buhmann
- Swiss Federal Institute of Technology Zurich (ETH Zurich), Department of Computer Science, Zurich, Switzerland
| | - C Templin
- University Hospital Zurich, Zurich, Switzerland
| |
Collapse
|
13
|
Abstract
Two node variables determine the evolution of cascades in random networks: a node's degree and threshold. Correlations between both fundamentally change the robustness of a network, yet they are disregarded in standard analytic methods as local tree or heterogeneous mean field approximations, since order statistics are difficult to capture analytically because of their combinatorial nature. We show how they become tractable in the thermodynamic limit of infinite network size. This enables the analytic description of node attacks that are characterized by threshold allocations based on node degree. Using two examples, we discuss possible implications of irregular phase transitions and different speeds of cascade evolution for the control of cascades.
Collapse
Affiliation(s)
- Rebekka Burkholz
- ETH Zurich, Institute of Machine Learning Universitätstrasse 6, 8092 Zurich, Switzerland
| | - Frank Schweitzer
- ETH Zurich, Chair of Systems Design Weinbergstrasse 56/58, 8092 Zurich, Switzerland
| |
Collapse
|
14
|
Weichwald S, Candreva A, Burkholz R, Klingenberg R, Raeber L, Mach F, Rodondi N, Laaksonen R, Manka R, Luescher TF, Ruschitzka F, Buhmann JM, Matter CM. P6246Machine learning for improving risk stratification after ACS. Eur Heart J 2018. [DOI: 10.1093/eurheartj/ehy566.p6246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- S Weichwald
- Max Planck Institute for Intelligent Systems, Tübingen, Germany
| | - A Candreva
- University Heart Center, Zurich, Switzerland
| | - R Burkholz
- Swiss Federal Institute of Technology Zurich (ETH Zurich), Zurich, Switzerland
| | | | - L Raeber
- University of Bern, Department of Cardiology, Bern, Switzerland
| | - F Mach
- University of Geneva, Faculty of Medicine, Geneva, Switzerland
| | - N Rodondi
- University of Bern, Institute of Primary Health Care, Bern, Switzerland
| | - R Laaksonen
- University of Tampere, Faculty of Medicine and Life Sciences, Tampere, Finland
| | - R Manka
- University Heart Center, Zurich, Switzerland
| | | | | | - J M Buhmann
- Swiss Federal Institute of Technology Zurich (ETH Zurich), Zurich, Switzerland
| | - C M Matter
- University Heart Center, Zurich, Switzerland
| |
Collapse
|
15
|
Abstract
We present a framework to calculate the cascade size evolution for a large class of cascade models on random network ensembles in the limit of infinite network size. Our method is exact and applies to network ensembles with almost arbitrary degree distribution, degree-degree correlations, and, in case of threshold models, for arbitrary threshold distribution. With our approach, we shift the perspective from the known branching process approximations to the iterative update of suitable probability distributions. Such distributions are key to capture cascade dynamics that involve possibly continuous quantities and that depend on the cascade history, e.g., if load is accumulated over time. As a proof of concept, we provide two examples: (a) Constant load models that cover many of the analytically tractable casacade models, and, as a highlight, (b) a fiber bundle model that was not tractable by branching process approximations before. Our derivations cover the whole cascade dynamics, not only their steady state. This allows us to include interventions in time or further model complexity in the analysis.
Collapse
Affiliation(s)
- Rebekka Burkholz
- ETH Zurich, Chair of Systems Design Weinbergstrasse 56/58, 8092 Zurich, Switzerland
| | - Frank Schweitzer
- ETH Zurich, Chair of Systems Design Weinbergstrasse 56/58, 8092 Zurich, Switzerland
| |
Collapse
|
16
|
Abstract
We study the influence of risk diversification on cascading failures in weighted complex networks, where weighted directed links represent exposures between nodes. These weights result from different diversification strategies and their adjustment allows us to reduce systemic risk significantly by topological means. As an example, we contrast a classical exposure diversification (ED) approach with a damage diversification (DD) variant. The latter reduces the loss that the failure of high degree nodes generally inflict to their network neighbors and thus hampers the cascade amplification. To quantify the final cascade size and obtain our results, we develop a branching process approximation taking into account that inflicted losses cannot only depend on properties of the exposed, but also of the failing node. This analytic extension is a natural consequence of the paradigm shift from individual to system safety. To deepen our understanding of the cascade process, we complement this systemic perspective by a mesoscopic one: an analysis of the failure risk of nodes dependent on their degree. Additionally, we ask for the role of these failures in the cascade amplification.
Collapse
Affiliation(s)
- Rebekka Burkholz
- ETH Zurich, Chair of Systems Design, Weinbergstrasse 56/58, 8092 Zurich, Switzerland
| | - Antonios Garas
- ETH Zurich, Chair of Systems Design, Weinbergstrasse 56/58, 8092 Zurich, Switzerland
| | - Frank Schweitzer
- ETH Zurich, Chair of Systems Design, Weinbergstrasse 56/58, 8092 Zurich, Switzerland
| |
Collapse
|
17
|
Butzer RJ, Burkholz R. [Not Available]. Luzif Amor 2001; 4:24-49. [PMID: 11640661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Affiliation(s)
- R J Butzer
- Johann Wolfgang Goethe-Universität, Frankfurt/M
| | | |
Collapse
|