1
|
Matschinske J, Späth J, Bakhtiari M, Probul N, Kazemi Majdabadi MM, Nasirigerdeh R, Torkzadehmahani R, Hartebrodt A, Orban BA, Fejér SJ, Zolotareva O, Das S, Baumbach L, Pauling JK, Tomašević O, Bihari B, Bloice M, Donner NC, Fdhila W, Frisch T, Hauschild AC, Heider D, Holzinger A, Hötzendorfer W, Hospes J, Kacprowski T, Kastelitz M, List M, Mayer R, Moga M, Müller H, Pustozerova A, Röttger R, Saak CC, Saranti A, Schmidt HHHW, Tschohl C, Wenke NK, Baumbach J. The FeatureCloud Platform for Federated Learning in Biomedicine: Unified Approach. J Med Internet Res 2023; 25:e42621. [PMID: 37436815 PMCID: PMC10372562 DOI: 10.2196/42621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 01/13/2023] [Accepted: 02/26/2023] [Indexed: 07/13/2023] Open
Abstract
BACKGROUND Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures. OBJECTIVE Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond. METHODS The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime. RESULTS FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites. CONCLUSIONS FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Linda Baumbach
- University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | | | | | | | | | - Nina C Donner
- Concentris Research Management gGmbH, Fürstenfeldbruck, Germany
| | | | | | | | | | | | | | - Jan Hospes
- Research Institute AG & Co KG, Vienna, Austria
| | - Tim Kacprowski
- Technical University Braunschweig and Hannover Medical School, Brunswick, Germany
| | | | - Markus List
- Technical University Munich, Munich, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Maier A, Hartung M, Abovsky M, Adamowicz K, Bader GD, Baier S, Blumenthal DB, Chen J, Elkjaer ML, Garcia-Hernandez C, Helmy M, Hoffmann M, Jurisica I, Kotlyar M, Lazareva O, Levi H, List M, Lobentanzer S, Loscalzo J, Malod-Dognin N, Manz Q, Matschinske J, Mee M, Oubounyt M, Pico AR, Pillich RT, Poschenrieder JM, Pratt D, Pržulj N, Sadegh S, Saez-Rodriguez J, Sarkar S, Shaked G, Shamir R, Trummer N, Turhan U, Wang R, Zolotareva O, Baumbach J. Drugst.One - A plug-and-play solution for online systems medicine and network-based drug repurposing. ArXiv 2023:arXiv:2305.15453v2. [PMID: 37332567 PMCID: PMC10274948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.
Collapse
Affiliation(s)
- Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Michael Hartung
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Mark Abovsky
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, and Data Science Discovery Centre, Osteoarthritis Research Program, Krembil Research Institute, UHN, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, 60 Leonard Avenue, 5KD-407, Toronto, ON, M5T 0S8, Canada
| | - Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Gary D Bader
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada
| | - Sylvie Baier
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - David B Blumenthal
- Department Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander University Erlangen-Nürnberg (FAU), 91052 Erlangen, Germany
| | - Jing Chen
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Maria L Elkjaer
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | | | - Mohamed Helmy
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Markus Hoffmann
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Institute for Advanced Study (Lichtenbergstrasse 2a, D-85748 Garching, Germany), Technical University of Munich, Germany
- National Institute of Diabetes, Digestive, and Kidney Diseases, Bethesda, MD 20892, United States of America
| | - Igor Jurisica
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, and Data Science Discovery Centre, Osteoarthritis Research Program, Krembil Research Institute, UHN, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, 60 Leonard Avenue, 5KD-407, Toronto, ON, M5T 0S8, Canada
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Max Kotlyar
- Division of Orthopaedic Surgery, Schroeder Arthritis Institute, and Data Science Discovery Centre, Osteoarthritis Research Program, Krembil Research Institute, UHN, Toronto, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, 60 Leonard Avenue, 5KD-407, Toronto, ON, M5T 0S8, Canada
| | - Olga Lazareva
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Junior Clinical Cooperation Unit Multiparametric methods for early detection of prostate cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - Hagai Levi
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Sebastian Lobentanzer
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | - Quirin Manz
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Julian Matschinske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Miles Mee
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, 1650 Owens Street, San Francisco, 94158, California, USA
| | - Rudolf T Pillich
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Julian M Poschenrieder
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Dexter Pratt
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Nataša Pržulj
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
- Department of Computer Science, University College London, London WC1E 6BT, UK
- ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain
| | - Sepideh Sadegh
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Suryadipto Sarkar
- Department Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander University Erlangen-Nürnberg (FAU), 91052 Erlangen, Germany
| | - Gideon Shaked
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Nico Trummer
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Ugur Turhan
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Ruisheng Wang
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Olga Zolotareva
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
3
|
Wolff J, Matschinske J, Baumgart D, Pytlik A, Keck A, Natarajan A, von Schacky CE, Pauling JK, Baumbach J. Federated machine learning for a facilitated implementation of Artificial Intelligence in healthcare - a proof of concept study for the prediction of coronary artery calcification scores. J Integr Bioinform 2022; 19:jib-2022-0032. [PMID: 36054833 PMCID: PMC9800042 DOI: 10.1515/jib-2022-0032] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 08/03/2022] [Accepted: 08/11/2022] [Indexed: 01/09/2023] Open
Abstract
The implementation of Artificial Intelligence (AI) still faces significant hurdles and one key factor is the access to data. One approach that could support that is federated machine learning (FL) since it allows for privacy preserving data access. For this proof of concept, a prediction model for coronary artery calcification scores (CACS) has been applied. The FL was trained based on the data in the different institutions, while the centralized machine learning model was trained on one allocation of data. Both algorithms predict patients with risk scores ≥5 based on age, biological sex, waist circumference, dyslipidemia and HbA1c. The centralized model yields a sensitivity of c. 66% and a specificity of c. 70%. The FL slightly outperforms that with a sensitivity of 67% while slightly underperforming it with a specificity of 69%. It could be demonstrated that CACS prediction is feasible via both, a centralized and an FL approach, and that both show very comparable accuracy. In order to increase accuracy, additional and a higher volume of patient data is required and for that FL is utterly necessary. The developed "CACulator" serves as proof of concept, is available as research tool and shall support future research to facilitate AI implementation.
Collapse
Affiliation(s)
- Justus Wolff
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354Freising, Germany
- Syte – Strategy Institute for Digital Health, Hohe Bleichen 8, 20354Hamburg, Germany
| | - Julian Matschinske
- Chair of Computational Systems Biology, University of Hamburg, Notkestreet 9-11, 22607Hamburg, Germany
| | - Dietrich Baumgart
- Preventicum Essen, Theodor-Althoff-Str. 47 45133Essen, Germany
- Preventicum Duesseldorf, Koenigsallee 11, 40212Duesseldorf, Germany
| | - Anne Pytlik
- Preventicum Essen, Theodor-Althoff-Str. 47 45133Essen, Germany
- Preventicum Duesseldorf, Koenigsallee 11, 40212Duesseldorf, Germany
| | - Andreas Keck
- Syte – Strategy Institute for Digital Health, Hohe Bleichen 8, 20354Hamburg, Germany
| | - Arunakiry Natarajan
- Independent Researcher, Digital Health, Informatics and Data Science, Lower Saxony, Germany
| | - Claudio E. von Schacky
- Department of Diagnostic and Interventional Radiology, Klinikum rechts der Isar, Technical University of Munich, Ismaningerstr. 22, 81675Munich, Germany
| | - Josch K. Pauling
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354Freising, Germany
- LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354Freising, Germany
| | - Jan Baumbach
- Chair of Computational Systems Biology, University of Hamburg, Notkestreet 9-11, 22607Hamburg, Germany
- Computational BioMedicine Lab, Institute of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230Odense, Denmark
| |
Collapse
|
4
|
Cao H, Zhang Y, Baumbach J, Burton PR, Dwyer D, Koutsouleris N, Matschinske J, Marcon Y, Rajan S, Rieg T, Ryser-Welch P, Späth J, Herrmann C, Schwarz E. dsMTL - a computational framework for privacy-preserving, distributed multi-task machine learning. Bioinformatics 2022; 38:4919-4926. [PMID: 36073911 DOI: 10.1093/bioinformatics/btac616] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 09/06/2022] [Accepted: 09/07/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION In multi-cohort machine learning studies, it is critical to differentiate between effects that are reproducible across cohorts and those that are cohort-specific. Multi-task learning (MTL) is a machine learning approach that facilitates this differentiation through the simultaneous learning of prediction tasks across cohorts. Since multi-cohort data can often not be combined into a single storage solution, there would be the substantial utility of an MTL application for geographically distributed data sources. RESULTS Here, we describe the development of "dsMTL", a computational framework for privacy-preserving, distributed multi-task machine learning that includes three supervised and one unsupervised algorithms. First, we derive the theoretical properties of these methods and the relevant machine learning workflows to ensure the validity of the software implementation. Second, we implement dsMTL as a library for the R programming language, building on the DataSHIELD platform that supports the federated analysis of sensitive individual-level data. Third, we demonstrate the applicability of dsMTL for comorbidity modeling in distributed data. We show that comorbidity modeling using dsMTL outperformed conventional, federated machine learning, as well as the aggregation of multiple models built on the distributed datasets individually. The application of dsMTL was computationally efficient and highly scalable when applied to moderate-size (n < 500), real expression data given the actual network latency. AVAILABILITY dsMTL is freely available at https://github.com/transbioZI/dsMTLBase (server-side package) and https://github.com/transbioZI/dsMTLClient (client-side package). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Han Cao
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Youcheng Zhang
- Health Data Science Unit, Medical Faculty Heidelberg & BioQuant, Heidelberg, 69120, Germany
| | - Jan Baumbach
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Paul R Burton
- Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Dominic Dwyer
- Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich 80638, Germany
| | - Nikolaos Koutsouleris
- Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich 80638, Germany
| | - Julian Matschinske
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | | | - Sivanesan Rajan
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Thilo Rieg
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Patricia Ryser-Welch
- Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Julian Späth
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | | | - Carl Herrmann
- Health Data Science Unit, Medical Faculty Heidelberg & BioQuant, Heidelberg, 69120, Germany
| | - Emanuel Schwarz
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| |
Collapse
|
5
|
Späth J, Matschinske J, Kamanu FK, Murphy SA, Zolotareva O, Bakhtiari M, Antman EM, Loscalzo J, Brauneck A, Schmalhorst L, Buchholtz G, Baumbach J. Privacy-aware multi-institutional time-to-event studies. PLOS Digit Health 2022; 1:e0000101. [PMID: 36812603 PMCID: PMC9931301 DOI: 10.1371/journal.pdig.0000101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 08/06/2022] [Indexed: 11/19/2022]
Abstract
Clinical time-to-event studies are dependent on large sample sizes, often not available at a single institution. However, this is countered by the fact that, particularly in the medical field, individual institutions are often legally unable to share their data, as medical data is subject to strong privacy protection due to its particular sensitivity. But the collection, and especially aggregation into centralized datasets, is also fraught with substantial legal risks and often outright unlawful. Existing solutions using federated learning have already demonstrated considerable potential as an alternative for central data collection. Unfortunately, current approaches are incomplete or not easily applicable in clinical studies owing to the complexity of federated infrastructures. This work presents privacy-aware and federated implementations of the most used time-to-event algorithms (survival curve, cumulative hazard rate, log-rank test, and Cox proportional hazards model) in clinical trials, based on a hybrid approach of federated learning, additive secret sharing, and differential privacy. On several benchmark datasets, we show that all algorithms produce highly similar, or in some cases, even identical results compared to traditional centralized time-to-event algorithms. Furthermore, we were able to reproduce the results of a previous clinical time-to-event study in various federated scenarios. All algorithms are accessible through the intuitive web-app Partea (https://partea.zbh.uni-hamburg.de), offering a graphical user interface for clinicians and non-computational researchers without programming knowledge. Partea removes the high infrastructural hurdles derived from existing federated learning approaches and removes the complexity of execution. Therefore, it is an easy-to-use alternative to central data collection, reducing bureaucratic efforts but also the legal risks associated with the processing of personal data to a minimum.
Collapse
Affiliation(s)
- Julian Späth
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- * E-mail:
| | - Julian Matschinske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Frederick K. Kamanu
- TIMI Study Group, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Sabina A. Murphy
- TIMI Study Group, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Olga Zolotareva
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Mohammad Bakhtiari
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Elliott M. Antman
- Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Joseph Loscalzo
- Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Alissa Brauneck
- Faculty of Legal Sciences, University of Hamburg, Hamburg, Germany
| | | | | | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| |
Collapse
|
6
|
Hauschild AC, Lemanczyk M, Matschinske J, Frisch T, Zolotareva O, Holzinger A, Baumbach J, Heider D. Federated Random Forests can improve local performance of predictive models for various healthcare applications. Bioinformatics 2022; 38:2278-2286. [PMID: 35139148 DOI: 10.1093/bioinformatics/btac065] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 01/08/2022] [Accepted: 02/01/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Limited data access has hindered the field of precision medicine from exploring its full potential, e.g. concerning machine learning and privacy and data protection rules.Our study evaluates the efficacy of federated Random Forests (FRF) models, focusing particularly on the heterogeneity within and between datasets. We addressed three common challenges: (i) number of parties, (ii) sizes of datasets and (iii) imbalanced phenotypes, evaluated on five biomedical datasets. RESULTS The FRF outperformed the average local models and performed comparably to the data-centralized models trained on the entire data. With an increasing number of models and decreasing dataset size, the performance of local models decreases drastically. The FRF, however, do not decrease significantly. When combining datasets of different sizes, the FRF vastly improve compared to the average local models. We demonstrate that the FRF remain more robust and outperform the local models by analyzing different class-imbalances.Our results support that FRF overcome boundaries of clinical research and enables collaborations across institutes without violating privacy or legal regulations. Clinicians benefit from a vast collection of unbiased data aggregated from different geographic locations, demographics and other varying factors. They can build more generalizable models to make better clinical decisions, which will have relevance, especially for patients in rural areas and rare or geographically uncommon diseases, enabling personalized treatment. In combination with secure multi-party computation, federated learning has the power to revolutionize clinical practice by increasing the accuracy and robustness of healthcare AI and thus paving the way for precision medicine. AVAILABILITY AND IMPLEMENTATION The implementation of the federated random forests can be found at https://featurecloud.ai/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Marta Lemanczyk
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
| | - Julian Matschinske
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising-Weihenstephan, Germany.,Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Tobias Frisch
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Olga Zolotareva
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising-Weihenstephan, Germany
| | - Andreas Holzinger
- Institut für Medizinische Informatik, Statistik und Dokumentation, Medizinische Universität Graz, Graz, Austria
| | - Jan Baumbach
- Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
| |
Collapse
|
7
|
Nasirigerdeh R, Torkzadehmahani R, Matschinske J, Frisch T, List M, Späth J, Weiss S, Völker U, Pitkänen E, Heider D, Wenke NK, Kaissis G, Rueckert D, Kacprowski T, Baumbach J. sPLINK: a hybrid federated tool as a robust alternative to meta-analysis in genome-wide association studies. Genome Biol 2022; 23:32. [PMID: 35073941 PMCID: PMC8785575 DOI: 10.1186/s13059-021-02562-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 12/02/2021] [Indexed: 11/10/2022] Open
Abstract
Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distributed datasets while preserving the accuracy of the results. sPLINK is robust against heterogeneous distributions of data across cohorts while meta-analysis considerably loses accuracy in such scenarios. sPLINK achieves practical runtime and acceptable network usage for chi-square and linear/logistic regression tests. sPLINK is available at https://exbio.wzw.tum.de/splink .
Collapse
Affiliation(s)
- Reza Nasirigerdeh
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.
- Klinikum rechts der Isar, Munich, Germany.
| | | | - Julian Matschinske
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Tobias Frisch
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Julian Späth
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Stefan Weiss
- Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Uwe Völker
- Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany
| | - Esa Pitkänen
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
- Applied Tumor Genomics Research Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Dominik Heider
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
| | - Nina Kerstin Wenke
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Georgios Kaissis
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany
- Klinikum rechts der Isar, Munich, Germany
- Biomedical Image Analysis Group, Imperial College London, London, UK
- OpenMined, Oxford, UK
| | - Daniel Rueckert
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany
- Klinikum rechts der Isar, Munich, Germany
- Biomedical Image Analysis Group, Imperial College London, London, UK
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Brunswick, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Brunswick, Germany
| | - Jan Baumbach
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
8
|
Torkzadehmahani R, Nasirigerdeh R, Blumenthal DB, Kacprowski T, List M, Matschinske J, Spaeth J, Wenke NK, Baumbach J. Privacy-Preserving Artificial Intelligence Techniques in Biomedicine. Methods Inf Med 2022; 61:e12-e27. [PMID: 35062032 PMCID: PMC9246509 DOI: 10.1055/s-0041-1740630] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Background
Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g., in the interpretation of next-generation sequencing data and in the design of clinical decision support systems.
Objectives
However, training an AI model on sensitive data raises concerns about the privacy of individual participants. For example, summary statistics of a genome-wide association study can be used to determine the presence or absence of an individual in a given dataset. This considerable privacy risk has led to restrictions in accessing genomic and other biomedical data, which is detrimental for collaborative research and impedes scientific progress. Hence, there has been a substantial effort to develop AI methods that can learn from sensitive data while protecting individuals' privacy.
Method
This paper provides a structured overview of recent advances in privacy-preserving AI techniques in biomedicine. It places the most important state-of-the-art approaches within a unified taxonomy and discusses their strengths, limitations, and open problems.
Conclusion
As the most promising direction, we suggest combining federated machine learning as a more scalable approach with other additional privacy-preserving techniques. This would allow to merge the advantages to provide privacy guarantees in a distributed way for biomedical applications. Nonetheless, more research is necessary as hybrid approaches pose new challenges such as additional network or computation overhead.
Collapse
Affiliation(s)
- Reihaneh Torkzadehmahani
- Institute for Artificial Intelligence in Medicine and Healthcare, Technical University of Munich, Munich, Germany
| | - Reza Nasirigerdeh
- Institute for Artificial Intelligence in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum Rechts der Isar, Technical University of Munich, Munich, Germany
| | - David B Blumenthal
- Department of Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Tim Kacprowski
- Division of Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Medical School Hannover, Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, Technical University of Munich, Munich, Germany
| | - Julian Matschinske
- E.U. Horizon2020 FeatureCloud Project Consortium.,Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Julian Spaeth
- E.U. Horizon2020 FeatureCloud Project Consortium.,Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Nina Kerstin Wenke
- E.U. Horizon2020 FeatureCloud Project Consortium.,Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- E.U. Horizon2020 FeatureCloud Project Consortium.,Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Institute of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
9
|
Zolotareva O, Nasirigerdeh R, Matschinske J, Torkzadehmahani R, Bakhtiari M, Frisch T, Späth J, Blumenthal DB, Abbasinejad A, Tieri P, Kaissis G, Rückert D, Wenke NK, List M, Baumbach J. Flimma: a federated and privacy-aware tool for differential gene expression analysis. Genome Biol 2021; 22:338. [PMID: 34906207 PMCID: PMC8670124 DOI: 10.1186/s13059-021-02553-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 11/22/2021] [Indexed: 12/13/2022] Open
Abstract
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneously distributed among cohorts. Flimma ( https://exbio.wzw.tum.de/flimma/ ) addresses this issue by implementing the state-of-the-art workflow limma voom in a federated manner, i.e., patient data never leaves its source site. Flimma results are identical to those generated by limma voom on aggregated datasets even in imbalanced scenarios where meta-analysis approaches fail.
Collapse
Affiliation(s)
- Olga Zolotareva
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany. .,Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.
| | - Reza Nasirigerdeh
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Julian Matschinske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | | | - Mohammad Bakhtiari
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Tobias Frisch
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Julian Späth
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - David B Blumenthal
- Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Amir Abbasinejad
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.,Sapienza University of Rome, Rome, Italy
| | - Paolo Tieri
- CNR National Research Council, IAC Institute for Applied Computing, Rome, Italy.,Sapienza University of Rome, Rome, Italy
| | - Georgios Kaissis
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.,Biomedical Image Analysis Group, Imperial College London, London, UK.,OpenMined, Oxford, UK
| | - Daniel Rückert
- AI in Medicine and Healthcare, Technical University of Munich, Munich, Germany.,Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.,Biomedical Image Analysis Group, Imperial College London, London, UK
| | - Nina K Wenke
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.,Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
10
|
Hufsky F, Lamkiewicz K, Almeida A, Aouacheria A, Arighi C, Bateman A, Baumbach J, Beerenwinkel N, Brandt C, Cacciabue M, Chuguransky S, Drechsel O, Finn RD, Fritz A, Fuchs S, Hattab G, Hauschild AC, Heider D, Hoffmann M, Hölzer M, Hoops S, Kaderali L, Kalvari I, von Kleist M, Kmiecinski R, Kühnert D, Lasso G, Libin P, List M, Löchel HF, Martin MJ, Martin R, Matschinske J, McHardy AC, Mendes P, Mistry J, Navratil V, Nawrocki EP, O’Toole ÁN, Ontiveros-Palacios N, Petrov AI, Rangel-Pineros G, Redaschi N, Reimering S, Reinert K, Reyes A, Richardson L, Robertson DL, Sadegh S, Singer JB, Theys K, Upton C, Welzel M, Williams L, Marz M. Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research. Brief Bioinform 2021; 22:642-663. [PMID: 33147627 PMCID: PMC7665365 DOI: 10.1093/bib/bbaa232] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 07/28/2020] [Accepted: 08/26/2020] [Indexed: 12/16/2022] Open
Abstract
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories. Contact:evbc@unj-jena.de.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Christian Brandt
- Institute of Infectious Disease and Infection Control at Jena University Hospital, Germany
| | - Marco Cacciabue
- Consejo Nacional de Investigaciones Científicas y Tócnicas (CONICET) working on FMDV virology at the Instituto de Agrobiotecnología y Biología Molecular (IABiMo, INTA-CONICET) and at the Departamento de Ciencias Básicas, Universidad Nacional de Luján (UNLu), Argentina
| | | | - Oliver Drechsel
- bioinformatics department at the Robert Koch-Institute, Germany
| | | | - Adrian Fritz
- Computational Biology of Infection Research group of Alice C. McHardy at the Helmholtz Centre for Infection Research, Germany
| | - Stephan Fuchs
- bioinformatics department at the Robert Koch-Institute, Germany
| | - Georges Hattab
- Bioinformatics Division at Philipps-University Marburg, Germany
| | | | - Dominik Heider
- Data Science in Biomedicine at the Philipps-University of Marburg, Germany
| | | | | | - Stefan Hoops
- Biocomplexity Institute and Initiative at the University of Virginia, USA
| | - Lars Kaderali
- Bioinformatics and head of the Institute of Bioinformatics at University Medicine Greifswald, Germany
| | | | - Max von Kleist
- bioinformatics department at the Robert Koch-Institute, Germany
| | - Renó Kmiecinski
- bioinformatics department at the Robert Koch-Institute, Germany
| | | | - Gorka Lasso
- Chandran Lab, Albert Einstein College of Medicine, USA
| | | | | | | | | | | | | | - Alice C McHardy
- Computational Biology of Infection Research Lab at the Helmholtz Centre for Infection Research in Braunschweig, Germany
| | - Pedro Mendes
- Center for Quantitative Medicine of the University of Connecticut School of Medicine, USA
| | | | - Vincent Navratil
- Bioinformatics and Systems Biology at the Rhône Alpes Bioinformatics core facility, Universitó de Lyon, France
| | | | | | | | | | | | - Nicole Redaschi
- Development of the Swiss-Prot group at the SIB for UniProt and SIB resources that cover viral biology (ViralZone)
| | - Susanne Reimering
- Computational Biology of Infection Research group of Alice C. McHardy at the Helmholtz Centre for Infection Research
| | | | | | | | | | - Sepideh Sadegh
- Chair of Experimental Bioinformatics at Technical University of Munich, Germany
| | - Joshua B Singer
- MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, UK
| | | | - Chris Upton
- Department of Biochemistry and Microbiology, University of Victoria, Canada
| | | | | | - Manja Marz
- Friedrich Schiller University Jena, Germany
| |
Collapse
|
11
|
Galindez G, Matschinske J, Rose TD, Sadegh S, Salgado-Albarrán M, Späth J, Baumbach J, Pauling JK. Lessons from the COVID-19 pandemic for advancing computational drug repurposing strategies. Nat Comput Sci 2021; 1:33-41. [PMID: 38217166 DOI: 10.1038/s43588-020-00007-6] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 12/01/2020] [Indexed: 12/15/2022]
Abstract
Responding quickly to unknown pathogens is crucial to stop uncontrolled spread of diseases that lead to epidemics, such as the novel coronavirus, and to keep protective measures at a level that causes as little social and economic harm as possible. This can be achieved through computational approaches that significantly speed up drug discovery. A powerful approach is to restrict the search to existing drugs through drug repurposing, which can vastly accelerate the usually long approval process. In this Review, we examine a representative set of currently used computational approaches to identify repurposable drugs for COVID-19, as well as their underlying data resources. Furthermore, we compare drug candidates predicted by computational methods to drugs being assessed by clinical trials. Finally, we discuss lessons learned from the reviewed research efforts, including how to successfully connect computational approaches with experimental studies, and propose a unified drug repurposing strategy for better preparedness in the case of future outbreaks.
Collapse
Affiliation(s)
- Gihanna Galindez
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Julian Matschinske
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Tim Daniel Rose
- LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Marisol Salgado-Albarrán
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Natural Sciences Department, Universidad Autónoma Metropolitana-Cuajimalpa (UAM-C), Mexico City, Mexico
| | - Julian Späth
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany
- Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Josch Konstantin Pauling
- LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Munich, Germany.
| |
Collapse
|
12
|
Matschinske J, Salgado-Albarrán M, Sadegh S, Bongiovanni D, Baumbach J, Blumenthal DB. Individuating Possibly Repurposable Drugs and Drug Targets for COVID-19 Treatment Through Hypothesis-Driven Systems Medicine Using CoVex. Assay Drug Dev Technol 2020; 18:348-355. [PMID: 33164550 PMCID: PMC7703307 DOI: 10.1089/adt.2020.1010] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Coronavirus disease 2019 (COVID-19), caused by the SARS-CoV-2 virus, has developed into a pandemic causing major disruptions and hundreds of thousands of deaths in wide parts of the world. As of July 3, 2020, neither vaccines nor approved drugs for effective treatment are available. In this article, we showcase how to individuate drug targets and potentially repurposable drugs in silico using CoVex a recently presented systems medicine platform for COVID-19 drug repurposing. Starting from initial hypotheses, CoVex leverages network algorithms to individuate host proteins involved in COVID-19 disease mechanisms, as well as existing drugs targeting these potential drug targets. Our analysis reveals GLA, PLAT, and GGCX as potential drug targets, and urokinase, argatroban, dabigatran etexilate, betrixaban, ximelagatran and anisindione as potentially repurposable drugs.
Collapse
Affiliation(s)
- Julian Matschinske
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| | - Marisol Salgado-Albarrán
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
- Natural Sciences Department, Universidad Autónoma Metropolitana-Cuajimalpa, Mexico City, Mexico
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| | - Dario Bongiovanni
- Department of Internal Medicine I, School of Medicine, University Hospital Rechts der Isar, Technical University of Munich, Munich, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Munich Heart Alliance, Munich, Germany
- Department of Cardiovascular Medicine, Humanitas Clinical and Research Center IRCCS, Humanitas University, Rozzano, Milan, Italy
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| | - David B. Blumenthal
- Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany
| |
Collapse
|
13
|
Sadegh S, Matschinske J, Blumenthal DB, Galindez G, Kacprowski T, List M, Nasirigerdeh R, Oubounyt M, Pichlmair A, Rose TD, Salgado-Albarrán M, Späth J, Stukalov A, Wenke NK, Yuan K, Pauling JK, Baumbach J. Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing. Nat Commun 2020; 11:3518. [PMID: 32665542 PMCID: PMC7360763 DOI: 10.1038/s41467-020-17189-2] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 06/12/2020] [Indexed: 11/09/2022] Open
Abstract
Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. Various studies exist about the molecular mechanisms of viral infection. However, such information is spread across many publications and it is very time-consuming to integrate, and exploit. We develop CoVex, an interactive online platform for SARS-CoV-2 host interactome exploration and drug (target) identification. CoVex integrates virus-human protein interactions, human protein-protein interactions, and drug-target interactions. It allows visual exploration of the virus-host interactome and implements systems medicine algorithms for network-based prediction of drug candidates. Thus, CoVex is a resource to understand molecular mechanisms of pathogenicity and to prioritize candidate therapeutics. We investigate recent hypotheses on a systems biology level to explore mechanistic virus life cycle drivers, and to extract drug repurposing candidates. CoVex renders COVID-19 drug research systems-medicine-ready by giving the scientific community direct access to network medicine algorithms. It is available at https://exbio.wzw.tum.de/covex/.
Collapse
Affiliation(s)
- Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Julian Matschinske
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - David B Blumenthal
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Gihanna Galindez
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Tim Kacprowski
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Reza Nasirigerdeh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Mhaned Oubounyt
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Andreas Pichlmair
- Institute of Virology, TUM School of Medicine, Technical University of Munich, München, Germany
| | - Tim Daniel Rose
- LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Marisol Salgado-Albarrán
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
- Natural Sciences Department, Universidad Autónoma Metropolitana-Cuajimalpa (UAM-C), 05300, Mexico City, Mexico
| | - Julian Späth
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Alexey Stukalov
- Institute of Virology, TUM School of Medicine, Technical University of Munich, München, Germany
| | - Nina K Wenke
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Kevin Yuan
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Josch K Pauling
- LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, München, Germany.
- Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|