1
|
el Bouhaddani S, Höllerhage M, Uh HW, Moebius C, Bickle M, Höglinger G, Houwing-Duistermaat J. Statistical integration of multi-omics and drug screening data from cell lines. PLoS Comput Biol 2024; 20:e1011809. [PMID: 38295113 PMCID: PMC10878536 DOI: 10.1371/journal.pcbi.1011809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/20/2024] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
Data integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The workflow is motivated by a study on synucleinopathies where transcriptomics, proteomics, and drug screening data are measured in affected LUHMES cell lines and controls. The aim is to highlight potentially druggable pathways and genes involved in synucleinopathies. First, POPLS-DA is used to prioritize genes and proteins that best distinguish cases and controls. For these genes, an integrated interaction network is constructed where the drug screen data is incorporated to highlight druggable genes and pathways in the network. Finally, functional enrichment analyses are performed to identify clusters of synaptic and lysosome-related genes and proteins targeted by the protective drugs. POPLS-DA is compared to other single- and multi-omics approaches. We found that HSPA5, a member of the heat shock protein 70 family, was one of the most targeted genes by the validated drugs, in particular by AT1-blockers. HSPA5 and AT1-blockers have been previously linked to α-synuclein pathology and Parkinson's disease, showing the relevance of our findings. Our computational workflow identified new directions for therapeutic targets for synucleinopathies. POPLS-DA provided a larger interpretable gene set than other single- and multi-omic approaches. An implementation based on R and markdown is freely available online.
Collapse
Affiliation(s)
| | | | - Hae-Won Uh
- Dept. Data science & Biostatistics, UMC Utrecht, Utrecht, Netherlands
| | - Claudia Moebius
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Marc Bickle
- Roche Institute for Translational Bioengineering, Basel, Switzerland
| | - Günter Höglinger
- Department of Neurology, Hannover Medical School, Hannover, Germany
- Department of Neurology, Ludwig-Maximilians-Universität, Munich, Germany
- German Center for Neurodegenerative Diseases, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Jeanine Houwing-Duistermaat
- Dept. Data science & Biostatistics, UMC Utrecht, Utrecht, Netherlands
- Dept. of Mathematics, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
2
|
Meijer C, Uh HW, el Bouhaddani S. Digital Twins in Healthcare: Methodological Challenges and Opportunities. J Pers Med 2023; 13:1522. [PMID: 37888133 PMCID: PMC10608065 DOI: 10.3390/jpm13101522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/14/2023] [Accepted: 10/15/2023] [Indexed: 10/28/2023] Open
Abstract
One of the most promising advancements in healthcare is the application of digital twin technology, offering valuable applications in monitoring, diagnosis, and development of treatment strategies tailored to individual patients. Furthermore, digital twins could also be helpful in finding novel treatment targets and predicting the effects of drugs and other chemical substances in development. In this review article, we consider digital twins as virtual counterparts of real human patients. The primary aim of this narrative review is to give an in-depth look into the various data sources and methodologies that contribute to the construction of digital twins across several healthcare domains. Each data source, including blood glucose levels, heart MRI and CT scans, cardiac electrophysiology, written reports, and multi-omics data, comes with different challenges regarding standardization, integration, and interpretation. We showcase how various datasets and methods are used to overcome these obstacles and generate a digital twin. While digital twin technology has seen significant progress, there are still hurdles in the way to achieving a fully comprehensive patient digital twin. Developments in non-invasive and high-throughput data collection, as well as advancements in modeling and computational power will be crucial to improve digital twin systems. We discuss a few critical developments in light of the current state of digital twin technology. Despite challenges, digital twin research holds great promise for personalized patient care and has the potential to shape the future of healthcare innovation.
Collapse
Affiliation(s)
| | | | - Said el Bouhaddani
- Department Data Science & Biostatistics, Julius Center, UMC Utrecht, 3584 CX Utrecht, The Netherlands (H.-W.U.)
| |
Collapse
|
3
|
Gill SK, Karwath A, Uh HW, Cardoso VR, Gu Z, Barsky A, Slater L, Acharjee A, Duan J, Dall'Olio L, el Bouhaddani S, Chernbumroong S, Stanbury M, Haynes S, Asselbergs FW, Grobbee DE, Eijkemans MJC, Gkoutos GV, Kotecha D. Artificial intelligence to enhance clinical value across the spectrum of cardiovascular healthcare. Eur Heart J 2023; 44:713-725. [PMID: 36629285 PMCID: PMC9976986 DOI: 10.1093/eurheartj/ehac758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 11/22/2022] [Accepted: 12/05/2022] [Indexed: 01/12/2023] Open
Abstract
Artificial intelligence (AI) is increasingly being utilized in healthcare. This article provides clinicians and researchers with a step-wise foundation for high-value AI that can be applied to a variety of different data modalities. The aim is to improve the transparency and application of AI methods, with the potential to benefit patients in routine cardiovascular care. Following a clear research hypothesis, an AI-based workflow begins with data selection and pre-processing prior to analysis, with the type of data (structured, semi-structured, or unstructured) determining what type of pre-processing steps and machine-learning algorithms are required. Algorithmic and data validation should be performed to ensure the robustness of the chosen methodology, followed by an objective evaluation of performance. Seven case studies are provided to highlight the wide variety of data modalities and clinical questions that can benefit from modern AI techniques, with a focus on applying them to cardiovascular disease management. Despite the growing use of AI, further education for healthcare workers, researchers, and the public are needed to aid understanding of how AI works and to close the existing gap in knowledge. In addition, issues regarding data access, sharing, and security must be addressed to ensure full engagement by patients and the public. The application of AI within healthcare provides an opportunity for clinicians to deliver a more personalized approach to medical care by accounting for confounders, interactions, and the rising prevalence of multi-morbidity.
Collapse
Affiliation(s)
- Simrat K Gill
- Institute of Cardiovascular Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Andreas Karwath
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | - Hae-Won Uh
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Victor Roth Cardoso
- Institute of Cardiovascular Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | - Zhujie Gu
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Andrey Barsky
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | - Luke Slater
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | - Animesh Acharjee
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | - Jinming Duan
- School of Computer Science, University of Birmingham, Birmingham, UK
- Alan Turing Institute, London, UK
| | - Lorenzo Dall'Olio
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy
| | - Said el Bouhaddani
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Saisakul Chernbumroong
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | | | | | - Folkert W Asselbergs
- Amsterdam University Medical Center, Department of Cardiology, University of Amsterdam, Amsterdam, The Netherlands
- Health Data Research UK and Institute of Health Informatics, University College London, London, UK
| | - Diederick E Grobbee
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Marinus J C Eijkemans
- Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Georgios V Gkoutos
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
| | - Dipak Kotecha
- Institute of Cardiovascular Sciences, University of Birmingham, Vincent Drive, B15 2TT Birmingham, UK
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
4
|
el Bouhaddani S, Uh H, Jongbloed G, Houwing‐Duistermaat J. Statistical integration of heterogeneous omics data: Probabilistic two‐way partial least squares (PO2PLS). J R Stat Soc Ser C Appl Stat 2022. [DOI: 10.1111/rssc.12583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Said el Bouhaddani
- Department of Data Science and Biostatistics UMC Utrecht UtrechtThe Netherlands
| | - Hae‐Won Uh
- Department of Data Science and Biostatistics UMC Utrecht UtrechtThe Netherlands
| | - Geurt Jongbloed
- Delft Institute of Applied Mathematics TU Delft Delft The Netherlands
| | - Jeanine Houwing‐Duistermaat
- Department of Data Science and Biostatistics UMC Utrecht UtrechtThe Netherlands
- Department of Statistics University of Leeds Leeds UK
- Department of Statistical Sciences University of Bologna Bologna Italy
| |
Collapse
|
5
|
Rahimi E, Shahisavandi M, Royo AC, Azizi M, el Bouhaddani S, Sigari N, Sturkenboom M, Ahmadizar F. The risk profile of patients with COVID-19 as predictors of lung lesions severity and mortality—Development and validation of a prediction model. Front Microbiol 2022; 13:893750. [PMID: 35958125 PMCID: PMC9361066 DOI: 10.3389/fmicb.2022.893750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 06/30/2022] [Indexed: 11/23/2022] Open
Abstract
Objective We developed and validated a prediction model based on individuals' risk profiles to predict the severity of lung involvement and death in patients hospitalized with coronavirus disease 2019 (COVID-19) infection. Methods In this retrospective study, we studied hospitalized COVID-19 patients with data on chest CT scans performed during hospital stay (February 2020-April 2021) in a training dataset (TD) (n = 2,251) and an external validation dataset (eVD) (n = 993). We used the most relevant demographical, clinical, and laboratory variables (n = 25) as potential predictors of COVID-19-related outcomes. The primary and secondary endpoints were the severity of lung involvement quantified as mild (≤25%), moderate (26–50%), severe (>50%), and in-hospital death, respectively. We applied random forest (RF) classifier, a machine learning technique, and multivariable logistic regression analysis to study our objectives. Results In the TD and the eVD, respectively, the mean [standard deviation (SD)] age was 57.9 (18.0) and 52.4 (17.6) years; patients with severe lung involvement [n (%):185 (8.2) and 116 (11.7)] were significantly older [mean (SD) age: 64.2 (16.9), and 56.2 (18.9)] than the other two groups (mild and moderate). The mortality rate was higher in patients with severe (64.9 and 38.8%) compared to moderate (5.5 and 12.4%) and mild (2.3 and 7.1%) lung involvement. The RF analysis showed age, C reactive protein (CRP) levels, and duration of hospitalizations as the three most important predictors of lung involvement severity at the time of the first CT examination. Multivariable logistic regression analysis showed a significant strong association between the extent of the severity of lung involvement (continuous variable) and death; adjusted odds ratio (OR): 9.3; 95% CI: 7.1–12.1 in the TD and 2.6 (1.8–3.5) in the eVD. Conclusion In hospitalized patients with COVID-19, the severity of lung involvement is a strong predictor of death. Age, CRP levels, and duration of hospitalizations are the most important predictors of severe lung involvement. A simple prediction model based on available clinical and imaging data provides a validated tool that predicts the severity of lung involvement and death probability among hospitalized patients with COVID-19.
Collapse
Affiliation(s)
- Ezat Rahimi
- Clinical Research Unit, Department of Internal Medicine, Kowsar Hospital, Kurdistan University of Medical Sciences, Sanandaj, Iran
| | - Mina Shahisavandi
- Epilepsy Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Albert Cid Royo
- Department of Datascience and Biostatistics, University Medical Center Utrecht, Utrecht, Netherlands
| | - Mohammad Azizi
- School of Medicine, Kurdistan University of Medical Sciences, Sanandaj, Iran
| | - Said el Bouhaddani
- Department of Datascience and Biostatistics, University Medical Center Utrecht, Utrecht, Netherlands
| | - Naseh Sigari
- Lung Diseases and Allergy Research Center, Research Institute for Health Development, Kurdistan University of Medical Sciences, Sanandaj, Iran
| | - Miriam Sturkenboom
- Department of Datascience and Biostatistics, University Medical Center Utrecht, Utrecht, Netherlands
| | - Fariba Ahmadizar
- Department of Datascience and Biostatistics, University Medical Center Utrecht, Utrecht, Netherlands
- *Correspondence: Fariba Ahmadizar
| |
Collapse
|
6
|
Bouhaddani SE, Uh HW, Jongbloed G, Hayward C, Klarić L, Kiełbasa SM, Houwing-Duistermaat J. Integrating omics datasets with the OmicsPLS package. BMC Bioinformatics 2018; 19:371. [PMID: 30309317 PMCID: PMC6182835 DOI: 10.1186/s12859-018-2371-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 09/11/2018] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. RESULTS We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. CONCLUSIONS We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").
Collapse
Affiliation(s)
- Said el Bouhaddani
- Dept. of Biomedical Data Sciences, LUMC, Albinusdreef 2, Leiden, 2300 RC The Netherlands
- Delft Institute of Applied Mathematics, EEMCS, TU Delft, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Hae-Won Uh
- Department of Biostatistics and Research Support, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, Utrecht, 3508 GA The Netherlands
| | - Geurt Jongbloed
- Delft Institute of Applied Mathematics, EEMCS, TU Delft, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Caroline Hayward
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU Scotland
| | - Lucija Klarić
- Genos Glycobiology Laboratory, Zagreb, 10000 Croatia
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU Scotland
- Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, EH8 9DX Scotland
| | - Szymon M. Kiełbasa
- Dept. of Biomedical Data Sciences, LUMC, Albinusdreef 2, Leiden, 2300 RC The Netherlands
| | | |
Collapse
|
7
|
el Bouhaddani S, Uh HW, Hayward C, Jongbloed G, Houwing-Duistermaat J. Probabilistic partial least squares model: Identifiability, estimation and application. J MULTIVARIATE ANAL 2018. [DOI: 10.1016/j.jmva.2018.05.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|