1
|
Drożdż A, Duggan B, Ruddock MW, Reid CN, Kurth MJ, Watt J, Irvine A, Lamont J, Fitzgerald P, O’Rourke D, Curry D, Evans M, Boyd R, Sousa J. Stratifying risk of disease in haematuria patients using machine learning techniques to improve diagnostics. Front Oncol 2024; 14:1401071. [PMID: 38779086 PMCID: PMC11109371 DOI: 10.3389/fonc.2024.1401071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 04/22/2024] [Indexed: 05/25/2024] Open
Abstract
Background Detailed and invasive clinical investigations are required to identify the causes of haematuria. Highly unbalanced patient population (predominantly male) and a wide range of potential causes make the ability to correctly classify patients and identify patient-specific biomarkers a major challenge. Studies have shown that it is possible to improve the diagnosis using multi-marker analysis, even in unbalanced datasets, by applying advanced analytical methods. Here, we applied several machine learning algorithms to classify patients from the haematuria patient cohort (HaBio) by analysing multiple biomarkers and to identify the most relevant ones. Materials and methods We applied several classification and feature selection methods (k-means clustering, decision trees, random forest with LIME explainer and CACTUS algorithm) to stratify patients into two groups: healthy (with no clear cause of haematuria) or sick (with an identified cause of haematuria e.g., bladder cancer, or infection). The classification performance of the models was compared. Biomarkers identified as important by the algorithms were also analysed in relation to their involvement in the pathological processes. Results Results showed that a high unbalance in the datasets significantly affected the classification by random forest and decision trees, leading to the overestimation of the sick class and low model performance. CACTUS algorithm was more robust to the unbalance in the dataset. CACTUS obtained a balanced accuracy of 0.747 for both genders, 0.718 for females and 0.803 for males. The analysis showed that in the classification process for the whole dataset: microalbumin, male gender, and tPSA emerged as the most informative biomarkers. For males: age, microalbumin, tPSA, cystatin C, BTA, HAD and S100A4 were the most significant biomarkers while for females microalbumin, IL-8, pERK, and CXCL16. Conclusions CACTUS algorithm demonstrated improved performance compared with other methods such as decision trees and random forest. Additionally, we identified the most relevant biomarkers for the specific patient group, which could be considered in the future as novel biomarkers for diagnosis. Our results have the potential to inform future research and provide new personalised diagnostic approaches tailored directly to the needs of the individuals.
Collapse
Affiliation(s)
- Anna Drożdż
- Personal Health Data Science Group, Sano – Centre for Computational Personalised Medicine - International Research Foundation, Krakow, Poland
| | - Brian Duggan
- South Eastern Health and Social Care Trust, Ulster Hospital Dundonald, Belfast, United Kingdom
| | - Mark W. Ruddock
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - Cherith N. Reid
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - Mary Jo Kurth
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - Joanne Watt
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - Allister Irvine
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - John Lamont
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - Peter Fitzgerald
- Clinical Studies Group, Randox Laboratories Ltd., Co., Antrim, United Kingdom
| | - Declan O’Rourke
- Belfast Health and Social Care Trust, Belfast City Hospital, Belfast, United Kingdom
| | - David Curry
- Belfast Health and Social Care Trust, Belfast City Hospital, Belfast, United Kingdom
| | - Mark Evans
- Belfast Health and Social Care Trust, Belfast City Hospital, Belfast, United Kingdom
| | - Ruth Boyd
- Northern Ireland Clinical Trials Network, Belfast City Hospital, Belfast, United Kingdom
| | - Jose Sousa
- Personal Health Data Science Group, Sano – Centre for Computational Personalised Medicine - International Research Foundation, Krakow, Poland
- Centre for Public Health, Institute of Clinical Sciences, Queen’s University, Belfast, United Kingdom
| |
Collapse
|
2
|
Luo L, Vart P, Kieneker LM, van der Vegt B, Bakker SJL, Gruppen EG, Casteleijn NF, de Boer RA, Suthahar N, de Bock GH, Aboumsallem JP, Gansevoort RT. Mediators of the association between albuminuria and incident cancer: the PREVEND study. Clin Kidney J 2024; 17:sfad295. [PMID: 38213496 PMCID: PMC10783233 DOI: 10.1093/ckj/sfad295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Indexed: 01/13/2024] Open
Affiliation(s)
- Li Luo
- Department of Nephrology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Priya Vart
- Department of Clinical Pharmacy and Pharmacology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Lyanne M Kieneker
- Department of Nephrology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Bert van der Vegt
- Department of Pathology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Stephan J L Bakker
- Department of Nephrology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Eke G Gruppen
- Department of Nephrology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Niek F Casteleijn
- Department of Urology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Rudolf A de Boer
- Erasmus MC, Cardiovascular Institute, Thorax Center, Department of Cardiology, Rotterdam, The Netherlands
| | - Navin Suthahar
- Erasmus MC, Cardiovascular Institute, Thorax Center, Department of Cardiology, Rotterdam, The Netherlands
| | - Geertruida H de Bock
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Joseph Pierre Aboumsallem
- Erasmus MC, Cardiovascular Institute, Thorax Center, Department of Cardiology, Rotterdam, The Netherlands
| | - Ron T Gansevoort
- Department of Nephrology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|