1
|
Farinella R, Felici A, Peduzzi G, Testoni SGG, Costello E, Aretini P, Blazquez-Encinas R, Oz E, Pastore A, Tacelli M, Otlu B, Campa D, Gentiluomo M. From classical approaches to artificial intelligence, old and new tools for PDAC risk stratification and prediction. Semin Cancer Biol 2025; 112:71-92. [PMID: 40147701 DOI: 10.1016/j.semcancer.2025.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 03/08/2025] [Accepted: 03/19/2025] [Indexed: 03/29/2025]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is recognized as one of the most lethal malignancies, characterized by late-stage diagnosis and limited therapeutic options. Risk stratification has traditionally been performed using epidemiological studies and genetic analyses, through which key risk factors, including smoking, diabetes, chronic pancreatitis, and inherited predispositions, have been identified. However, the multifactorial nature of PDAC has often been insufficiently addressed by these methods, leading to limited precision in individualized risk assessments. Advances in artificial intelligence (AI) have been proposed as a transformative approach, allowing the integration of diverse datasets-spanning genetic, clinical, lifestyle, and imaging data into dynamic models capable of uncovering novel interactions and risk profiles. In this review, the evolution of PDAC risk stratification is explored, with classical epidemiological frameworks compared to AI-driven methodologies. Genetic insights, including genome-wide association studies and polygenic risk scores, are discussed, alongside AI models such as machine learning, radiomics, and deep learning. Strengths and limitations of these approaches are evaluated, with challenges in clinical translation, such as data scarcity, model interpretability, and external validation, addressed. Finally, future directions are proposed for combining classical and AI-driven methodologies to develop scalable, personalized predictive tools for PDAC, with the goal of improving early detection and patient outcomes.
Collapse
Affiliation(s)
| | | | | | - Sabrina Gloria Giulia Testoni
- Division of Gastroenterology and Gastrointestinal Endoscopy, IRCCS Policlinico San Donato, Vita-Salute San Raffaele University, Milan, Italy
| | - Eithne Costello
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - Paolo Aretini
- Fondazione Pisana per la Scienza, San Giuliano Terme, Italy
| | - Ricardo Blazquez-Encinas
- Department of Cell Biology, Physiology and Immunology, University of Cordoba / Maimonides Biomedical Research Institute of Cordoba (IMIBIC), Cordoba, Spain
| | - Elif Oz
- Department of Biostatistics and Bioinformatics, Institute of Health Sciences, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
| | - Aldo Pastore
- Fondazione Pisana per la Scienza, San Giuliano Terme, Italy
| | - Matteo Tacelli
- Pancreas Translational & Clinical Research Center, Pancreato-Biliary Endoscopy and Endosonography Division, San Raffaele Scientific Institute IRCCS, Milan, Italy
| | - Burçak Otlu
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Daniele Campa
- Department of Biology, University of Pisa, Pisa, Italy
| | | |
Collapse
|
2
|
Murray K, Oldfield L, Stefanova I, Gentiluomo M, Aretini P, O'Sullivan R, Greenhalf W, Paiella S, Aoki MN, Pastore A, Birch-Ford J, Rao BH, Uysal-Onganer P, Walsh CM, Hanna GB, Narang J, Sharma P, Campa D, Rizzato C, Turtoi A, Sever EA, Felici A, Sucularli C, Peduzzi G, Öz E, Sezerman OU, Van der Meer R, Thompson N, Costello E. Biomarkers, omics and artificial intelligence for early detection of pancreatic cancer. Semin Cancer Biol 2025; 111:76-88. [PMID: 39986585 DOI: 10.1016/j.semcancer.2025.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 02/13/2025] [Accepted: 02/17/2025] [Indexed: 02/24/2025]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is frequently diagnosed in its late stages when treatment options are limited. Unlike other common cancers, there are no population-wide screening programmes for PDAC. Thus, early disease detection, although urgently needed, remains elusive. Individuals in certain high-risk groups are, however, offered screening or surveillance. Here we explore advances in understanding high-risk groups for PDAC and efforts to implement biomarker-driven detection of PDAC in these groups. We review current approaches to early detection biomarker development and the use of artificial intelligence as applied to electronic health records (EHRs) and social media. Finally, we address the cost-effectiveness of applying biomarker strategies for early detection of PDAC.
Collapse
Affiliation(s)
- Kate Murray
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - Lucy Oldfield
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - Irena Stefanova
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | | | | | - Rachel O'Sullivan
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - William Greenhalf
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - Salvatore Paiella
- Pancreatic Surgery Unit, Department of Surgery, Dentistry, Paediatrics and Gynaecology, University of Verona, Italy
| | - Mateus N Aoki
- Laboratory for Applied Science and Technology in Health, Carlos Chagas Institute, Oswaldo Cruz Foundation (Fiocruz), Brazil
| | - Aldo Pastore
- Fondazione Pisana per la Scienza, Scuola Normale Superiore di Pisa, Italy
| | - James Birch-Ford
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - Bhavana Hemantha Rao
- Biomedical Centre, Faculty of Medicine in Pilsen, Charles University, Czech Republic
| | - Pinar Uysal-Onganer
- School of Life Sciences, Cancer Mechanisms and Biomarkers Group, The University of Westminster, United Kingdom
| | - Caoimhe M Walsh
- Department of Surgery and Cancer, Imperial College London, United Kingdom
| | - George B Hanna
- Department of Surgery and Cancer, Imperial College London, United Kingdom
| | | | | | | | | | - Andrei Turtoi
- Tumor Microenvironment and Resistance to Treatment Lab, Institut de Recherche en Cancérologie de Montpellier, INSERM U1194, Université de Montpellier, France
| | - Elif Arik Sever
- Institute of Health Sciences, Acibadem Mehmet Ali Aydinlar University, Turkiye
| | | | | | | | - Elif Öz
- Department of Biostatistics and Bioinformatics, Acibadem Mehmet Ali Aydinlar University, Turkiye
| | - Osman Uğur Sezerman
- Department of Biostatistics and Bioinformatics, Acibadem Mehmet Ali Aydinlar University, Turkiye
| | | | | | - Eithne Costello
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom.
| |
Collapse
|
3
|
Peduzzi G, Felici A, Pellungrini R, Campa D. Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the UK Biobank. Dig Liver Dis 2025; 57:915-922. [PMID: 39632152 DOI: 10.1016/j.dld.2024.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 11/11/2024] [Accepted: 11/12/2024] [Indexed: 12/07/2024]
Abstract
BACKGROUND Predicting the risk of developing pancreatic ductal adenocarcinoma (PDAC) is of paramount importance, given its high mortality rate. Current PDAC risk prediction models rely on a limited number of variables, do not include genetics, and have a modest accuracy. AIM This study aimed to develop an interpretable PDAC risk prediction model, based on machine learning (ML). METHODS Five ML models (Adaptive Boosting, eXtreme Gradient Boosting, CatBoost, Deep Forest and Random Forest) built on 56 exposome variables and a polygenic risk score (PRS) were tested in 654 PDAC cases and 1,308 controls of the UK Biobank. Additionally, SHapley Additive exPlanation (SHAP) and Global model Interpretation via the Recursive Partitioning (Girp) were employed to explain the models. RESULTS All models provided similar performance, but based on recall the best was CatBoost (77.10 %). SHAP highlighted age and the PRS as primary contributors across all models. Girp developed rules to discern cases from controls, identifying age, PRS, and pancreatitis in most of the rules. CONCLUSION The predictive models tested have exhibited good performance, indicating their potential application in the clinical field in the near future, with the PRS playing a key role in identifying high-risk individuals as demonstrated by the explainers.
Collapse
Affiliation(s)
- Giulia Peduzzi
- Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.
| | - Alessio Felici
- Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.
| | - Roberto Pellungrini
- Classe di scienze, Scuola Normale Superiore, Piazza dei Cavalieri, 7 - 56126, Pisa, Italy.
| | - Daniele Campa
- Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.
| |
Collapse
|
4
|
Maurer J, Rübner M, Kuo CC, Klein B, Franzen J, Wittenborn J, Kupec T, Najjari L, Fasching P, Stickeler E. Random forest algorithm identifies miRNA signatures for breast cancer detection and classification from patient urine samples. Ther Adv Med Oncol 2024; 16:17588359241299563. [PMID: 39678737 PMCID: PMC11645719 DOI: 10.1177/17588359241299563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Accepted: 10/28/2024] [Indexed: 12/17/2024] Open
Abstract
Background and objectives Breast cancer is the most common cancer in women, with one in eight women suffering from this disease in her lifetime. The implementation of centrally organized mammography screening for women between 50 and 69 years of age was a major step in the direction of early detection. However, the participation rate reaches approximately 50% of the eligible women, one reason being the painful compression of the breast, cited as a major issue for not participating in this very important program. Therefore, focusing current research on less painful and less invasive techniques for the detection of breast cancer is highly clinically relevant. Liquid biopsies offer this option by detection of distinct molecules such as microRNAs (miRNAs) or circulating tumor DNA (ctDNA) or disseminated tumor cells. Design and methods Here, we present the first proof-of-concept approach for sequencing miRNAs in female urine to detect breast cancer and, subsequently, intrinsic subtype-specific miRNA patterns and implement in this regard a novel random forest algorithm. To this end, we performed miRNA sequencing on 82 urine samples, 32 samples from breast cancer patients (9× luminal A, 8× luminal B, 9× triple-negative, and 6× HER2) and 50 healthy control samples. Results and conclusion Using a random forest algorithm, we identified a signature of 275 miRNAs that allows the detection of invasive breast cancer in urine. Furthermore, we identified distinct miRNA expression patterns for the major intrinsic subtypes of breast cancer, specifically luminal A, luminal B, HER2-enriched, and triple-negative breast cancer. This experimental approach specifically validates miRNA sequencing as a technique for breast cancer detection in urine samples and opens the door to a new, easy, and painless procedure for different breast cancer-related medical procedures such as screening but also treatment monitoring.
Collapse
Affiliation(s)
- Jochen Maurer
- Clinic for Gynecology and Obstetrics, University Hospital RWTH Aachen, Aachen, Germany
- Center for Integrated Oncology (CIO), Aachen, Bonn, Cologne, Düsseldorf (ABCD), Pauwelsstraße 30, D 52074 Aachen, Germany
| | - Matthias Rübner
- Department of Gynecology and Obstetrics, Erlangen University Hospital, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander University Erlangen-Nuremberg, Erlangen, Germany
| | - Chao-Chung Kuo
- Genomics Facility, Interdisciplinary Center for Clinical Research (IZKF), RWTH Aachen University, Aachen, Germany
| | - Birgit Klein
- Clinic for Gynecology and Obstetrics, University Hospital RWTH Aachen, Aachen, Germany
| | - Julia Franzen
- Genomics Facility, Interdisciplinary Center for Clinical Research (IZKF), RWTH Aachen University, Aachen, Germany
| | - Julia Wittenborn
- Clinic for Gynecology and Obstetrics, University Hospital RWTH Aachen, Aachen, Germany
- Center for Integrated Oncology (CIO), Aachen, Bonn, Cologne, Düsseldorf (ABCD), Germany
| | - Tomas Kupec
- Clinic for Gynecology and Obstetrics, University Hospital RWTH Aachen, Aachen, Germany
- Center for Integrated Oncology (CIO), Aachen, Bonn, Cologne, Düsseldorf (ABCD), Germany
| | - Laila Najjari
- Clinic for Gynecology and Obstetrics, University Hospital RWTH Aachen, Aachen, Germany
- Center for Integrated Oncology (CIO), Aachen, Bonn, Cologne, Düsseldorf (ABCD), Germany
| | - Peter Fasching
- Department of Gynecology and Obstetrics, Erlangen University Hospital, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander University Erlangen-Nuremberg, Erlangen, Germany
| | - Elmar Stickeler
- Clinic for Gynecology and Obstetrics, University Hospital RWTH Aachen, Aachen, Germany
- Center for Integrated Oncology (CIO), Aachen, Bonn, Cologne, Düsseldorf (ABCD), Germany
| |
Collapse
|
5
|
Isakov O, Riesel D, Leshchinsky M, Shaham G, Reis BY, Keret D, Levi Z, Brener B, Balicer R, Dagan N, Hayek S. Development and Validation of a Colorectal Cancer Prediction Model: A Nationwide Cohort-Based Study. Dig Dis Sci 2024; 69:2611-2620. [PMID: 38662163 PMCID: PMC11258054 DOI: 10.1007/s10620-024-08427-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/05/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND Early diagnosis of colorectal cancer (CRC) is critical to increasing survival rates. Computerized risk prediction models hold great promise for identifying individuals at high risk for CRC. In order to utilize such models effectively in a population-wide screening setting, development and validation should be based on cohorts that are similar to the target population. AIM Establish a risk prediction model for CRC diagnosis based on electronic health records (EHR) from subjects eligible for CRC screening. METHODS A retrospective cohort study utilizing the EHR data of Clalit Health Services (CHS). The study includes CHS members aged 50-74 who were eligible for CRC screening from January 2013 to January 2019. The model was trained to predict receiving a CRC diagnosis within 2 years of the index date. Approximately 20,000 EHR demographic and clinical features were considered. RESULTS The study includes 2935 subjects with CRC diagnosis, and 1,133,457 subjects without CRC diagnosis. Incidence values of CRC among subjects in the top 1% risk scores were higher than baseline (2.3% vs 0.3%; lift 8.38; P value < 0.001). Cumulative event probabilities increased with higher model scores. Model-based risk stratification among subjects with a positive FOBT, identified subjects with more than twice the risk for CRC compared to FOBT alone. CONCLUSIONS We developed an individualized risk prediction model for CRC that can be utilized as a complementary decision support tool for healthcare providers to precisely identify subjects at high risk for CRC and refer them for confirmatory testing.
Collapse
Affiliation(s)
- Ofer Isakov
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
| | - Dan Riesel
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel
| | - Michael Leshchinsky
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel
| | - Galit Shaham
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel
| | - Ben Y Reis
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Predictive Medicine Group, Boston Children's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Dan Keret
- Gastroenterology and Hepatology Department, Clalit Health Services, Jerusalem, Israel
| | - Zohar Levi
- Department of Gastroenterology, Beilinson Medical Center, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Baruch Brener
- Institute of Oncology, Davidoff Cancer Center, Rabin Medical Center, Beilinson Campus, Petah Tikva, Israel
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ran Balicer
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- School of Public Health, Faculty of Health Sciences, Ben Gurion University of the Negev, Be'er Sheva, Israel
| | - Noa Dagan
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Software and Information Systems Engineering, Ben Gurion University of the Negev, Be'er Sheva, Israel
| | - Samah Hayek
- Innovation Division, Clalit Research Institute, Clalit Health Services, Tel Aviv, Israel.
- Department of Epidemiology and Preventive Medicine, School of Public Health, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
6
|
Afrash MR, Mirbagheri E, Mashoufi M, Kazemi-Arpanahi H. Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study. BMC Med Inform Decis Mak 2023; 23:54. [PMID: 37024885 PMCID: PMC10080884 DOI: 10.1186/s12911-023-02154-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 03/15/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND Gastric cancer is the most common malignant tumor worldwide and a leading cause of cancer deaths. This neoplasm has a poor prognosis and heterogeneous outcomes. Survivability prediction may help select the best treatment plan based on an individual's prognosis. Numerous clinical and pathological features are generally used in predicting gastric cancer survival, and their influence on the survival of this cancer has not been fully elucidated. Moreover, the five-year survivability prognosis performances of feature selection methods with machine learning (ML) classifiers for gastric cancer have not been fully benchmarked. Therefore, we adopted several well-known feature selection methods and ML classifiers together to determine the best-paired feature selection-classifier for this purpose. METHODS This was a retrospective study on a dataset of 974 patients diagnosed with gastric cancer in the Ayatollah Talleghani Hospital, Abadan, Iran. First, four feature selection algorithms, including Relief, Boruta, least absolute shrinkage and selection operator (LASSO), and minimum redundancy maximum relevance (mRMR) were used to select a set of relevant features that are very informative for five-year survival prediction in gastric cancer patients. Then, each feature set was fed to three classifiers: XG Boost (XGB), hist gradient boosting (HGB), and support vector machine (SVM) to develop predictive models. Finally, paired feature selection-classifier methods were evaluated to select the best-paired method using the area under the curve (AUC), accuracy, sensitivity, specificity, and f1-score metrics. RESULTS The LASSO feature selection algorithm combined with the XG Boost classifier achieved an accuracy of 89.10%, a specificity of 87.15%, a sensitivity of 89.42%, an AUC of 89.37%, and an f1-score of 90.8%. Tumor stage, history of other cancers, lymphatic invasion, tumor site, type of treatment, body weight, histological type, and addiction were identified as the most significant factors affecting gastric cancer survival. CONCLUSIONS This study proved the worth of the paired feature selection-classifier to identify the best path that could augment the five-year survival prediction in gastric cancer patients. Our results were better than those of previous studies, both in terms of the time required to form the models and the performance measurement criteria of the algorithms. These findings may be very promising and can, therefore, inform clinical decision-making and shed light on future studies.
Collapse
Affiliation(s)
- Mohammad Reza Afrash
- Department of Artificial Intelligence, Smart University of Medical Sciences, Tehran, Iran
| | - Esmat Mirbagheri
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Mehrnaz Mashoufi
- Department of Health Information Management, Ardabil University of Medical Sciences, Ardabil, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
7
|
Loh HW, Ooi CP, Seoni S, Barua PD, Molinari F, Acharya UR. Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011-2022). COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107161. [PMID: 36228495 DOI: 10.1016/j.cmpb.2022.107161] [Citation(s) in RCA: 155] [Impact Index Per Article: 51.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/16/2022] [Accepted: 09/25/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVES Artificial intelligence (AI) has branched out to various applications in healthcare, such as health services management, predictive medicine, clinical decision-making, and patient data and diagnostics. Although AI models have achieved human-like performance, their use is still limited because they are seen as a black box. This lack of trust remains the main reason for their low use in practice, especially in healthcare. Hence, explainable artificial intelligence (XAI) has been introduced as a technique that can provide confidence in the model's prediction by explaining how the prediction is derived, thereby encouraging the use of AI systems in healthcare. The primary goal of this review is to provide areas of healthcare that require more attention from the XAI research community. METHODS Multiple journal databases were thoroughly searched using PRISMA guidelines 2020. Studies that do not appear in Q1 journals, which are highly credible, were excluded. RESULTS In this review, we surveyed 99 Q1 articles covering the following XAI techniques: SHAP, LIME, GradCAM, LRP, Fuzzy classifier, EBM, CBR, rule-based systems, and others. CONCLUSION We discovered that detecting abnormalities in 1D biosignals and identifying key text in clinical notes are areas that require more attention from the XAI research community. We hope this is review will encourage the development of a holistic cloud system for a smart city.
Collapse
Affiliation(s)
- Hui Wen Loh
- School of Science and Technology, Singapore University of Social Sciences, Singapore
| | - Chui Ping Ooi
- School of Science and Technology, Singapore University of Social Sciences, Singapore
| | - Silvia Seoni
- Department of Electronics and Telecommunications, Biolab, Politecnico di Torino, Torino 10129, Italy
| | - Prabal Datta Barua
- Faculty of Engineering and Information Technology, University of Technology Sydney, Australia; School of Business (Information Systems), Faculty of Business, Education, Law & Arts, University of Southern Queensland, Australia
| | - Filippo Molinari
- Department of Electronics and Telecommunications, Biolab, Politecnico di Torino, Torino 10129, Italy
| | - U Rajendra Acharya
- School of Science and Technology, Singapore University of Social Sciences, Singapore; School of Business (Information Systems), Faculty of Business, Education, Law & Arts, University of Southern Queensland, Australia; School of Engineering, Ngee Ann Polytechnic, Singapore; Department of Bioinformatics and Medical Engineering, Asia University, Taiwan; Research Organization for Advanced Science and Technology (IROAST), Kumamoto University, Kumamoto, Japan.
| |
Collapse
|
8
|
Afrash MR, Shanbehzadeh M, Kazemi-Arpanahi H. Design and Development of an Intelligent System for Predicting 5-Year Survival in Gastric Cancer. Clin Med Insights Oncol 2022; 16:11795549221116833. [PMID: 36035639 PMCID: PMC9403452 DOI: 10.1177/11795549221116833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 07/13/2022] [Indexed: 11/17/2022] Open
Abstract
Background Gastric cancer remains one of the leading causes of worldwide cancer-specific deaths. Accurately predicting the survival likelihood of gastric cancer patients can inform caregivers to boost patient prognostication and choose the best possible treatment path. This study intends to develop an intelligent system based on machine learning (ML) algorithms for predicting the 5-year survival status in gastric cancer patients. Methods A data set that includes the records of 974 gastric cancer patients retrospectively was used. First, the most important predictors were recognized using the Boruta feature selection algorithm. Five classifiers, including J48 decision tree (DT), support vector machine (SVM) with radial basic function (RBF) kernel, bootstrap aggregating (Bagging), hist gradient boosting (HGB), and adaptive boosting (AdaBoost), were trained for predicting gastric cancer survival. The performance of the used techniques was evaluated with specificity, sensitivity, likelihood ratio, and total accuracy. Finally, the system was developed according to the best model. Results The stage, position, and size of tumor were selected as the 3 top predictors for gastric cancer survival. Among the 6 selected ML algorithms, the HGB classifier with the mean accuracy, mean specificity, mean sensitivity, mean area under the curve, and mean F1-score of 88.37%, 86.24%, 89.72%, 88.11%, and 89.91%, respectively, gained the best performance. Conclusions The ML models can accurately predict the 5-year survival and potentially act as a customized recommender for decision-making in gastric cancer patients. The developed system in our study can improve the quality of treatment, patient safety, and survival rates; it may guide prescribing more personalized medicine.
Collapse
Affiliation(s)
- Mohammad Reza Afrash
- Department of Health Information
Technology and Management, School of Allied Medical Sciences, Shahid Beheshti
University of Medical Sciences, Tehran, Iran
| | - Mostafa Shanbehzadeh
- Department of Health Information
Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam,
Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information
Technology, Abadan University of Medical Sciences, Abadan, Iran
- Student Research Committee, Abadan
University of Medical Sciences, Abadan, Iran
| |
Collapse
|
9
|
Gopukumar D, Ghoshal A, Zhao H. A Machine Learning Approach for Predicting Readmission Charges Billed by Hospitals. JMIR Med Inform 2022; 10:e37578. [PMID: 35896038 PMCID: PMC9472041 DOI: 10.2196/37578] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 05/02/2022] [Accepted: 07/26/2022] [Indexed: 11/29/2022] Open
Abstract
Background The Centers for Medicare and Medicaid Services projects that health care costs will continue to grow over the next few years. Rising readmission costs contribute significantly to increasing health care costs. Multiple areas of health care, including readmissions, have benefited from the application of various machine learning algorithms in several ways. Objective We aimed to identify suitable models for predicting readmission charges billed by hospitals. Our literature review revealed that this application of machine learning is underexplored. We used various predictive methods, ranging from glass-box models (such as regularization techniques) to black-box models (such as deep learning–based models). Methods We defined readmissions as readmission with the same major diagnostic category (RSDC) and all-cause readmission category (RADC). For these readmission categories, 576,701 and 1,091,580 individuals, respectively, were identified from the Nationwide Readmission Database of the Healthcare Cost and Utilization Project by the Agency for Healthcare Research and Quality for 2013. Linear regression, lasso regression, elastic net, ridge regression, eXtreme gradient boosting (XGBoost), and a deep learning model based on multilayer perceptron (MLP) were the 6 machine learning algorithms we tested for RSDC and RADC through 10-fold cross-validation. Results Our preliminary analysis using a data-driven approach revealed that within RADC, the subsequent readmission charge billed per patient was higher than the previous charge for 541,090 individuals, and this number was 319,233 for RSDC. The top 3 major diagnostic categories (MDCs) for such instances were the same for RADC and RSDC. The average readmission charge billed was higher than the previous charge for 21 of the MDCs in the case of RSDC, whereas it was only for 13 of the MDCs in RADC. We recommend XGBoost and the deep learning model based on MLP for predicting readmission charges. The following performance metrics were obtained for XGBoost: (1) RADC (mean absolute percentage error [MAPE]=3.121%; root mean squared error [RMSE]=0.414; mean absolute error [MAE]=0.317; root relative squared error [RRSE]=0.410; relative absolute error [RAE]=0.399; normalized RMSE [NRMSE]=0.040; mean absolute deviation [MAD]=0.031) and (2) RSDC (MAPE=3.171%; RMSE=0.421; MAE=0.321; RRSE=0.407; RAE=0.393; NRMSE=0.041; MAD=0.031). The performance obtained for MLP-based deep neural networks are as follows: (1) RADC (MAPE=3.103%; RMSE=0.413; MAE=0.316; RRSE=0.410; RAE=0.397; NRMSE=0.040; MAD=0.031) and (2) RSDC (MAPE=3.202%; RMSE=0.427; MAE=0.326; RRSE=0.413; RAE=0.399; NRMSE=0.041; MAD=0.032). Repeated measures ANOVA revealed that the mean RMSE differed significantly across models with P<.001. Post hoc tests using the Bonferroni correction method indicated that the mean RMSE of the deep learning/XGBoost models was statistically significantly (P<.001) lower than that of all other models, namely linear regression/elastic net/lasso/ridge regression. Conclusions Models built using XGBoost and MLP are suitable for predicting readmission charges billed by hospitals. The MDCs allow models to accurately predict hospital readmission charges.
Collapse
Affiliation(s)
- Deepika Gopukumar
- Department of Health and Clinical Outcomes Research, School of Medicine, Saint Louis University, SALUS Center, 3545 Lafayette Ave., 4rth floor, Room 409 B, St.Louis, US
| | - Abhijeet Ghoshal
- Department of Business Administration, Gies College of Business, University of Illinois Urbana-Champaign, Champaign, US
| | - Huimin Zhao
- Sheldon B. Lubar College of Business, University of Wisconsin-Milwaukee, Milwaukee, US
| |
Collapse
|