1
|
Farinella R, Felici A, Peduzzi G, Testoni SGG, Costello E, Aretini P, Blazquez-Encinas R, Oz E, Pastore A, Tacelli M, Otlu B, Campa D, Gentiluomo M. From classical approaches to artificial intelligence, old and new tools for PDAC risk stratification and prediction. Semin Cancer Biol 2025; 112:71-92. [PMID: 40147701 DOI: 10.1016/j.semcancer.2025.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 03/08/2025] [Accepted: 03/19/2025] [Indexed: 03/29/2025]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is recognized as one of the most lethal malignancies, characterized by late-stage diagnosis and limited therapeutic options. Risk stratification has traditionally been performed using epidemiological studies and genetic analyses, through which key risk factors, including smoking, diabetes, chronic pancreatitis, and inherited predispositions, have been identified. However, the multifactorial nature of PDAC has often been insufficiently addressed by these methods, leading to limited precision in individualized risk assessments. Advances in artificial intelligence (AI) have been proposed as a transformative approach, allowing the integration of diverse datasets-spanning genetic, clinical, lifestyle, and imaging data into dynamic models capable of uncovering novel interactions and risk profiles. In this review, the evolution of PDAC risk stratification is explored, with classical epidemiological frameworks compared to AI-driven methodologies. Genetic insights, including genome-wide association studies and polygenic risk scores, are discussed, alongside AI models such as machine learning, radiomics, and deep learning. Strengths and limitations of these approaches are evaluated, with challenges in clinical translation, such as data scarcity, model interpretability, and external validation, addressed. Finally, future directions are proposed for combining classical and AI-driven methodologies to develop scalable, personalized predictive tools for PDAC, with the goal of improving early detection and patient outcomes.
Collapse
Affiliation(s)
| | | | | | - Sabrina Gloria Giulia Testoni
- Division of Gastroenterology and Gastrointestinal Endoscopy, IRCCS Policlinico San Donato, Vita-Salute San Raffaele University, Milan, Italy
| | - Eithne Costello
- Liverpool Experimental Cancer Medicine Centre, University of Liverpool, Liverpool, United Kingdom
| | - Paolo Aretini
- Fondazione Pisana per la Scienza, San Giuliano Terme, Italy
| | - Ricardo Blazquez-Encinas
- Department of Cell Biology, Physiology and Immunology, University of Cordoba / Maimonides Biomedical Research Institute of Cordoba (IMIBIC), Cordoba, Spain
| | - Elif Oz
- Department of Biostatistics and Bioinformatics, Institute of Health Sciences, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
| | - Aldo Pastore
- Fondazione Pisana per la Scienza, San Giuliano Terme, Italy
| | - Matteo Tacelli
- Pancreas Translational & Clinical Research Center, Pancreato-Biliary Endoscopy and Endosonography Division, San Raffaele Scientific Institute IRCCS, Milan, Italy
| | - Burçak Otlu
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Daniele Campa
- Department of Biology, University of Pisa, Pisa, Italy
| | | |
Collapse
|
2
|
Felici A, Peduzzi G, Pellungrini R, Campa D. Artificial intelligence to predict cancer risk, are we there yet? A comprehensive review across cancer types. Eur J Cancer 2025; 222:115440. [PMID: 40273730 DOI: 10.1016/j.ejca.2025.115440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2025] [Accepted: 03/25/2025] [Indexed: 04/26/2025]
Abstract
Cancer remains the second leading cause of death worldwide, representing a substantial challenge to global health. Although traditional risk prediction models have played a crucial role in epidemiology of several cancer types, they have limitations especially in the ability to process complex and multidimensional data. In contrast, artificial intelligence (AI) approaches represent a promising solution to overcome this limitation. AI techniques have the potential to identify complex patterns and relationships in data that traditional methods might overlook, making them especially useful for handling large and heterogeneous datasets analysed in cancer research. This review first examines the current state of the art of AI techniques, highlighting their differences and suitability for various data types. Then, offers a comprehensive analysis of the literature, focusing on the application of AI approaches in nineteen cancer types (bladder cancer, breast cancer, cervical cancer, colorectal cancer, endometrial cancer, esophageal cancer, gastric cancer, gynaecological cancers, head and neck cancer, haematological cancers, kidney cancer, liver cancer, lung cancer, melanoma, ovarian cancer, pancreatic cancer, prostate cancer, thyroid cancer and overall cancer), evaluating the models, metrics, and exposure variables used. Finally, the review discusses the application of AI in the clinical practice, along with an assessment of its potential limitations and future directions.
Collapse
Affiliation(s)
- Alessio Felici
- Department of Biology, University of Pisa, Via Luca Ghini, 13, Pisa 56126, Italy
| | - Giulia Peduzzi
- Department of Biology, University of Pisa, Via Luca Ghini, 13, Pisa 56126, Italy
| | - Roberto Pellungrini
- Classe di scienze, Scuola Normale Superiore, Piazza dei Cavalieri, 7, Pisa 56126, Italy
| | - Daniele Campa
- Department of Biology, University of Pisa, Via Luca Ghini, 13, Pisa 56126, Italy.
| |
Collapse
|
3
|
Zhu W, Chen L, Aphinyanaphongs Y, Kastrinos F, Simeone DM, Pochapin M, Stender C, Razavian N, Gonda TA. Identification of patients at risk for pancreatic cancer in a 3-year timeframe based on machine learning algorithms. Sci Rep 2025; 15:11697. [PMID: 40188106 PMCID: PMC11972345 DOI: 10.1038/s41598-025-89607-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Accepted: 02/06/2025] [Indexed: 04/07/2025] Open
Abstract
Early detection of pancreatic cancer (PC) remains challenging largely due to the low population incidence and few known risk factors. However, screening in at-risk populations and detection of early cancer has the potential to significantly alter survival. In this study, we aim to develop a predictive model to identify patients at risk for developing new-onset PC at two and a half to three year time frame. We used the Electronic Health Records (EHR) of a large medical system from 2000 to 2021 (N = 537,410). The EHR data analyzed in this work consists of patients' demographic information, diagnosis records, and lab values, which are used to identify patients who were diagnosed with pancreatic cancer and the risk factors used in the machine learning algorithm for prediction. We identified 73 risk factors of pancreatic cancer with the Phenome-wide Association Study (PheWAS) on a matched case-control cohort. Based on them, we built a large-scale machine learning algorithm based on EHR. A temporally stratified validation based on patients not included in any stage of the training of the model was performed. This model showed an AUROC at 0.742 [0.727, 0.757] which was similar in both the general population and in a subset of the population who has had prior cross-sectional imaging. The rate of diagnosis of pancreatic cancer in those in the top 1 percentile of the risk score was 6 folds higher than the general population. Our model leverages data extracted from a 6-month window of time in the electronic health record to identify patients at nearly sixfold higher than baseline risk of developing pancreatic cancer 2.5-3 years from evaluation. This approach offers an opportunity to define an enriched population entirely based on static data, where current screening may be recommended.
Collapse
Affiliation(s)
- Weicheng Zhu
- Center for Data Science, New York University, New York, NY, USA
| | - Long Chen
- Center for Data Science, New York University, New York, NY, USA
| | - Yindalon Aphinyanaphongs
- Department of Population Health, New York University Grossman School of Medicine, 227 East 30th Street, 6th Floor, New York, NY, 10016, USA
| | - Fay Kastrinos
- Department of Medicine, Division of Digestive and Liver Diseases, Columbia University Irving Medical Center, New York, NY, USA
| | - Diane M Simeone
- Moores Cancer Center, UC San Diego Health, San Diego, CA, USA
| | - Mark Pochapin
- Division of Gastroenterology and Hepatology, Department of Medicine, New York University, 240 East 38th Street, 23rd Floor, New York, NY, 10016, USA
| | - Cody Stender
- Department of Surgery, New York University, New York, NY, USA
| | - Narges Razavian
- Department of Population Health, New York University Grossman School of Medicine, 227 East 30th Street, 6th Floor, New York, NY, 10016, USA.
| | - Tamas A Gonda
- Division of Gastroenterology and Hepatology, Department of Medicine, New York University, 240 East 38th Street, 23rd Floor, New York, NY, 10016, USA.
| |
Collapse
|
4
|
Peduzzi G, Felici A, Pellungrini R, Campa D. Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the UK Biobank. Dig Liver Dis 2025; 57:915-922. [PMID: 39632152 DOI: 10.1016/j.dld.2024.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 11/11/2024] [Accepted: 11/12/2024] [Indexed: 12/07/2024]
Abstract
BACKGROUND Predicting the risk of developing pancreatic ductal adenocarcinoma (PDAC) is of paramount importance, given its high mortality rate. Current PDAC risk prediction models rely on a limited number of variables, do not include genetics, and have a modest accuracy. AIM This study aimed to develop an interpretable PDAC risk prediction model, based on machine learning (ML). METHODS Five ML models (Adaptive Boosting, eXtreme Gradient Boosting, CatBoost, Deep Forest and Random Forest) built on 56 exposome variables and a polygenic risk score (PRS) were tested in 654 PDAC cases and 1,308 controls of the UK Biobank. Additionally, SHapley Additive exPlanation (SHAP) and Global model Interpretation via the Recursive Partitioning (Girp) were employed to explain the models. RESULTS All models provided similar performance, but based on recall the best was CatBoost (77.10 %). SHAP highlighted age and the PRS as primary contributors across all models. Girp developed rules to discern cases from controls, identifying age, PRS, and pancreatitis in most of the rules. CONCLUSION The predictive models tested have exhibited good performance, indicating their potential application in the clinical field in the near future, with the PRS playing a key role in identifying high-risk individuals as demonstrated by the explainers.
Collapse
Affiliation(s)
- Giulia Peduzzi
- Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.
| | - Alessio Felici
- Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.
| | - Roberto Pellungrini
- Classe di scienze, Scuola Normale Superiore, Piazza dei Cavalieri, 7 - 56126, Pisa, Italy.
| | - Daniele Campa
- Department of Biology, University of Pisa, Via Luca Ghini, 13 - 56126, Pisa, Italy.
| |
Collapse
|
5
|
He J, Rasmy L, Zhi D, Tao C. Advancing Pancreatic Cancer Prediction with a Next Visit Token Prediction Head on Top of Med-BERT. Cancers (Basel) 2025; 17:516. [PMID: 39941883 PMCID: PMC11816036 DOI: 10.3390/cancers17030516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2025] [Revised: 01/27/2025] [Accepted: 01/30/2025] [Indexed: 02/16/2025] Open
Abstract
BACKGROUND Electronic Health Records (EHRs) encompass valuable data essential for disease prediction. The application of artificial intelligence (AI), particularly deep learning, significantly enhances disease prediction by analyzing extensive EHR datasets to identify hidden patterns, facilitating early detection. Recently, numerous foundation models pretrained on extensive data have demonstrated efficacy in disease prediction using EHRs. However, there remains some unanswered questions on how to best utilize such models, especially with very small fine-tuning cohorts. METHODS We utilized Med-BERT, an EHR-specific foundation model, and reformulated the disease binary prediction task into a token prediction task and a next visit mask token prediction task to align with Med-BERT's pretraining task format in order to improve the accuracy of pancreatic cancer (PaCa) prediction in both few-shot and fully supervised settings. RESULTS The reformulation of the task into a token prediction task, referred to as Med-BERT-Sum, demonstrated slightly superior performance in both few-shot scenarios and larger data samples. Furthermore, reformulating the prediction task as a Next Visit Mask Token Prediction task (Med-BERT-Mask) significantly outperformed the conventional Binary Classification (BC) prediction task (Med-BERT-BC) by 3% to 7% in few-shot scenarios with data sizes ranging from 10 to 500 samples. These findings highlight that aligning the downstream task with Med-BERT's pretraining objectives substantially enhances the model's predictive capabilities, thereby improving its effectiveness in predicting both rare and common diseases. CONCLUSIONS Reformatting disease prediction tasks to align with the pretraining of foundation models enhances prediction accuracy, leading to earlier detection and timely intervention. This approach improves treatment effectiveness, survival rates, and overall patient outcomes for PaCa and potentially other cancers.
Collapse
Affiliation(s)
- Jianping He
- McWilliams School of Biomedical Informatics, UTHealth at Houston, Houston, TX 77030, USA; (J.H.); (L.R.)
| | - Laila Rasmy
- McWilliams School of Biomedical Informatics, UTHealth at Houston, Houston, TX 77030, USA; (J.H.); (L.R.)
| | - Degui Zhi
- McWilliams School of Biomedical Informatics, UTHealth at Houston, Houston, TX 77030, USA; (J.H.); (L.R.)
| | - Cui Tao
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL 32224, USA
| |
Collapse
|
6
|
Steinfeldt J, Wild B, Buergel T, Pietzner M, Upmeier Zu Belzen J, Vauvelle A, Hegselmann S, Denaxas S, Hemingway H, Langenberg C, Landmesser U, Deanfield J, Eils R. Medical history predicts phenome-wide disease onset and enables the rapid response to emerging health threats. Nat Commun 2025; 16:585. [PMID: 39794311 PMCID: PMC11724087 DOI: 10.1038/s41467-025-55879-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 01/02/2025] [Indexed: 01/13/2025] Open
Abstract
The COVID-19 pandemic exposed a global deficiency of systematic, data-driven guidance to identify high-risk individuals. Here, we illustrate the utility of routinely recorded medical history to predict the risk for 1741 diseases across clinical specialties and support the rapid response to emerging health threats such as COVID-19. We developed a neural network to learn from health records of 502,489 UK Biobank participants. Importantly, we observed discriminative improvements over basic demographic predictors for 1546 (88.8%) endpoints. After transferring the unmodified risk models to the All of US cohort, we replicated these improvements for 1115 (78.9%) of 1414 investigated endpoints, demonstrating generalizability across healthcare systems and historically underrepresented groups. Ultimately, we showed how this approach could have been used to identify individuals vulnerable to severe COVID-19. Our study demonstrates the potential of medical history to support guidance for emerging pandemics by systematically estimating risk for thousands of diseases at once at minimal cost.
Collapse
Affiliation(s)
- Jakob Steinfeldt
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Berlin, Germany
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Benjamin Wild
- Institute of Cardiovascular Sciences, University College London, London, UK
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Thore Buergel
- Institute of Cardiovascular Sciences, University College London, London, UK
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Maik Pietzner
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Julius Upmeier Zu Belzen
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Andre Vauvelle
- Institute of Health Informatics, University College London, London, UK
| | - Stefan Hegselmann
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Massachusetts, USA
- Pattern Recognition and Image Analysis Lab, University of Münster, Münster, Germany
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals, London, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals, London, UK
| | - Claudia Langenberg
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Ulf Landmesser
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, Berlin, Berlin, Germany
| | - John Deanfield
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Roland Eils
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany.
- Health Data Science Unit, Heidelberg University Hospital and BioQuant, Heidelberg, Germany.
| |
Collapse
|
7
|
Haue AD, Hjaltelin JX, Holm PC, Placido D, Brunak SR. Artificial intelligence-aided data mining of medical records for cancer detection and screening. Lancet Oncol 2024; 25:e694-e703. [PMID: 39637906 DOI: 10.1016/s1470-2045(24)00277-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 05/08/2024] [Accepted: 05/10/2024] [Indexed: 12/07/2024]
Abstract
The application of artificial intelligence methods to electronic patient records paves the way for large-scale analysis of multimodal data. Such population-wide data describing deep phenotypes composed of thousands of features are now being leveraged to create data-driven algorithms, which in turn has led to improved methods for early cancer detection and screening. Remaining challenges include establishment of infrastructures for prospective testing of such methods, ways to assess biases given the data, and gathering of sufficiently large and diverse datasets that reflect disease heterogeneities across populations. This Review provides an overview of artificial intelligence methods designed to detect cancer early, including key aspects of concern (eg, the problem of data drift-when the underlying health-care data change over time), ethical aspects, and discrepancies between access to cancer screening in high-income countries versus low-income and middle-income countries.
Collapse
Affiliation(s)
- Amalie Dahl Haue
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Copenhagen University Hospital Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Jessica Xin Hjaltelin
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Peter Christoffer Holm
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Davide Placido
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Copenhagen University Hospital Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - S Ren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Copenhagen University Hospital Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
8
|
Mishra AK, Chong B, Arunachalam SP, Oberg AL, Majumder S. Machine Learning Models for Pancreatic Cancer Risk Prediction Using Electronic Health Record Data-A Systematic Review and Assessment. Am J Gastroenterol 2024; 119:1466-1482. [PMID: 38752654 PMCID: PMC11296923 DOI: 10.14309/ajg.0000000000002870] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 05/06/2024] [Indexed: 06/20/2024]
Abstract
INTRODUCTION Accurate risk prediction can facilitate screening and early detection of pancreatic cancer (PC). We conducted a systematic review to critically evaluate effectiveness of machine learning (ML) and artificial intelligence (AI) techniques applied to electronic health records (EHR) for PC risk prediction. METHODS Ovid MEDLINE(R), Ovid EMBASE, Ovid Cochrane Central Register of Controlled Trials, Ovid Cochrane Database of Systematic Reviews, Scopus, and Web of Science were searched for articles that utilized ML/AI techniques to predict PC, published between January 1, 2012, and February 1, 2024. Study selection and data extraction were conducted by 2 independent reviewers. Critical appraisal and data extraction were performed using the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies checklist. Risk of bias and applicability were examined using prediction model risk of bias assessment tool. RESULTS Thirty studies including 169,149 PC cases were identified. Logistic regression was the most frequent modeling method. Twenty studies utilized a curated set of known PC risk predictors or those identified by clinical experts. ML model discrimination performance (C-index) ranged from 0.57 to 1.0. Missing data were underreported, and most studies did not implement explainable-AI techniques or report exclusion time intervals. DISCUSSION AI/ML models for PC risk prediction using known risk factors perform reasonably well and may have near-term applications in identifying cohorts for targeted PC screening if validated in real-world data sets. The combined use of structured and unstructured EHR data using emerging AI models while incorporating explainable-AI techniques has the potential to identify novel PC risk factors, and this approach merits further study.
Collapse
Affiliation(s)
- Anup Kumar Mishra
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Bradford Chong
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | | | - Ann L. Oberg
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Shounak Majumder
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
9
|
Mukund A, Afridi MA, Karolak A, Park MA, Permuth JB, Rasool G. Pancreatic Ductal Adenocarcinoma (PDAC): A Review of Recent Advancements Enabled by Artificial Intelligence. Cancers (Basel) 2024; 16:2240. [PMID: 38927945 PMCID: PMC11201559 DOI: 10.3390/cancers16122240] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 06/03/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
Pancreatic Ductal Adenocarcinoma (PDAC) remains one of the most formidable challenges in oncology, characterized by its late detection and poor prognosis. Artificial intelligence (AI) and machine learning (ML) are emerging as pivotal tools in revolutionizing PDAC care across various dimensions. Consequently, many studies have focused on using AI to improve the standard of PDAC care. This review article attempts to consolidate the literature from the past five years to identify high-impact, novel, and meaningful studies focusing on their transformative potential in PDAC management. Our analysis spans a broad spectrum of applications, including but not limited to patient risk stratification, early detection, and prediction of treatment outcomes, thereby highlighting AI's potential role in enhancing the quality and precision of PDAC care. By categorizing the literature into discrete sections reflective of a patient's journey from screening and diagnosis through treatment and survivorship, this review offers a comprehensive examination of AI-driven methodologies in addressing the multifaceted challenges of PDAC. Each study is summarized by explaining the dataset, ML model, evaluation metrics, and impact the study has on improving PDAC-related outcomes. We also discuss prevailing obstacles and limitations inherent in the application of AI within the PDAC context, offering insightful perspectives on potential future directions and innovations.
Collapse
Affiliation(s)
- Ashwin Mukund
- Department of Machine Learning, Moffitt Cancer Center and Research Institute, 12902 USF Magnolia Drive, Tampa, FL 33612, USA; (A.M.); (A.K.)
| | - Muhammad Ali Afridi
- School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan;
| | - Aleksandra Karolak
- Department of Machine Learning, Moffitt Cancer Center and Research Institute, 12902 USF Magnolia Drive, Tampa, FL 33612, USA; (A.M.); (A.K.)
| | - Margaret A. Park
- Departments of Cancer Epidemiology and Gastrointestinal Oncology, Moffitt Cancer Center and Research Institute, 12902 USF Magnolia Drive, Tampa, FL 33612, USA; (M.A.P.); (J.B.P.)
| | - Jennifer B. Permuth
- Departments of Cancer Epidemiology and Gastrointestinal Oncology, Moffitt Cancer Center and Research Institute, 12902 USF Magnolia Drive, Tampa, FL 33612, USA; (M.A.P.); (J.B.P.)
| | - Ghulam Rasool
- Department of Machine Learning, Moffitt Cancer Center and Research Institute, 12902 USF Magnolia Drive, Tampa, FL 33612, USA; (A.M.); (A.K.)
| |
Collapse
|
10
|
Sarwal D, Wang L, Gandhi S, Sagheb Hossein Pour E, Janssens LP, Delgado AM, Doering KA, Mishra AK, Greenwood JD, Liu H, Majumder S. Identification of pancreatic cancer risk factors from clinical notes using natural language processing. Pancreatology 2024; 24:572-578. [PMID: 38693040 DOI: 10.1016/j.pan.2024.03.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 05/03/2024]
Abstract
OBJECTIVES Screening for pancreatic ductal adenocarcinoma (PDAC) is considered in high-risk individuals (HRIs) with established PDAC risk factors, such as family history and germline mutations in PDAC susceptibility genes. Accurate assessment of risk factor status is provider knowledge-dependent and requires extensive manual chart review by experts. Natural Language Processing (NLP) has shown promise in automated data extraction from the electronic health record (EHR). We aimed to use NLP for automated extraction of PDAC risk factors from unstructured clinical notes in the EHR. METHODS We first developed rule-based NLP algorithms to extract PDAC risk factors at the document-level, using an annotated corpus of 2091 clinical notes. Next, we further improved the NLP algorithms using a cohort of 1138 patients through patient-level training, validation, and testing, with comparison against a pre-specified reference standard. To minimize false-negative results we prioritized algorithm recall. RESULTS In the test set (n = 807), the NLP algorithms achieved a recall of 0.933, precision of 0.790, and F1-score of 0.856 for family history of PDAC. For germline genetic mutations, the algorithm had a high recall of 0.851, while precision and F1-score were lower at 0.350 and 0.496 respectively. Most false positives for germline mutations resulted from erroneous recognition of tissue mutations. CONCLUSIONS Rule-based NLP algorithms applied to unstructured clinical notes are highly sensitive for automated identification of PDAC risk factors. Further validation in a large primary-care patient population is warranted to assess real-world utility in identifying HRIs for pancreatic cancer screening.
Collapse
Affiliation(s)
- Dhruv Sarwal
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Liwei Wang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Sonal Gandhi
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | | | - Laurens P Janssens
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Adriana M Delgado
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Karen A Doering
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Anup Kumar Mishra
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Jason D Greenwood
- Department of Family Medicine, Mayo Clinic, Rochester, MN, USA; Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Shounak Majumder
- Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
11
|
Wang H, Shen B, Jia P, Li H, Bai X, Li Y, Xu K, Hu P, Ding L, Xu N, Xia X, Fang Y, Chen H, Zhang Y, Yue S. Guiding post-pancreaticoduodenectomy interventions for pancreatic cancer patients utilizing decision tree models. Front Oncol 2024; 14:1399297. [PMID: 38873261 PMCID: PMC11169653 DOI: 10.3389/fonc.2024.1399297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 04/29/2024] [Indexed: 06/15/2024] Open
Abstract
Background Pancreatic ductal adenocarcinoma (PDAC) is frequently diagnosed in advanced stages, necessitating pancreaticoduodenectomy (PD) as a primary therapeutic approach. However, PD surgery can engender intricate complications. Thus, understanding the factors influencing postoperative complications documented in electronic medical records and their impact on survival rates is crucial for improving overall patient outcomes. Methods A total of 749 patients were divided into two groups: 598 (79.84%) chose the RPD (Robotic pancreaticoduodenectomy) procedure and 151 (20.16%) chose the LPD (Laparoscopic pancreaticoduodenectomy) procedure. We used correlation analysis, survival analysis, and decision tree models to find the similarities and differences about postoperative complications and prognostic survival. Results Pancreatic cancer, known for its aggressiveness, often requires pancreaticoduodenectomy as an effective treatment. In predictive models, both BMI and surgery duration weigh heavily. Lower BMI correlates with longer survival, while patients with heart disease and diabetes have lower survival rates. Complications like delayed gastric emptying, pancreatic fistula, and infection are closely linked post-surgery, prompting conjectures about their causal mechanisms. Interestingly, we found no significant correlation between nasogastric tube removal timing and delayed gastric emptying, suggesting its prompt removal post-decompression. Conclusion This study aimed to explore predictive factors for postoperative complications and survival in PD patients. Effective predictive models enable early identification of high-risk individuals, allowing timely interventions. Higher BMI, heart disease, or diabetes significantly reduce survival rates in pancreatic cancer patients post-PD. Additionally, there's no significant correlation between DGE incidence and postoperative extubation time, necessitating further investigation into its interaction with pancreatic fistula and infection.
Collapse
Affiliation(s)
- Haixin Wang
- Department of Cadre Medical, The First Medical Centre, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Bo Shen
- Department of Respiratory and Critical Care Medicine, The Eighth Medical Centre, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Peiheng Jia
- Academy of Military Medical Science, Beijing, China
| | - Hao Li
- Academy of Military Medical Science, Beijing, China
| | - Xuemei Bai
- Academy of Military Medical Science, Beijing, China
| | - Yaru Li
- Academy of Military Medical Science, Beijing, China
| | - Kang Xu
- School of Software, Shandong University, Jinan, China
| | - Pengzhen Hu
- Academy of Military Medical Science, Beijing, China
- Northwestern Polytechnical University School of Life Sciences, Xi'an, China
| | - Li Ding
- Department of Cadre Medical, The First Medical Centre, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Na Xu
- Department of Cadre Medical, The First Medical Centre, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Xiaoxiao Xia
- Department of Cadre Medical, The First Medical Centre, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Yong Fang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, China
| | - Hebing Chen
- Academy of Military Medical Science, Beijing, China
| | - Yan Zhang
- Department of Cadre Medical, The First Medical Centre, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Shutong Yue
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, China
| |
Collapse
|
12
|
Steinfeldt J, Wild B, Buergel T, Pietzner M, Upmeier Zu Belzen J, Vauvelle A, Hegselmann S, Denaxas S, Hemingway H, Langenberg C, Landmesser U, Deanfield J, Eils R. Medical history predicts phenome-wide disease onset and enables the rapid response to emerging health threats. Nat Commun 2024; 15:4257. [PMID: 38763986 PMCID: PMC11102902 DOI: 10.1038/s41467-024-48568-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/03/2024] [Indexed: 05/21/2024] Open
Abstract
The COVID-19 pandemic exposed a global deficiency of systematic, data-driven guidance to identify high-risk individuals. Here, we illustrate the utility of routinely recorded medical history to predict the risk for 1883 diseases across clinical specialties and support the rapid response to emerging health threats such as COVID-19. We developed a neural network to learn from health records of 502,460 UK Biobank. Importantly, we observed discriminative improvements over basic demographic predictors for 1774 (94.3%) endpoints. After transferring the unmodified risk models to the All of US cohort, we replicated these improvements for 1347 (89.8%) of 1500 investigated endpoints, demonstrating generalizability across healthcare systems and historically underrepresented groups. Ultimately, we showed how this approach could have been used to identify individuals vulnerable to severe COVID-19. Our study demonstrates the potential of medical history to support guidance for emerging pandemics by systematically estimating risk for thousands of diseases at once at minimal cost.
Collapse
Affiliation(s)
- Jakob Steinfeldt
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Charitéplatz 1, 10117, Berlin, Germany
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Benjamin Wild
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Thore Buergel
- Institute of Cardiovascular Sciences, University College London, London, UK
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Maik Pietzner
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Julius Upmeier Zu Belzen
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Andre Vauvelle
- Institute of Health Informatics, University College London, London, UK
| | - Stefan Hegselmann
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Massachusetts, USA
- Pattern Recognition and Image Analysis Lab, University of Münster, Münster, Germany
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals National Institute for Health Research, Biomedical Research Centre, London, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals National Institute for Health Research, Biomedical Research Centre, London, UK
| | - Claudia Langenberg
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Ulf Landmesser
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Charitéplatz 1, 10117, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, Berlin, Berlin, Germany
| | - John Deanfield
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Roland Eils
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany.
- Health Data Science Unit, Heidelberg University Hospital and BioQuant, Heidelberg, Germany.
| |
Collapse
|
13
|
Daher H, Punchayil SA, Ismail AAE, Fernandes RR, Jacob J, Algazzar MH, Mansour M. Advancements in Pancreatic Cancer Detection: Integrating Biomarkers, Imaging Technologies, and Machine Learning for Early Diagnosis. Cureus 2024; 16:e56583. [PMID: 38646386 PMCID: PMC11031195 DOI: 10.7759/cureus.56583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/20/2024] [Indexed: 04/23/2024] Open
Abstract
Artificial intelligence (AI) has come to play a pivotal role in revolutionizing medical practices, particularly in the field of pancreatic cancer detection and management. As a leading cause of cancer-related deaths, pancreatic cancer warrants innovative approaches due to its typically advanced stage at diagnosis and dismal survival rates. Present detection methods, constrained by limitations in accuracy and efficiency, underscore the necessity for novel solutions. AI-driven methodologies present promising avenues for enhancing early detection and prognosis forecasting. Through the analysis of imaging data, biomarker profiles, and clinical information, AI algorithms excel in discerning subtle abnormalities indicative of pancreatic cancer with remarkable precision. Moreover, machine learning (ML) algorithms facilitate the amalgamation of diverse data sources to optimize patient care. However, despite its huge potential, the implementation of AI in pancreatic cancer detection faces various challenges. Issues such as the scarcity of comprehensive datasets, biases in algorithm development, and concerns regarding data privacy and security necessitate thorough scrutiny. While AI offers immense promise in transforming pancreatic cancer detection and management, ongoing research and collaborative efforts are indispensable in overcoming technical hurdles and ethical dilemmas. This review delves into the evolution of AI, its application in pancreatic cancer detection, and the challenges and ethical considerations inherent in its integration.
Collapse
Affiliation(s)
- Hisham Daher
- Internal Medicine, University of Debrecen, Debrecen, HUN
| | - Sneha A Punchayil
- Internal Medicine, University Hospital of North Tees, Stockton-on-Tees, GBR
| | | | | | - Joel Jacob
- General Medicine, Diana Princess of Wales Hospital, Grimsby, GBR
| | | | - Mohammad Mansour
- General Medicine, University of Debrecen, Debrecen, HUN
- General Medicine, Jordan University Hospital, Amman, JOR
| |
Collapse
|
14
|
Rawlani P, Ghosh NK, Kumar A. Role of artificial intelligence in the characterization of indeterminate pancreatic head mass and its usefulness in preoperative diagnosis. Artif Intell Gastroenterol 2023; 4:48-63. [DOI: 10.35712/aig.v4.i3.48] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/11/2023] [Accepted: 10/08/2023] [Indexed: 12/07/2023] Open
Abstract
Artificial intelligence (AI) has been used in various fields of day-to-day life and its role in medicine is immense. Understanding of oncology has been improved with the introduction of AI which helps in diagnosis, treatment planning, management, prognosis, and follow-up. It also helps to identify high-risk groups who can be subjected to timely screening for early detection of malignant conditions. It is more important in pancreatic cancer as it is one of the major causes of cancer-related deaths worldwide and there are no specific early features (clinical and radiological) for diagnosis. With improvement in imaging modalities (computed tomography, magnetic resonance imaging, endoscopic ultrasound), most often clinicians were being challenged with lesions that were difficult to diagnose with human competence. AI has been used in various other branches of medicine to differentiate such indeterminate lesions including the thyroid gland, breast, lungs, liver, adrenal gland, kidney, etc. In the case of pancreatic cancer, the role of AI has been explored and is still ongoing. This review article will focus on how AI can be used to diagnose pancreatic cancer early or differentiate it from benign pancreatic lesions, therefore, management can be planned at an earlier stage.
Collapse
Affiliation(s)
- Palash Rawlani
- Department of Surgical Gastroenterology, Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow 226014, Uttar Pradesh, India
| | - Nalini Kanta Ghosh
- Department of Surgical Gastroenterology, Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow 226014, Uttar Pradesh, India
| | - Ashok Kumar
- Department of Surgical Gastroenterology, Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow 226014, Uttar Pradesh, India
| |
Collapse
|
15
|
Jia K, Kundrot S, Palchuk MB, Warnick J, Haapala K, Kaplan ID, Rinard M, Appelbaum L. A pancreatic cancer risk prediction model (Prism) developed and validated on large-scale US clinical data. EBioMedicine 2023; 98:104888. [PMID: 38007948 PMCID: PMC10755107 DOI: 10.1016/j.ebiom.2023.104888] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 11/03/2023] [Accepted: 11/10/2023] [Indexed: 11/28/2023] Open
Abstract
BACKGROUND Pancreatic Duct Adenocarcinoma (PDAC) screening can enable early-stage disease detection and long-term survival. Current guidelines use inherited predisposition, with about 10% of PDAC cases eligible for screening. Using Electronic Health Record (EHR) data from a multi-institutional federated network, we developed and validated a PDAC RISk Model (Prism) for the general US population to extend early PDAC detection. METHODS Neural Network (PrismNN) and Logistic Regression (PrismLR) were developed using EHR data from 55 US Health Care Organisations (HCOs) to predict PDAC risk 6-18 months before diagnosis for patients 40 years or older. Model performance was assessed using Area Under the Curve (AUC) and calibration plots. Models were internal-externally validated by geographic location, race, and time. Simulated model deployment evaluated Standardised Incidence Ratio (SIR) and other metrics. FINDINGS With 35,387 PDAC cases, 1,500,081 controls, and 87 features per patient, PrismNN obtained a test AUC of 0.826 (95% CI: 0.824-0.828) (PrismLR: 0.800 (95% CI: 0.798-0.802)). PrismNN's average internal-external validation AUCs were 0.740 for locations, 0.828 for races, and 0.789 (95% CI: 0.762-0.816) for time. At SIR = 5.10 (exceeding the current screening inclusion threshold) in simulated model deployment, PrismNN sensitivity was 35.9% (specificity 95.3%). INTERPRETATION Prism models demonstrated good accuracy and generalizability across diverse populations. PrismNN could find 3.5 times more cases at comparable risk than current screening guidelines. The small number of features provided a basis for model interpretation. Integration with the federated network provided data from a large, heterogeneous patient population and a pathway to future clinical deployment. FUNDING Prevent Cancer Foundation, TriNetX, Boeing, DARPA, NSF, and Aarno Labs.
Collapse
Affiliation(s)
- Kai Jia
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| | | | | | | | | | - Irving D Kaplan
- Beth Israel Deaconess Medical Center, Boston, MA, 02215, USA.
| | - Martin Rinard
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| | - Limor Appelbaum
- Beth Israel Deaconess Medical Center, Boston, MA, 02215, USA.
| |
Collapse
|
16
|
Ke TM, Lophatananon A, Muir KR. An Integrative Pancreatic Cancer Risk Prediction Model in the UK Biobank. Biomedicines 2023; 11:3206. [PMID: 38137427 PMCID: PMC10740416 DOI: 10.3390/biomedicines11123206] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 11/20/2023] [Accepted: 11/26/2023] [Indexed: 12/24/2023] Open
Abstract
Pancreatic cancer (PaCa) is a lethal cancer with an increasing incidence, highlighting the need for early prevention strategies. There is a lack of a comprehensive PaCa predictive model derived from large prospective cohorts. Therefore, we have developed an integrated PaCa risk prediction model for PaCa using data from the UK Biobank, incorporating lifestyle-related, genetic-related, and medical history-related variables for application in healthcare settings. We used a machine learning-based random forest approach and a traditional multivariable logistic regression method to develop a PaCa predictive model for different purposes. Additionally, we employed dynamic nomograms to visualize the probability of PaCa risk in the prediction model. The top five influential features in the random forest model were age, PRS, pancreatitis, DM, and smoking. The significant risk variables in the logistic regression model included male gender (OR = 1.17), age (OR = 1.10), non-O blood type (OR = 1.29), higher polygenic score (PRS) (Q5 vs. Q1, OR = 2.03), smoking (OR = 1.82), alcohol consumption (OR = 1.27), pancreatitis (OR = 3.99), diabetes (DM) (OR = 2.57), and gallbladder-related disease (OR = 2.07). The area under the receiver operating curve (AUC) of the logistic regression model is 0.78. Internal validation and calibration performed well in both models. Our integrative PaCa risk prediction model with the PRS effectively stratifies individuals at future risk of PaCa, aiding targeted prevention efforts and supporting community-based cancer prevention initiatives.
Collapse
Affiliation(s)
| | | | - Kenneth R. Muir
- Division of Population Health, Health Services Research and Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PT, UK; (T.-M.K.); (A.L.)
| |
Collapse
|
17
|
Chen S, Phuc PT, Nguyen P, Burton W, Lin S, Lin W, Lu CY, Hsu M, Cheng C, Hsu JC. A novel prediction model of the risk of pancreatic cancer among diabetes patients using multiple clinical data and machine learning. Cancer Med 2023; 12:19987-19999. [PMID: 37737056 PMCID: PMC10587954 DOI: 10.1002/cam4.6547] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/14/2023] [Accepted: 09/06/2023] [Indexed: 09/23/2023] Open
Abstract
INTRODUCTION Pancreatic cancer is associated with poor prognosis. Considering the increased global incidence of diabetes cases and that individuals with diabetes are considered a high-risk subpopulation for pancreatic cancer, it is critical to detect the risk of pancreatic cancer within populations of person living = with diabetes. This study aimed to develop a novel prediction model for pancreatic cancer risk among patients with diabetes, using = a real-world database containing clinical features and employing numerous artificial intelligent approach algorithms. METHODS This retrospective observational study analyzed data on patients with Type 2 diabetes from a multisite Taiwanese EMR database between 2009 and 2019. Predictors were selected in accordance with the literature review and clinical perspectives. The prediction models were constructed using machine learning algorithms such as logistic regression, linear discriminant analysis, gradient boosting machine, and random forest. RESULTS The cohort consisted of 66,384 patients. The Linear Discriminant Analysis (LDA) model generated the highest AUROC of 0.9073, followed by the Voting Ensemble and Gradient Boosting machine models. LDA, the best model, exhibited an accuracy of 84.03%, a sensitivity of 0.8611, and a specificity of 0.8403. The most significant predictors identified for pancreatic cancer risk were glucose, glycated hemoglobin, hyperlipidemia comorbidity, antidiabetic drug use, and lipid-modifying drug use. CONCLUSION This study successfully developed a highly accurate 4-year risk model for pancreatic cancer in patients with diabetes using real-world clinical data and multiple machine-learning algorithms. Potentially, our predictors offer an opportunity to identify pancreatic cancer early and thus increase prevention and invention windows to impact survival in diabetic patients.
Collapse
Affiliation(s)
- Shih‐Min Chen
- School of PharmacyTaipei Medical UniversityTaipeiTaiwan
| | - Phan Thanh Phuc
- International Ph.D. Program in Biotech and Healthcare Management, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Phung‐Anh Nguyen
- Clinical Data Center, Office of Data ScienceTaipei Medical UniversityTaipeiTaiwan
- Clinical Big Data Research CenterTaipei Medical University Hospital, Taipei Medical UniversityTaipeiTaiwan
- Research Center of Health Care Industry Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Whitney Burton
- International Ph.D. Program in Biotech and Healthcare Management, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | | | - Weei‐Chin Lin
- Section of Hematology/Oncology, Department of Medicine and Department of Molecular and Cellular BiologyBaylor College of MedicineHoustonTexasUSA
| | - Christine Y. Lu
- Department of Population MedicineHarvard Medical School and Harvard Pilgrim Health Care InstituteBostonMassachusettsUSA
- Kolling Institute, Faculty of Medicine and HealthThe University of Sydney and the Northern Sydney Local Health DistrictSydneyNew South WalesAustralia
- School of Pharmacy, Faculty of Medicine and HealthThe University of SydneySydneyNew South WalesAustralia
| | - Min‐Huei Hsu
- Clinical Data Center, Office of Data ScienceTaipei Medical UniversityTaipeiTaiwan
- Graduate Institute of Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Chi‐Tsun Cheng
- Research Center of Health Care Industry Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| | - Jason C. Hsu
- International Ph.D. Program in Biotech and Healthcare Management, College of ManagementTaipei Medical UniversityTaipeiTaiwan
- Clinical Data Center, Office of Data ScienceTaipei Medical UniversityTaipeiTaiwan
- Clinical Big Data Research CenterTaipei Medical University Hospital, Taipei Medical UniversityTaipeiTaiwan
- Research Center of Health Care Industry Data Science, College of ManagementTaipei Medical UniversityTaipeiTaiwan
| |
Collapse
|
18
|
Matchaba S, Fellague-Chebra R, Purushottam P, Johns A. Early Diagnosis of Pancreatic Cancer via Machine Learning Analysis of a National Electronic Medical Record Database. JCO Clin Cancer Inform 2023; 7:e2300076. [PMID: 37816199 DOI: 10.1200/cci.23.00076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/24/2023] [Accepted: 08/22/2023] [Indexed: 10/12/2023] Open
Abstract
PURPOSE Pancreatic cancer (PaC) is often diagnosed at advanced stages, resulting in one of the lowest survival rates among patients with cancer. The purpose of this study was to investigate whether machine learning (ML) models can predict with high sensitivity and specificity an increased risk for PaC ahead of clinical diagnosis. METHODS Optum deidentified electronic health record (EHR) data set was used to extract 1-year data for each patient and to sample for PaC diagnosis, the number of interactions with the health care system, and unique demographic and clinical features. Data for patients with PaC diagnosis were collected between 1 and 2 years before the diagnosis. Standard binary classification ML models were used on training and testing data sets. Data analyses were performed using the scikit-learn package version 1.0.1. RESULTS The data set consisted of 18,987 patient EHRs collected between December 31, 2007, and December 31, 2017. EHRs with 10 unique features and at least three health care interactions were used for model training (N = 15,189; n = 8,438 [56%] with PaC) and testing (N = 3,798; n = 2,127 [56%] with PaC). The ensemble model achieved an AUC of 0.89, a sensitivity of 85.61%, and a specificity of 76.18% on the testing data set and produced superior results compared with other binary classifiers. Increasing unique health care interactions to nine failed to improve the AUC score. When the testing data set was enlarged to 5,696 patients, the ensemble model achieved an AUC of 0.92 and a specificity of 93.21%, but the sensitivity was compromised. CONCLUSION The ensemble model exceeded the state-of-the-art level of performance for prediction of PaC ahead of clinical diagnosis with a minimal clinically guided input, providing a potential strategy for selection of high-risk patients for further screening.
Collapse
Affiliation(s)
- Siyabonga Matchaba
- Health Economics and Evidence Development, Novartis Oncology, East Hanover, NJ
- Mendel, San Jose, CA
| | | | | | - Adam Johns
- Health Economics and Evidence Development, Novartis Oncology, East Hanover, NJ
| |
Collapse
|
19
|
Bojesen AB, Mortensen FV, Kirkegård J. Real-Time Identification of Pancreatic Cancer Cases Using Artificial Intelligence Developed on Danish Nationwide Registry Data. JCO Clin Cancer Inform 2023; 7:e2300084. [PMID: 37812754 DOI: 10.1200/cci.23.00084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 07/18/2023] [Accepted: 08/29/2023] [Indexed: 10/11/2023] Open
Abstract
PURPOSE Pancreatic cancer is expected to be the second leading cause of cancer-related deaths worldwide within few years. Most patients are not diagnosed in time for curative-intent treatment. Accelerating the time of diagnosis is a key component of reducing pancreatic cancer mortality. We developed and tested a dynamic algorithm aiming at proactively identifying patients with a substantially elevated risk of having undiagnosed pancreatic cancer. METHODS Machine learning methodology was applied to a live stream of nationwide Danish registry data. A hybrid case-control and prospective cohort design relying on incidence density sampling was used. Three models with minimal tuning were tested. All performance evaluation metrics were based on out-of-sample, out-of-time data in a monthly walk-forward strategy to avoid any temporal biases or inflation of performance metrics. Outcome was a diagnosis of pancreatic cancer. RESULTS Subgroups identified had a 10.1% risk of being diagnosed with pancreatic cancer within 1 year, corresponding to a number needed to screen of 9.9. When considering competing, potentially computed tomography-detectable GI cancers, this number is reduced to 5.7. The time of diagnosis can be accelerated by up to 142 days. CONCLUSION Currently available nationwide live data and computational resources are sufficient for real-time identification of individuals with at least 10.1% risk of having undiagnosed pancreatic cancer and 17.7% risk of any GI cancer in the Danish population. For prospective identification of high-risk patients, the area under the curve is not a useful indication of the positive predictive values achieved. Viable design solutions are demonstrated, which address the main shortfalls of the existing cancer prediction efforts in relation to temporal biases, leaks, and performance metric inflation. Efficacy evaluations with resection rates and mortality as end points are needed.
Collapse
Affiliation(s)
- Anders Bo Bojesen
- Department of Surgery, HPB Section, Aarhus University Hospital, Aarhus, Denmark
| | - Frank Viborg Mortensen
- Department of Surgery, HPB Section, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Jakob Kirkegård
- Department of Surgery, HPB Section, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| |
Collapse
|
20
|
Dite GS, Spaeth E, Wong CK, Murphy NM, Allman R. Predicting 10-Year Risk of Pancreatic Cancer Using a Combined Genetic and Clinical Model. GASTRO HEP ADVANCES 2023; 2:979-989. [PMID: 39130772 PMCID: PMC11308393 DOI: 10.1016/j.gastha.2023.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 05/12/2023] [Indexed: 08/13/2024]
Abstract
Background and Aims Pancreatic cancer has the poorest 5-year survival rate of any major solid tumor, but when diagnosed at an early stage, survival rates improve. Population screening is impractical because pancreatic cancer is rare with a lifetime risk of 1.7%, but accurate risk stratification in the general population could enable health care providers to focus early detection strategies to at-risk individuals. Here, we validate a combined risk prediction model that integrates a polygenic risk score and a clinical risk model. Methods Using the UK Biobank, we conducted a prospective cohort study assessing 10-year pancreatic cancer risks based on a polygenic risk score, a clinical risk score, and a combined risk score. We assessed the association, discrimination, calibration, cumulative hazards, and standardized incidence ratios compared to population incidence rates for the risk scores. We also conducted net reclassification analyses. Results While all of the risk scores discriminated well between affected and unaffected participants, the combined risk score - with a Harrell's C-index of 0.714 (95% confidence interval [CI] = 0.698, 0.730) - discriminated better than both the polygenic risk score (P = .001) and the clinical risk score (P = .02). In terms of calibration, there was no problem with dispersion for the combined risk score (β = 0.952, 95% CI = 0.865-1.039, P = .3) and overall there was a small overestimation of risk (α = -0.089, 95% CI = -0.156 to -0.021, P = .009). Participants in the top decile of 10-year risk were at 1.413 (95% CI = 1.242-1.607) times population risk. Conclusion The combined risk score was able to identify individuals at substantially increased risk of pancreatic cancer and to whom targeted screening could be useful.
Collapse
Affiliation(s)
| | - Erika Spaeth
- Phenogen Sciences Inc, Charlotte, North Carolina
| | - Chi Kuen Wong
- Genetic Technologies Limited, Fitzroy, Victoria, Australia
| | | | - Richard Allman
- Genetic Technologies Limited, Fitzroy, Victoria, Australia
| |
Collapse
|
21
|
Placido D, Yuan B, Hjaltelin JX, Zheng C, Haue AD, Chmura PJ, Yuan C, Kim J, Umeton R, Antell G, Chowdhury A, Franz A, Brais L, Andrews E, Marks DS, Regev A, Ayandeh S, Brophy MT, Do NV, Kraft P, Wolpin BM, Rosenthal MH, Fillmore NR, Brunak S, Sander C. A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories. Nat Med 2023; 29:1113-1122. [PMID: 37156936 PMCID: PMC10202814 DOI: 10.1038/s41591-023-02332-5] [Citation(s) in RCA: 106] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 03/31/2023] [Indexed: 05/10/2023]
Abstract
Pancreatic cancer is an aggressive disease that typically presents late with poor outcomes, indicating a pronounced need for early detection. In this study, we applied artificial intelligence methods to clinical data from 6 million patients (24,000 pancreatic cancer cases) in Denmark (Danish National Patient Registry (DNPR)) and from 3 million patients (3,900 cases) in the United States (US Veterans Affairs (US-VA)). We trained machine learning models on the sequence of disease codes in clinical histories and tested prediction of cancer occurrence within incremental time windows (CancerRiskNet). For cancer occurrence within 36 months, the performance of the best DNPR model has area under the receiver operating characteristic (AUROC) curve = 0.88 and decreases to AUROC (3m) = 0.83 when disease events within 3 months before cancer diagnosis are excluded from training, with an estimated relative risk of 59 for 1,000 highest-risk patients older than age 50 years. Cross-application of the Danish model to US-VA data had lower performance (AUROC = 0.71), and retraining was needed to improve performance (AUROC = 0.78, AUROC (3m) = 0.76). These results improve the ability to design realistic surveillance programs for patients at elevated risk, potentially benefiting lifespan and quality of life by early detection of this aggressive cancer.
Collapse
Affiliation(s)
- Davide Placido
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Bo Yuan
- Harvard Medical School, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of MIT and Harvard, Boston, MA, USA
| | - Jessica X Hjaltelin
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Chunlei Zheng
- VA Boston Healthcare System, Boston, MA, USA
- Boston University School of Medicine, Boston, MA, USA
| | - Amalie D Haue
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Piotr J Chmura
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Chen Yuan
- Harvard Medical School, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
| | - Jihye Kim
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Renato Umeton
- Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Massachusetts Institute of Technology, Cambridge, MA, USA
- Weill Cornell Medicine, New York City, NY, USA
| | | | | | - Alexandra Franz
- Harvard Medical School, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of MIT and Harvard, Boston, MA, USA
| | | | | | | | - Aviv Regev
- Broad Institute of MIT and Harvard, Boston, MA, USA
- Genentech, Inc., South San Francisco, CA, USA
| | | | - Mary T Brophy
- VA Boston Healthcare System, Boston, MA, USA
- Boston University School of Medicine, Boston, MA, USA
| | - Nhan V Do
- VA Boston Healthcare System, Boston, MA, USA
- Boston University School of Medicine, Boston, MA, USA
| | - Peter Kraft
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Brian M Wolpin
- Harvard Medical School, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Michael H Rosenthal
- Harvard Medical School, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Nathanael R Fillmore
- Harvard Medical School, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Boston University School of Medicine, Boston, MA, USA
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
- Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark.
| | - Chris Sander
- Harvard Medical School, Boston, MA, USA.
- Dana-Farber Cancer Institute, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Boston, MA, USA.
| |
Collapse
|
22
|
Zhang Y, Wang QL, Yuan C, Lee AA, Babic A, Ng K, Perez K, Nowak JA, Lagergren J, Stampfer MJ, Giovannucci EL, Sander C, Rosenthal MH, Kraft P, Wolpin BM. Pancreatic cancer is associated with medication changes prior to clinical diagnosis. Nat Commun 2023; 14:2437. [PMID: 37117188 PMCID: PMC10147931 DOI: 10.1038/s41467-023-38088-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 04/11/2023] [Indexed: 04/30/2023] Open
Abstract
Patients with pancreatic ductal adenocarcinoma (PDAC) commonly develop symptoms and signs in the 1-2 years before diagnosis that can result in changes to medications. We investigate recent medication changes and PDAC diagnosis in Nurses' Health Study (NHS; females) and Health Professionals Follow-up Study (HPFS; males), including up to 148,973 U.S. participants followed for 2,994,057 person-years and 991 incident PDAC cases. Here we show recent initiation of antidiabetic (NHS) or anticoagulant (NHS, HFS) medications and cessation of antihypertensive medications (NHS, HPFS) are associated with pancreatic cancer diagnosis in the next 2 years. Two-year PDAC risk increases as number of relevant medication changes increases (P-trend <1 × 10-5), with participants who recently start antidiabetic and stop antihypertensive medications having multivariable-adjusted hazard ratio of 4.86 (95%CI, 1.74-13.6). These changes are not associated with diagnosis of other digestive system cancers. Recent medication changes should be considered as candidate features in multi-factor risk models for PDAC, though they are not causally implicated in development of PDAC.
Collapse
Affiliation(s)
- Yin Zhang
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
- Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Qiao-Li Wang
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
- Department of Clinical Science, Intervention and Technology, Karolinka Institutet, Stockholm, Sweden
| | - Chen Yuan
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Alice A Lee
- Division of Gastroenterology, Hepatology and Endoscopy, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ana Babic
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Kimmie Ng
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Kimberly Perez
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Jonathan A Nowak
- Program in MPE Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Jesper Lagergren
- Upper Gastrointestinal Surgery, Department of Molecular Medicine and Surgery, Karolinska Institutet, Karolinska University Hospital, Stockholm, Sweden
- School of Cancer and Pharmaceutical Sciences, King's College London, London, UK
| | - Meir J Stampfer
- Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Edward L Giovannucci
- Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Chris Sander
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Michael H Rosenthal
- Department of Radiology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Peter Kraft
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Brian M Wolpin
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
23
|
Santos R, Coleman HG, Cairnduff V, Kunzmann AT. Clinical Prediction Models for Pancreatic Cancer in General and At-Risk Populations: A Systematic Review. Am J Gastroenterol 2023; 118:26-40. [PMID: 36148840 DOI: 10.14309/ajg.0000000000002022] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 09/16/2022] [Indexed: 01/12/2023]
Abstract
INTRODUCTION Identifying high-risk individuals using a risk prediction model could be a crucial first stage of screening pathways to improve the early detection of pancreatic cancer. A systematic review was conducted to critically evaluate the published primary literature on the development or validation of clinical risk prediction models for pancreatic cancer risk. METHODS MEDLINE, Embase, and Web of Science were searched for relevant articles from the inception of each database up to November 2021. Study selection and data extraction were conducted by 2 independent reviewers. The Prediction model Risk Of Bias Assessment Tool (PROBAST) was applied to assess risk of bias. RESULTS In total, 33 studies were included, describing 38 risk prediction models. Excluding studies with an overlapping population, this study consist of 15,848,100 participants, of which 58,313 were diagnosed with pancreatic cancer. Eight studies externally validated their model, and 13 performed internal validation. The studies described risk prediction models for pancreatic cancer in the general population (n = 14), patients with diabetes (n = 8), and individuals with gastrointestinal (and other) symptoms (symptoms included abdominal pain, unexplained weight loss, jaundice, and change in bowel habits and indigestion; n = 11). The commonly used clinical risk factors in the model were cigarette smoking (n = 27), age (n = 25), diabetes history (n = 22), chronic pancreatitis (n = 18), and body mass index (n = 14). In the 25 studies that assessed model performance, C-statistics ranged from 0.61 to 0.98. Of the 33 studies included, 6 were rated as being at a low risk of bias based on PROBAST. DISCUSSION Many clinical risk prediction models for pancreatic cancer had been developed for different target populations. Although low risk-of-bias studies were identified, these require external validation and implementation studies to ensure that these will benefit clinical decision making.
Collapse
Affiliation(s)
- Ralph Santos
- Centre for Public Health, Queen's University Belfast, Belfast, UK
| | - Helen G Coleman
- Centre for Public Health, Queen's University Belfast, Belfast, UK
- Patrick G. Johnston Centre for Cancer Research, Queen's University Belfast, Belfast, UK
| | | | | |
Collapse
|
24
|
Lemanska A, Price CA, Jeffreys N, Byford R, Dambha-Miller H, Fan X, Hinton W, Otter S, Rice R, Stunt A, Whyte MB, Faithfull S, de Lusignan S. BMI and HbA1c are metabolic markers for pancreatic cancer: Matched case-control study using a UK primary care database. PLoS One 2022; 17:e0275369. [PMID: 36197912 PMCID: PMC9534412 DOI: 10.1371/journal.pone.0275369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 09/15/2022] [Indexed: 11/07/2022] Open
Abstract
Background Weight loss, hyperglycaemia and diabetes are known features of pancreatic cancer. We quantified the timing and the amount of changes in body mass index (BMI) and glycated haemoglobin (HbA1c), and their association with pancreatic cancer from five years before diagnosis. Methods A matched case-control study was undertaken within 590 primary care practices in England, United Kingdom. 8,777 patients diagnosed with pancreatic cancer (cases) between 1st January 2007 and 31st August 2020 were matched to 34,979 controls by age, gender and diabetes. Longitudinal trends in BMI and HbA1c were visualised. Odds ratios adjusted for demographic and lifestyle factors (aOR) and 95% confidence intervals (CI) were calculated with conditional logistic regression. Subgroup analyses were undertaken according to the diabetes status. Results Changes in BMI and HbA1c observed for cases on longitudinal plots started one and two years (respectively) before diagnosis. In the year before diagnosis, a 1 kg/m2 decrease in BMI between cases and controls was associated with aOR for pancreatic cancer of 1.05 (95% CI 1.05 to 1.06), and a 1 mmol/mol increase in HbA1c was associated with aOR of 1.06 (1.06 to 1.07). ORs remained statistically significant (p < 0.001) for 2 years before pancreatic cancer diagnosis for BMI and 3 years for HbA1c. Subgroup analysis revealed that the decrease in BMI was associated with a higher pancreatic cancer risk for people with diabetes than for people without (aORs 1.08, 1.06 to 1.09 versus 1.04, 1.03 to 1.05), but the increase in HbA1c was associated with a higher risk for people without diabetes than for people with diabetes (aORs 1.09, 1.07 to 1.11 versus 1.04, 1.03 to 1.04). Conclusions The statistically significant changes in weight and glycaemic control started three years before pancreatic cancer diagnosis but varied according to the diabetes status. The information from this study could be used to detect pancreatic cancer earlier than is currently achieved. However, regular BMI and HbA1c measurements are required to facilitate future research and implementation in clinical practice.
Collapse
Affiliation(s)
- Agnieszka Lemanska
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
- * E-mail:
| | - Claire A. Price
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Nathan Jeffreys
- Royal Surrey NHS Foundation Trust, Guildford, United Kingdom
| | - Rachel Byford
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
| | - Hajira Dambha-Miller
- Primary Care Research Centre, University of Southampton, Southampton, United Kingdom
| | - Xuejuan Fan
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
| | - William Hinton
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
| | - Sophie Otter
- Royal Surrey NHS Foundation Trust, Guildford, United Kingdom
| | - Rebecca Rice
- Barnardo’s, Barkingside, Ilford, Essex, London, United Kingdom
| | - Ali Stunt
- Pancreatic Cancer Action, London, United Kingdom
| | - Martin B. Whyte
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Sara Faithfull
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Simon de Lusignan
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
25
|
Yin H, Zhang F, Yang X, Meng X, Miao Y, Noor Hussain MS, Yang L, Li Z. Research trends of artificial intelligence in pancreatic cancer: a bibliometric analysis. Front Oncol 2022; 12:973999. [PMID: 35982967 PMCID: PMC9380440 DOI: 10.3389/fonc.2022.973999] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 07/13/2022] [Indexed: 01/03/2023] Open
Abstract
Purpose We evaluated the related research on artificial intelligence (AI) in pancreatic cancer (PC) through bibliometrics analysis and explored the research hotspots and current status from 1997 to 2021. Methods Publications related to AI in PC were retrieved from the Web of Science Core Collection (WoSCC) during 1997-2021. Bibliometrix package of R software 4.0.3 and VOSviewer were used to bibliometrics analysis. Results A total of 587 publications in this field were retrieved from WoSCC database. After 2018, the number of publications grew rapidly. The United States and Johns Hopkins University were the most influential country and institution, respectively. A total of 2805 keywords were investigated, 81 of which appeared more than 10 times. Co-occurrence analysis categorized these keywords into five types of clusters: (1) AI in biology of PC, (2) AI in pathology and radiology of PC, (3) AI in the therapy of PC, (4) AI in risk assessment of PC and (5) AI in endoscopic ultrasonography (EUS) of PC. Trend topics and thematic maps show that keywords " diagnosis ", “survival”, “classification”, and “management” are the research hotspots in this field. Conclusion The research related to AI in pancreatic cancer is still in the initial stage. Currently, AI is widely studied in biology, diagnosis, treatment, risk assessment, and EUS of pancreatic cancer. This bibliometrics study provided an insight into AI in PC research and helped researchers identify new research orientations.
Collapse
Affiliation(s)
- Hua Yin
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
- Postgraduate Training Base in Shanghai Gongli Hospital, Ningxia Medical University, Shanghai, China
| | - Feixiong Zhang
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Xiaoli Yang
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Xiangkun Meng
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | - Yu Miao
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
| | | | - Li Yang
- Department of Gastroenterology, General Hospital of Ningxia Medical University, Yinchuan, China
- *Correspondence: Zhaoshen Li, ; Li Yang,
| | - Zhaoshen Li
- Postgraduate Training Base in Shanghai Gongli Hospital, Ningxia Medical University, Shanghai, China
- Clinical Medical College, Ningxia Medical University, Yinchuan, China
- *Correspondence: Zhaoshen Li, ; Li Yang,
| |
Collapse
|
26
|
Park J, Artin MG, Lee KE, Pumpalova YS, Ingram MA, May BL, Park M, Hur C, Tatonetti NP. Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer. J Biomed Inform 2022; 131:104095. [PMID: 35598881 PMCID: PMC10286873 DOI: 10.1016/j.jbi.2022.104095] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 04/04/2022] [Accepted: 05/16/2022] [Indexed: 11/26/2022]
Abstract
The multi-modal and unstructured nature of observational data in Electronic Health Records (EHR) is currently a significant obstacle for the application of machine learning towards risk stratification. In this study, we develop a deep learning framework for incorporating longitudinal clinical data from EHR to infer risk for pancreatic cancer (PC). This framework includes a novel training protocol, which enforces an emphasis on early detection by applying an independent Poisson-random mask on proximal-time measurements for each variable. Data fusion for irregular multivariate time-series features is enabled by a "grouped" neural network (GrpNN) architecture, which uses representation learning to generate a dimensionally reduced vector for each measurement set before making a final prediction. These models were evaluated using EHR data from Columbia University Irving Medical Center-New York Presbyterian Hospital. Our framework demonstrated better performance on early detection (AUROC 0.671, CI 95% 0.667 - 0.675, p < 0.001) at 12 months prior to diagnosis compared to a logistic regression, xgboost, and a feedforward neural network baseline. We demonstrate that our masking strategy results greater improvements at distal times prior to diagnosis, and that our GrpNN model improves generalizability by reducing overfitting relative to the feedforward baseline. The results were consistent across reported race. Our proposed algorithm is potentially generalizable to other diseases including but not limited to cancer where early detection can improve survival.
Collapse
Affiliation(s)
- Jiheum Park
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Michael G Artin
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Kate E Lee
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Yoanna S Pumpalova
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Myles A Ingram
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Benjamin L May
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, United States
| | - Michael Park
- Applied Info Partners Inc, Worlds Fair Drive, Somerset, NJ, United States; X-Mechanics LLC, Cresskill, NJ, United States
| | - Chin Hur
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States.
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| |
Collapse
|
27
|
Huang RJ, Kwon NSE, Tomizawa Y, Choi AY, Hernandez-Boussard T, Hwang JH. A Comparison of Logistic Regression Against Machine Learning Algorithms for Gastric Cancer Risk Prediction Within Real-World Clinical Data Streams. JCO Clin Cancer Inform 2022; 6:e2200039. [PMID: 35763703 DOI: 10.1200/cci.22.00039] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
PURPOSE Noncardia gastric cancer (NCGC) is a leading cause of global cancer mortality, and is often diagnosed at advanced stages. Development of NCGC risk models within electronic health records (EHR) may allow for improved cancer prevention. There has been much recent interest in use of machine learning (ML) for cancer prediction, but few studies comparing ML with classical statistical models for NCGC risk prediction. METHODS We trained models using logistic regression (LR) and four commonly used ML algorithms to predict NCGC from age-/sex-matched controls in two EHR systems: Stanford University and the University of Washington (UW). The LR model contained well-established NCGC risk factors (intestinal metaplasia histology, prior Helicobacter pylori infection, race, ethnicity, nativity status, smoking history, anemia), whereas ML models agnostically selected variables from the EHR. Models were developed and internally validated in the Stanford data, and externally validated in the UW data. Hyperparameter tuning of models was achieved using cross-validation. Model performance was compared by accuracy, sensitivity, and specificity. RESULTS In internal validation, LR performed with comparable accuracy (0.732; 95% CI, 0.698 to 0.764), sensitivity (0.697; 95% CI, 0.647 to 0.744), and specificity (0.767; 95% CI, 0.720 to 0.809) to penalized lasso, support vector machine, K-nearest neighbor, and random forest models. In external validation, LR continued to demonstrate high accuracy, sensitivity, and specificity. Although K-nearest neighbor demonstrated higher accuracy and specificity, this was offset by significantly lower sensitivity. No ML model consistently outperformed LR across evaluation criteria. CONCLUSION Drawing data from two independent EHRs, we find LR on the basis of established risk factors demonstrated comparable performance to optimized ML algorithms. This study demonstrates that classical models built on robust, hand-chosen predictor variables may not be inferior to data-driven models for NCGC risk prediction.
Collapse
Affiliation(s)
- Robert J Huang
- Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA
| | - Nicole Sung-Eun Kwon
- Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA
| | - Yutaka Tomizawa
- Division of Gastroenterology, University of Washington, Seattle, WA
| | - Alyssa Y Choi
- Division of Gastroenterology and Hepatology, University of California Irvine, Irvine, CA
| | | | - Joo Ha Hwang
- Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA
| |
Collapse
|
28
|
Chen HY, Ge P, Liu JY, Qu JL, Bao F, Xu CM, Chen HL, Shang D, Zhang GX. Artificial intelligence: Emerging player in the diagnosis and treatment of digestive disease. World J Gastroenterol 2022; 28:2152-2162. [PMID: 35721881 PMCID: PMC9157617 DOI: 10.3748/wjg.v28.i20.2152] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/24/2021] [Accepted: 04/24/2022] [Indexed: 02/06/2023] Open
Abstract
Given the breakthroughs in key technologies, such as image recognition, deep learning and neural networks, artificial intelligence (AI) continues to be increasingly developed, leading to closer and deeper integration with an increasingly data-, knowledge- and brain labor-intensive medical industry. As society continues to advance and individuals become more aware of their health needs, the problems associated with the aging of the population are receiving increasing attention, and there is an urgent demand for improving medical technology, prolonging human life and enhancing health. Digestive system diseases are the most common clinical diseases and are characterized by complex clinical manifestations and a general lack of obvious symptoms in the early stage. Such diseases are very difficult to diagnose and treat. In recent years, the incidence of diseases of the digestive system has increased. As AI applications in the field of health care continue to be developed, AI has begun playing an important role in the diagnosis and treatment of diseases of the digestive system. In this paper, the application of AI in assisted diagnosis and the application and prospects of AI in malignant and benign digestive system diseases are reviewed.
Collapse
Affiliation(s)
- Hai-Yang Chen
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
| | - Peng Ge
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
| | - Jia-Yue Liu
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
| | - Jia-Lin Qu
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Institute (College) of Integrative Medicine, Dalian Medical University, Dalian 116044, Liaoning Province, China
| | - Fang Bao
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
| | - Cai-Ming Xu
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Institute (College) of Integrative Medicine, Dalian Medical University, Dalian 116044, Liaoning Province, China
| | - Hai-Long Chen
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Institute (College) of Integrative Medicine, Dalian Medical University, Dalian 116044, Liaoning Province, China
| | - Dong Shang
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Institute (College) of Integrative Medicine, Dalian Medical University, Dalian 116044, Liaoning Province, China
| | - Gui-Xin Zhang
- Laboratory of Integrative Medicine, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Department of General Surgery, Pancreatic-Biliary Center, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, Liaoning Province, China
- Institute (College) of Integrative Medicine, Dalian Medical University, Dalian 116044, Liaoning Province, China
| |
Collapse
|
29
|
Appelbaum L, Kaplan ID, Palchuk MB, Kundrot S, Winer-Jones JP, Rinard M. Development and Experience with Cancer Risk Prediction Models Using Federated Databases and Electronic Health Records. Digit Health 2022. [DOI: 10.36255/exon-publications-digital-health-federated-databases] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
30
|
Lin KW, Ang TL, Li JW. Role of artificial intelligence in early detection and screening for pancreatic adenocarcinoma. Artif Intell Med Imaging 2022; 3:21-32. [DOI: 10.35711/aimi.v3.i2.21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/12/2022] [Accepted: 03/17/2022] [Indexed: 02/06/2023] Open
Abstract
Pancreatic adenocarcinoma remains to be one of the deadliest malignancies in the world despite treatment advancement over the past few decades. Its low survival rates and poor prognosis can be attributed to ambiguity in recommendations for screening and late symptom onset, contributing to its late presentation. In the recent years, artificial intelligence (AI) as emerged as a field to aid in the process of clinical decision making. Considerable efforts have been made in the realm of AI to screen for and predict future development of pancreatic ductal adenocarcinoma. This review discusses the use of AI in early detection and screening for pancreatic adenocarcinoma, and factors which may limit its use in a clinical setting.
Collapse
Affiliation(s)
- Kenneth Weicong Lin
- Department of Gastroenterology and Hepatology, Changi General Hospital, Singapore 529889, Singapore
| | - Tiing Leong Ang
- Department of Gastroenterology and Hepatology, Changi General Hospital, Singapore 529889, Singapore
| | - James Weiquan Li
- Department of Gastroenterology and Hepatology, Changi General Hospital, Singapore 529889, Singapore
| |
Collapse
|
31
|
Prediction Model for Pancreatic Cancer-A Population-Based Study from NHIRD. Cancers (Basel) 2022; 14:cancers14040882. [PMID: 35205630 PMCID: PMC8870511 DOI: 10.3390/cancers14040882] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 02/07/2022] [Accepted: 02/09/2022] [Indexed: 01/06/2023] Open
Abstract
Simple Summary Pancreatic cancer has been ranked seventh in the top ten cancer mortality rates for the past three year in Taiwan. It is one of the more difficult cancers to detect early due to the lack of early diagnostic tools. This is a population-based study from NHIRD. A higher performance pancreatic cancer prediction model has been established. This predictive model can improve the awareness of the risk of pancreatic cancer and give patients with pancreatic cancer a simpler tool for early screening in the golden period when the disease can still be eradicated. Abstract (1) Background: Cancer has been the leading cause of death in Taiwan for 39 years, and among them, pancreatic cancer has been ranked seventh in the top ten cancer mortality rates for the past three years. While the incidence rate of pancreatic cancer is ranked at the bottom of the top 10 cancers, the survival rate is very low. Pancreatic cancer is one of the more difficult cancers to detect early due to the lack of early diagnostic tools. Early screening is important for the treatment of pancreatic cancer. Only a few studies have designed predictive models for pancreatic cancer. (2) Methods: The Taiwan Health Insurance Database was used in this study, covering over 99% of the population in Taiwan. The subset sample was not significantly different from the original NHIRD sample. A machine learning approach was used to develop a predictive model for pancreatic cancer disease. Four models, including logistic regression, deep neural networks, ensemble learning, and voting ensemble were used in this study. The ROC curve and a confusion matrix were used to evaluate the accuracy of the pancreatic cancer prediction models. (3) Results: The AUC of the LR model was higher than the other three models in the external testing set for all three of the factor combinations. Sensitivity was best measured by the stacking model for the first factor combinations, and specificity was best measured by the DNN model for the second factor combination. The result of the model that used only nine factors (third factor combinations) was equal to the other two factor combinations. The AUC of the previous models for the early assessment of pancreatic cancer ranged from approximately 0.57 to 0.71. The AUC of this study was higher than that of previous studies and ranged from 0.71 to 0.76, which provides higher accuracy. (4) Conclusions: This study compared the performances of LR, DNN, stacking, and voting models for pancreatic cancer prediction and constructed a pancreatic cancer prediction model with accuracy higher than that of previous studies. This predictive model will improve awareness of the risk of pancreatic cancer and give patients with pancreatic cancer a simpler tool for early screening in the golden period when the disease can still be eradicated.
Collapse
|
32
|
Chen X, Fu R, Shao Q, Chen Y, Ye Q, Li S, He X, Zhu J. Application of artificial intelligence to pancreatic adenocarcinoma. Front Oncol 2022; 12:960056. [PMID: 35936738 PMCID: PMC9353734 DOI: 10.3389/fonc.2022.960056] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/24/2022] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND AND OBJECTIVES Pancreatic cancer (PC) is one of the deadliest cancers worldwide although substantial advancement has been made in its comprehensive treatment. The development of artificial intelligence (AI) technology has allowed its clinical applications to expand remarkably in recent years. Diverse methods and algorithms are employed by AI to extrapolate new data from clinical records to aid in the treatment of PC. In this review, we will summarize AI's use in several aspects of PC diagnosis and therapy, as well as its limits and potential future research avenues. METHODS We examine the most recent research on the use of AI in PC. The articles are categorized and examined according to the medical task of their algorithm. Two search engines, PubMed and Google Scholar, were used to screen the articles. RESULTS Overall, 66 papers published in 2001 and after were selected. Of the four medical tasks (risk assessment, diagnosis, treatment, and prognosis prediction), diagnosis was the most frequently researched, and retrospective single-center studies were the most prevalent. We found that the different medical tasks and algorithms included in the reviewed studies caused the performance of their models to vary greatly. Deep learning algorithms, on the other hand, produced excellent results in all of the subdivisions studied. CONCLUSIONS AI is a promising tool for helping PC patients and may contribute to improved patient outcomes. The integration of humans and AI in clinical medicine is still in its infancy and requires the in-depth cooperation of multidisciplinary personnel.
Collapse
Affiliation(s)
- Xi Chen
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Ruibiao Fu
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Qian Shao
- Department of Surgical Ward 1, Ningbo Women and Children’s Hospital, Ningbo, China
| | - Yan Chen
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Qinghuang Ye
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
| | - Sheng Li
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Xiongxiong He
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Jinhui Zhu
- Department of General Surgery, Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Jinhui Zhu,
| |
Collapse
|
33
|
Hayashi H, Uemura N, Matsumura K, Zhao L, Sato H, Shiraishi Y, Yamashita YI, Baba H. Recent advances in artificial intelligence for pancreatic ductal adenocarcinoma. World J Gastroenterol 2021; 27:7480-7496. [PMID: 34887644 PMCID: PMC8613738 DOI: 10.3748/wjg.v27.i43.7480] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 08/02/2021] [Accepted: 11/15/2021] [Indexed: 02/06/2023] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) remains the most lethal type of cancer. The 5-year survival rate for patients with early-stage diagnosis can be as high as 20%, suggesting that early diagnosis plays a pivotal role in the prognostic improvement of PDAC cases. In the medical field, the broad availability of biomedical data has led to the advent of the "big data" era. To overcome this deadly disease, how to fully exploit big data is a new challenge in the era of precision medicine. Artificial intelligence (AI) is the ability of a machine to learn and display intelligence to solve problems. AI can help to transform big data into clinically actionable insights more efficiently, reduce inevitable errors to improve diagnostic accuracy, and make real-time predictions. AI-based omics analyses will become the next alterative approach to overcome this poor-prognostic disease by discovering biomarkers for early detection, providing molecular/genomic subtyping, offering treatment guidance, and predicting recurrence and survival. Advances in AI may therefore improve PDAC survival outcomes in the near future. The present review mainly focuses on recent advances of AI in PDAC for clinicians. We believe that breakthroughs will soon emerge to fight this deadly disease using AI-navigated precision medicine.
Collapse
Affiliation(s)
- Hiromitsu Hayashi
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Norio Uemura
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Kazuki Matsumura
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Liu Zhao
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Hiroki Sato
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Yuta Shiraishi
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Yo-ichi Yamashita
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Hideo Baba
- Department of Gastroenterological Surgery, Graduate School of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| |
Collapse
|
34
|
Kenner BJ, Abrams ND, Chari ST, Field BF, Goldberg AE, Hoos WA, Klimstra DS, Rothschild LJ, Srivastava S, Young MR, Go VLW. Early Detection of Pancreatic Cancer: Applying Artificial Intelligence to Electronic Health Records. Pancreas 2021; 50:916-922. [PMID: 34629446 PMCID: PMC8542068 DOI: 10.1097/mpa.0000000000001882] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022]
Abstract
ABSTRACT The potential of artificial intelligence (AI) applied to clinical data from electronic health records (EHRs) to improve early detection for pancreatic and other cancers remains underexplored. The Kenner Family Research Fund, in collaboration with the Cancer Biomarker Research Group at the National Cancer Institute, organized the workshop entitled: "Early Detection of Pancreatic Cancer: Opportunities and Challenges in Utilizing Electronic Health Records (EHR)" in March 2021. The workshop included a select group of panelists with expertise in pancreatic cancer, EHR data mining, and AI-based modeling. This review article reflects the findings from the workshop and assesses the feasibility of AI-based data extraction and modeling applied to EHRs. It highlights the increasing role of data sharing networks and common data models in improving the secondary use of EHR data. Current efforts using EHR data for AI-based modeling to enhance early detection of pancreatic cancer show promise. Specific challenges (biology, limited data, standards, compatibility, legal, quality, AI chasm, incentives) are identified, with mitigation strategies summarized and next steps identified.
Collapse
Affiliation(s)
| | - Natalie D. Abrams
- Division of Cancer Prevention, National Cancer Institute, Bethesda, MD
| | - Suresh T. Chari
- Department of Gastroenterology, Hepatology and Nutrition, The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | | | | | - David S. Klimstra
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY
| | | | - Sudhir Srivastava
- Division of Cancer Prevention, National Cancer Institute, Bethesda, MD
| | - Matthew R. Young
- Division of Cancer Prevention, National Cancer Institute, Bethesda, MD
| | - Vay Liang W. Go
- UCLA Center for Excellence in Pancreatic Diseases, University of California, Los Angeles, Los Angeles, CA
| |
Collapse
|