1
|
Felici A, Peduzzi G, Pellungrini R, Campa D. Artificial intelligence to predict cancer risk, are we there yet? A comprehensive review across cancer types. Eur J Cancer 2025; 222:115440. [PMID: 40273730 DOI: 10.1016/j.ejca.2025.115440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2025] [Accepted: 03/25/2025] [Indexed: 04/26/2025]
Abstract
Cancer remains the second leading cause of death worldwide, representing a substantial challenge to global health. Although traditional risk prediction models have played a crucial role in epidemiology of several cancer types, they have limitations especially in the ability to process complex and multidimensional data. In contrast, artificial intelligence (AI) approaches represent a promising solution to overcome this limitation. AI techniques have the potential to identify complex patterns and relationships in data that traditional methods might overlook, making them especially useful for handling large and heterogeneous datasets analysed in cancer research. This review first examines the current state of the art of AI techniques, highlighting their differences and suitability for various data types. Then, offers a comprehensive analysis of the literature, focusing on the application of AI approaches in nineteen cancer types (bladder cancer, breast cancer, cervical cancer, colorectal cancer, endometrial cancer, esophageal cancer, gastric cancer, gynaecological cancers, head and neck cancer, haematological cancers, kidney cancer, liver cancer, lung cancer, melanoma, ovarian cancer, pancreatic cancer, prostate cancer, thyroid cancer and overall cancer), evaluating the models, metrics, and exposure variables used. Finally, the review discusses the application of AI in the clinical practice, along with an assessment of its potential limitations and future directions.
Collapse
Affiliation(s)
- Alessio Felici
- Department of Biology, University of Pisa, Via Luca Ghini, 13, Pisa 56126, Italy
| | - Giulia Peduzzi
- Department of Biology, University of Pisa, Via Luca Ghini, 13, Pisa 56126, Italy
| | - Roberto Pellungrini
- Classe di scienze, Scuola Normale Superiore, Piazza dei Cavalieri, 7, Pisa 56126, Italy
| | - Daniele Campa
- Department of Biology, University of Pisa, Via Luca Ghini, 13, Pisa 56126, Italy.
| |
Collapse
|
2
|
Rubenstein JH, Burns J, Arasim ME, Evans RR. Validation of Tools Using the Electronic Health Record for Predicting Barrett's Esophagus With High-grade Dysplasia. Clin Gastroenterol Hepatol 2025:S1542-3565(25)00324-6. [PMID: 40315971 DOI: 10.1016/j.cgh.2025.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Revised: 03/05/2025] [Accepted: 03/18/2025] [Indexed: 05/04/2025]
Affiliation(s)
- Joel H Rubenstein
- Veterans Affairs Center for Clinical Management Research, LTC Charles S. Kettles Veterans Affairs Medical Center, Ann Arbor, Michigan; Barrett's Esophagus Program, Division of Gastroenterology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan.
| | - Jennifer Burns
- Veterans Affairs Center for Clinical Management Research, LTC Charles S. Kettles Veterans Affairs Medical Center, Ann Arbor, Michigan
| | - Maria E Arasim
- Veterans Affairs Center for Clinical Management Research, LTC Charles S. Kettles Veterans Affairs Medical Center, Ann Arbor, Michigan
| | - Richard R Evans
- Veterans Affairs Center for Clinical Management Research, LTC Charles S. Kettles Veterans Affairs Medical Center, Ann Arbor, Michigan
| |
Collapse
|
3
|
Moglia V, Johnson O, Cook G, de Kamps M, Smith L. Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review. BMC Med Res Methodol 2025; 25:24. [PMID: 39875808 PMCID: PMC11773903 DOI: 10.1186/s12874-025-02473-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 01/17/2025] [Indexed: 01/30/2025] Open
Abstract
BACKGROUND Early detection and diagnosis of cancer are vital to improving outcomes for patients. Artificial intelligence (AI) models have shown promise in the early detection and diagnosis of cancer, but there is limited evidence on methods that fully exploit the longitudinal data stored within electronic health records (EHRs). This review aims to summarise methods currently utilised for prediction of cancer from longitudinal data and provides recommendations on how such models should be developed. METHODS The review was conducted following PRISMA-ScR guidance. Six databases (MEDLINE, EMBASE, Web of Science, IEEE Xplore, PubMed and SCOPUS) were searched for relevant records published before 2/2/2024. Search terms related to the concepts "artificial intelligence", "prediction", "health records", "longitudinal", and "cancer". Data were extracted relating to several areas of the articles: (1) publication details, (2) study characteristics, (3) input data, (4) model characteristics, (4) reproducibility, and (5) quality assessment using the PROBAST tool. Models were evaluated against a framework for terminology relating to reporting of cancer detection and risk prediction models. RESULTS Of 653 records screened, 33 were included in the review; 10 predicted risk of cancer, 18 performed either cancer detection or early detection, 4 predicted recurrence, and 1 predicted metastasis. The most common cancers predicted in the studies were colorectal (n = 9) and pancreatic cancer (n = 9). 16 studies used feature engineering to represent temporal data, with the most common features representing trends. 18 used deep learning models which take a direct sequential input, most commonly recurrent neural networks, but also including convolutional neural networks and transformers. Prediction windows and lead times varied greatly between studies, even for models predicting the same cancer. High risk of bias was found in 90% of the studies. This risk was often introduced due to inappropriate study design (n = 26) and sample size (n = 26). CONCLUSION This review highlights the breadth of approaches to cancer prediction from longitudinal data. We identify areas where reporting of methods could be improved, particularly regarding where in a patients' trajectory the model is applied. The review shows opportunities for further work, including comparison of these approaches and their applications in other cancers.
Collapse
Affiliation(s)
- Victoria Moglia
- School of Computing, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, UK.
| | - Owen Johnson
- School of Computing, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, UK
| | - Gordon Cook
- Leeds Institute of Clinical Trials Research, University of Leeds, Clarendon Way, Leeds, LS2 9NL, UK
- NIHR Leeds Biomedical Research Centre, Chapeltown Road, Leeds, LS7 4SA, United Kingdom
| | - Marc de Kamps
- School of Computing, University of Leeds, Woodhouse Lane, Leeds, LS2 9JT, UK
| | - Lesley Smith
- Leeds Institute of Clinical Trials Research, University of Leeds, Clarendon Way, Leeds, LS2 9NL, UK
| |
Collapse
|
4
|
Haue AD, Hjaltelin JX, Holm PC, Placido D, Brunak SR. Artificial intelligence-aided data mining of medical records for cancer detection and screening. Lancet Oncol 2024; 25:e694-e703. [PMID: 39637906 DOI: 10.1016/s1470-2045(24)00277-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 05/08/2024] [Accepted: 05/10/2024] [Indexed: 12/07/2024]
Abstract
The application of artificial intelligence methods to electronic patient records paves the way for large-scale analysis of multimodal data. Such population-wide data describing deep phenotypes composed of thousands of features are now being leveraged to create data-driven algorithms, which in turn has led to improved methods for early cancer detection and screening. Remaining challenges include establishment of infrastructures for prospective testing of such methods, ways to assess biases given the data, and gathering of sufficiently large and diverse datasets that reflect disease heterogeneities across populations. This Review provides an overview of artificial intelligence methods designed to detect cancer early, including key aspects of concern (eg, the problem of data drift-when the underlying health-care data change over time), ethical aspects, and discrepancies between access to cancer screening in high-income countries versus low-income and middle-income countries.
Collapse
Affiliation(s)
- Amalie Dahl Haue
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Copenhagen University Hospital Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Jessica Xin Hjaltelin
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Peter Christoffer Holm
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Davide Placido
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Copenhagen University Hospital Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - S Ren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark; Copenhagen University Hospital Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
5
|
Laun SE, Kann L, Braun J, Gilbert S, Lunz D, Pierre F, Kalra A, Ma K, Tsai HL, Wang H, Jit S, Cheng Y, Ahmed Y, Wang KK, Leggett CL, Cellini A, Ioffe OB, Zaidi AH, Omstead AN, Jobe B, Korman L, Cornish D, Zellenrath P, Spaander M, Kuipers E, Perpetua L, Greenwald BD, Maddala T, Meltzer SJ. Validation of an Epigenetic Prognostic Assay to Accurately Risk-Stratify Patients with Barrett's Esophagus. Am J Gastroenterol 2024; 120:00000434-990000000-01289. [PMID: 39140473 PMCID: PMC11825890 DOI: 10.14309/ajg.0000000000003030] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 07/20/2024] [Indexed: 08/15/2024]
Abstract
INTRODUCTION: Esophageal adenocarcinoma (EAC) is the second-most lethal cancer in the United States, with Barrett esophagus (BE) being the strongest risk factor. Assessing the future risk of neoplastic progression in patients with BE is difficult; however, high-grade dysplasia (HGD) and early EAC are treatable by endoscopic eradication therapy (EET), with survival rates of 90%. Thus, it would be beneficial to develop a molecular assay to identify high-risk patients, who merit more frequent endoscopic surveillance or EET, as well as low-risk patients, who can avoid EET and undergo less frequent surveillance. METHODS: Deidentified endoscopic biopsies were acquired from 240 patients with BE at 6 centers and confirmed as future progressors or nonprogressors. Tissues were analyzed by a set of methylation-specific biomarker assays. Test performance was assessed in an independent validation set using 4 stratification levels: low risks, low-moderate risks, high-moderate risks, and high risks. RESULTS: Relative to patients in the low-risk group, high-risk patients were 15.2 times more likely to progress within 5 years to HGD or EAC. For patients in the high-risk category, the average risk of progressing to HGD or EAC within 5 years was 21.5%, 4-fold the BE population prevalence within 5 years, whereas low-risk patients had a progression risk of only 1.85%. DISCUSSION: This clinical assay, Esopredict, stratifies future neoplastic progression risk to identify higher-risk patients with BE who can benefit from EET or more frequent surveillance and lower-risk patients who can benefit from reduced surveillance.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Andrew Kalra
- Division of Gastroenterology and Hepatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Ke Ma
- Division of Gastroenterology and Hepatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Hua-Ling Tsai
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Division of Biostatistics, Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Hao Wang
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Division of Biostatistics, Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Simran Jit
- Division of Gastroenterology and Hepatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Yulan Cheng
- Division of Gastroenterology and Hepatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Yousra Ahmed
- Division of Gastroenterology and Hepatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Kenneth K. Wang
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Cadman L. Leggett
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Ashley Cellini
- Department of Pathology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Olga B. Ioffe
- Department of Pathology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Ali H. Zaidi
- Esophageal Institute, Allegheny Health Network Cancer Institute, Allegheny Health Network, Pittsburgh, Pennsylvania, USA
| | - Ashten N. Omstead
- Esophageal Institute, Allegheny Health Network Cancer Institute, Allegheny Health Network, Pittsburgh, Pennsylvania, USA
| | - Blair Jobe
- Department of Surgery, Esophageal Institute, Allegheny Health Network, Pittsburgh, Pennsylvania, USA
- Department of Surgery, Drexel University, Philadelphia, Pennsylvania, USA
| | - Louis Korman
- Capital Digestive Care, Chevy Chase, Maryland, USA
| | - Drew Cornish
- Capital Digestive Care, Chevy Chase, Maryland, USA
| | - Pauline Zellenrath
- Department of Gastroenterology & Hepatology, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Manon Spaander
- Department of Gastroenterology & Hepatology, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Ernst Kuipers
- Department of Gastroenterology & Hepatology, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Lorrie Perpetua
- Research Tissue Biorepository Core Facility, University of Connecticut, Storrs, Connecticut, USA
| | - Bruce D. Greenwald
- Division of Gastroenterology and Hepatology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | | | - Stephen J. Meltzer
- Division of Gastroenterology and Hepatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|