1
|
Wu W, Laville A, Deutsch E, Sun R. Deep learning for malignant lymph node segmentation and detection: a review. Front Immunol 2025; 16:1526518. [PMID: 40356919 PMCID: PMC12066500 DOI: 10.3389/fimmu.2025.1526518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Accepted: 03/17/2025] [Indexed: 05/15/2025] Open
Abstract
Radiation therapy remains a cornerstone in the treatment of cancer, with the delineation of Organs at Risk (OARs), tumors, and malignant lymph nodes playing a critical role in the planning process. However, the manual segmentation of these anatomical structures is both time-consuming and costly, with inter-observer and intra-observer variability often leading to delineation errors. In recent years, deep learning-based automatic segmentation has gained increasing attention, leading to a proliferation of scholarly works on OAR and tumor segmentation algorithms utilizing deep learning techniques. Nevertheless, similar comprehensive reviews focusing solely on malignant lymph nodes are scarce. This paper provides an in-depth review of the advancements in deep learning for malignant lymph node segmentation and detection. After a brief overview of deep learning methodologies, the review examines specific models and their outcomes for malignant lymph node segmentation and detection across five clinical sites: head and neck, upper extremity, chest, abdomen, and pelvis. The discussion section extensively covers the current challenges and future trends in this field, analyzing how they might impact clinical applications. This review aims to bridge the gap in literature by providing a focused overview on deep learning applications in the context of malignant lymph node challenges, offering insights into their potential to enhance the precision and efficiency of cancer treatment planning.
Collapse
Affiliation(s)
| | | | - Eric Deutsch
- Unité Mixte de Recherche (UMR) 1030, Gustave Roussy, Department of Radiation
Oncology, Université Paris-Saclay, Villejuif, France
| | - Roger Sun
- Unité Mixte de Recherche (UMR) 1030, Gustave Roussy, Department of Radiation
Oncology, Université Paris-Saclay, Villejuif, France
| |
Collapse
|
2
|
Liao W, Luo X, Li L, Xu J, He Y, Huang H, Zhang S. Automatic cervical lymph nodes detection and segmentation in heterogeneous computed tomography images using deep transfer learning. Sci Rep 2025; 15:4250. [PMID: 39905029 PMCID: PMC11794882 DOI: 10.1038/s41598-024-84804-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2024] [Accepted: 12/27/2024] [Indexed: 02/06/2025] Open
Abstract
To develop a deep learning model using transfer learning for automatic detection and segmentation of neck lymph nodes (LNs) in computed tomography (CT) images, the study included 11,013 annotated LNs with a short-axis diameter ≥ 3 mm from 626 head and neck cancer patients across four hospitals. The nnUNet model was used as a baseline, pre-trained on a large-scale head and neck dataset, and then fine-tuned with 4,729 LNs from hospital A for detection and segmentation. Validation was conducted on an internal testing cohort (ITC A) and three external testing cohorts (ETCs B, C, and D), with 1684 and 4600 LNs, respectively. Detection was evaluated via sensitivity, positive predictive value (PPV), and false positive rate per case (FP/vol), while segmentation was assessed using the Dice similarity coefficient (DSC) and Hausdorff distance (HD95). For detection, the sensitivity, PPV, and FP/vol in ITC A were 54.6%, 69.0%, and 3.4, respectively. In ETCs, the sensitivity ranged from 45.7% at 3.9 FP/vol to 63.5% at 5.8 FP/vol. Segmentation achieved a mean DSC of 0.72 in ITC A and 0.72 to 0.74 in ETCs, as well as a mean HD95 of 3.78 mm in ITC A and 2.73 mm to 2.85 mm in ETCs. No significant sensitivity difference was found between contrast-enhanced and unenhanced CT images (p = 0.502) or repeated CT images (p = 0.815) during adaptive radiotherapy. The model's segmentation accuracy was comparable to that of experienced oncologists. The model shows promise in automatically detecting and segmenting neck LNs in CT images, potentially reducing oncologists' segmentation workload.
Collapse
Affiliation(s)
- Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Cancer Hospital Affiliate to School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610041, China
| | - Xiangde Luo
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Lu Li
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Cancer Hospital Affiliate to School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610041, China
| | - Jinfeng Xu
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
| | - Yuan He
- Department of Radiation Oncology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 23000, Anhui, China
| | - Hui Huang
- Cancer Center, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 610072, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Cancer Hospital Affiliate to School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610041, China.
| |
Collapse
|
3
|
Al Hasan MM, Ghazimoghadam S, Tunlayadechanont P, Mostafiz MT, Gupta M, Roy A, Peters K, Hochhegger B, Mancuso A, Asadizanjani N, Forghani R. Automated Segmentation of Lymph Nodes on Neck CT Scans Using Deep Learning. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2955-2966. [PMID: 38937342 PMCID: PMC11612088 DOI: 10.1007/s10278-024-01114-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 04/01/2024] [Accepted: 04/03/2024] [Indexed: 06/29/2024]
Abstract
Early and accurate detection of cervical lymph nodes is essential for the optimal management and staging of patients with head and neck malignancies. Pilot studies have demonstrated the potential for radiomic and artificial intelligence (AI) approaches in increasing diagnostic accuracy for the detection and classification of lymph nodes, but implementation of many of these approaches in real-world clinical settings would necessitate an automated lymph node segmentation pipeline as a first step. In this study, we aim to develop a non-invasive deep learning (DL) algorithm for detecting and automatically segmenting cervical lymph nodes in 25,119 CT slices from 221 normal neck contrast-enhanced CT scans from patients without head and neck cancer. We focused on the most challenging task of segmentation of small lymph nodes, evaluated multiple architectures, and employed U-Net and our adapted spatial context network to detect and segment small lymph nodes measuring 5-10 mm. The developed algorithm achieved a Dice score of 0.8084, indicating its effectiveness in detecting and segmenting cervical lymph nodes despite their small size. A segmentation framework successful in this task could represent an essential initial block for future algorithms aiming to evaluate small objects such as lymph nodes in different body parts, including small lymph nodes looking normal to the naked human eye but harboring early nodal metastases.
Collapse
Affiliation(s)
- Md Mahfuz Al Hasan
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Electrical and Computer Engineering, University of Florida College of Medicine, Gainesville, FL, USA
| | - Saba Ghazimoghadam
- Augmented Intelligence and Precision Health Laboratory, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Padcha Tunlayadechanont
- Augmented Intelligence and Precision Health Laboratory, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Department of Diagnostic and Therapeutic Radiology and Research, Faculty of Medicine Ramathibodi Hospital, Ratchathewi, Bangkok, Thailand
| | - Mohammed Tahsin Mostafiz
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Electrical and Computer Engineering, University of Florida College of Medicine, Gainesville, FL, USA
| | - Manas Gupta
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
| | - Antika Roy
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Electrical and Computer Engineering, University of Florida College of Medicine, Gainesville, FL, USA
| | - Keith Peters
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Bruno Hochhegger
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Anthony Mancuso
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA
| | - Navid Asadizanjani
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA
- Department of Electrical and Computer Engineering, University of Florida College of Medicine, Gainesville, FL, USA
| | - Reza Forghani
- Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, 1600 SW Archer Road, Gainesville, FL, 32610-0374, USA.
- Department of Radiology, University of Florida College of Medicine, Gainesville, FL, USA.
- Division of Medical Physics, University of Florida College of Medicine, Gainesville, FL, USA.
- Department of Neurology, Division of Movement Disorders, University of Florida College of Medicine, Gainesville, FL, USA.
- Augmented Intelligence and Precision Health Laboratory, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.
| |
Collapse
|
4
|
Nam Y, Kim SY, Kim KA, Kwon E, Lee YH, Jang J, Lee MK, Kim J, Choi Y. Development and Validation of Deep Learning-Based Automated Detection of Cervical Lymphadenopathy in Patients with Lymphoma for Treatment Response Assessment: A Bi-institutional Feasibility Study. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:734-743. [PMID: 38316667 PMCID: PMC11031526 DOI: 10.1007/s10278-024-00966-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/11/2023] [Accepted: 11/13/2023] [Indexed: 02/07/2024]
Abstract
The purpose is to train and evaluate a deep learning (DL) model for the accurate detection and segmentation of abnormal cervical lymph nodes (LN) on head and neck contrast-enhanced CT scans in patients diagnosed with lymphoma and evaluate the clinical utility of the DL model in response assessment. This retrospective study included patients who underwent CT for abnormal cervical LN and lymphoma assessment between January 2021 and July 2022. Patients were grouped into the development (n = 76), internal test 1 (n = 27), internal test 2 (n = 87), and external test (n = 26) cohorts. A 3D SegResNet model was used to train the CT images. The volume change rates of cervical LN across longitudinal CT scans were compared among patients with different treatment outcomes (stable, response, and progression). Dice similarity coefficient (DSC) and the Bland-Altman plot were used to assess the model's segmentation performance and reliability, respectively. No significant differences in baseline clinical characteristics were found across cohorts (age, P = 0.55; sex, P = 0.13; diagnoses, P = 0.06). The mean DSC was 0.39 ± 0.2 with a precision and recall of 60.9% and 57.0%, respectively. Most LN volumes were within the limits of agreement on the Bland-Altman plot. The volume change rates among the three groups differed significantly (progression (n = 74), 342.2%; response (n = 8), - 79.2%; stable (n = 5), - 8.1%; all P < 0.01). Our proposed DL segmentation model showed modest performance in quantifying the cervical LN burden on CT in patients with lymphoma. Longitudinal changes in cervical LN volume, as predicted by the DL model, were useful for treatment response assessment.
Collapse
Affiliation(s)
- Yoonho Nam
- Division of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin-Si, Gyeonggi-do, Republic of Korea
| | - Su-Youn Kim
- Division of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin-Si, Gyeonggi-do, Republic of Korea
| | - Kyu-Ah Kim
- Division of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin-Si, Gyeonggi-do, Republic of Korea
| | - Euna Kwon
- Division of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin-Si, Gyeonggi-do, Republic of Korea
| | - Yoo Hyun Lee
- College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Jinhee Jang
- Department of Radiology, College of Medicine, Seoul St. Mary's Hospital, The Catholic University of Korea, Seoul, Republic of Korea
| | - Min Kyoung Lee
- Department of Radiology, College of Medicine, Yeouido St. Mary's Hospital, The Catholic University of Korea, Seoul, Republic of Korea
| | - Jiwoong Kim
- Department of Mathematics and Statistics, University of South Florida, Tampa, FL, USA
| | - Yangsean Choi
- Department of Radiology, College of Medicine, Seoul St. Mary's Hospital, The Catholic University of Korea, Seoul, Republic of Korea.
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Centre, 43 Olympic-Ro 88, Songpa-Gu, Seoul, 05505, Republic of Korea.
| |
Collapse
|
5
|
Omobolaji Alabi R, Sjöblom A, Carpén T, Elmusrati M, Leivo I, Almangush A, Mäkitie AA. Application of artificial intelligence for overall survival risk stratification in oropharyngeal carcinoma: A validation of ProgTOOL. Int J Med Inform 2023; 175:105064. [PMID: 37094545 DOI: 10.1016/j.ijmedinf.2023.105064] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 03/31/2023] [Accepted: 04/03/2023] [Indexed: 04/26/2023]
Abstract
BACKGROUND In recent years, there has been a surge in machine learning-based models for diagnosis and prognostication of outcomes in oncology. However, there are concerns relating to the model's reproducibility and generalizability to a separate patient cohort (i.e., external validation). OBJECTIVES This study primarily provides a validation study for a recently introduced and publicly available machine learning (ML) web-based prognostic tool (ProgTOOL) for overall survival risk stratification of oropharyngeal squamous cell carcinoma (OPSCC). Additionally, we reviewed the published studies that have utilized ML for outcome prognostication in OPSCC to examine how many of these models were externally validated, type of external validation, characteristics of the external dataset, and diagnostic performance characteristics on the internal validation (IV) and external validation (EV) datasets were extracted and compared. METHODS We used a total of 163 OPSCC patients obtained from the Helsinki University Hospital to externally validate the ProgTOOL for generalizability. In addition, PubMed, OvidMedline, Scopus, and Web of Science databases were systematically searched according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. RESULTS The ProgTOOL produced a predictive performance of 86.5% balanced accuracy, Mathew's correlation coefficient of 0.78, Net Benefit (0.7) and Brier score (0.06) for overall survival stratification of OPSCC patients as either low-chance or high-chance. In addition, out of a total of 31 studies found to have used ML for the prognostication of outcomes in OPSCC, only seven (22.6%) reported a form of EV. Three studies (42.9%) each used either temporal EV or geographical EV while only one study (14.2%) used expert as a form of EV. Most of the studies reported a reduction in performance when externally validated. CONCLUSION The performance of the model in this validation study indicates that it may be generalized, therefore, bringing recommendations of the model for clinical evaluation closer to reality. However, the number of externally validated ML-based models for OPSCC is still relatively small. This significantly limits the transfer of these models for clinical evaluation and subsequently reduces the likelihood of the use of these models in daily clinical practice. As a gold standard, we recommend the use of geographical EV and validation studies to reveal biases and overfitting of these models. These recommendations are poised to facilitate the implementation of these models in clinical practice.
Collapse
Affiliation(s)
- Rasheed Omobolaji Alabi
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland.
| | - Anni Sjöblom
- Department of Pathology, University of Helsinki, Helsinki, Finland
| | - Timo Carpén
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Department of Pathology, University of Helsinki, Helsinki, Finland; Department of Otorhinolaryngology - Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Mohammed Elmusrati
- Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland
| | - Ilmo Leivo
- University of Turku, Institute of Biomedicine, Pathology, Turku, Finland
| | - Alhadi Almangush
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Department of Pathology, University of Helsinki, Helsinki, Finland; University of Turku, Institute of Biomedicine, Pathology, Turku, Finland; Faculty of Dentistry, Misurata University, Misurata, Libya
| | - Antti A Mäkitie
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Department of Otorhinolaryngology - Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland; Division of Ear, Nose and Throat Diseases, Department of Clinical Sciences, Intervention and Technology, Karolinska Institute and Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
6
|
Wahid KA, Lin D, Sahin O, Cislo M, Nelms BE, He R, Naser MA, Duke S, Sherer MV, Christodouleas JP, Mohamed ASR, Murphy JD, Fuller CD, Gillespie EF. Large scale crowdsourced radiotherapy segmentations across a variety of cancer anatomic sites. Sci Data 2023; 10:161. [PMID: 36949088 PMCID: PMC10033824 DOI: 10.1038/s41597-023-02062-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 03/10/2023] [Indexed: 03/24/2023] Open
Abstract
Clinician generated segmentation of tumor and healthy tissue regions of interest (ROIs) on medical images is crucial for radiotherapy. However, interobserver segmentation variability has long been considered a significant detriment to the implementation of high-quality and consistent radiotherapy dose delivery. This has prompted the increasing development of automated segmentation approaches. However, extant segmentation datasets typically only provide segmentations generated by a limited number of annotators with varying, and often unspecified, levels of expertise. In this data descriptor, numerous clinician annotators manually generated segmentations for ROIs on computed tomography images across a variety of cancer sites (breast, sarcoma, head and neck, gynecologic, gastrointestinal; one patient per cancer site) for the Contouring Collaborative for Consensus in Radiation Oncology challenge. In total, over 200 annotators (experts and non-experts) contributed using a standardized annotation platform (ProKnow). Subsequently, we converted Digital Imaging and Communications in Medicine data into Neuroimaging Informatics Technology Initiative format with standardized nomenclature for ease of use. In addition, we generated consensus segmentations for experts and non-experts using the Simultaneous Truth and Performance Level Estimation method. These standardized, structured, and easily accessible data are a valuable resource for systematically studying variability in segmentation applications.
Collapse
Affiliation(s)
- Kareem A Wahid
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Diana Lin
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Onur Sahin
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Michael Cislo
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | - Renjie He
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Mohammed A Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Simon Duke
- Department of Radiation Oncology, Cambridge University Hospitals, Cambridge, UK
| | - Michael V Sherer
- Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, CA, USA
| | - John P Christodouleas
- Department of Radiation Oncology, The University of Pennsylvania Cancer Center, Philadelphia, PA, USA
- Elekta, Atlanta, GA, USA
| | - Abdallah S R Mohamed
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - James D Murphy
- Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, CA, USA
| | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
| | - Erin F Gillespie
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Fred Hutchinson Cancer Center, Seattle, WA, USA.
| |
Collapse
|
7
|
Huiskes M, Astreinidou E, Kong W, Breedveld S, Heijmen B, Rasch C. Dosimetric impact of adaptive proton therapy in head and neck cancer - A review. Clin Transl Radiat Oncol 2023; 39:100598. [PMID: 36860581 PMCID: PMC9969246 DOI: 10.1016/j.ctro.2023.100598] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/10/2023] [Accepted: 02/12/2023] [Indexed: 02/18/2023] Open
Abstract
Background Intensity Modulated Proton Therapy (IMPT) in head and neck cancer (HNC) is susceptible to anatomical changes and patient set-up inaccuracies during the radiotherapy course, which can cause discrepancies between planned and delivered dose. The discrepancies can be counteracted by adaptive replanning strategies. This article reviews the observed dosimetric impact of adaptive proton therapy (APT) and the timing to perform a plan adaptation in IMPT in HNC. Methods A literature search of articles published in PubMed/MEDLINE, EMBASE and Web of Science from January 2010 to March 2022 was performed. Among a total of 59 records assessed for possible eligibility, ten articles were included in this review. Results Included studies reported on target coverage deterioration in IMPT plans during the RT course, which was recovered with the application of an APT approach. All APT plans showed an average improved target coverage for the high- and low-dose targets as compared to the accumulated dose on the planned plans. Dose improvements up to 2.5 Gy (3.5 %) and up to 4.0 Gy (7.1 %) in the D98 of the high- and low dose targets were observed with APT. Doses to the organs at risk (OARs) remained equal or decreased slightly after APT was applied. In the included studies, APT was largely performed once, which resulted in the largest target coverage improvement, but eventual additional APT improved the target coverage further. There is no data showing what is the most appropriate timing for APT. Conclusion APT during IMPT for HNC patients improves target coverage. The largest improvement in target coverage was found with a single adaptive intervention, and an eventual second or more frequent APT application improved the target coverage further. Doses to the OARs remained equal or decreased slightly after applying APT. The most optimal timing for APT is yet to be determined.
Collapse
Affiliation(s)
- Merle Huiskes
- Department of Radiation Oncology, Leiden University Medical Center, Leiden, the Netherlands
| | - Eleftheria Astreinidou
- Department of Radiation Oncology, Leiden University Medical Center, Leiden, the Netherlands
| | - Wens Kong
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, the Netherlands
| | - Sebastiaan Breedveld
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, the Netherlands
| | - Ben Heijmen
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, the Netherlands
| | - Coen Rasch
- Department of Radiation Oncology, Leiden University Medical Center, Leiden, the Netherlands
- HollandPTC, Delft, the Netherlands
| |
Collapse
|
8
|
Sahlsten J, Wahid KA, Glerean E, Jaskari J, Naser MA, He R, Kann BH, Mäkitie A, Fuller CD, Kaski K. Segmentation stability of human head and neck cancer medical images for radiotherapy applications under de-identification conditions: Benchmarking data sharing and artificial intelligence use-cases. Front Oncol 2023; 13:1120392. [PMID: 36925936 PMCID: PMC10011442 DOI: 10.3389/fonc.2023.1120392] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 02/13/2023] [Indexed: 03/08/2023] Open
Abstract
Background Demand for head and neck cancer (HNC) radiotherapy data in algorithmic development has prompted increased image dataset sharing. Medical images must comply with data protection requirements so that re-use is enabled without disclosing patient identifiers. Defacing, i.e., the removal of facial features from images, is often considered a reasonable compromise between data protection and re-usability for neuroimaging data. While defacing tools have been developed by the neuroimaging community, their acceptability for radiotherapy applications have not been explored. Therefore, this study systematically investigated the impact of available defacing algorithms on HNC organs at risk (OARs). Methods A publicly available dataset of magnetic resonance imaging scans for 55 HNC patients with eight segmented OARs (bilateral submandibular glands, parotid glands, level II neck lymph nodes, level III neck lymph nodes) was utilized. Eight publicly available defacing algorithms were investigated: afni_refacer, DeepDefacer, defacer, fsl_deface, mask_face, mri_deface, pydeface, and quickshear. Using a subset of scans where defacing succeeded (N=29), a 5-fold cross-validation 3D U-net based OAR auto-segmentation model was utilized to perform two main experiments: 1.) comparing original and defaced data for training when evaluated on original data; 2.) using original data for training and comparing the model evaluation on original and defaced data. Models were primarily assessed using the Dice similarity coefficient (DSC). Results Most defacing methods were unable to produce any usable images for evaluation, while mask_face, fsl_deface, and pydeface were unable to remove the face for 29%, 18%, and 24% of subjects, respectively. When using the original data for evaluation, the composite OAR DSC was statistically higher (p ≤ 0.05) for the model trained with the original data with a DSC of 0.760 compared to the mask_face, fsl_deface, and pydeface models with DSCs of 0.742, 0.736, and 0.449, respectively. Moreover, the model trained with original data had decreased performance (p ≤ 0.05) when evaluated on the defaced data with DSCs of 0.673, 0.693, and 0.406 for mask_face, fsl_deface, and pydeface, respectively. Conclusion Defacing algorithms may have a significant impact on HNC OAR auto-segmentation model training and testing. This work highlights the need for further development of HNC-specific image anonymization methods.
Collapse
Affiliation(s)
- Jaakko Sahlsten
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Kareem A. Wahid
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Enrico Glerean
- Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | - Joel Jaskari
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Mohamed A. Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Renjie He
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Benjamin H. Kann
- Artificial Intelligence in Medicine Program, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, United States
| | - Antti Mäkitie
- Department of Otorhinolaryngology, Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Clifton D. Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Kimmo Kaski
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| |
Collapse
|
9
|
Sahlsten J, Jaskari J, Wahid KA, Ahmed S, Glerean E, He R, Kann BH, Mäkitie A, Fuller CD, Naser MA, Kaski K. Application of simultaneous uncertainty quantification for image segmentation with probabilistic deep learning: Performance benchmarking of oropharyngeal cancer target delineation as a use-case. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.20.23286188. [PMID: 36865296 PMCID: PMC9980236 DOI: 10.1101/2023.02.20.23286188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
Abstract
Background Oropharyngeal cancer (OPC) is a widespread disease, with radiotherapy being a core treatment modality. Manual segmentation of the primary gross tumor volume (GTVp) is currently employed for OPC radiotherapy planning, but is subject to significant interobserver variability. Deep learning (DL) approaches have shown promise in automating GTVp segmentation, but comparative (auto)confidence metrics of these models predictions has not been well-explored. Quantifying instance-specific DL model uncertainty is crucial to improving clinician trust and facilitating broad clinical implementation. Therefore, in this study, probabilistic DL models for GTVp auto-segmentation were developed using large-scale PET/CT datasets, and various uncertainty auto-estimation methods were systematically investigated and benchmarked. Methods We utilized the publicly available 2021 HECKTOR Challenge training dataset with 224 co-registered PET/CT scans of OPC patients with corresponding GTVp segmentations as a development set. A separate set of 67 co-registered PET/CT scans of OPC patients with corresponding GTVp segmentations was used for external validation. Two approximate Bayesian deep learning methods, the MC Dropout Ensemble and Deep Ensemble, both with five submodels, were evaluated for GTVp segmentation and uncertainty performance. The segmentation performance was evaluated using the volumetric Dice similarity coefficient (DSC), mean surface distance (MSD), and Hausdorff distance at 95% (95HD). The uncertainty was evaluated using four measures from literature: coefficient of variation (CV), structure expected entropy, structure predictive entropy, and structure mutual information, and additionally with our novel Dice-risk measure. The utility of uncertainty information was evaluated with the accuracy of uncertainty-based segmentation performance prediction using the Accuracy vs Uncertainty (AvU) metric, and by examining the linear correlation between uncertainty estimates and DSC. In addition, batch-based and instance-based referral processes were examined, where the patients with high uncertainty were rejected from the set. In the batch referral process, the area under the referral curve with DSC (R-DSC AUC) was used for evaluation, whereas in the instance referral process, the DSC at various uncertainty thresholds were examined. Results Both models behaved similarly in terms of the segmentation performance and uncertainty estimation. Specifically, the MC Dropout Ensemble had 0.776 DSC, 1.703 mm MSD, and 5.385 mm 95HD. The Deep Ensemble had 0.767 DSC, 1.717 mm MSD, and 5.477 mm 95HD. The uncertainty measure with the highest DSC correlation was structure predictive entropy with correlation coefficients of 0.699 and 0.692 for the MC Dropout Ensemble and the Deep Ensemble, respectively. The highest AvU value was 0.866 for both models. The best performing uncertainty measure for both models was the CV which had R-DSC AUC of 0.783 and 0.782 for the MC Dropout Ensemble and Deep Ensemble, respectively. With referring patients based on uncertainty thresholds from 0.85 validation DSC for all uncertainty measures, on average the DSC improved from the full dataset by 4.7% and 5.0% while referring 21.8% and 22% patients for MC Dropout Ensemble and Deep Ensemble, respectively. Conclusion We found that many of the investigated methods provide overall similar but distinct utility in terms of predicting segmentation quality and referral performance. These findings are a critical first-step towards more widespread implementation of uncertainty quantification in OPC GTVp segmentation.
Collapse
Affiliation(s)
- Jaakko Sahlsten
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Joel Jaskari
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| | - Kareem A Wahid
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Sara Ahmed
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Enrico Glerean
- Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Espoo, Finland
| | - Renjie He
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Benjamin H Kann
- Artificial Intelligence in Medicine Program, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Antti Mäkitie
- Department of Otorhinolaryngology, Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Mohamed A Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Kimmo Kaski
- Department of Computer Science, Aalto University School of Science, Espoo, Finland
| |
Collapse
|