1
|
Haidar A, Field M, Batumalai V, Cloak K, Al Mouiee D, Chlap P, Huang X, Chin V, Aly F, Carolan M, Sykes J, Vinod SK, Delaney GP, Holloway L. Standardising Breast Radiotherapy Structure Naming Conventions: A Machine Learning Approach. Cancers (Basel) 2023; 15:cancers15030564. [PMID: 36765523 PMCID: PMC9913464 DOI: 10.3390/cancers15030564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/01/2023] [Accepted: 01/11/2023] [Indexed: 01/18/2023] Open
Abstract
In progressing the use of big data in health systems, standardised nomenclature is required to enable data pooling and analyses. In many radiotherapy planning systems and their data archives, target volumes (TV) and organ-at-risk (OAR) structure nomenclature has not been standardised. Machine learning (ML) has been utilised to standardise volumes nomenclature in retrospective datasets. However, only subsets of the structures have been targeted. Within this paper, we proposed a new approach for standardising all the structures nomenclature by using multi-modal artificial neural networks. A cohort consisting of 1613 breast cancer patients treated with radiotherapy was identified from Liverpool & Macarthur Cancer Therapy Centres, NSW, Australia. Four types of volume characteristics were generated to represent each target and OAR volume: textual features, geometric features, dosimetry features, and imaging data. Five datasets were created from the original cohort, the first four represented different subsets of volumes and the last one represented the whole list of volumes. For each dataset, 15 sets of combinations of features were generated to investigate the effect of using different characteristics on the standardisation performance. The best model reported 99.416% classification accuracy over the hold-out sample when used to standardise all the nomenclatures in a breast cancer radiotherapy plan into 21 classes. Our results showed that ML based automation methods can be used for standardising naming conventions in a radiotherapy plan taking into consideration the inclusion of multiple modalities to better represent each volume.
Collapse
Affiliation(s)
- Ali Haidar
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
- Correspondence: or
| | - Matthew Field
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Vikneswary Batumalai
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
- GenesisCare, Alexandria, NSW 2015, Australia
| | - Kirrily Cloak
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Daniel Al Mouiee
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Phillip Chlap
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Xiaoshui Huang
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- University of Sydney, Camperdown, NSW 2006, Australia
| | - Vicky Chin
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Farhannah Aly
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Martin Carolan
- Illawarra Cancer Care Center, Wollongong, NSW 2522, Australia
- University of Wollongong, Wollongong, NSW 2522, Australia
| | - Jonathan Sykes
- University of Sydney, Camperdown, NSW 2006, Australia
- Blacktown Hospital, Blacktown, NSW 2148, Australia
| | - Shalini K. Vinod
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Geoffrey P. Delaney
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
| | - Lois Holloway
- Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, NSW 2170, Australia
- South Western Sydney Clinical School, University of New South Wales, Liverpool, NSW 2170, Australia
- University of Sydney, Camperdown, NSW 2006, Australia
| |
Collapse
|
2
|
Syed K, Sleeman WC, Hagan M, Palta J, Kapoor R, Ghosh P. Multi-View Data Integration Methods for Radiotherapy Structure Name Standardization. Cancers (Basel) 2021; 13:cancers13081796. [PMID: 33918716 PMCID: PMC8070367 DOI: 10.3390/cancers13081796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 03/28/2021] [Accepted: 04/05/2021] [Indexed: 11/24/2022] Open
Abstract
Simple Summary Structure names associated with radiotherapy treatments need standardization to develop data pipelines enabling personalized treatment plans. Automatic classification of structure names based on the currently available TG-263 nomenclature can help with data aggregation from both retrospective and future data sources. The aim of our proposed machine learning-based data integration methods is to achieve highly accurate structure name classification to automate the data aggregation process. Our multi-view models can overcome the challenges of integrating different data types associated with radiotherapy structures, such as the physician-given text labels and geometric or image data. The models exhibited high accuracy when tested on multi-center and multi-institutional lung and prostate cancer patients data and outperformed the models built on any single data type. This highlights the importance of combining different types of data in building generalizable models for structure name standardization. Abstract Standardization of radiotherapy structure names is essential for developing data-driven personalized radiotherapy treatment plans. Different types of data are associated with radiotherapy structures, such as the physician-given text labels, geometric (image) data, and Dose-Volume Histograms (DVH). Prior work on structure name standardization used just one type of data. We present novel approaches to integrate complementary types (views) of structure data to build better-performing machine learning models. We present two methods, namely (a) intermediate integration and (b) late integration, to combine physician-given textual structure name features and geometric information of structures. The dataset consisted of 709 prostate cancer and 752 lung cancer patients across 40 radiotherapy centers administered by the U.S. Veterans Health Administration (VA) and the Department of Radiation Oncology, Virginia Commonwealth University (VCU). We used randomly selected data from 30 centers for training and ten centers for testing. We also used the VCU data for testing. We observed that the intermediate integration approach outperformed the models with a single view of the dataset, while late integration showed comparable performance with single-view results. Thus, we demonstrate that combining different views (types of data) helps build better models for structure name standardization to enable big data analytics in radiation oncology.
Collapse
Affiliation(s)
- Khajamoinuddin Syed
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA; (W.C.S.IV); (P.G.)
- Correspondence:
| | - William C. Sleeman
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA; (W.C.S.IV); (P.G.)
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
| | - Michael Hagan
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
- National Radiation Oncology Program, Department of Veteran Affairs, Richmond, VA 23249, USA
| | - Jatinder Palta
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
- National Radiation Oncology Program, Department of Veteran Affairs, Richmond, VA 23249, USA
| | - Rishabh Kapoor
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
- National Radiation Oncology Program, Department of Veteran Affairs, Richmond, VA 23249, USA
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA; (W.C.S.IV); (P.G.)
| |
Collapse
|
3
|
Syed K, Sleeman IV W, Ivey K, Hagan M, Palta J, Kapoor R, Ghosh P. Integrated Natural Language Processing and Machine Learning Models for Standardizing Radiotherapy Structure Names. Healthcare (Basel) 2020; 8:healthcare8020120. [PMID: 32365973 PMCID: PMC7348919 DOI: 10.3390/healthcare8020120] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 04/18/2020] [Accepted: 04/24/2020] [Indexed: 01/16/2023] Open
Abstract
The lack of standardized structure names in radiotherapy (RT) data limits interoperability, data sharing, and the ability to perform big data analysis. To standardize radiotherapy structure names, we developed an integrated natural language processing (NLP) and machine learning (ML) based system that can map the physician-given structure names to American Association of Physicists in Medicine (AAPM) Task Group 263 (TG-263) standard names. The dataset consist of 794 prostate and 754 lung cancer patients across the 40 different radiation therapy centers managed by the Veterans Health Administration (VA). Additionally, data from the Radiation Oncology department at Virginia Commonwealth University (VCU) was collected to serve as a test set. Domain experts identified as anatomically significant nine prostate and ten lung organs-at-risk (OAR) structures and manually labeled them according to the TG-263 standards, and remaining structures were labeled as Non_OAR. We experimented with six different classification algorithms and three feature vector methods, and the final model was built with fastText algorithm. Multiple validation techniques are used to assess the robustness of the proposed methodology. The macro-averaged F 1 score was used as the main evaluation metric. The model achieved an F 1 score of 0.97 on prostate structures and 0.99 for lung structures from the VA dataset. The model also performed well on the test (VCU) dataset, achieving an F 1 score of 0.93 for prostate structures and 0.95 on lung structures. In this work, we demonstrate that NLP and ML based approaches can used to standardize the physician-given RT structure names with high fidelity. This standardization can help with big data analytics in the radiation therapy domain using population-derived datasets, including standardization of the treatment planning process, clinical decision support systems, treatment quality improvement programs, and hypothesis-driven clinical research.
Collapse
Affiliation(s)
- Khajamoinuddin Syed
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA; (W.S.I.); (P.G.)
- Correspondence:
| | - William Sleeman IV
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA; (W.S.I.); (P.G.)
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
| | - Kevin Ivey
- Department of Computer Science, University of Virginia, Charlottesville, VA 22904, USA;
| | - Michael Hagan
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
- National Radiation Oncology Program, Department of Veteran Affairs, Richmond, VA 23249, USA
| | - Jatinder Palta
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
- National Radiation Oncology Program, Department of Veteran Affairs, Richmond, VA 23249, USA
| | - Rishabh Kapoor
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA; (M.H.); (J.P.); (R.K.)
- National Radiation Oncology Program, Department of Veteran Affairs, Richmond, VA 23249, USA
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA; (W.S.I.); (P.G.)
| |
Collapse
|
4
|
Schuler T, Kipritidis J, Eade T, Hruby G, Kneebone A, Perez M, Grimberg K, Richardson K, Evill S, Evans B, Gallego B. Big Data Readiness in Radiation Oncology: An Efficient Approach for Relabeling Radiation Therapy Structures With Their TG-263 Standard Name in Real-World Data Sets. Adv Radiat Oncol 2018; 4:191-200. [PMID: 30706028 PMCID: PMC6349627 DOI: 10.1016/j.adro.2018.09.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 09/28/2018] [Indexed: 12/17/2022] Open
Abstract
Purpose To prepare for big data analyses on radiation therapy data, we developed Stature, a tool-supported approach for standardization of structure names in existing radiation therapy plans. We applied the widely endorsed nomenclature standard TG-263 as the mapping target and quantified the structure name inconsistency in 2 real-world data sets. Methods and Materials The clinically relevant structures in the radiation therapy plans were identified by reference to randomized controlled trials. The Stature approach was used by clinicians to identify the synonyms for each relevant structure, which was then mapped to the corresponding TG-263 name. We applied Stature to standardize the structure names for 654 patients with prostate cancer (PCa) and 224 patients with head and neck squamous cell carcinoma (HNSCC) who received curative radiation therapy at our institution between 2007 and 2017. The accuracy of the Stature process was manually validated in a random sample from each cohort. For the HNSCC cohort we measured the resource requirements for Stature, and for the PCa cohort we demonstrated its impact on an example clinical analytics scenario. Results All but 1 synonym group (“Hydrogel”) was mapped to the corresponding TG-263 name, resulting in a TG-263 relabel rate of 99% (8837 of 8925 structures). For the PCa cohort, Stature matched a total of 5969 structures. Of these, 5682 structures were exact matches (ie, following local naming convention), 284 were matched via a synonym, and 3 required manual matching. This original radiation therapy structure names therefore had a naming inconsistency rate of 4.81%. For the HNSCC cohort, Stature mapped a total of 2956 structures (2638 exact, 304 synonym, 14 manual; 10.76% inconsistency rate) and required 7.5 clinician hours. The clinician hours required were one-fifth of those that would be required for manual relabeling. The accuracy of Stature was 99.97% (PCa) and 99.61% (HNSCC). Conclusions The Stature approach was highly accurate and had significant resource efficiencies compared with manual curation.
Collapse
Affiliation(s)
- Thilo Schuler
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia.,Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - John Kipritidis
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia
| | - Thomas Eade
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia.,Northern Clinical School, University of Sydney, Sydney, Australia
| | - George Hruby
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia.,Northern Clinical School, University of Sydney, Sydney, Australia
| | - Andrew Kneebone
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia.,Northern Clinical School, University of Sydney, Sydney, Australia
| | - Mario Perez
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia
| | - Kylie Grimberg
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia
| | - Kylie Richardson
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia
| | - Sally Evill
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia
| | - Brooke Evans
- Department of Radiation Oncology, Northern Sydney Cancer Centre, Royal North Shore Hospital, Sydney, Australia
| | - Blanca Gallego
- Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| |
Collapse
|