1
|
Adeoye J, Hui L, Su YX. Data-centric artificial intelligence in oncology: a systematic review assessing data quality in machine learning models for head and neck cancer. JOURNAL OF BIG DATA 2023; 10:28. [DOI: 10.1186/s40537-023-00703-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 02/23/2023] [Indexed: 01/03/2025]
Abstract
AbstractMachine learning models have been increasingly considered to model head and neck cancer outcomes for improved screening, diagnosis, treatment, and prognostication of the disease. As the concept of data-centric artificial intelligence is still incipient in healthcare systems, little is known about the data quality of the models proposed for clinical utility. This is important as it supports the generalizability of the models and data standardization. Therefore, this study overviews the quality of structured and unstructured data used for machine learning model construction in head and neck cancer. Relevant studies reporting on the use of machine learning models based on structured and unstructured custom datasets between January 2016 and June 2022 were sourced from PubMed, EMBASE, Scopus, and Web of Science electronic databases. Prediction model Risk of Bias Assessment (PROBAST) tool was used to assess the quality of individual studies before comprehensive data quality parameters were assessed according to the type of dataset used for model construction. A total of 159 studies were included in the review; 106 utilized structured datasets while 53 utilized unstructured datasets. Data quality assessments were deliberately performed for 14.2% of structured datasets and 11.3% of unstructured datasets before model construction. Class imbalance and data fairness were the most common limitations in data quality for both types of datasets while outlier detection and lack of representative outcome classes were common in structured and unstructured datasets respectively. Furthermore, this review found that class imbalance reduced the discriminatory performance for models based on structured datasets while higher image resolution and good class overlap resulted in better model performance using unstructured datasets during internal validation. Overall, data quality was infrequently assessed before the construction of ML models in head and neck cancer irrespective of the use of structured or unstructured datasets. To improve model generalizability, the assessments discussed in this study should be introduced during model construction to achieve data-centric intelligent systems for head and neck cancer management.
Collapse
|
2
|
Adeoye J, Zheng LW, Thomson P, Choi SW, Su YX. Explainable ensemble learning model improves identification of candidates for oral cancer screening. Oral Oncol 2023; 136:106278. [PMID: 36525782 DOI: 10.1016/j.oraloncology.2022.106278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 11/26/2022] [Accepted: 12/06/2022] [Indexed: 12/15/2022]
Abstract
OBJECTIVES Artificial intelligence could enhance the use of disparate risk factors (crude method) for better stratification of patients to be screened for oral cancer. This study aims to construct a meta-classifier that considers diverse risk factors to identify patients at risk of oral cancer and other suspicious oral diseases for targeted screening. MATERIALS AND METHODS A retrospective dataset from a community oral cancer screening program was used to construct and train the novel voting meta-classifier. Comprehensive risk factor information from this dataset was used as input features for eleven supervised learning algorithms which served as base learners and provided predicted probabilities that are weighted and aggregated by the meta-classifier. Training dataset was augmented using SMOTE-ENN. Additionally, Shapley additive explanations (SHAP) values were generated to implement the explainability of the model and display the important risk factors. RESULTS Our meta-classifier had an internal validation recall, specificity, and AUROC of 0.83, 0.86, and 0.85 for identifying the risk of oral cancer and 0.92, 0.60, and 0.76 for identifying suspicious oral mucosal disease respectively. Upon external validation, the meta-classifier had a significantly higher AUROC than the crude/current method used for identifying the risk of oral cancer (0.78 vs 0.46; p = 0.001) Also, the meta-classifier had better recall than the crude method for predicting the risk of suspicious oral mucosal diseases (0.78 vs 0.47). CONCLUSION Overall, these findings showcase that our approach optimizes the use of risk factors in identifying patients for oral screening which suggests potential clinical application.
Collapse
Affiliation(s)
- John Adeoye
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Li-Wu Zheng
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Peter Thomson
- College of Medicine and Dentistry, James Cook University, Cairns, Queensland, Australia
| | - Siu-Wai Choi
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Yu-Xiong Su
- Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong, China.
| |
Collapse
|
3
|
Adeoye J, Hui L, Koohi-Moghadam M, Tan JY, Choi SW, Thomson P. Comparison of time-to-event machine learning models in predicting oral cavity cancer prognosis. Int J Med Inform 2021; 157:104635. [PMID: 34800847 DOI: 10.1016/j.ijmedinf.2021.104635] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 10/28/2021] [Accepted: 10/29/2021] [Indexed: 12/13/2022]
Abstract
BACKGROUND Applying machine learning to predicting oral cavity cancer prognosis is important in selecting candidates for aggressive treatment following diagnosis. However, models proposed so far have only considered cancer survival as discrete rather than dynamic outcomes. OBJECTIVES To compare the model performance of different machine learning-based algorithms that incorporate time-to-event data. These algorithms included DeepSurv, DeepHit, neural net-extended time-dependent cox model (Cox-Time), and random survival forest (RSF). MATERIALS AND METHODS Retrospective cohort of 313 oral cavity cancer patients were obtained from electronic health records. Models were trained on patient data following preprocessing. Predictors were based on demographic, clinicopathologic, and treatment information of the cases. Outcomes were the disease-specific and overall survival. Multivariable analyses were conducted to select significant prognostic features associated with tumor prognosis. Two models were generated per algorithm based on all-prognostic features and significant-prognostic features following statistical analysis. Concordance index (c-index) and integrated Brier scores were used as performance evaluators and model stability was assessed using intraclass correlation coefficients (ICC) calculated from these measures obtained from the cross-validation folds. RESULTS While all models were satisfactory, better discriminatory performance and calibration was observed for disease-specific than overall survival (mean c-index: 0.85 vs 0.74; mean integrated Brier score: 0.12 vs 0.17). DeepSurv performed best in terms of discrimination for both outcomes (c-indices: 0.76 -0.89) while RSF produced better calibrated survival estimates (integrated Brier score: 0.06 -0.09). Model stability of the algorithms varied with the outcomes as Cox-Time had the best intraclass correlation coefficient (mean ICC: 1.00) for disease-specific survival while DeepSurv was most stable for overall survival prediction (mean ICC: 0.99). CONCLUSIONS Machine learning algorithms based on time-to-event outcomes are successful in predicting oral cavity cancer prognosis with DeepSurv and RSF producing the best discriminative performance and calibration.
Collapse
Affiliation(s)
- John Adeoye
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China; Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong, China.
| | - Liuling Hui
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China
| | - Mohamad Koohi-Moghadam
- Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China
| | - Jia Yan Tan
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China; Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Siu-Wai Choi
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China; Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
| | - Peter Thomson
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China; Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong, China; College of Medicine and Dentistry, James Cook University, Queensland, Australia.
| |
Collapse
|
4
|
Adeoye J, Tan JY, Ip CM, Choi SW, Thomson P. "Fact or fiction?": Oral cavity cancer in nonsmoking, nonalcohol drinking patients as a distinct entity-Scoping review. Head Neck 2021; 43:3662-3680. [PMID: 34313348 DOI: 10.1002/hed.26824] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/11/2021] [Accepted: 07/19/2021] [Indexed: 12/20/2022] Open
Abstract
Oral cavity cancer is often described as a lifestyle-related malignancy due to its strong associations with habitual factors, including tobacco use, heavy alcohol consumption, and betel nut chewing. However, patients with no genetically predisposing conditions who do not indulge in these risk habits are still being encountered, albeit less commonly. The aim of this review is to summarize contemporaneous reports on these nonsmoking, nonalcohol drinking (NSND) patients. We performed database searching to identify relevant studies from January 1, 2000 to March 31, 2021. Twenty-six articles from 20 studies were included in this study. We found that these individuals were mostly females in their eighth decade with tumors involving the tongue and gingivobuccal mucosa. This review also observed that these patients were likely diagnosed with early stage tumors with overexpression of programmed death-ligand 1 (PD-L1) and increased intensity of tumor infiltrating lymphocytes. Treatment response and disease-specific prognosis were largely comparable between NSND and smoking/drinking patients.
Collapse
Affiliation(s)
- John Adeoye
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Jia Yan Tan
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Cheuk Man Ip
- Department of Anesthesia, Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Siu-Wai Choi
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Peter Thomson
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
5
|
Adeoye J, Tan JY, Choi SW, Thomson P. Prediction models applying machine learning to oral cavity cancer outcomes: A systematic review. Int J Med Inform 2021; 154:104557. [PMID: 34455119 DOI: 10.1016/j.ijmedinf.2021.104557] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 07/26/2021] [Accepted: 07/27/2021] [Indexed: 12/17/2022]
Abstract
OBJECTIVES Machine learning platforms are now being introduced into modern oncological practice for classification and prediction of patient outcomes. To determine the current status of the application of these learning models as adjunctive decision-making tools in oral cavity cancer management, this systematic review aims to summarize the accuracy of machine-learning based models for disease outcomes. METHODS Electronic databases including PubMed, Scopus, EMBASE, Cochrane Library, LILACS, SciELO, PsychINFO, and Web of Science were searched up until December 21, 2020. Pertinent articles detailing the development and accuracy of machine learning prediction models for oral cavity cancer outcomes were selected in a two-stage process. Quality assessment was conducted using the Quality in Prognosis Studies (QUIPS) tool and results of base studies were qualitatively synthesized by all authors. Outcomes of interest were malignant transformation of precancer lesions, cervical lymph node metastasis, as well as treatment response, and prognosis of oral cavity cancer. RESULTS Twenty-seven articles out of 950 citations identified from electronic and manual searching were included in this study. Five studies had low bias concerns on the QUIPS tool. Prediction of malignant transformation, cervical lymph node metastasis, treatment response, and prognosis were reported in three, six, eight, and eleven articles respectively. Accuracy of these learning models on the internal or external validation sets ranged from 0.85 to 0.97 for malignant transformation prediction, 0.78-0.91 for cervical lymph node metastasis prediction, 0.64-1.00 for treatment response prediction, and 0.71-0.99 for prognosis prediction. In general, most trained algorithms predicting these outcomes performed better than alternate methods of prediction. We also found that models including molecular markers in training data had better accuracy estimates for malignant transformation, treatment response, and prognosis prediction. CONCLUSION Machine learning algorithms have a satisfactory to excellent accuracy for predicting three of four oral cavity cancer outcomes i.e., malignant transformation, nodal metastasis, and prognosis. However, considering the training approach of many available classifiers, these models may not be streamlined enough for clinical application currently.
Collapse
Affiliation(s)
- John Adeoye
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Jia Yan Tan
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong Special Administrative Region.
| | - Siu-Wai Choi
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong Special Administrative Region.
| | - Peter Thomson
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong Special Administrative Region
| |
Collapse
|
6
|
Adeoye J, Hui L, Tan JY, Koohi-Moghadam M, Choi SW, Thomson P. Prognostic value of non-smoking, non-alcohol drinking status in oral cavity cancer. Clin Oral Investig 2021; 25:6909-6918. [PMID: 33991259 DOI: 10.1007/s00784-021-03981-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 05/10/2021] [Indexed: 11/12/2022]
Abstract
OBJECTIVES To compare the treatment response and prognosis of oral cavity cancer between non-smoking and non-alcohol-drinking (NSND) patients and smoking and alcohol-drinking (SD) patients. METHODS A total of 313 consecutively treated patients from 2000 to 2019 were included. Demographic, clinicopathologic, treatment, and prognosis information were obtained. Relapse-free survival (RFS), disease-specific survival (DSS), and overall survival (OS) were compared between NSND and SD groups using Kaplan-Meier plots, log-rank test, and multivariate Cox regression analysis. RESULTS Sample prevalence of NSND patients was 54.6%. These patients were predominantly females in their eighth decade with lower prevalence of floor of the mouth cancers compared to SD patients (1.8% vs 14.8%). No difference in the RFS and DSS between both groups was found following multivariable analysis; however, NSND patients had better OS (HR (95% CI) - 0.47 (0.29-0.75); p = 0.002). Extracapsular extension was associated with significantly poorer OS, DSS, and RFS in this oral cavity cancer cohort. CONCLUSION Treatment response and disease-specific prognosis are comparable between NSND and SD patients with oral cavity cancer. However, NSND patients have better OS. CLINICAL RELEVANCE This study shows that oral cavity cancer in NSND is not less or more aggressive compared to SD patients. Although better survival is expected for NSND than SD patients, this is likely due to the reduced incidence of other chronic diseases in the NSND group.
Collapse
Affiliation(s)
- John Adeoye
- Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China. .,Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China.
| | - Liuling Hui
- Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China
| | - Jia Yan Tan
- Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China.,Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China
| | - Mohamad Koohi-Moghadam
- Applied Oral Sciences and Community Dental Care, University of Hong Kong, Hong Kong SAR, China
| | - Siu-Wai Choi
- Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China.,Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China
| | - Peter Thomson
- Oral and Maxillofacial Surgery, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China.,Oral Cancer Research Group, Faculty of Dentistry, University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
7
|
Adeoye J, Thomson P. A call for an established oral cancer classification by etiology and revision of related terminology. Oral Dis 2021; 28:840-842. [PMID: 33512047 DOI: 10.1111/odi.13784] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 01/24/2021] [Indexed: 12/28/2022]
Affiliation(s)
- John Adeoye
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, Hong Kong SAR
| | - Peter Thomson
- Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong, Hong Kong SAR
| |
Collapse
|
8
|
Wang W, Adeoye J, Thomson P, Choi S. Statistical profiling of oral cancer and the prediction of outcome. J Oral Pathol Med 2020; 50:39-46. [DOI: 10.1111/jop.13110] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 08/20/2020] [Indexed: 12/24/2022]
Affiliation(s)
- Weilan Wang
- Oral & Maxillofacial Surgery Faculty of Dentistry The University of Hong Kong Pokfulam Hong Kong
| | - John Adeoye
- Oral & Maxillofacial Surgery Faculty of Dentistry The University of Hong Kong Pokfulam Hong Kong
| | - Peter Thomson
- Oral & Maxillofacial Surgery Faculty of Dentistry The University of Hong Kong Pokfulam Hong Kong
| | - Siu‐Wai Choi
- Oral & Maxillofacial Surgery Faculty of Dentistry The University of Hong Kong Pokfulam Hong Kong
| |
Collapse
|