1
|
In Reply: Transradial Flow-Diverting Stent Placement Through an Arteria Lusoria: 2-Dimensional Operative Video. Oper Neurosurg (Hagerstown) 2023; 25:e118. [PMID: 37195057 DOI: 10.1227/ons.0000000000000772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 03/23/2023] [Indexed: 05/18/2023] Open
|
2
|
Transradial Flow-Diverting Stent Placement Through an Arteria Lusoria: 2-Dimensional Operative Video. Oper Neurosurg (Hagerstown) 2023; 24:e438. [PMID: 36723287 DOI: 10.1227/ons.0000000000000635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/28/2022] [Indexed: 02/02/2023] Open
|
3
|
Machine Learning Models for Classifying High- and Low-Grade Gliomas: A Systematic Review and Quality of Reporting Analysis. Front Oncol 2022; 12:856231. [PMID: 35530302 PMCID: PMC9076130 DOI: 10.3389/fonc.2022.856231] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 03/25/2022] [Indexed: 12/11/2022] Open
Abstract
Objectives To systematically review, assess the reporting quality of, and discuss improvement opportunities for studies describing machine learning (ML) models for glioma grade prediction. Methods This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy (PRISMA-DTA) statement. A systematic search was performed in September 2020, and repeated in January 2021, on four databases: Embase, Medline, CENTRAL, and Web of Science Core Collection. Publications were screened in Covidence, and reporting quality was measured against the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Descriptive statistics were calculated using GraphPad Prism 9. Results The search identified 11,727 candidate articles with 1,135 articles undergoing full text review and 85 included in analysis. 67 (79%) articles were published between 2018-2021. The mean prediction accuracy of the best performing model in each study was 0.89 ± 0.09. The most common algorithm for conventional machine learning studies was Support Vector Machine (mean accuracy: 0.90 ± 0.07) and for deep learning studies was Convolutional Neural Network (mean accuracy: 0.91 ± 0.10). Only one study used both a large training dataset (n>200) and external validation (accuracy: 0.72) for their model. The mean adherence rate to TRIPOD was 44.5% ± 11.1%, with poor reporting adherence for model performance (0%), abstracts (0%), and titles (0%). Conclusions The application of ML to glioma grade prediction has grown substantially, with ML model studies reporting high predictive accuracies but lacking essential metrics and characteristics for assessing model performance. Several domains, including generalizability and reproducibility, warrant further attention to enable translation into clinical practice. Systematic Review Registration PROSPERO, identifier CRD42020209938.
Collapse
|
4
|
Machine Learning in Differentiating Gliomas from Primary CNS Lymphomas: A Systematic Review, Reporting Quality, and Risk of Bias Assessment. AJNR Am J Neuroradiol 2022; 43:526-533. [PMID: 35361577 PMCID: PMC8993193 DOI: 10.3174/ajnr.a7473] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 01/31/2022] [Indexed: 12/12/2022]
Abstract
BACKGROUND Differentiating gliomas and primary CNS lymphoma represents a diagnostic challenge with important therapeutic ramifications. Biopsy is the preferred method of diagnosis, while MR imaging in conjunction with machine learning has shown promising results in differentiating these tumors. PURPOSE Our aim was to evaluate the quality of reporting and risk of bias, assess data bases with which the machine learning classification algorithms were developed, the algorithms themselves, and their performance. DATA SOURCES Ovid EMBASE, Ovid MEDLINE, Cochrane Central Register of Controlled Trials, and the Web of Science Core Collection were searched according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. STUDY SELECTION From 11,727 studies, 23 peer-reviewed studies used machine learning to differentiate primary CNS lymphoma from gliomas in 2276 patients. DATA ANALYSIS Characteristics of data sets and machine learning algorithms were extracted. A meta-analysis on a subset of studies was performed. Reporting quality and risk of bias were assessed using the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) and Prediction Model Study Risk Of Bias Assessment Tool. DATA SYNTHESIS The highest area under the receiver operating characteristic curve (0.961) and accuracy (91.2%) in external validation were achieved by logistic regression and support vector machines models using conventional radiomic features. Meta-analysis of machine learning classifiers using these features yielded a mean area under the receiver operating characteristic curve of 0.944 (95% CI, 0.898-0.99). The median TRIPOD score was 51.7%. The risk of bias was high for 16 studies. LIMITATIONS Exclusion of abstracts decreased the sensitivity in evaluating all published studies. Meta-analysis had high heterogeneity. CONCLUSIONS Machine learning-based methods of differentiating primary CNS lymphoma from gliomas have shown great potential, but most studies lack large, balanced data sets and external validation. Assessment of the studies identified multiple deficiencies in reporting quality and risk of bias. These factors reduce the generalizability and reproducibility of the findings.
Collapse
|
5
|
Trends in Development of Novel Machine Learning Methods for the Identification of Gliomas in Datasets That Include Non-Glioma Images: A Systematic Review. Front Oncol 2021; 11:788819. [PMID: 35004312 PMCID: PMC8733688 DOI: 10.3389/fonc.2021.788819] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 12/07/2021] [Indexed: 12/12/2022] Open
Abstract
Purpose Machine learning has been applied to the diagnostic imaging of gliomas to augment classification, prognostication, segmentation, and treatment planning. A systematic literature review was performed to identify how machine learning has been applied to identify gliomas in datasets which include non-glioma images thereby simulating normal clinical practice. Materials and Methods Four databases were searched by a medical librarian and confirmed by a second librarian for all articles published prior to February 1, 2021: Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL), and Web of Science-Core Collection. The search strategy included both keywords and controlled vocabulary combining the terms for: artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, as well as related terms. The review was conducted in stepwise fashion with abstract screening, full text screening, and data extraction. Quality of reporting was assessed using TRIPOD criteria. Results A total of 11,727 candidate articles were identified, of which 12 articles were included in the final analysis. Studies investigated the differentiation of normal from abnormal images in datasets which include gliomas (7 articles) and the differentiation of glioma images from non-glioma or normal images (5 articles). Single institution datasets were most common (5 articles) followed by BRATS (3 articles). The median sample size was 280 patients. Algorithm testing strategies consisted of five-fold cross validation (5 articles), and the use of exclusive sets of images within the same dataset for training and for testing (7 articles). Neural networks were the most common type of algorithm (10 articles). The accuracy of algorithms ranged from 0.75 to 1.00 (median 0.96, 10 articles). Quality of reporting assessment utilizing TRIPOD criteria yielded a mean individual TRIPOD ratio of 0.50 (standard deviation 0.14, range 0.37 to 0.85). Conclusion Systematic review investigating the identification of gliomas in datasets which include non-glioma images demonstrated multiple limitations hindering the application of these algorithms to clinical practice. These included limited datasets, a lack of generalizable algorithm training and testing strategies, and poor quality of reporting. The development of more robust and heterogeneous datasets is needed for algorithm development. Future studies would benefit from using external datasets for algorithm testing as well as placing increased attention on quality of reporting standards. Systematic Review Registration www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42020209938, International Prospective Register of Systematic Reviews (PROSPERO 2020 CRD42020209938).
Collapse
|
6
|
NIMG-67. A SYSTEMATIC REVIEW ON THE DEVELOPMENT OF MACHINE LEARNING MODELS FOR DIFFERENTIATING PCNSL FROM GLIOMAS. Neuro Oncol 2021. [DOI: 10.1093/neuonc/noab196.565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Abstract
PURPOSE
Differentiating gliomas and Primary CNS Lymphomas (PCNSL) represents a diagnostic challenge with important therapeutic ramifications. MR imaging combined with Machine Learning (ML) has shown promising results in differentiating tumors non-invasively. The purpose of this systematic review is to evaluate and synthesize the findings on the application of ML in differentiating PCNSL and gliomas.
MATERIALS AND METHODS
A systematic search of literature was performed in October 2020 and February 2021 on Ovid Embase, Ovid MEDLINE, Cochrane trials, and Web of Science – Core Collection. The search strategy included keywords and controlled vocabulary including the terms: gliomas, artificial intelligence, machine learning, and related terms. Publications were reviewed and screened by four different reviewers in accordance with TRIPOD.
RESULTS
The literature search yielded 11,727 studies and 1,135 underwent full-text review. Data was extracted from 16 publications showing that 10 ML and 3 deep learning (DL) algorithms were tested. The analyzed databases had an average size of 118 patients per study. 50% of the publications validated the algorithm in an independent test cohort. The most commonly tested ML and DL algorithms were support vector machines and Convolutional Neural Networks, respectively. In internal (external) datasets, ML algorithms reached an average AUC of 89% (83%); and DL 74% (77%). Preliminary TRIPOD bias analysis yielded an average score of 0.5 (range 0.31-0.62), with most papers showing deficiencies in reporting model specifications, and funding details among other items.
CONCLUSIONS
AI-based methods for differentiating gliomas and PCNSL have been reported and show that ML methods result in accuracy = > 85%.With few studies using DL algorithms, further research into novel DL-based approaches is recommended. Additionally, most studies lack large datasets and external validation, thus increasing the risk of overfitting. Bias analysis of the published studies using TRIPOD identified reporting deficiencies, and close adherence to reporting criteria is recommended.
Collapse
|
7
|
NIMG-46. SYSTEMATIC LITERATURE REVIEW OF ARTIFICIAL INTELLIGENCE ALGORITHMS USING PRE-THERAPY MR IMAGING FOR GLIOMA MOLECULAR SUBTYPE CLASSIFICATION. Neuro Oncol 2021. [DOI: 10.1093/neuonc/noab196.545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
PURPOSE
Identifying molecular subtypes in gliomas has prognostic and therapeutic value, traditionally after invasive neurosurgical tumor resection or biopsy. Recent advances using artificial intelligence (AI) show promise in using pre-therapy imaging for predicting molecular subtype. We performed a systematic review of recent literature on AI methods used to predict molecular subtypes of gliomas.
METHODS
Literature review conforming to PRSIMA guidelines was performed for publications prior to February 2021 using 4 databases: Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL), and Web of Science core-collection. Keywords included: artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, and glioblastoma. Non-machine learning and non-human studies were excluded. Screening was performed using Covidence software. Bias analysis was done using TRIPOD guidelines.
RESULTS
11,727 abstracts were retrieved. After applying initial screening exclusion criteria, 1,135 full text reviews were performed, with 82 papers remaining for data extraction. 57% used retrospective single center hospital data, 31.6% used TCIA and BRATS, and 11.4% analyzed multicenter hospital data. An average of 146 patients (range 34-462 patients) were included. Algorithms predicting IDH status comprised 51.8% of studies, MGMT 18.1%, and 1p19q 6.0%. Machine learning methods were used in 71.4%, deep learning in 27.4%, and 1.2% directly compared both methods. The most common algorithm for machine learning were support vector machine (43.3%), and for deep learning convolutional neural network (68.4%). Mean prediction accuracy was 76.6%.
CONCLUSION
Machine learning is the predominant method for image-based prediction of glioma molecular subtypes. Major limitations include limited datasets (60.2% with under 150 patients) and thus limited generalizability of findings. We recommend using larger annotated datasets for AI network training and testing in order to create more robust AI algorithms, which will provide better prediction accuracy to real world clinical datasets and provide tools that can be translated to clinical practice.
Collapse
|
8
|
NIMG-17. SYSTEMATIC REVIEW OF LITERATURE EVALUATING MACHINE LEARNING ALGORITHMS TO DEVELOP OUTCOME PREDICTION MODELS IN GLIOMA USING MOLECULAR IMAGING WITH AMINO ACID PET. Neuro Oncol 2021. [DOI: 10.1093/neuonc/noab196.517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
PURPOSE
Machine learning (ML) algorithms demonstrate accurate prediction of tumor segmentation, molecular pathology, and outcomes in gliomas using MRI and recently application of ML tools has expanded into molecular imaging with PET. We performed a systematic review to evaluate the role and applications of ML in characterization of gliomas with PET.
METHODS
Four databases were searched by medical school librarian and confirmed by an independent librarian: Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL), and Web of Science-Core Collection. The search strategy used keywords and controlled vocabulary combining the terms for: artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, and related terms. All articles were reviewed by at least 2 independent reviewers at abstract screening, full text review, data extraction, and bias analysis using TRIPOD.
RESULTS
An initial 11,727 publications were imported to Covidence for screening. After review, 1135 studies moved to full-text review and 715 articles were included. Twelve publications included PET imaging of gliomas. All publications used single-center databases (3-73 patients) with distribution of tracers being [18F]-FDG (1), [18F]-FET (6), [11C]-MET (3), [18F]-FDOPA (1), and [18F]-AMP (1). All but 2 papers used supervised machine learning algorithms. Number of features ranged from 4-19,284. Nine papers manually extracted semiquantitative features TBRmax, TBRmean, SUV, TTP, in addition to demographics. Study outcomes included prediction of treatment response, survival, molecular subtypes, tumor grade, segmentation, and accuracy of image fusion. Accuracy ranged from 0.64-0.95 with AUC 0.43-0.9.
CONCLUSION
ML can be used on small datasets of PET imaging of brain tumors. While majority of the clinical scans are performed with FDG-PET, the machine learning approaches are being applied to mostly amino acid tracers. Extending ML approaches to FDG-PET, which is more common in clinical practice, is recommended. Overall, ML has potential as a useful tool for predicting patient outcomes and improving image postprocessing.
Collapse
|
9
|
Abstract
Abstract
PURPOSE
Machine learning (ML) technologies have demonstrated highly accurate prediction of glioma grade, though it is unclear which methods and algorithms are superior. We have conducted a systematic review of the literature in order to identify the ML applications most promising for future research and clinical implementation.
MATERIALS AND METHODS
A literature review, in agreement with PRISMA, was conducted by a university librarian in October 2020 and verified by a second librarian in February 2021 using four databases: Cochrane trials (CENTRAL), Ovid Embase, Ovid MEDLINE, and Web of Science core-collection. Keywords and controlled vocabulary included artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, and glioblastoma. Screening of publications was done in Covidence, and TRIPOD was used for bias assessment.
RESULTS
The search identified 11,727 candidate articles with 1,135 articles undergoing full text review. 86 articles published since 1995 met the criteria for our study. 79% of the articles were published between 2018 and 2020. The average glioma prediction accuracy of the highest performing model in each study was 90% (range: 53% to 100%). The most common algorithm used for cML studies was Support Vector Machine (SVM) and for DL studies was Convolutional Neural Network (CNN). BRATS and TCIA datasets were used in 47% of the studies, with the average patient number of study datasets being 186 (range: 23 to 662). The average number of features used in machine learning prediction was 55 (range: 2 to 580). Classical machine learning (cML) was the primary machine learning model in 68% of studies, with deep learning (DL) used in 32%.
CONCLUSIONS
Using multimodal sequences in ML methods delivers significantly higher grading accuracies than single sequences. Potential areas of improvement for ML glioma grade prediction studies include increasing sample size, incorporating molecular subtypes, and validating on external datasets.
Collapse
|
10
|
NIMG-71. IDENTIFYING CLINICALLY APPLICABLE MACHINE LEARNING ALGORITHMS FOR GLIOMA SEGMENTATION USING A SYSTEMATIC LITERATURE REVIEW. Neuro Oncol 2021. [DOI: 10.1093/neuonc/noab196.568] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Abstract
PURPOSE
Nowadays Machine learning (ML) algorithms are often used for segmentation of gliomas, but which algorithms provide the most accurate method for implementation into clinical practice has not fully been identified. We performed a systematic review of the literature to characterize the methods used for glioma segmentation and their accuracy.
METHODS
In accordance to PRISMA, a literature review was performed on four databases, Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL) and Web of science core-collection first in October 2020 and in February 2021. Keywords and controlled vocabulary included artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, and glioblastoma. Publications were screened in Covidence and the bias analysis was done in agreement with TRIPOD.
RESULTS
Sixty-six articles were used for data extraction. BRATS and TCIA datasets were used in 36.6% of all studies, with average number of patients being 141 (range: 1 to 622). ML methods represented 45.3% of studies, with deep learning used in 54.7%; Dice score for the tumor core ranged from 0.72 to 0.95. The most common algorithm used in the machine learning papers was support vector machines (SVM) and for deep learning papers, it was Convolutional Neural Networks (CNN). Preliminary TRIPOD analysis yielded an average score from 12 (range: 7-16) with the majority of papers demonstrating deficiencies in description of the ML algorithm, funding role, data acquisition and measures of model performance.
CONCLUSION
In the last years, many articles were published on segmentation of gliomas using machine learning, thus establishing this method for tumor segmentation with high accuracy. However, the major limitations for clinically applicable use of ML in glioma segmentation include more than one-third of publications use the same datasets, thus limiting generalizability, increase the likelihood of overfitting, show and lack of ML network description and standardization in accuracy reporting.
Collapse
|
11
|
NIMG-38. MEASURING ADHERENCE TO TRIPOD OF ARTIFICIAL INTELLIGENCE PAPERS IN THE GLIOMA SEGMENTATION. Neuro Oncol 2021. [DOI: 10.1093/neuonc/noab196.537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
PURPOSE
Generalizability, reproducibility and objectivity are critical elements that need to be considered when translating machine learning models into clinical practice. While a large body of literature has been published on machine learning methods for segmentation of brain tumors, a systematic evaluation of paper quality and reproducibility has not been done. We investigated the use of “Transparent Reporting of studies on prediction models for Individual Prognosis Or Diagnosis” (TRIPOD) items, among papers published in this relatively new and growing field.
METHODS
According to PRISMA a literature review was performed on four databases, Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL) and Web of science core-collection first in October 2020 and a second time in February 2021. Keywords and controlled vocabulary included artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, and glioblastoma. The publications were assessed in order to the TRIPOD items.
RESULTS
37 publications from our database search were screened in TRIPOD and yielded an average score of 12.08 with the maximum score being 16 and the minimum score 7. The best scoring item was interpretation (item 19) where all papers scored a point. The lowest scoring items were the title, the abstract, risk groups and the model performance (items number 1, 2, 11 and 16), where no paper scored a point. Less than 1% of the papers discussed the problem of missing data (item 9) and the funding of research (item 22).
CONCLUSION
TRIPOD analysis showed that a majority of the papers do not score high on critical elements that allow reproducibility, translation, and objectivity of research. An average score of 12.08 (40%) indicates that the publications usually achieve a relatively low score. The categories that were consistently poorly described include the ML network description, measuring model performance, title details and inclusion of information into the abstract.
Collapse
|
12
|
OTHR-12. The development of machine learning algorithms for the differentiation of glioma and brain metastases – a systematic review. Neurooncol Adv 2021. [PMCID: PMC8351249 DOI: 10.1093/noajnl/vdab071.067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Purpose Medical staging, surgical planning, and therapeutic decisions are significantly different for brain metastases versus gliomas. Machine learning (ML) algorithms have been developed to differentiate these pathologies. We performed a systematic review to characterize ML methods and to evaluate their accuracy. Methods Studies on the application of machine learning in neuro-oncology were searched in Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL) and Web of science core-collection. A search strategy was designed in compliance with a clinical librarian and confirmed by a second librarian. The search strategy comprised of controlled vocabulary including artificial intelligence, machine learning, deep learning, magnetic resonance imaging, and glioma. The initial search was performed in October 2020 and then updated in February 2021. Candidate articles were screened in Covidence by at least two reviewers each. A bias analysis was conducted in agreement with TRIPOD, a bias assessment tool similar to CLAIM. Results Twenty-nine articles were used for data extraction. Four articles specified model development for solitary brain metastases. Classical ML (cML) algorithms represented 85% of models used, while deep learning (DL) accounted for 15%. cML algorithms performed with an average accuracy, sensitivity, and specificity of 82%, 78%, 88%, respectively; DL performed 84%, 79%, 81%. The support vector machine (SVM) algorithm was the most common used cML model in the literature and convolutional neural networks (CNN) were standard for DL models. We also found T1, T1 post-gadolinium and T2 sequences were most commonly used for feature extraction. Preliminary TRIPOD analysis yielded an average score of 14.25 (range 8–18). Conclusion ML algorithms that can accurately classify glioma from brain metastases have been developed. SVM and CNN are leading approaches with high accuracy. Standardized algorithm performance reporting is a clear limitation to be addressed in future studies.
Collapse
|
13
|
OTHR-15. Assessment of TRIPOD adherence in articles developing machine learning models for differentiation of glioma from brain metastasis. Neurooncol Adv 2021. [PMCID: PMC8351195 DOI: 10.1093/noajnl/vdab071.070] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Purpose Machine learning (ML) applications in predictive models in neuro-oncology have become an increasingly investigated subject of research. For their incorporation into clinical practice, rigorous assessment is needed to reduce bias. Several reports have indicated utility of ML applications in differentiation of glioma from brain metastasis. However, a systematic assessment of quality of methodology and reporting in these studies has not been done yet. We examined the adherence of 29 published reports in this field to the TRIPOD statement, which is similar to CLAIM checklist. Materials and Methods Our systematic review was conducted in accordance with PRISMA guidelines. Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL) and Web of science core-collection were searched. Keywords included artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, and glioblastoma. Assessment of TRIPOD adherence in 29 eligible studies was performed. Individual item performance was assessed by adherence index (ADI), the ratio of mean achieved score to maximum score per TRIPOD item. Results In a preliminary analysis of 8 studies, the average TRIPOD adherence score was 0.48 (14.25/30 items fulfilled) with individual scores ranging from 0.27 (8/30) to 0.60 (18/30). Best overall item performance, with an ADI of 1, was seen in item 3 (Background/Objectives), 16 (Model performance) and 19 (Interpretation). Poorest performance was detected in item 1 (Title) and 2 (Abstract), followed by item 9 (Missing Data) with ADI of 0, 0 and 0.13, respectively. Conclusion Preliminary results underline the lack of reproducibility in ML studies on distinction between glioma and brain metastasis. An average TRIPOD adherence score of 0.48 indicates insufficient quality of reporting and outlines the need for increased utilization of quality scoring systems in study documentation. Systematic evaluation of quality score adherence will allow us to identify common flaws in this field for enabling translation of models into clinical workflow.
Collapse
|