1
|
Kim SE, Nam JW, Kim JI, Kim JK, Ro DH. Enhanced deep learning model enables accurate alignment measurement across diverse institutional imaging protocols. Knee Surg Relat Res 2024; 36:4. [PMID: 38217058 PMCID: PMC10785531 DOI: 10.1186/s43019-023-00209-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 12/27/2023] [Indexed: 01/14/2024] Open
Abstract
BACKGROUND Achieving consistent accuracy in radiographic measurements across different equipment and protocols is challenging. This study evaluates an advanced deep learning (DL) model, building upon a precursor, for its proficiency in generating uniform and precise alignment measurements in full-leg radiographs irrespective of institutional imaging differences. METHODS The enhanced DL model was trained on over 10,000 radiographs. Utilizing a segmented approach, it separately identified and evaluated regions of interest (ROIs) for the hip, knee, and ankle, subsequently integrating these regions. For external validation, 300 datasets from three distinct institutes with varied imaging protocols and equipment were employed. The study measured seven radiologic parameters: hip-knee-ankle angle, lateral distal femoral angle, medial proximal tibial angle, joint line convergence angle, weight-bearing line ratio, joint line obliquity angle, and lateral distal tibial angle. Measurements by the model were compared with an orthopedic specialist's evaluations using inter-observer and intra-observer intraclass correlation coefficients (ICCs). Additionally, the absolute error percentage in alignment measurements was assessed, and the processing duration for radiograph evaluation was recorded. RESULTS The DL model exhibited excellent performance, achieving an inter-observer ICC between 0.936 and 0.997, on par with an orthopedic specialist, and an intra-observer ICC of 1.000. The model's consistency was robust across different institutional imaging protocols. Its accuracy was particularly notable in measuring the hip-knee-ankle angle, with no instances of absolute error exceeding 1.5 degrees. The enhanced model significantly improved processing speed, reducing the time by 30-fold from an initial 10-11 s to 300 ms. CONCLUSIONS The enhanced DL model demonstrated its ability for accurate, rapid alignment measurements in full-leg radiographs, regardless of protocol variations, signifying its potential for broad clinical and research applicability.
Collapse
Affiliation(s)
- Sung Eun Kim
- Department of Orthopaedic Surgery, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul, 110-744, Republic of Korea
- Department of Orthopaedic Surgery, Seoul National University Hospital, Seoul, South Korea
| | | | - Joong Il Kim
- Department of Orthopaedic Surgery, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul, South Korea
| | - Jong-Keun Kim
- Department of Orthopaedic Surgery, Heung-K Hospital, Gyeonggi-do, South Korea
| | - Du Hyun Ro
- Department of Orthopaedic Surgery, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul, 110-744, Republic of Korea.
- Department of Orthopaedic Surgery, Seoul National University Hospital, Seoul, South Korea.
- CONNECTEVE Co., Ltd, Seoul, South Korea.
| |
Collapse
|
2
|
Kelly BS, Judge C, Bollard SM, Clifford SM, Healy GM, Aziz A, Mathur P, Islam S, Yeom KW, Lawlor A, Killeen RP. Radiology artificial intelligence: a systematic review and evaluation of methods (RAISE). Eur Radiol 2022; 32:7998-8007. [PMID: 35420305 PMCID: PMC9668941 DOI: 10.1007/s00330-022-08784-6] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/17/2022] [Accepted: 03/26/2022] [Indexed: 01/07/2023]
Abstract
OBJECTIVE There has been a large amount of research in the field of artificial intelligence (AI) as applied to clinical radiology. However, these studies vary in design and quality and systematic reviews of the entire field are lacking.This systematic review aimed to identify all papers that used deep learning in radiology to survey the literature and to evaluate their methods. We aimed to identify the key questions being addressed in the literature and to identify the most effective methods employed. METHODS We followed the PRISMA guidelines and performed a systematic review of studies of AI in radiology published from 2015 to 2019. Our published protocol was prospectively registered. RESULTS Our search yielded 11,083 results. Seven hundred sixty-seven full texts were reviewed, and 535 articles were included. Ninety-eight percent were retrospective cohort studies. The median number of patients included was 460. Most studies involved MRI (37%). Neuroradiology was the most common subspecialty. Eighty-eight percent used supervised learning. The majority of studies undertook a segmentation task (39%). Performance comparison was with a state-of-the-art model in 37%. The most used established architecture was UNet (14%). The median performance for the most utilised evaluation metrics was Dice of 0.89 (range .49-.99), AUC of 0.903 (range 1.00-0.61) and Accuracy of 89.4 (range 70.2-100). Of the 77 studies that externally validated their results and allowed for direct comparison, performance on average decreased by 6% at external validation (range increase of 4% to decrease 44%). CONCLUSION This systematic review has surveyed the major advances in AI as applied to clinical radiology. KEY POINTS • While there are many papers reporting expert-level results by using deep learning in radiology, most apply only a narrow range of techniques to a narrow selection of use cases. • The literature is dominated by retrospective cohort studies with limited external validation with high potential for bias. • The recent advent of AI extensions to systematic reporting guidelines and prospective trial registration along with a focus on external validation and explanations show potential for translation of the hype surrounding AI from code to clinic.
Collapse
Affiliation(s)
- Brendan S Kelly
- St Vincent's University Hospital, Dublin, Ireland.
- Insight Centre for Data Analytics, UCD, Dublin, Ireland.
- Wellcome Trust - HRB, Irish Clinical Academic Training, Dublin, Ireland.
- School of Medicine, University College Dublin, Dublin, Ireland.
- HRB-Clinical Research Facility, NUI Galway, Galway, Ireland.
| | - Conor Judge
- Wellcome Trust - HRB, Irish Clinical Academic Training, Dublin, Ireland
- Lucille Packard Children's Hospital at Stanford, Stanford, CA, USA
| | - Stephanie M Bollard
- Wellcome Trust - HRB, Irish Clinical Academic Training, Dublin, Ireland
- School of Medicine, University College Dublin, Dublin, Ireland
| | | | | | - Awsam Aziz
- School of Medicine, University College Dublin, Dublin, Ireland
| | | | - Shah Islam
- Division of Brain Sciences, Imperial College London, GN1 Commonwealth Building, Hammersmith Hospital, Du Cane Road, London, W12 0HS, UK
| | - Kristen W Yeom
- HRB-Clinical Research Facility, NUI Galway, Galway, Ireland
| | | | - Ronan P Killeen
- St Vincent's University Hospital, Dublin, Ireland
- School of Medicine, University College Dublin, Dublin, Ireland
| |
Collapse
|
3
|
Yu AC, Mohajer B, Eng J. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review. Radiol Artif Intell 2022; 4:e210064. [PMID: 35652114 DOI: 10.1148/ryai.210064] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 03/09/2022] [Accepted: 04/12/2022] [Indexed: 01/17/2023]
Abstract
Purpose To assess generalizability of published deep learning (DL) algorithms for radiologic diagnosis. Materials and Methods In this systematic review, the PubMed database was searched for peer-reviewed studies of DL algorithms for image-based radiologic diagnosis that included external validation, published from January 1, 2015, through April 1, 2021. Studies using nonimaging features or incorporating non-DL methods for feature extraction or classification were excluded. Two reviewers independently evaluated studies for inclusion, and any discrepancies were resolved by consensus. Internal and external performance measures and pertinent study characteristics were extracted, and relationships among these data were examined using nonparametric statistics. Results Eighty-three studies reporting 86 algorithms were included. The vast majority (70 of 86, 81%) reported at least some decrease in external performance compared with internal performance, with nearly half (42 of 86, 49%) reporting at least a modest decrease (≥0.05 on the unit scale) and nearly a quarter (21 of 86, 24%) reporting a substantial decrease (≥0.10 on the unit scale). No study characteristics were found to be associated with the difference between internal and external performance. Conclusion Among published external validation studies of DL algorithms for image-based radiologic diagnosis, the vast majority demonstrated diminished algorithm performance on the external dataset, with some reporting a substantial performance decrease.Keywords: Meta-Analysis, Computer Applications-Detection/Diagnosis, Neural Networks, Computer Applications-General (Informatics), Epidemiology, Technology Assessment, Diagnosis, Informatics Supplemental material is available for this article. © RSNA, 2022.
Collapse
Affiliation(s)
- Alice C Yu
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 1800 Orleans St, Baltimore, MD 21287
| | - Bahram Mohajer
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 1800 Orleans St, Baltimore, MD 21287
| | - John Eng
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 1800 Orleans St, Baltimore, MD 21287
| |
Collapse
|
4
|
Jalal S, Lloyd ME, Khosa F, I-Hsuan Hsu G, Nicolaou S. Exploratory data analysis for pre and post 24/7/365 attending radiologist coverage support in an emergency department: fundamentals of data science. Emerg Radiol 2019; 27:233-251. [PMID: 31840209 DOI: 10.1007/s10140-019-01737-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2019] [Accepted: 10/22/2019] [Indexed: 10/25/2022]
Abstract
OBJECTIVE To present a detailed exploratory data analysis for critically investigating the patterns in medical doctor (MD) to disposition time, pre and post 24/7/365 attending radiologist coverage, for patients presenting to an emergency department (ED). MATERIALS AND METHODS The process involved presenting several modeling techniques. To share an understanding of concepts and techniques, we used proportions, medians, and means, Mann-Whitney U test, Kaplan-Meier's (KM) survival analysis, linear and log-linear regression, log-ranked test, Cox proportional hazards model, Weibull parametric survival models and tertile analysis. Retrospective chart review was conducted to obtain a data set which was used to determine the trends in MD to disposition time. Data comprised of patients who had visited the emergency department (ED) during two distinct time periods and whose imaging studies were read by an attending emergency and trauma radiologist. RESULTS Median provided more insight into the data as compared with the mean. The Mann-Whitney U test was appropriate to evaluate MD to disposition time, but provided limited information. The Kaplan-Meier (KM) was able to offer more insight into the data since it did not assume an underlying model and that is the reason why it was appropriate. However, KM had limited ability to handle measured confounders and was unable to describe the magnitude of difference between curves. The Cox proportional hazards semi-parametric model or some other parametric model such as the Weibull could handle multiple measured confounders and described the magnitude of difference between two (survival) groups in the data set. However, both methods assumed underlying models that may not apply to the data set such as the one used in this study. Linear regression was unlikely to be appropriate due to the shape of survival time distributions, but log transforming the outcome could address the distribution issue. Nearly all the results of the KM subgroup analyses were consistent with the results of the log-transformed linear regression subgroup analyses and the interpretation of the results was the same for both. CONCLUSION Different statistical procedures may be applied to conduct exploratory subgroup analysis for a data set from a pre and post 24/7/365 attending coverage model. This could guide potential areas of further research to compare trends in MD to disposition time in ED. Pattern analysis provides evidence for various stakeholders to rethink the discourse about trends in MD to disposition time, pre and post 24/7/365 attending coverage. Graphical Illustration: The role of Emergency and Trauma Radiology in an Emergency Department.
Collapse
Affiliation(s)
- Sabeena Jalal
- Emergency & Trauma Radiology, Department of Radiology, Vancouver General Hospital, Vancouver, Canada. .,McGill University, Montréal, Canada.
| | | | - Faisal Khosa
- Emergency & Trauma Radiology, Department of Radiology, Vancouver General Hospital, Vancouver, Canada
| | | | - Savvas Nicolaou
- Emergency & Trauma Radiology, Department of Radiology, Vancouver General Hospital, Vancouver, Canada
| |
Collapse
|
5
|
García Villar C, Marín León I. [Critical reading of analytical observational studies]. RADIOLOGIA 2015; 57 Suppl 2:1-9. [PMID: 26123855 DOI: 10.1016/j.rx.2015.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Revised: 04/10/2015] [Accepted: 04/18/2015] [Indexed: 11/24/2022]
Abstract
Analytical observational studies provide very important information about real-life clinical practice and the natural history of diseases and can suggest causality. Furthermore, they are very common in scientific journals. The aim of this article is to review the main concepts necessary for the critical reading of articles about radiological studies with observational designs. It reviews the characteristics that case-control and cohort studies must have to ensure high quality. It explains a method of critical reading that involves checking the attributes that should be evaluated in each type of article using a structured list of specific questions. It underlines the main characteristics that confer credibility and confidence on the article evaluated. Readers are provided with tools for the critical analysis of the observational studies published in scientific journals.
Collapse
Affiliation(s)
- C García Villar
- Unidad Clínica de Diagnóstico por Imagen, Hospital Universitario Puerta del Mar, Cádiz, España
| | - I Marín León
- Medicina Interna, Hospital Universitario Virgen del Rocío, Sevilla, España. CIBERESP-IBIS, Fundación Enebro.
| |
Collapse
|
6
|
[Introduction to critical reading of articles: study design and biases]. RADIOLOGIA 2014; 57 Suppl 1:3-13. [PMID: 25458123 DOI: 10.1016/j.rx.2014.08.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 08/04/2014] [Accepted: 08/27/2014] [Indexed: 11/22/2022]
Abstract
The critical evaluation of an article enables professionals to make good use of the new information and therefore has direct repercussions for the benefit of our patients. Before undertaking a detailed critical reading of the chosen article, we need to consider whether the study used the most appropriate design for the question it aimed to answer (i.e., whether the level of evidence is adequate). To do this, we need to know how to classify studies in function of their design (descriptive or analytical; prospective or retrospective; cross-sectional or longitudinal) as well as their correlation with the levels of evidence. In critical reading it is also important to know the main systematic errors or biases that can affect a study. Biases can appear in any phase of a study; they can affect the sample, the development of the study, or the measurement of the results.
Collapse
|
7
|
García Villar C. [The importance of being systematic when criticizing]. RADIOLOGIA 2014; 57 Suppl 1:1-2. [PMID: 25085465 DOI: 10.1016/j.rx.2014.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 05/15/2014] [Accepted: 06/02/2014] [Indexed: 11/30/2022]
Affiliation(s)
- C García Villar
- Unidad Clínica de Diagnóstico por Imagen, Hospital Universitario Puerta del Mar, Cádiz, España.
| |
Collapse
|
8
|
Haramati LB. Ethical Trials to Determine the Risks and Benefits of Radiation Exposure from Coronary CT Angiography. J Am Coll Radiol 2008; 5:1073-6. [DOI: 10.1016/j.jacr.2008.05.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2008] [Indexed: 11/24/2022]
|
9
|
|
10
|
Methodology and Application of Clinical Trials in Radiology:Self-Assessment Module. AJR Am J Roentgenol 2008; 190:S23-8. [DOI: 10.2214/ajr.07.7011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
11
|
|
12
|
Costa C, Silva A, Oliveira JL. Current Perspectives on PACS and a Cardiology Case Study. ADVANCED COMPUTATIONAL INTELLIGENCE PARADIGMS IN HEALTHCARE-2 2007. [DOI: 10.1007/978-3-540-72375-2_5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
13
|
Abstract
Bias is a form of systematic error that can affect scientific investigations and distort the measurement process. A biased study loses validity in relation to the degree of the bias. While some study designs are more prone to bias, its presence is universal. It is difficult or even impossible to completely eliminate bias. In the process of attempting to do so, new bias may be introduced or a study may be rendered less generalizable. Therefore, the goals are to minimize bias and for both investigators and readers to comprehend its residual effects, limiting misinterpretation and misuse of data. Numerous forms of bias have been described, and the terminology can be confusing, overlapping, and specific to a medical specialty. Much of the terminology is drawn from the epidemiology literature and may not be common parlance for radiologists. In this review, various types of bias are discussed, with emphasis on the radiology literature, and common study designs in which bias occurs are presented.
Collapse
Affiliation(s)
- Gregory T Sica
- Harvard Vanguard Medical Associates, Boston, Mass., USA.
| |
Collapse
|
14
|
Abstract
To practice evidence-based radiology, knowledge of how to critically assess the literature is vital. This article outlines how to evaluate the radiology literature by asking these questions: Is it true?, Is it relevant?, and Is it sufficient? Clinical examples are used to explain several important causes of bias, as well as to clarify how these biases can affect the relevance of a particular study to a given clinical situation. Finally, discussion centers on the strength of evidence for both positive and negative findings in a research study.
Collapse
Affiliation(s)
- C Craig Blackmore
- Department of Radiology, Harborview Medical Center, University of Washington, 325 Ninth Avenue, Box 359728, Seattle, WA 98104, USA.
| |
Collapse
|
15
|
Research Training for Radiology Trainees. J Vasc Interv Radiol 2004. [DOI: 10.1016/s1051-0443(04)70024-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
16
|
Abstract
Health technology assessment is the systematic and quantitative evaluation of the safety, efficacy, and cost of health care interventions. This article outlines aspects of technology assessment of diagnostic imaging. First, it presents a conceptual framework of a hierarchy of levels of efficacy that should guide thinking about imaging test evaluation. In particular, the framework shows how the question answered by most evaluations of imaging tests, "How well does this test distinguish disease from the nondiseased state?" relates to the fundamental questions for all health technology assessment, "How much does this intervention improve the health of people?" and "What is the cost of that improvement?" Second, it describes decision analysis and cost-effectiveness analysis, which are quantitative modeling techniques usually used to answer the two core questions for imaging. Third, it outlines design and operational considerations that are vital if researchers who are conducting an experimental study are to make a quality contribution to technology assessment, either directly through their findings or as an input into decision analyses. Finally, it includes a separate discussion of screening--that is, the application of diagnostic tests to nonsymptomatic populations--because the requirements for good screening tests are different from those for diagnostic tests of symptomatic patients and because the appropriate evaluation methods also differ.
Collapse
Affiliation(s)
- Jonathan H Sunshine
- Department of Research, American College of Radiology, 1891 Preston White Dr, Reston, VA 20191, USA.
| | | |
Collapse
|