Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Xin KZ, Li D, Yi PH. Limited generalizability of deep learning algorithm for pediatric pneumonia classification on external data. Emerg Radiol 2021;29:107-113. [PMID: 34648114 PMCID: PMC8515154 DOI: 10.1007/s10140-021-01954-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/08/2021] [Indexed: 11/06/2022]

For:	Xin KZ, Li D, Yi PH. Limited generalizability of deep learning algorithm for pediatric pneumonia classification on external data. Emerg Radiol 2021;29:107-113. [PMID: 34648114 PMCID: PMC8515154 DOI: 10.1007/s10140-021-01954-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/08/2021] [Indexed: 11/06/2022]

Number

Cited by Other Article(s)

Rickard D, Kabir MA, Homaira N. Machine learning-based approaches for distinguishing viral and bacterial pneumonia in paediatrics: A scoping review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025;268:108802. [PMID: 40349546 DOI: 10.1016/j.cmpb.2025.108802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Revised: 04/13/2025] [Accepted: 04/22/2025] [Indexed: 05/14/2025]

Abstract

BACKGROUND AND OBJECTIVE

Pneumonia is the leading cause of hospitalisation and mortality among children under five, particularly in low-resource settings. Accurate differentiation between viral and bacterial pneumonia is essential for guiding appropriate treatment, yet it remains challenging due to overlapping clinical and radiographic features. Advances in machine learning (ML), particularly deep learning (DL), have shown promise in classifying pneumonia using chest X-ray (CXR) images. This scoping review summarises the evidence on ML techniques for classifying viral and bacterial pneumonia using CXR images in paediatric patients.

METHODS

This scoping review was conducted following the Joanna Briggs Institute methodology and the PRISMA-ScR guidelines. A comprehensive search was performed in PubMed, Embase, and Scopus to identify studies involving children (0-18 years) with pneumonia diagnosed through CXR, using ML models for binary or multiclass classification. Data extraction included ML models, dataset characteristics, and performance metrics.

RESULTS

A total of 35 studies, published between 2018 and 2025, were included in this review. Of these, 31 studies used the publicly available Kermany dataset, raising concerns about overfitting and limited generalisability to broader, real-world clinical populations. Most studies (n=33) used convolutional neural networks (CNNs) for pneumonia classification. While many models demonstrated promising performance, significant variability was observed due to differences in methodologies, dataset sizes, and validation strategies, complicating direct comparisons. For binary classification (viral vs bacterial pneumonia), a median accuracy of 92.3% (range: 80.8% to 97.9%) was reported. For multiclass classification (healthy, viral pneumonia, and bacterial pneumonia), the median accuracy was 91.8% (range: 76.8% to 99.7%).

CONCLUSIONS

Current evidence is constrained by a predominant reliance on a single dataset and variability in methodologies, which limit the generalisability and clinical applicability of findings. To address these limitations, future research should focus on developing diverse and representative datasets while adhering to standardised reporting guidelines. Such efforts are essential to improve the reliability, reproducibility, and translational potential of machine learning models in clinical settings.

Collapse

Shih YC, Ko CL, Wang SY, Chang CY, Lin SS, Huang CW, Cheng MF, Chen CM, Wu YW. Cross-institutional validation of a polar map-free 3D deep learning model for obstructive coronary artery disease prediction using myocardial perfusion imaging: insights into generalizability and bias. Eur J Nucl Med Mol Imaging 2025:10.1007/s00259-025-07243-w. [PMID: 40198356 DOI: 10.1007/s00259-025-07243-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Accepted: 03/24/2025] [Indexed: 04/10/2025]

Ganatra HA. Machine Learning in Pediatric Healthcare: Current Trends, Challenges, and Future Directions. J Clin Med 2025;14:807. [PMID: 39941476 PMCID: PMC11818243 DOI: 10.3390/jcm14030807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 01/16/2025] [Accepted: 01/23/2025] [Indexed: 02/16/2025] Open

Abstract

Background/Objectives: Artificial intelligence (AI) and machine learning (ML) are transforming healthcare by enabling predictive, diagnostic, and therapeutic advancements. Pediatric healthcare presents unique challenges, including limited data availability, developmental variability, and ethical considerations. This narrative review explores the current trends, applications, challenges, and future directions of ML in pediatric healthcare. Methods: A systematic search of the PubMed database was conducted using the query: ("artificial intelligence" OR "machine learning") AND ("pediatric" OR "paediatric"). Studies were reviewed to identify key themes, methodologies, applications, and challenges. Gaps in the research and ethical considerations were also analyzed to propose future research directions. Results: ML has demonstrated promise in diagnostic support, prognostic modeling, and therapeutic planning for pediatric patients. Applications include the early detection of conditions like sepsis, improved diagnostic imaging, and personalized treatment strategies for chronic conditions such as epilepsy and Crohn's disease. However, challenges such as data limitations, ethical concerns, and lack of model generalizability remain significant barriers. Emerging techniques, including federated learning and explainable AI (XAI), offer potential solutions. Despite these advancements, research gaps persist in data diversity, model interpretability, and ethical frameworks. Conclusions: ML offers transformative potential in pediatric healthcare by addressing diagnostic, prognostic, and therapeutic challenges. While advancements highlight its promise, overcoming barriers such as data limitations, ethical concerns, and model trustworthiness is essential for its broader adoption. Future efforts should focus on enhancing data diversity, developing standardized ethical guidelines, and improving model transparency to ensure equitable and effective implementation in pediatric care.

Collapse

Rajaraman S, Liang Z, Xue Z, Antani S. Addressing Class Imbalance with Latent Diffusion-based Data Augmentation for Improving Disease Classification in Pediatric Chest X-rays. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2024;2024:5059-5066. [PMID: 40134830 PMCID: PMC11936509 DOI: 10.1109/bibm62325.2024.10822172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]

Santomartino SM, Zech JR, Hall K, Jeudy J, Parekh V, Yi PH, Weintraub E. Evaluating the Performance and Bias of Natural Language Processing Tools in Labeling Chest Radiograph Reports. Radiology 2024;313:e232746. [PMID: 39436298 PMCID: PMC11535863 DOI: 10.1148/radiol.232746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 08/12/2024] [Accepted: 08/20/2024] [Indexed: 10/23/2024]

Abstract

Background Natural language processing (NLP) is commonly used to annotate radiology datasets for training deep learning (DL) models. However, the accuracy and potential biases of these NLP methods have not been thoroughly investigated, particularly across different demographic groups. Purpose To evaluate the accuracy and demographic bias of four NLP radiology report labeling tools on two chest radiograph datasets. Materials and Methods This retrospective study, performed between April 2022 and April 2024, evaluated chest radiograph report labeling using four NLP tools (CheXpert [rule-based], RadReportAnnotator [RRA; DL-based], OpenAI's GPT-4 [DL-based], cTAKES [hybrid]) on a subset of the Medical Information Mart for Intensive Care (MIMIC) chest radiograph dataset balanced for representation of age, sex, and race and ethnicity (n = 692) and the entire Indiana University (IU) chest radiograph dataset (n = 3665). Three board-certified radiologists annotated the chest radiograph reports for 14 thoracic disease labels. NLP tool performance was evaluated using several metrics, including accuracy and error rate. Bias was evaluated by comparing performance between demographic subgroups using the Pearson χ2 test. Results The IU dataset included 3665 patients (mean age, 49.7 years ± 17 [SD]; 1963 female), while the MIMIC dataset included 692 patients (mean age, 54.1 years ± 23.1; 357 female). All four NLP tools demonstrated high accuracy across findings in the IU and MIMIC datasets, as follows: CheXpert (92.6% [47 516 of 51 310], 90.2% [8742 of 9688]), RRA (82.9% [19 746 of 23 829], 92.2% [2870 of 3114]), GPT-4 (94.3% [45 586 of 48 342], 91.6% [6721 of 7336]), and cTAKES (84.7% [43 436 of 51 310], 88.7% [8597 of 9688]). RRA and cTAKES had higher accuracy (P < .001) on the MIMIC dataset, while CheXpert and GPT-4 had higher accuracy on the IU dataset. Differences (P < .001) in error rates were observed across age groups for all NLP tools except RRA on the MIMIC dataset, with the highest error rates for CheXpert, RRA, and cTAKES in patients older than 80 years (mean, 15.8% ± 5.0) and the highest error rate for GPT-4 in patients 60-80 years of age (8.3%). Conclusion Although commonly used NLP tools for chest radiograph report annotation are accurate when evaluating reports in aggregate, demographic subanalyses showed significant bias, with poorer performance in older patients. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Cai in this issue.

Collapse

Affiliation(s)

Samantha M. Santomartino From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)
John R. Zech From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)
Kent Hall From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)
Jean Jeudy From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)
Vishwa Parekh From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)
Paul H. Yi From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)
Elizabeth Weintraub From Drexel University College of Medicine, Philadelphia, Pa (S.M.S.); Department of Radiology, Columbia University Irving Medical Center, New York, NY (J.R.Z.); Department of Radiology, Wake Forest University Health Sciences Center, Winston-Salem, NC (K.H.); Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Md (J.J., V.P.); and Department of Diagnostic Imaging, St. Jude Children’s Research Hospital, 262 Danny Thomas Plc, Memphis, TN 38105-3678 (P.H.Y.)

Collapse

Siddiqi R, Javaid S. Deep Learning for Pneumonia Detection in Chest X-ray Images: A Comprehensive Survey. J Imaging 2024;10:176. [PMID: 39194965 DOI: 10.3390/jimaging10080176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 07/15/2024] [Accepted: 07/19/2024] [Indexed: 08/29/2024] Open

Abstract

This paper addresses the significant problem of identifying the relevant background and contextual literature related to deep learning (DL) as an evolving technology in order to provide a comprehensive analysis of the application of DL to the specific problem of pneumonia detection via chest X-ray (CXR) imaging, which is the most common and cost-effective imaging technique available worldwide for pneumonia diagnosis. This paper in particular addresses the key period associated with COVID-19, 2020-2023, to explain, analyze, and systematically evaluate the limitations of approaches and determine their relative levels of effectiveness. The context in which DL is applied as both an aid to and an automated substitute for existing expert radiography professionals, who often have limited availability, is elaborated in detail. The rationale for the undertaken research is provided, along with a justification of the resources adopted and their relevance. This explanatory text and the subsequent analyses are intended to provide sufficient detail of the problem being addressed, existing solutions, and the limitations of these, ranging in detail from the specific to the more general. Indeed, our analysis and evaluation agree with the generally held view that the use of transformers, specifically, vision transformers (ViTs), is the most promising technique for obtaining further effective results in the area of pneumonia detection using CXR images. However, ViTs require extensive further research to address several limitations, specifically the following: biased CXR datasets, data and code availability, the ease with which a model can be explained, systematic methods of accurate model comparison, the notion of class imbalance in CXR datasets, and the possibility of adversarial attacks, the latter of which remains an area of fundamental research.

Collapse

Wu D, Smith D, VanBerlo B, Roshankar A, Lee H, Li B, Ali F, Rahman M, Basmaji J, Tschirhart J, Ford A, VanBerlo B, Durvasula A, Vannelli C, Dave C, Deglint J, Ho J, Chaudhary R, Clausdorff H, Prager R, Millington S, Shah S, Buchanan B, Arntfield R. Improving the Generalizability and Performance of an Ultrasound Deep Learning Model Using Limited Multicenter Data for Lung Sliding Artifact Identification. Diagnostics (Basel) 2024;14:1081. [PMID: 38893608 PMCID: PMC11172006 DOI: 10.3390/diagnostics14111081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 05/18/2024] [Accepted: 05/20/2024] [Indexed: 06/21/2024] Open

Abstract

Deep learning (DL) models for medical image classification frequently struggle to generalize to data from outside institutions. Additional clinical data are also rarely collected to comprehensively assess and understand model performance amongst subgroups. Following the development of a single-center model to identify the lung sliding artifact on lung ultrasound (LUS), we pursued a validation strategy using external LUS data. As annotated LUS data are relatively scarce-compared to other medical imaging data-we adopted a novel technique to optimize the use of limited external data to improve model generalizability. Externally acquired LUS data from three tertiary care centers, totaling 641 clips from 238 patients, were used to assess the baseline generalizability of our lung sliding model. We then employed our novel Threshold-Aware Accumulative Fine-Tuning (TAAFT) method to fine-tune the baseline model and determine the minimum amount of data required to achieve predefined performance goals. A subgroup analysis was also performed and Grad-CAM++ explanations were examined. The final model was fine-tuned on one-third of the external dataset to achieve 0.917 sensitivity, 0.817 specificity, and 0.920 area under the receiver operator characteristic curve (AUC) on the external validation dataset, exceeding our predefined performance goals. Subgroup analyses identified LUS characteristics that most greatly challenged the model's performance. Grad-CAM++ saliency maps highlighted clinically relevant regions on M-mode images. We report a multicenter study that exploits limited available external data to improve the generalizability and performance of our lung sliding model while identifying poorly performing subgroups to inform future iterative improvements. This approach may contribute to efficiencies for DL researchers working with smaller quantities of external validation data.

Collapse

Affiliation(s)

Derek Wu Department of Medicine, Western University, London, ON N6A 5C1, Canada;
Delaney Smith Faculty of Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (D.S.); (H.L.)
Blake VanBerlo Faculty of Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (D.S.); (H.L.)
Amir Roshankar Faculty of Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (A.R.); (B.L.); (F.A.); (M.R.)
Hoseok Lee Faculty of Mathematics, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (D.S.); (H.L.)
Brian Li Faculty of Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (A.R.); (B.L.); (F.A.); (M.R.)
Faraz Ali Faculty of Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (A.R.); (B.L.); (F.A.); (M.R.)
Marwan Rahman Faculty of Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (A.R.); (B.L.); (F.A.); (M.R.)
John Basmaji Division of Critical Care Medicine, Western University, London, ON N6A 5C1, Canada; (J.B.); (C.D.); (R.P.); (R.A.)
Jared Tschirhart Schulich School of Medicine and Dentistry, Western University, London, ON N6A 5C1, Canada; (J.T.); (A.D.); (C.V.)
Alex Ford Independent Researcher, London, ON N6A 1L8, Canada;
Bennett VanBerlo Faculty of Engineering, Western University, London, ON N6A 5C1, Canada;
Ashritha Durvasula Schulich School of Medicine and Dentistry, Western University, London, ON N6A 5C1, Canada; (J.T.); (A.D.); (C.V.)
Claire Vannelli Schulich School of Medicine and Dentistry, Western University, London, ON N6A 5C1, Canada; (J.T.); (A.D.); (C.V.)
Chintan Dave Division of Critical Care Medicine, Western University, London, ON N6A 5C1, Canada; (J.B.); (C.D.); (R.P.); (R.A.)
Jason Deglint Faculty of Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada; (A.R.); (B.L.); (F.A.); (M.R.)
Jordan Ho Department of Family Medicine, Western University, London, ON N6A 5C1, Canada;
Rushil Chaudhary Department of Medicine, Western University, London, ON N6A 5C1, Canada;
Hans Clausdorff Departamento de Medicina de Urgencia, Pontificia Universidad Católica de Chile, Santiago 8331150, Chile;
Ross Prager Division of Critical Care Medicine, Western University, London, ON N6A 5C1, Canada; (J.B.); (C.D.); (R.P.); (R.A.)
Scott Millington Department of Critical Care Medicine, University of Ottawa, Ottawa, ON K1N 6N5, Canada;
Samveg Shah Department of Medicine, University of Alberta, Edmonton, AB T6G 2R3, Canada;
Brian Buchanan Department of Critical Care, University of Alberta, Edmonton, AB T6G 2R3, Canada;
Robert Arntfield Division of Critical Care Medicine, Western University, London, ON N6A 5C1, Canada; (J.B.); (C.D.); (R.P.); (R.A.)

Collapse

Rajaraman S, Zamzmi G, Yang F, Liang Z, Xue Z, Antani S. Uncovering the effects of model initialization on deep model generalization: A study with adult and pediatric chest X-ray images. PLOS DIGITAL HEALTH 2024;3:e0000286. [PMID: 38232121 DOI: 10.1371/journal.pdig.0000286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 12/04/2023] [Indexed: 01/19/2024]

Abstract

Model initialization techniques are vital for improving the performance and reliability of deep learning models in medical computer vision applications. While much literature exists on non-medical images, the impacts on medical images, particularly chest X-rays (CXRs) are less understood. Addressing this gap, our study explores three deep model initialization techniques: Cold-start, Warm-start, and Shrink and Perturb start, focusing on adult and pediatric populations. We specifically focus on scenarios with periodically arriving data for training, thereby embracing the real-world scenarios of ongoing data influx and the need for model updates. We evaluate these models for generalizability against external adult and pediatric CXR datasets. We also propose novel ensemble methods: F-score-weighted Sequential Least-Squares Quadratic Programming (F-SLSQP) and Attention-Guided Ensembles with Learnable Fuzzy Softmax to aggregate weight parameters from multiple models to capitalize on their collective knowledge and complementary representations. We perform statistical significance tests with 95% confidence intervals and p-values to analyze model performance. Our evaluations indicate models initialized with ImageNet-pretrained weights demonstrate superior generalizability over randomly initialized counterparts, contradicting some findings for non-medical images. Notably, ImageNet-pretrained models exhibit consistent performance during internal and external testing across different training scenarios. Weight-level ensembles of these models show significantly higher recall (p<0.05) during testing compared to individual models. Thus, our study accentuates the benefits of ImageNet-pretrained weight initialization, especially when used with weight-level ensembles, for creating robust and generalizable deep learning solutions.

Collapse

Rajaraman S, Yang F, Zamzmi G, Xue Z, Antani S. Can Deep Adult Lung Segmentation Models Generalize to the Pediatric Population? EXPERT SYSTEMS WITH APPLICATIONS 2023;229:120531. [PMID: 37397242 PMCID: PMC10310063 DOI: 10.1016/j.eswa.2023.120531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]

Krokos G, MacKewn J, Dunn J, Marsden P. A review of PET attenuation correction methods for PET-MR. EJNMMI Phys 2023;10:52. [PMID: 37695384 PMCID: PMC10495310 DOI: 10.1186/s40658-023-00569-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 08/07/2023] [Indexed: 09/12/2023] Open

Abstract

Despite being thirteen years since the installation of the first PET-MR system, the scanners constitute a very small proportion of the total hybrid PET systems installed. This is in stark contrast to the rapid expansion of the PET-CT scanner, which quickly established its importance in patient diagnosis within a similar timeframe. One of the main hurdles is the development of an accurate, reproducible and easy-to-use method for attenuation correction. Quantitative discrepancies in PET images between the manufacturer-provided MR methods and the more established CT- or transmission-based attenuation correction methods have led the scientific community in a continuous effort to develop a robust and accurate alternative. These can be divided into four broad categories: (i) MR-based, (ii) emission-based, (iii) atlas-based and the (iv) machine learning-based attenuation correction, which is rapidly gaining momentum. The first is based on segmenting the MR images in various tissues and allocating a predefined attenuation coefficient for each tissue. Emission-based attenuation correction methods aim in utilising the PET emission data by simultaneously reconstructing the radioactivity distribution and the attenuation image. Atlas-based attenuation correction methods aim to predict a CT or transmission image given an MR image of a new patient, by using databases containing CT or transmission images from the general population. Finally, in machine learning methods, a model that could predict the required image given the acquired MR or non-attenuation-corrected PET image is developed by exploiting the underlying features of the images. Deep learning methods are the dominant approach in this category. Compared to the more traditional machine learning, which uses structured data for building a model, deep learning makes direct use of the acquired images to identify underlying features. This up-to-date review goes through the literature of attenuation correction approaches in PET-MR after categorising them. The various approaches in each category are described and discussed. After exploring each category separately, a general overview is given of the current status and potential future approaches along with a comparison of the four outlined categories.

Collapse

Beheshtian E, Putman K, Santomartino SM, Parekh VS, Yi PH. Generalizability and Bias in a Deep Learning Pediatric Bone Age Prediction Model Using Hand Radiographs. Radiology 2023;306:e220505. [PMID: 36165796 DOI: 10.1148/radiol.220505] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Abstract

Background Although deep learning (DL) models have demonstrated expert-level ability for pediatric bone age prediction, they have shown poor generalizability and bias in other use cases. Purpose To quantify generalizability and bias in a bone age DL model measured by performance on external versus internal test sets and performance differences between different demographic groups, respectively. Materials and Methods The winning DL model of the 2017 RSNA Pediatric Bone Age Challenge was retrospectively evaluated and trained on 12 611 pediatric hand radiographs from two U.S. hospitals. The DL model was tested from September 2021 to December 2021 on an internal validation set and an external test set of pediatric hand radiographs with diverse demographic representation. Images reporting ground-truth bone age were included for study. Mean absolute difference (MAD) between ground-truth bone age and the model prediction bone age was calculated for each set. Generalizability was evaluated by comparing MAD between internal and external evaluation sets with use of t tests. Bias was evaluated by comparing MAD and clinically significant error rate (rate of errors changing the clinical diagnosis) between demographic groups with use of t tests or analysis of variance and χ² tests, respectively (statistically significant difference defined as P < .05). Results The internal validation set had images from 1425 individuals (773 boys), and the external test set had images from 1202 individuals (mean age, 133 months ± 60 [SD]; 614 boys). The bone age model generalized well to the external test set, with no difference in MAD (6.8 months in the validation set vs 6.9 months in the external set; P = .64). Model predictions would have led to clinically significant errors in 194 of 1202 images (16%) in the external test set. The MAD was greater for girls than boys in the internal validation set (P = .01) and in the subcategories of age and Tanner stage in the external test set (P < .001 for both). Conclusion A deep learning (DL) bone age model generalized well to an external test set, although clinically significant sex-, age-, and sexual maturity-based biases in DL bone age were identified. © RSNA, 2022 Online supplemental material is available for this article See also the editorial by Larson in this issue.

Collapse

Chua M, Kim D, Choi J, Lee NG, Deshpande V, Schwab J, Lev MH, Gonzalez RG, Gee MS, Do S. Tackling prediction uncertainty in machine learning for healthcare. Nat Biomed Eng 2022:10.1038/s41551-022-00988-x. [PMID: 36581695 DOI: 10.1038/s41551-022-00988-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 11/17/2022] [Indexed: 12/31/2022]

Can images crowdsourced from the internet be used to train generalizable joint dislocation deep learning algorithms? Skeletal Radiol 2022;51:2121-2128. [PMID: 35624310 DOI: 10.1007/s00256-022-04077-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 05/18/2022] [Accepted: 05/19/2022] [Indexed: 02/02/2023]