1
|
Abdulai ASB, Storm J, Ehrlich M. "I don't know": An uncertainty-aware machine learning model for predicting patient disposition at emergency department triage. Int J Med Inform 2025; 201:105957. [PMID: 40318497 DOI: 10.1016/j.ijmedinf.2025.105957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2025] [Revised: 04/22/2025] [Accepted: 04/24/2025] [Indexed: 05/07/2025]
Abstract
BACKGROUND Machine learning (ML) models are widely used for predicting patient disposition at emergency department (ED) triage. However, these models generate predictions regardless of the level of uncertainty, potentially leading to overconfident outputs that can compromise clinical decision-making. OBJECTIVE To develop a conformal prediction model for ED triage that provides uncertainty-aware patient disposition predictions. METHODS This retrospective study analyzed 560,486 adult ED visits (March 2014 - July 2017) from one academic and two community hospitals. An extreme gradient boosting (XGBoost) model was trained, validated, and conformalized to introduce a "Don't know" prediction for high-uncertainty cases. The model was tested on a random sample of 56,000 ED cases. RESULTS The standard XGBoost model achieved an AUC of 0.9307 (95% CI: 0.9285 - 0.9329), with sensitivity of 0.72 and specificity of 0.94. With conformal prediction at a lower confidence threshold of 60%, the model indicated "Don't know" in 4.9% of cases while returning sensitivity and specificity values of 0.74 and 0.95, respectively. As confidence thresholds increased, the model returned more "Don't know" predictions and fewer misclassifications. At 90% confidence, the model returned "Don't know" in 34.5% of cases while returning sensitivity and specificity values of 0.88 and 0.99, respectively. This trade-off highlights a balance between model confidence and prediction accuracy. CONCLUSION Incorporating uncertainty-awareness in ML models improves reliability in ED triage. By acknowledging uncertainty, clinicians receive more interpretable insights, reducing the risk of overconfident predictions and enhancing patient safety.
Collapse
Affiliation(s)
| | - Jean Storm
- Quality Insights, Inc., 3001 Chesterfield Ave, Charleston, WV 25311, USA.
| | - Michael Ehrlich
- Martin Tuchman School of Management, New Jersey Institute of Technology, 323 Dr Martin Luther King Jr Blvd, Newark, NJ 07102, USA.
| |
Collapse
|
2
|
Liu L, Chang C, Wang L, Gu X, Szalkowski G, Xing L. Efficient and accurate commissioning and quality assurance of radiosurgery beam via prior-embedded implicit neural representation learning. Med Phys 2025; 52:3398-3408. [PMID: 39812551 DOI: 10.1002/mp.17617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 12/14/2024] [Accepted: 12/25/2024] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND Dosimetric commissioning and quality assurance (QA) for linear accelerators (LINACs) present a significant challenge for clinical physicists due to the high measurement workload and stringent precision standards. This challenge is exacerbated for radiosurgery LINACs because of increased measurement uncertainty and more demanding setup accuracy for small-field beams. Optimizing physicists' effort during beam measurements while ensuring the quality of the measured data is crucial for clinical efficiency and patient safety. PURPOSE To develop a radiosurgery LINAC beam model that embeds prior knowledge of beam data through implicit neural representation (NeRP) learning and to evaluate the model's effectiveness in guiding beam data sampling, predicting complete beam dataset from sparse samples, and verifying detector choice and setup during commissioning and QA. MATERIALS AND METHODS Beam data including lateral profile and tissue-phantom-ratio (TPR), collected from CyberKnife LINACs, were investigated. Multi-layer perceptron (MLP) neural networks were optimized to parameterize a continuous function of the beam data, implicitly defined by the mapping from measurement coordinates to measured dose values. Beam priors were embedded into network weights by first training the network to learn the NeRP of a vendor-provided reference dataset. The prior-embedded network was further fine-tuned with sparse clinical measurements and used to predict unacquired beam data. Prospective and retrospective evaluations of different beam data samples in finetuning the model were performed using the reference beam dataset and clinical testing datasets, respectively. Model prediction accuracy was evaluated over 10 clinical datasets collected from various LINACs with different manufacturing modes and collimation systems. Model sensitivity in detecting beam data acquisition errors including inaccurate detector positioning and inappropriate detector choice was evaluated using two additional datasets with intentionally introduced erroneous samples. RESULTS Prospective and retrospective evaluations identified consistent beam data samples that are most effective in fine-tuning the model for complete beam data prediction. Despite of discrepancies between clinical beam and the reference beam, fine-tuning the model with sparse beam profile measured at a single depth or with beam TPR measured at a single collimator size predicted beam data that closely match ground truth water tank measurements. Across the 10 clinical beam datasets, the averaged mean absolute error (MAE) in percentage dose was lower than 0.5% and the averaged 1D Gamma passing rate (1%/0.5 mm for profile and 1%/1 mm for TPR) was higher than 99%. In contrast, the MAE and Gamma passing rates were above 1% and below 95% between the reference beam dataset and clinical beam datasets. Model sensitivity to beam data acquisition errors was demonstrated by significant model prediction changes when fine-tuned with erroneous versus correct beam data samples, as quantified by a Gamma passing rate as low as 18.16% between model predictions. CONCLUSION A model for small-field radiosurgery beam was proposed that embeds prior knowledge of beam properties and predicts the entire beam data from sparse measurements. The model can serve as a valuable tool for clinical physicists to verify the accuracy of beam data acquisition and promises to improve commissioning and QA reliability and efficiency with substantially reduced number of beam measurements.
Collapse
Affiliation(s)
- Lianli Liu
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - Cynthia Chang
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - Lei Wang
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - Xuejun Gu
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - Gregory Szalkowski
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - Lei Xing
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| |
Collapse
|
3
|
Linli Z, Liang X, Zhang Z, Hu K, Guo S. Enhancing brain age estimation under uncertainty: A spectral-normalized neural gaussian process approach utilizing 2.5D slicing. Neuroimage 2025; 311:121184. [PMID: 40180003 DOI: 10.1016/j.neuroimage.2025.121184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 03/19/2025] [Accepted: 04/01/2025] [Indexed: 04/05/2025] Open
Abstract
Brain age gap, the difference between estimated brain age and chronological age via magnetic resonance imaging, has emerged as a pivotal biomarker in the detection of brain abnormalities. While deep learning is accurate in estimating brain age, the absence of uncertainty estimation may pose risks in clinical use. Moreover, current 3D brain age models are intricate, and using 2D slices hinders comprehensive dimensional data integration. Here, we introduced Spectral-normalized Neural Gaussian Process (SNGP) accompanied by 2.5D slice approach for seamless uncertainty integration in a single network with low computational expenses, and extra dimensional data integration without added model complexity. Subsequently, we compared different deep learning methods for estimating brain age uncertainty via the Pearson correlation coefficient, a metric that helps circumvent systematic underestimation of uncertainty during training. SNGP shows excellent uncertainty estimation and generalization on a dataset of 11 public datasets (N = 6327), with competitive predictive performance (MAE=2.95). Besides, SNGP demonstrates superior generalization performance (MAE=3.47) on an independent validation set (N = 301). Additionally, we conducted five controlled experiments to validate our method. Firstly, uncertainty adjustment in brain age estimation improved the detection of accelerated brain aging in adolescents with ADHD, with a 38% increase in effect size after adjustment. Secondly, the SNGP model exhibited OOD detection capabilities, showing significant differences in uncertainty across Asian and non-Asian datasets. Thirdly, the performance of DenseNet as a backbone for SNGP was slightly better than ResNeXt, attributed to DenseNet's feature reuse capability, with robust generalization on an independent validation set. Fourthly, site effect harmonization led to a decline in model performance, consistent with previous studies. Finally, the 2.5D slice approach significantly outperformed 2D methods, improving model performance without increasing network complexity. In conclusion, we present a cost-effective method for estimating brain age with uncertainty, utilizing 2.5D slicing for enhanced performance, showcasing promise for clinical applications.
Collapse
Affiliation(s)
- Zeqiang Linli
- School of Mathematics and Statistics, Guangdong University of Foreign Studies, Guangzhou, 510420, PR China; Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, 510420, Guangzhou, PR China; MOE-LCSM, School of Mathematics and Statistics, Hunan Normal University, Changsha, 410006, PR China.
| | - Xingcheng Liang
- School of Mathematics and Statistics, Guangdong University of Foreign Studies, Guangzhou, 510420, PR China; Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, 510420, Guangzhou, PR China.
| | - Zhenhua Zhang
- School of Mathematics and Statistics, Guangdong University of Foreign Studies, Guangzhou, 510420, PR China; Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, 510420, Guangzhou, PR China.
| | - Kang Hu
- School of Information Engineering, Wuhan Business University, Wuhan, 430056, PR China.
| | - Shuixia Guo
- MOE-LCSM, School of Mathematics and Statistics, Hunan Normal University, Changsha, 410006, PR China; Key Laboratory of Applied Statistics and Data Science, Hunan Normal University, College of Hunan Province, Changsha, 410006, PR China.
| |
Collapse
|
4
|
Portela A, Banga JR, Matabuena M. Conformal prediction for uncertainty quantification in dynamic biological systems. PLoS Comput Biol 2025; 21:e1013098. [PMID: 40354480 PMCID: PMC12091895 DOI: 10.1371/journal.pcbi.1013098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 05/20/2025] [Accepted: 04/28/2025] [Indexed: 05/14/2025] Open
Abstract
Uncertainty quantification (UQ) is the process of systematically determining and characterizing the degree of confidence in computational model predictions. In systems biology, and particularly with dynamic models, UQ is critical due to the nonlinearities and parameter sensitivities that influence the behavior of complex biological systems. Addressing these issues through robust UQ enables a deeper understanding of system dynamics and more reliable extrapolation beyond observed conditions. Many state-of-the-art UQ approaches in this field are grounded in Bayesian statistical methods. While these frameworks naturally incorporate uncertainty quantification, they often require the specification of parameter distributions as priors and may impose parametric assumptions that do not always reflect biological reality. Additionally, Bayesian methods can be computationally expensive, posing significant challenges when dealing with large-scale models and seeking rapid, reliable uncertainty calibration. As an alternative, we propose using conformal predictions methods and introduce two novel algorithms designed for dynamic biological systems. These approaches can provide non-asymptotic guarantees, improving robustness and scalability across various applications, even when the predictive models are misspecified. Through several illustrative scenarios, we demonstrate that these conformal algorithms can serve as powerful complements-or even alternatives-to conventional Bayesian methods, delivering effective uncertainty quantification for predictive tasks in systems biology.
Collapse
Affiliation(s)
- Alberto Portela
- Computational Biology Lab, MBG-CSIC (Spanish National Research Council), Pontevedra, Galicia, Spain
| | - Julio R. Banga
- Computational Biology Lab, MBG-CSIC (Spanish National Research Council), Pontevedra, Galicia, Spain
| | - Marcos Matabuena
- Department of Biostatistics, Harvard University, Boston, Massachusetts, United States of America
| |
Collapse
|
5
|
Qiao L, Khalilimeybodi A, Linden-Santangeli NJ, Rangamani P. The Evolution of Systems Biology and Systems Medicine: From Mechanistic Models to Uncertainty Quantification. Annu Rev Biomed Eng 2025; 27:425-447. [PMID: 39971380 DOI: 10.1146/annurev-bioeng-102723-065309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Understanding interaction mechanisms within cells, tissues, and organisms is crucial for driving developments across biology and medicine. Mathematical modeling is an essential tool for simulating such biological systems. Building on experiments, mechanistic models are widely used to describe small-scale intracellular networks. The development of sequencing techniques and computational tools has recently enabled multiscale models. Combining such larger scale network modeling with mechanistic modeling provides us with an opportunity to reveal previously unknown disease mechanisms and pharmacological interventions. Here, we review systems biology models from mechanistic models to multiscale models that integrate multiple layers of cellular networks and discuss how they can be used to shed light on disease states and even wellness-related states. Additionally, we introduce several methods that increase the certainty and accuracy of model predictions. Thus, combining mechanistic models with emerging mathematical and computational techniques can provide us with increasingly powerful tools to understand disease states and inspire drug discoveries.
Collapse
Affiliation(s)
- Lingxia Qiao
- Department of Pharmacology, University of California San Diego, La Jolla, California, USA;
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, California, USA
| | - Ali Khalilimeybodi
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, California, USA
| | | | - Padmini Rangamani
- Department of Pharmacology, University of California San Diego, La Jolla, California, USA;
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, California, USA
| |
Collapse
|
6
|
Peracchio L, Nicora G, Parimbelli E, Buonocore TM, Tavazzi E, Bergamaschi R, Dagliati A, Bellazzi R. RelAI: an automated approach to judge pointwise ML prediction reliability. Int J Med Inform 2025; 197:105857. [PMID: 40037268 DOI: 10.1016/j.ijmedinf.2025.105857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 02/05/2025] [Accepted: 02/20/2025] [Indexed: 03/06/2025]
Abstract
OBJECTIVES AI/ML advancements have been significant, yet their deployment in clinical practice faces logistical, regulatory, and trust-related challenges. To promote trust and informed use of ML predictions in real-world scenarios, reliable assessment of individual predictions is essential. We propose RelAI, a tool for pointwise reliability assessment of ML predictions that can support the identification of prediction errors during deployment. MATERIALS AND METHODS RelAI utilizes Autoencoders (AEs) to detect distributional shifts (Density principle) and a proxy model to encode local performance (Local Fit principle). We validated RelAI on a synthetic dataset and a real-world scenario involving Multiple Sclerosis (MS) patient outcomes. RESULTS On a synthetic dataset, RelAI effectively identified unreliable predictions, outperforming alternative approaches. In the MS case study, reliable predictions exhibited higher accuracy and were associated with specific demographic features, such as sex, residence, and eye symptoms. DISCUSSION AND CONCLUSION RelAI can support ML deployment in clinical settings by providing pointwise reliability assessments, ensuring regulatory compliance, and fostering user trust. Its model-agnostic nature and its compatibility with Python-based ML pipelines enhance its potential for widespread adoption.
Collapse
Affiliation(s)
- Lorenzo Peracchio
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy
| | - Giovanna Nicora
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.
| | - Enea Parimbelli
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy
| | | | | | | | - Arianna Dagliati
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy
| | - Riccardo Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy
| |
Collapse
|
7
|
Salvi M, Seoni S, Campagner A, Gertych A, Acharya UR, Molinari F, Cabitza F. Explainability and uncertainty: Two sides of the same coin for enhancing the interpretability of deep learning models in healthcare. Int J Med Inform 2025; 197:105846. [PMID: 39993336 DOI: 10.1016/j.ijmedinf.2025.105846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 02/19/2025] [Accepted: 02/19/2025] [Indexed: 02/26/2025]
Abstract
BACKGROUND The increasing use of Deep Learning (DL) in healthcare has highlighted the critical need for improved transparency and interpretability. While Explainable Artificial Intelligence (XAI) methods provide insights into model predictions, reliability cannot be guaranteed by simply relying on explanations. OBJECTIVES This position paper proposes the integration of Uncertainty Quantification (UQ) with XAI methods to improve model reliability and trustworthiness in healthcare applications. METHODS We examine state-of-the-art XAI and UQ techniques, discuss implementation challenges, and suggest solutions to combine UQ with XAI methods. We propose a framework for estimating both aleatoric and epistemic uncertainty in the XAI context, providing illustrative examples of their potential application. RESULTS Our analysis indicates that integrating UQ with XAI could significantly enhance the reliability of DL models in practice. This approach has the potential to reduce interpretation biases and over-reliance, leading to more cautious and conscious use of AI in healthcare.
Collapse
Affiliation(s)
- Massimo Salvi
- Biolab, PoliToBIOMed Lab, Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy.
| | - Silvia Seoni
- Biolab, PoliToBIOMed Lab, Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy
| | | | - Arkadiusz Gertych
- Faculty of Biomedical Engineering, Silesian University of Technology, Zabrze, Poland; Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA, United States; Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia; Centre for Health Research, University of Southern Queensland, Springfield, QLD 4300, Australia
| | - Filippo Molinari
- Biolab, PoliToBIOMed Lab, Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy
| | - Federico Cabitza
- IRCCS Ospedale Galeazzi - Sant'Ambrogio, Milan, Italy; Department of Computer Science, Systems and Communication, University of Milano-Bicocca, Milan, Italy
| |
Collapse
|
8
|
Lester C, Rowell B, Zheng Y, Co Z, Marshall V, Kim JY, Chen Q, Kontar R, Yang XJ. Effect of Uncertainty-Aware AI Models on Pharmacists' Reaction Time and Decision-Making in a Web-Based Mock Medication Verification Task: Randomized Controlled Trial. JMIR Med Inform 2025; 13:e64902. [PMID: 40249341 PMCID: PMC12023801 DOI: 10.2196/64902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 02/18/2025] [Accepted: 03/07/2025] [Indexed: 04/19/2025] Open
Abstract
Background Artificial intelligence (AI)-based clinical decision support systems are increasingly used in health care. Uncertainty-aware AI presents the model's confidence in its decision alongside its prediction, whereas black-box AI only provides a prediction. Little is known about how this type of AI affects health care providers' work performance and reaction time. Objective This study aimed to determine the effects of black-box and uncertainty-aware AI advice on pharmacist decision-making and reaction time. Methods Recruitment emails were sent to pharmacists through professional listservs describing a web-based, crossover, randomized controlled trial. Participants were randomized to the black-box AI or uncertainty-aware AI condition in a 1:1 manner. Participants completed 100 mock verification tasks with AI help and 100 without AI help. The order of no help and AI help was randomized. Participants were exposed to correct and incorrect prescription fills, where the correct decision was to "accept" or "reject," respectively. AI help provided correct (79%) or incorrect (21%) advice. Reaction times, participant decisions, AI advice, and AI help type were recorded for each verification. Likelihood ratio tests compared means across the three categories of AI type for each level of AI correctness. Results A total of 30 participants provided complete datasets. An equal number of participants were in each AI condition. Participants' decision-making performance and reaction times differed across the 3 conditions. Accurate AI recommendations resulted in the rejection of the incorrect drug 96.1% and 91.8% of the time for uncertainty-aware AI and black-box AI respectively, compared with 81.2% without AI help. Correctly dispensed medications were accepted at rates of 99.2% with black-box help, 94.1% with uncertainty-aware AI help, and 94.6% without AI help. Uncertainty-aware AI protected against bad AI advice to approve an incorrectly filled medication compared with black-box AI (83.3% vs 76.7%). When the AI recommended rejecting a correctly filled medication, pharmacists without AI help had a higher rate of correctly accepting the medication (94.6%) compared with uncertainty-aware AI help (86.2%) and black-box AI help (81.2%). Uncertainty-aware AI resulted in shorter reaction times than black-box AI and no AI help except in the scenario where "AI rejects the correct drug." Black-box AI did not lead to reduced reaction times compared with pharmacists acting alone. Conclusions Pharmacists' performance and reaction times varied by AI type and AI accuracy. Overall, uncertainty-aware AI resulted in faster decision-making and acted as a safeguard against bad AI advice to approve a misfilled medication. Conversely, black-box AI had the longest reaction times, and user performance degraded in the presence of bad AI advice. However, uncertainty-aware AI could result in unnecessary double-checks, but it is preferred over false negative advice, where patients receive the wrong medication. These results highlight the importance of well-designed AI that addresses users' needs, enhances performance, and avoids overreliance on AI.
Collapse
Affiliation(s)
- Corey Lester
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, MI, 48109, United States, 1 734-647-8849
| | - Brigid Rowell
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, MI, 48109, United States, 1 734-647-8849
| | - Yifan Zheng
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, MI, 48109, United States, 1 734-647-8849
| | - Zoe Co
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, MI, 48109, United States, 1 734-647-8849
- Department of Learning Health Sciences, University of Michigan School of Medicine, Ann Arbor, MI, United States
| | - Vincent Marshall
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, MI, 48109, United States, 1 734-647-8849
| | - Jin Yong Kim
- Department of Industrial and Operations Engineering, College of Engineering, University of Michigan, Ann Arbor, MI, United States
| | - Qiyuan Chen
- Department of Industrial and Operations Engineering, College of Engineering, University of Michigan, Ann Arbor, MI, United States
| | - Raed Kontar
- Department of Industrial and Operations Engineering, College of Engineering, University of Michigan, Ann Arbor, MI, United States
| | - X Jessie Yang
- Department of Industrial and Operations Engineering, College of Engineering, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
9
|
Ahmad T, Guida A, Stewart S, Barrett N, Jiang X, Vincer M, Afifi J. Can deep learning classify cerebral ultrasound images for the detection of brain injury in very preterm infants? Eur Radiol 2025; 35:1948-1958. [PMID: 39212671 DOI: 10.1007/s00330-024-11028-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/02/2024] [Accepted: 08/03/2024] [Indexed: 09/04/2024]
Abstract
OBJECTIVES Cerebral ultrasound (CUS) is the main imaging screening tool in preterm infants. The aim of this work is to develop deep learning (DL) models that classify normal vs abnormal CUS to serve as a computer-aided detection tool providing timely interpretation of the scans. METHODS A population-based cohort of very preterm infants (220-306 weeks) born between 2004 and 2016 in Nova Scotia, Canada. A set of nine sequential CUS images per infant was retrieved at three specific coronal landmarks at three pre-identified times (first, sixth weeks, and term age). A radiologist manually labeled each image as normal or abnormal. The dataset was split into training/development/test subsets (80:10:10). Different convolutional neural networks were tested, with filtering of the most uncertain prediction. The model's performance was assessed using precision/recall and the receiver operating area under the curve. RESULTS Sequential CUS retrieved for 538/665 babies (81% of the cohort). Four thousand one hundred eighty images were used to develop and test the model. The model performance was only discrete at the beginning but, through different machine learning strategies was boosted to good levels averaging 0.86 ROC AUC (95% CI: 0.82, 0.90) and 0.87 PR AUC (95% CI: 0.84, 0.90) (model uncertainty estimation filters using normalized entropy threshold = 0.5). CONCLUSION This study offers proof of the feasibility of applying DL to CUS. This basic diagnostic model showed good discriminative ability to classify normal versus abnormal CUS. This serves as a CAD and a framework for constructing a prognostic model. CLINICAL RELEVANCE STATEMENT This DL model can serve as a computer-aided detection tool to classify CUS of very preterm babies as either normal or abnormal. This model will also be used as a framework to develop a prognostic model. KEY POINTS Binary computer-aided detection models of CUS are applicable for classifying ultrasound images in very preterm babies. This model acts as a step towards developing a model for predicting neurodevelopmental outcomes in very preterm babies. This model serves as a tool for interpretation of CUS in this patient population with a heightened risk of brain injury.
Collapse
Affiliation(s)
- Tahani Ahmad
- Department of Pediatric Radiology, IWK Health, Halifax, NS, Canada.
- Department of Diagnostic Imaging, Dalhousie University, Halifax, NS, Canada.
| | - Alessandro Guida
- Department of Diagnostic Imaging, Dalhousie University, Halifax, NS, Canada
| | - Samuel Stewart
- Department of Community Health and Epidemiology, Dalhousie University, Halifax, NS, Canada
| | - Noah Barrett
- Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
| | - Xiang Jiang
- Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
| | - Michael Vincer
- Department of Pediatrics, Dalhousie University, Halifax, NS, Canada
- Division of Neonatal-Perinatal Medicine, IWK Health, Halifax, NS, Canada
| | - Jehier Afifi
- Department of Pediatrics, Dalhousie University, Halifax, NS, Canada
- Division of Neonatal-Perinatal Medicine, IWK Health, Halifax, NS, Canada
| |
Collapse
|
10
|
Kelly A, Jensen EK, Grua EM, Mathiasen K, Van de Ven P. An Interpretable Model With Probabilistic Integrated Scoring for Mental Health Treatment Prediction: Design Study. JMIR Med Inform 2025; 13:e64617. [PMID: 40138679 PMCID: PMC11982765 DOI: 10.2196/64617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 02/08/2025] [Accepted: 02/16/2025] [Indexed: 03/29/2025] Open
Abstract
BACKGROUND Machine learning (ML) systems in health care have the potential to enhance decision-making but often fail to address critical issues such as prediction explainability, confidence, and robustness in a context-based and easily interpretable manner. OBJECTIVE This study aimed to design and evaluate an ML model for a future decision support system for clinical psychopathological treatment assessments. The novel ML model is inherently interpretable and transparent. It aims to enhance clinical explainability and trust through a transparent, hierarchical model structure that progresses from questions to scores to classification predictions. The model confidence and robustness were addressed by applying Monte Carlo dropout, a probabilistic method that reveals model uncertainty and confidence. METHODS A model for clinical psychopathological treatment assessments was developed, incorporating a novel ML model structure. The model aimed at enhancing the graphical interpretation of the model outputs and addressing issues of prediction explainability, confidence, and robustness. The proposed ML model was trained and validated using patient questionnaire answers and demographics from a web-based treatment service in Denmark (N=1088). RESULTS The balanced accuracy score on the test set was 0.79. The precision was ≥0.71 for all 4 prediction classes (depression, panic, social phobia, and specific phobia). The area under the curve for the 4 classes was 0.93, 0.92, 0.91, and 0.98, respectively. CONCLUSIONS We have demonstrated a mental health treatment ML model that supported a graphical interpretation of prediction class probability distributions. Their spread and overlap can inform clinicians of competing treatment possibilities for patients and uncertainty in treatment predictions. With the ML model achieving 79% balanced accuracy, we expect that the model will be clinically useful in both screening new patients and informing clinical interviews.
Collapse
Affiliation(s)
- Anthony Kelly
- Department of Electronic and Computer Engineering, University of Limerick, Limerick, Ireland
- Health Research Institute, University of Limerick, Limerick, Ireland
| | | | - Eoin Martino Grua
- Department of Electronic and Computer Engineering, University of Limerick, Limerick, Ireland
| | - Kim Mathiasen
- Department of Psychology and Behavioural Sciences, Aarhus University, Aarhus, Denmark
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Pepijn Van de Ven
- Department of Electronic and Computer Engineering, University of Limerick, Limerick, Ireland
- Health Research Institute, University of Limerick, Limerick, Ireland
| |
Collapse
|
11
|
Werthen-Brabants L, Dhaene T, Deschrijver D. The role of trustworthy and reliable AI for multiple sclerosis. Front Digit Health 2025; 7:1507159. [PMID: 40196398 PMCID: PMC11973328 DOI: 10.3389/fdgth.2025.1507159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 03/12/2025] [Indexed: 04/09/2025] Open
Abstract
This paper investigates the importance of Trustworthy Machine Learning (ML) in the context of Multiple Sclerosis (MS) research and care. Due to the complex and individual nature of MS, the need for reliable and trustworthy ML models is essential. In this paper, key aspects of trustworthy ML, such as out-of-distribution generalization, explainability, uncertainty quantification and calibration are explored, highlighting their significance for healthcare applications. Challenges in integrating these ML tools into clinical workflows are addressed, discussing the difficulties in interpreting AI outputs, data diversity, and the need for comprehensive, quality data. It calls for collaborative efforts among researchers, clinicians, and policymakers to develop ML solutions that are technically sound, clinically relevant, and patient-centric.
Collapse
|
12
|
Friesacher HR, Engkvist O, Mervin L, Moreau Y, Arany A. Achieving well-informed decision-making in drug discovery: a comprehensive calibration study using neural network-based structure-activity models. J Cheminform 2025; 17:29. [PMID: 40045403 PMCID: PMC11881400 DOI: 10.1186/s13321-025-00964-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 01/26/2025] [Indexed: 03/09/2025] Open
Abstract
In the drug discovery process, where experiments can be costly and time-consuming, computational models that predict drug-target interactions are valuable tools to accelerate the development of new therapeutic agents. Estimating the uncertainty inherent in these neural network predictions provides valuable information that facilitates optimal decision-making when risk assessment is crucial. However, such models can be poorly calibrated, which results in unreliable uncertainty estimates that do not reflect the true predictive uncertainty. In this study, we compare different metrics, including accuracy and calibration scores, used for model hyperparameter tuning to investigate which model selection strategy achieves well-calibrated models. Furthermore, we propose to use a computationally efficient Bayesian uncertainty estimation method named HMC Bayesian Last Layer (HBLL), which generates Hamiltonian Monte Carlo (HMC) trajectories to obtain samples for the parameters of a Bayesian logistic regression fitted to the hidden layer of the baseline neural network. We report that this approach improves model calibration and achieves the performance of common uncertainty quantification methods by combining the benefits of uncertainty estimation and probability calibration methods. Finally, we show that combining post hoc calibration method with well-performing uncertainty quantification approaches can boost model accuracy and calibration.
Collapse
Affiliation(s)
- Hannah Rosa Friesacher
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, 3000, Belgium.
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Gothenburg, Gothenburg, 431 83, Sweden.
| | - Ola Engkvist
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, 412 96, Sweden
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Gothenburg, Gothenburg, 431 83, Sweden
| | - Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Cambridge, Cambridge, CB2 0AA, UK
| | - Yves Moreau
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, 3000, Belgium
| | - Adam Arany
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, 3000, Belgium.
| |
Collapse
|
13
|
Ye K, Tang H, Dai S, Fortel I, Thompson PM, Mackin RS, Leow A, Huang H, Zhan L. BPEN: Brain Posterior Evidential Network for trustworthy brain imaging analysis. Neural Netw 2025; 183:106943. [PMID: 39657531 PMCID: PMC11750605 DOI: 10.1016/j.neunet.2024.106943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 10/22/2024] [Accepted: 11/17/2024] [Indexed: 12/12/2024]
Abstract
The application of deep learning techniques to analyze brain functional magnetic resonance imaging (fMRI) data has led to significant advancements in identifying prospective biomarkers associated with various clinical phenotypes and neurological conditions. Despite these achievements, the aspect of prediction uncertainty has been relatively underexplored in brain fMRI data analysis. Accurate uncertainty estimation is essential for trustworthy learning, given the challenges associated with brain fMRI data acquisition and the potential diagnostic implications for patients. To address this gap, we introduce a novel posterior evidential network, named the Brain Posterior Evidential Network (BPEN), designed to capture both aleatoric and epistemic uncertainty in the analysis of brain fMRI data. We conducted comprehensive experiments using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and ADNI-depression (ADNI-D) cohorts, focusing on predictions for mild cognitive impairment (MCI) and depression across various diagnostic groups. Our experiments not only unequivocally demonstrate the superior predictive performance of our BPEN model compared to existing state-of-the-art methods but also underscore the importance of uncertainty estimation in predictive models.
Collapse
Affiliation(s)
- Kai Ye
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15260, PA, USA
| | - Haoteng Tang
- Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, 78539, TX, USA
| | - Siyuan Dai
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15260, PA, USA
| | - Igor Fortel
- Department of Biomedical Engineering, University of Illinois at Chicago, Chicago, 60607, IL, USA
| | - Paul M Thompson
- Keck School of Medicine, University of Southern California, Los Angeles, 90089, CA, USA
| | - R Scott Mackin
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, 94143, CA, USA
| | - Alex Leow
- Department of Biomedical Engineering, University of Illinois at Chicago, Chicago, 60607, IL, USA; Department of Psychiatry, University of Illinois at Chicago, Chicago, 60607, IL, USA; Department of Computer Science, University of Illinois at Chicago, Chicago, 60607, IL, USA
| | - Heng Huang
- Department of Computer Science, University of Maryland, College Park, 20742, MD, USA
| | - Liang Zhan
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15260, PA, USA.
| |
Collapse
|
14
|
Van Vogt E, Gordon AC, Diaz-Ordaz K, Cro S. Application of causal forests to randomised controlled trial data to identify heterogeneous treatment effects: a case study. BMC Med Res Methodol 2025; 25:50. [PMID: 39987431 PMCID: PMC11846376 DOI: 10.1186/s12874-025-02489-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 02/03/2025] [Indexed: 02/24/2025] Open
Abstract
BACKGROUND Classical approaches to subgroup analysis in randomised controlled trials (RCTs) to identify heterogeneous treatment effects (HTEs) involve testing the interaction between each pre-specified possible treatment effect modifier and the treatment effect. However, individual significant interactions may not always yield clinically actionable subgroups, particularly for continuous covariates. Non-parametric causal machine learning approaches are flexible alternatives for estimating HTEs across many possible treatment effect modifiers in a single analysis. METHODS We conducted a secondary analysis of the VANISH RCT, which compared the early use of vasopressin with norepinephrine on renal failure-free survival for patients with septic shock at 28 days. We used classical (separate tests for interaction with Bonferroni correction), data-adaptive (hierarchical lasso regression), and non-parametric causal machine learning (causal forest) methods to analyse HTEs for the primary outcome of being alive at 28 days. Causal forests comprise honest causal trees, which use sample splitting to determine tree splits and estimate treatment effects separately. The modal initial (root) splits of the causal forest were extracted, and the mean value was used as a threshold to partition the population into subgroups with different treatment effects. RESULTS All three models found evidence of HTE with serum potassium levels. Univariable logistic regression OR 0.435 (95%CI [0.270, 0.683]. p = 0.0004), hierarchical lasso logistic regression standardised OR: 0.604 (95% CI 0.259, 0.701), lambda = 0.0049. Hierarchical lasso kept the interaction between the treatment and serum potassium, sodium level, minimum temperature, platelet count and presence of ischemic heart disease. The causal forest approach found some evidence of HTE (p = 0.124). When extracting root splits, the modal split was on serum potassium (mean applied threshold of 4.68 mmol/L). When dividing the patient population into subgroups based on the mean initial root threshold, risk differences in being alive at 28 days were 0.069 (95%CI [-0.032, 0.169]) and - 0.257 (95%CI [-0.368, -0.146]) with serum potassium ≤ 4.68 and > 4.68 respectively. CONCLUSIONS The causal forest agreed with the data-adaptive and classical method of subgroup analysis in identifying HTE by serum potassium. Whilst classical and data-adaptive methods may identify sources of HTE, they do not immediately suggest subgroup splits which are clinically actionable. The extraction of root splits in causal forests is a novel approach to obtaining data-derived subgroups, to be further investigated.
Collapse
Affiliation(s)
| | | | | | - Suzie Cro
- Imperial College London, London, UK.
| |
Collapse
|
15
|
Lindenmeyer A, Blattmann M, Franke S, Neumuth T, Schneider D. Towards Trustworthy AI in Healthcare: Epistemic Uncertainty Estimation for Clinical Decision Support. J Pers Med 2025; 15:58. [PMID: 39997335 PMCID: PMC11856777 DOI: 10.3390/jpm15020058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 01/20/2025] [Accepted: 01/23/2025] [Indexed: 02/26/2025] Open
Abstract
Introduction: Widespread adoption of AI for medical decision-making is still hindered due to ethical and safety-related concerns. For AI-based decision support systems in healthcare settings, it is paramount to be reliable and trustworthy. Common deep learning approaches, however, have the tendency towards overconfidence when faced with unfamiliar or changing conditions. Inappropriate extrapolation beyond well-supported scenarios may have dire consequences highlighting the importance of the reliable estimation of local knowledge uncertainty and its communication to the end user. Materials and Methods: While neural network ensembles (ENNs) have been heralded as a potential solution to these issues for many years, deep learning methods, specifically modeling the amount of knowledge, promise more principled and reliable behavior. This study compares their reliability in clinical applications. We centered our analysis on experiments with low-dimensional toy datasets and the exemplary case study of mortality prediction for intensive care unit hospitalizations using Electronic Health Records (EHRs) from the MIMIC3 study. For predictions on the EHR time series, Encoder-Only Transformer models were employed. Knowledge uncertainty estimation is achieved with both ensemble and Spectral Normalized Neural Gaussian Process (SNGP) variants of the common Transformer model. We designed two datasets to test their reliability in detecting token level and more subtle discrepancies both for toy datasets and an EHR dataset. Results: While both SNGP and ENN model variants achieve similar prediction performance (AUROC: ≈0.85, AUPRC: ≈0.52 for in-hospital mortality prediction from a selected MIMIC3 benchmark), the former demonstrates improved capabilities to quantify knowledge uncertainty for individual samples/patients. Discussion/Conclusions: Methods including a knowledge model, such as SNGP, offer superior uncertainty estimation compared to traditional stochastic deep learning, leading to more trustworthy and safe clinical decision support.
Collapse
Affiliation(s)
- Adrian Lindenmeyer
- Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig, Leipzig University, Humboldtstraße 25, 04105 Leipzig, Germany
- Innovation Center Computer Assisted Surgery (ICCAS), Leipzig University, Semmelweisstrasse 14, 04103 Leipzig, Germany
| | - Malte Blattmann
- Innovation Center Computer Assisted Surgery (ICCAS), Leipzig University, Semmelweisstrasse 14, 04103 Leipzig, Germany
| | - Stefan Franke
- Innovation Center Computer Assisted Surgery (ICCAS), Leipzig University, Semmelweisstrasse 14, 04103 Leipzig, Germany
| | - Thomas Neumuth
- Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig, Leipzig University, Humboldtstraße 25, 04105 Leipzig, Germany
- Innovation Center Computer Assisted Surgery (ICCAS), Leipzig University, Semmelweisstrasse 14, 04103 Leipzig, Germany
| | - Daniel Schneider
- Innovation Center Computer Assisted Surgery (ICCAS), Leipzig University, Semmelweisstrasse 14, 04103 Leipzig, Germany
| |
Collapse
|
16
|
Molchanova N, Raina V, Malinin A, Rosa FL, Depeursinge A, Gales M, Granziera C, Müller H, Graziani M, Cuadra MB. Structural-based uncertainty in deep learning across anatomical scales: Analysis in white matter lesion segmentation. Comput Biol Med 2025; 184:109336. [PMID: 39546878 DOI: 10.1016/j.compbiomed.2024.109336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 10/23/2024] [Accepted: 10/23/2024] [Indexed: 11/17/2024]
Abstract
This paper explores uncertainty quantification (UQ) as an indicator of the trustworthiness of automated deep-learning (DL) tools in the context of white matter lesion (WML) segmentation from magnetic resonance imaging (MRI) scans of multiple sclerosis (MS) patients. Our study focuses on two principal aspects of uncertainty in structured output segmentation tasks. First, we postulate that a reliable uncertainty measure should indicate predictions likely to be incorrect with high uncertainty values. Second, we investigate the merit of quantifying uncertainty at different anatomical scales (voxel, lesion, or patient). We hypothesize that uncertainty at each scale is related to specific types of errors. Our study aims to confirm this relationship by conducting separate analyses for in-domain and out-of-domain settings. Our primary methodological contributions are (i) the development of novel measures for quantifying uncertainty at lesion and patient scales, derived from structural prediction discrepancies, and (ii) the extension of an error retention curve analysis framework to facilitate the evaluation of UQ performance at both lesion and patient scales. The results from a multi-centric MRI dataset of 444 patients demonstrate that our proposed measures more effectively capture model errors at the lesion and patient scales compared to measures that average voxel-scale uncertainty values. We provide the UQ protocols code at https://github.com/Medical-Image-Analysis-Laboratory/MS_WML_uncs.
Collapse
Affiliation(s)
- Nataliia Molchanova
- Radiology Department, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland; MedGIFT, Institute of Informatics, School of Management, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland; CIBM Center for Biomedical Imaging, Lausanne, Switzerland.
| | - Vatsal Raina
- ALTA Institute, University of Cambridge, Cambridge, United Kingdom
| | | | - Francesco La Rosa
- Icahn School of Medicine at Mount Sinai, New York City, United States of America
| | - Adrien Depeursinge
- Radiology Department, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland; MedGIFT, Institute of Informatics, School of Management, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
| | - Mark Gales
- ALTA Institute, University of Cambridge, Cambridge, United Kingdom
| | - Cristina Granziera
- Translational Imaging in Neurology (ThINK) Basel, Department of Medicine and Biomedical Engineering, University Hospital Basel and University of Basel, Basel, Switzerland; Department of Neurology, University Hospital Basel, Basel, Switzerland; Research Center for Clinical Neuroimmunology and Neuroscience Basel (RC2NB), University Hospital Basel and University of Basel, Basel, Switzerland
| | - Henning Müller
- MedGIFT, Institute of Informatics, School of Management, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland; Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Mara Graziani
- MedGIFT, Institute of Informatics, School of Management, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
| | - Meritxell Bach Cuadra
- Radiology Department, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland; CIBM Center for Biomedical Imaging, Lausanne, Switzerland
| |
Collapse
|
17
|
Wahid KA, Kaffey ZY, Farris DP, Humbert-Vidan L, Moreno AC, Rasmussen M, Ren J, Naser MA, Netherton TJ, Korreman S, Balakrishnan G, Fuller CD, Fuentes D, Dohopolski MJ. Artificial intelligence uncertainty quantification in radiotherapy applications - A scoping review. Radiother Oncol 2024; 201:110542. [PMID: 39299574 PMCID: PMC11648575 DOI: 10.1016/j.radonc.2024.110542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 08/18/2024] [Accepted: 09/09/2024] [Indexed: 09/22/2024]
Abstract
BACKGROUND/PURPOSE The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. METHODS We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. RESULTS We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets. CONCLUSION Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
Collapse
Affiliation(s)
- Kareem A Wahid
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA; Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Zaphanlene Y Kaffey
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - David P Farris
- Research Medical Library, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Laia Humbert-Vidan
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Amy C Moreno
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | | | - Jintao Ren
- Department of Oncology, Aarhus University Hospital, Denmark
| | - Mohamed A Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tucker J Netherton
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Stine Korreman
- Department of Oncology, Aarhus University Hospital, Denmark
| | | | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - David Fuentes
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| | - Michael J Dohopolski
- Department of Radiation Oncology, The University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
18
|
Peeters D, Alves N, Venkadesh KV, Dinnessen R, Saghir Z, Scholten ET, Schaefer-Prokop C, Vliegenthart R, Prokop M, Jacobs C. Enhancing a deep learning model for pulmonary nodule malignancy risk estimation in chest CT with uncertainty estimation. Eur Radiol 2024; 34:6639-6651. [PMID: 38536463 PMCID: PMC11399205 DOI: 10.1007/s00330-024-10714-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 02/22/2024] [Accepted: 02/27/2024] [Indexed: 09/15/2024]
Abstract
OBJECTIVE To investigate the effect of uncertainty estimation on the performance of a Deep Learning (DL) algorithm for estimating malignancy risk of pulmonary nodules. METHODS AND MATERIALS In this retrospective study, we integrated an uncertainty estimation method into a previously developed DL algorithm for nodule malignancy risk estimation. Uncertainty thresholds were developed using CT data from the Danish Lung Cancer Screening Trial (DLCST), containing 883 nodules (65 malignant) collected between 2004 and 2010. We used thresholds on the 90th and 95th percentiles of the uncertainty score distribution to categorize nodules into certain and uncertain groups. External validation was performed on clinical CT data from a tertiary academic center containing 374 nodules (207 malignant) collected between 2004 and 2012. DL performance was measured using area under the ROC curve (AUC) for the full set of nodules, for the certain cases and for the uncertain cases. Additionally, nodule characteristics were compared to identify trends for inducing uncertainty. RESULTS The DL algorithm performed significantly worse in the uncertain group compared to the certain group of DLCST (AUC 0.62 (95% CI: 0.49, 0.76) vs 0.93 (95% CI: 0.88, 0.97); p < .001) and the clinical dataset (AUC 0.62 (95% CI: 0.50, 0.73) vs 0.90 (95% CI: 0.86, 0.94); p < .001). The uncertain group included larger benign nodules as well as more part-solid and non-solid nodules than the certain group. CONCLUSION The integrated uncertainty estimation showed excellent performance for identifying uncertain cases in which the DL-based nodule malignancy risk estimation algorithm had significantly worse performance. CLINICAL RELEVANCE STATEMENT Deep Learning algorithms often lack the ability to gauge and communicate uncertainty. For safe clinical implementation, uncertainty estimation is of pivotal importance to identify cases where the deep learning algorithm harbors doubt in its prediction. KEY POINTS • Deep learning (DL) algorithms often lack uncertainty estimation, which potentially reduce the risk of errors and improve safety during clinical adoption of the DL algorithm. • Uncertainty estimation identifies pulmonary nodules in which the discriminative performance of the DL algorithm is significantly worse. • Uncertainty estimation can further enhance the benefits of the DL algorithm and improve its safety and trustworthiness.
Collapse
Affiliation(s)
- Dré Peeters
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands.
| | - Natália Alves
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
| | - Kiran V Venkadesh
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
| | - Renate Dinnessen
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
| | - Zaigham Saghir
- Department of Medicine, Section of Pulmonary Medicine, Herlev-Gentofte Hospital, Hellerup, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Ernst T Scholten
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
| | - Cornelia Schaefer-Prokop
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
- Radiology Department, Meander Medical Center, Maatweg 3, 3813 TZ, Amersfoort, The Netherlands
| | - Rozemarijn Vliegenthart
- Department of Radiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700RB, Groningen, The Netherlands
| | - Mathias Prokop
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
- Department of Radiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9700RB, Groningen, The Netherlands
| | - Colin Jacobs
- Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands
| |
Collapse
|
19
|
Yao N, Tian Y, Neves DGD, Zhao C, Mesquita CT, Martins WDA, Dos Santos AASMD, Li Y, Han C, Zhu F, Dai N, Zhou W. Incremental Value of Radiomics Features of Epicardial Adipose Tissue for Detecting the Severity of COVID-19 Infection. KARDIOLOGIIA 2024; 64:96-104. [PMID: 39392272 DOI: 10.18087/cardio.2024.9.n2685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 06/30/2024] [Indexed: 10/12/2024]
Abstract
INTRODUCTION Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, existing detection methods for COVID-19 severity assessment often lack consideration of organs and tissues other than the lungs, which limits the accuracy and reliability of these predictive models. MATERIAL AND METHODS The retrospective study included data from 515 COVID-19 patients (Cohort 1, n=415; Cohort 2, n=100) from two centers (Shanghai Public Health Center and Brazil Niteroi Hospital) between January 2020 and July 2020. Firstly, a three-stage EAT segmentation method was proposed by combining object detection and segmentation networks. Lung and EAT radiomics features were then extracted, and feature selection was performed. Finally, a hybrid model, based on seven machine learning models, was built for detecting COVID-19 severity. The hybrid model's performance and uncertainty were evaluated in both internal and external validation cohorts. RESULTS For EAT extraction, the Dice similarity coefficients (DSC) of the two centers were 0.972 (±0.011) and 0.968 (±0.005), respectively. For severity detection, the area under the receiver operating characteristic curve (AUC), net reclassification improvement (NRI), and integrated discrimination improvement (IDI) of the hybrid model increased by 0.09 (p<0.001), 19.3 % (p<0.05), and 18.0 % (p<0.05) in the internal validation cohort, and by 0.06 (p<0.001), 18.0 % (p<0.05) and 18.0 % (p<0.05) in the external validation cohort, respectively. Uncertainty and radiomics features analysis confirmed the interpretability of increased certainty in case prediction after inclusion of EAT features. CONCLUSION This study proposed a novel three-stage EAT extraction method. We demonstrated that adding EAT radiomics features to a COVID-19 severity detection model results in increased accuracy and reduced uncertainty. The value of these features was also confirmed through feature importance ranking and visualization.
Collapse
Affiliation(s)
- Ni Yao
- Zhengzhou University of Light Industry, School of Computer Science and Technology, Zhengzhou
| | - Yanhui Tian
- Zhengzhou University of Light Industry, School of Computer Science and Technology, Zhengzhou
| | - Daniel Gama das Neves
- Universidade Federal Fluminense, Department of Radiology; DASA Complexo Hospitalar de Niterói
| | - Chen Zhao
- Michigan Technological University, Department of Applied Computing, Houghton
| | | | | | | | - Yanting Li
- Zhengzhou University of Light Industry, School of Computer Science and Technology, Zhengzhou
| | - Chuang Han
- Zhengzhou University of Light Industry, School of Computer Science and Technology, Zhengzhou
| | - Fubao Zhu
- Zhengzhou University of Light Industry, School of Computer Science and Technology, Zhengzhou
| | - Neng Dai
- Zhongshan Hospital, Fudan University, Shanghai Institute of Cardiovascular Diseases, Department of Cardiology; National Clinical Research Center for Interventional Medicine
| | - Weihua Zhou
- Michigan Technological University, Department of Applied Computing, Houghton; Center for Biocomputing and Digital Health, Institute of Computing and Cybersystems, and Health Research Institute, Michigan Technological University, Houghton
| |
Collapse
|
20
|
Guo X, Xiang Y, Yang Y, Ye C, Yu Y, Ma T. Accelerating denoising diffusion probabilistic model via truncated inverse processes for medical image segmentation. Comput Biol Med 2024; 180:108933. [PMID: 39096612 DOI: 10.1016/j.compbiomed.2024.108933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/07/2024] [Accepted: 07/19/2024] [Indexed: 08/05/2024]
Abstract
Medical image segmentation demands precise accuracy and the capability to assess segmentation uncertainty for informed clinical decision-making. Denoising Diffusion Probability Models (DDPMs), with their advancements in image generation, can treat segmentation as a conditional generation task, providing accurate segmentation and uncertainty estimation. However, current DDPMs used in medical image segmentation suffer from low inference efficiency and prediction errors caused by excessive noise at the end of the forward process. To address this issue, we propose an accelerated denoising diffusion probabilistic model via truncated inverse processes (ADDPM) that is specifically designed for medical image segmentation. The inverse process of ADDPM starts from a non-Gaussian distribution and terminates early once a prediction with relatively low noise is obtained after multiple iterations of denoising. We employ a separate powerful segmentation network to obtain pre-segmentation and construct the non-Gaussian distribution of the segmentation based on the forward diffusion rule. By further adopting a separate denoising network, the final segmentation can be obtained with just one denoising step from the predictions with low noise. ADDPM greatly reduces the number of denoising steps to approximately one-tenth of that in vanilla DDPMs. Our experiments on four segmentation tasks demonstrate that ADDPM outperforms both vanilla DDPMs and existing representative accelerating DDPMs methods. Moreover, ADDPM can be easily integrated with existing advanced segmentation models to improve segmentation performance and provide uncertainty estimation. Implementation code: https://github.com/Guoxt/ADDPM.
Collapse
Affiliation(s)
- Xutao Guo
- School of Electronics and Information Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China; The Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | - Yang Xiang
- The Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | - Yanwu Yang
- School of Electronics and Information Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China; The Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | - Chenfei Ye
- The International Research Institute for Artificial Intelligence, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China
| | - Yue Yu
- The Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | - Ting Ma
- School of Electronics and Information Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China; The Peng Cheng Laboratory, Shenzhen, Guangdong, China; Guangdong Provincial Key Laboratory of Aerospace Communication and Networking Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China.
| |
Collapse
|
21
|
Kang S, Kang Y, Tan S. Exploring and Exploiting Multi-Modality Uncertainty for Tumor Segmentation on PET/CT. IEEE J Biomed Health Inform 2024; 28:5435-5446. [PMID: 38776203 DOI: 10.1109/jbhi.2024.3397332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
Despite the success of deep learning methods in multi-modality segmentation tasks, they typically produce a deterministic output, neglecting the underlying uncertainty. The absence of uncertainty could lead to over-confident predictions with catastrophic consequences, particularly in safety-critical clinical applications. Recently, uncertainty estimation has attracted increasing attention, offering a measure of confidence associated with machine decisions. Nonetheless, existing uncertainty estimation approaches primarily focus on single-modality networks, leaving the uncertainty of multi-modality networks a largely under-explored domain. In this study, we present the first exploration of multi-modality uncertainties in the context of tumor segmentation on PET/CT. Concretely, we assessed four well-established uncertainty estimation approaches across various dimensions, including segmentation performance, uncertainty quality, comparison to single-modality uncertainties, and correlation to the contradictory information between modalities. Through qualitative and quantitative analyses, we gained valuable insights into what benefits multi-modality uncertainties derive, what information multi-modality uncertainties capture, and how multi-modality uncertainties correlate to information from single modalities. Drawing from these insights, we introduced a novel uncertainty-driven loss, which incentivized the network to effectively utilize the complementary information between modalities. The proposed approach outperformed the backbone network by 4.53 and 2.92 Dices in percentages on two PET/CT datasets while achieving lower uncertainties. This study not only advanced the comprehension of multi-modality uncertainties but also revealed the potential benefit of incorporating them into the segmentation network.
Collapse
|
22
|
Edfeldt K, Edwards AM, Engkvist O, Günther J, Hartley M, Hulcoop DG, Leach AR, Marsden BD, Menge A, Misquitta L, Müller S, Owen DR, Schütt KT, Skelton N, Steffen A, Tropsha A, Vernet E, Wang Y, Wellnitz J, Willson TM, Clevert DA, Haibe-Kains B, Schiavone LH, Schapira M. A data science roadmap for open science organizations engaged in early-stage drug discovery. Nat Commun 2024; 15:5640. [PMID: 38965235 PMCID: PMC11224410 DOI: 10.1038/s41467-024-49777-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 06/12/2024] [Indexed: 07/06/2024] Open
Abstract
The Structural Genomics Consortium is an international open science research organization with a focus on accelerating early-stage drug discovery, namely hit discovery and optimization. We, as many others, believe that artificial intelligence (AI) is poised to be a main accelerator in the field. The question is then how to best benefit from recent advances in AI and how to generate, format and disseminate data to enable future breakthroughs in AI-guided drug discovery. We present here the recommendations of a working group composed of experts from both the public and private sectors. Robust data management requires precise ontologies and standardized vocabulary while a centralized database architecture across laboratories facilitates data integration into high-value datasets. Lab automation and opening electronic lab notebooks to data mining push the boundaries of data sharing and data modeling. Important considerations for building robust machine-learning models include transparent and reproducible data processing, choosing the most relevant data representation, defining the right training and test sets, and estimating prediction uncertainty. Beyond data-sharing, cloud-based computing can be harnessed to build and disseminate machine-learning models. Important vectors of acceleration for hit and chemical probe discovery will be (1) the real-time integration of experimental data generation and modeling workflows within design-make-test-analyze (DMTA) cycles openly, and at scale and (2) the adoption of a mindset where data scientists and experimentalists work as a unified team, and where data science is incorporated into the experimental design.
Collapse
Affiliation(s)
- Kristina Edfeldt
- Structural Genomics Consortium, Department of Medicine, Karolinska University Hospital and Karolinska Institutet, Stockholm, Sweden
| | - Aled M Edwards
- Structural Genomics Consortium, University of Toronto, Toronto, ON, Canada
| | - Ola Engkvist
- Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden & Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Judith Günther
- Bayer AG Research and Development, Computational Molecular Design, Berlin, Germany
| | - Matthew Hartley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - David G Hulcoop
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Andrew R Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Brian D Marsden
- Centre for Medicines Discovery, NDM, University of Oxford, Oxford, UK
| | - Amelie Menge
- Institute of Pharmaceutical Chemistry, Johann Wolfgang Goethe University, Frankfurt am Main, 60438, Germany & Structural Genomics Consortium (SGC), Buchmann Institute for Life Sciences, Johann Wolfgang Goethe University, Frankfurt am Main, Germany
| | - Leonie Misquitta
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Susanne Müller
- Institute of Pharmaceutical Chemistry, Johann Wolfgang Goethe University, Frankfurt am Main, 60438, Germany & Structural Genomics Consortium (SGC), Buchmann Institute for Life Sciences, Johann Wolfgang Goethe University, Frankfurt am Main, Germany
| | - Dafydd R Owen
- Pfizer Worldwide Research, Development & Medical, Cambridge, MA, USA
| | - Kristof T Schütt
- Pfizer, Worldwide Research, Development and Medical, Machine Learning & Computational Sciences, Berlin, Germany
| | - Nicholas Skelton
- Department of Discovery Chemistry, Genentech, Inc., South San Francisco, CA, USA
| | - Andreas Steffen
- Pfizer, Worldwide Research, Development and Medical, Machine Learning & Computational Sciences, Berlin, Germany
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Erik Vernet
- Digital Science & Innovation, Novo Nordisk A/S, Maaloev, Denmark
| | - Yanli Wang
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - James Wellnitz
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Timothy M Willson
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Djork-Arné Clevert
- Pfizer, Worldwide Research, Development and Medical, Machine Learning & Computational Sciences, Berlin, Germany.
| | - Benjamin Haibe-Kains
- Structural Genomics Consortium, University of Toronto, Toronto, ON, Canada.
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
| | | | - Matthieu Schapira
- Structural Genomics Consortium, University of Toronto, Toronto, ON, Canada.
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
23
|
Vo K, El-Khamy M, Choi Y. PPG-to-ECG Signal Translation for Continuous Atrial Fibrillation Detection via Attention-based Deep State-Space Modeling. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-7. [PMID: 40039489 DOI: 10.1109/embc53108.2024.10781630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Photoplethysmography (PPG) is a cost-effective and non-invasive technique that utilizes optical methods to measure cardiac physiology. PPG has become increasingly popular in health monitoring and is used in various commercial and clinical wearable devices. Compared to electrocardiography (ECG), PPG does not provide substantial clinical diagnostic value, despite the strong correlation between the two. Here, we propose a subject-independent attention-based deep state-space model (ADSSM) to translate PPG signals to corresponding ECG waveforms. The model is not only robust to noise but also data-efficient by incorporating probabilistic prior knowledge. To evaluate our approach, 55 subjects' data from the MIMIC-III database were used in their original form, and then modified with noise, mimicking real-world scenarios. Our approach was proven effective as evidenced by the PR-AUC of 0.986 achieved when inputting the translated ECG signals into an existing atrial fibrillation (AFib) detector. ADSSM enables the integration of ECG's extensive knowledge base and PPG's continuous measurement for early diagnosis of cardiovascular disease.
Collapse
|
24
|
Beil M, Moreno R, Fronczek J, Kogan Y, Moreno RPJ, Flaatten H, Guidet B, de Lange D, Leaver S, Nachshon A, van Heerden PV, Joskowicz L, Sviri S, Jung C, Szczeklik W. Prognosticating the outcome of intensive care in older patients-a narrative review. Ann Intensive Care 2024; 14:97. [PMID: 38907141 PMCID: PMC11192712 DOI: 10.1186/s13613-024-01330-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 06/10/2024] [Indexed: 06/23/2024] Open
Abstract
Prognosis determines major decisions regarding treatment for critically ill patients. Statistical models have been developed to predict the probability of survival and other outcomes of intensive care. Although they were trained on the characteristics of large patient cohorts, they often do not represent very old patients (age ≥ 80 years) appropriately. Moreover, the heterogeneity within this particular group impairs the utility of statistical predictions for informing decision-making in very old individuals. In addition to these methodological problems, the diversity of cultural attitudes, available resources as well as variations of legal and professional norms limit the generalisability of prediction models, especially in patients with complex multi-morbidity and pre-existing functional impairments. Thus, current approaches to prognosticating outcomes in very old patients are imperfect and can generate substantial uncertainty about optimal trajectories of critical care in the individual. This article presents the state of the art and new approaches to predicting outcomes of intensive care for these patients. Special emphasis has been given to the integration of predictions into the decision-making for individual patients. This requires quantification of prognostic uncertainty and a careful alignment of decisions with the preferences of patients, who might prioritise functional outcomes over survival. Since the performance of outcome predictions for the individual patient may improve over time, time-limited trials in intensive care may be an appropriate way to increase the confidence in decisions about life-sustaining treatment.
Collapse
Affiliation(s)
- Michael Beil
- Department of Medical Intensive Care, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Rui Moreno
- Unidade Local de Saúde São José, Hospital de São José, Lisbon, Portugal
- Centro Clínico Académico de Lisboa, Lisbon, Portugal
- Faculdade de Ciências da Saúde, Universidade da Beira Interior, Covilhã, Portugal
| | - Jakub Fronczek
- Center for Intensive Care and Perioperative Medicine, Jagiellonian University Medical College, Krakow, Poland
| | - Yuri Kogan
- Institute for Medical Biomathematics, Bene Ataroth, Israel
| | | | - Hans Flaatten
- Department of Research and Development, Haukeland University Hospital, Bergen, Norway
| | - Bertrand Guidet
- INSERM, Institut Pierre Louis d'Epidémiologie Et de Santé Publique, AP-HP, Hôpital Saint Antoine, Sorbonne Université, Service MIR, Paris, France
| | - Dylan de Lange
- Department of Intensive Care Medicine, University Medical Center, University Utrecht, Utrecht, The Netherlands
| | - Susannah Leaver
- General Intensive Care, St George's University Hospitals NHS Foundation Trust, London, UK
| | - Akiva Nachshon
- General Intensive Care Unit, Department of Anaesthesiology, Critical Care and Pain Medicine, Faculty of Medicine, Hebrew University and, Hadassah University Medical Center, Jerusalem, Israel
| | - Peter Vernon van Heerden
- General Intensive Care Unit, Department of Anaesthesiology, Critical Care and Pain Medicine, Faculty of Medicine, Hebrew University and, Hadassah University Medical Center, Jerusalem, Israel
| | - Leo Joskowicz
- School of Computer Science and Engineering and Center for Computational Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Sigal Sviri
- Department of Medical Intensive Care, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Christian Jung
- Department of Cardiology, Pulmonology and Vascular Medicine, Faculty of Medicine, Heinrich-Heine-University, University Duesseldorf, Moorenstraße 5, 40225, Düsseldorf, Germany.
| | - Wojciech Szczeklik
- Center for Intensive Care and Perioperative Medicine, Jagiellonian University Medical College, Krakow, Poland
| |
Collapse
|
25
|
Fan Z, Yu J, Zhang X, Chen Y, Sun S, Zhang Y, Chen M, Xiao F, Wu W, Li X, Zheng M, Luo X, Wang D. Reducing overconfident errors in molecular property classification using Posterior Network. PATTERNS (NEW YORK, N.Y.) 2024; 5:100991. [PMID: 39005492 PMCID: PMC11240180 DOI: 10.1016/j.patter.2024.100991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/20/2023] [Accepted: 04/15/2024] [Indexed: 07/16/2024]
Abstract
Deep-learning-based classification models are increasingly used for predicting molecular properties in drug development. However, traditional classification models using the Softmax function often give overconfident mispredictions for out-of-distribution samples, highlighting a critical lack of accurate uncertainty estimation. Such limitations can result in substantial costs and should be avoided during drug development. Inspired by advances in evidential deep learning and Posterior Network, we replaced the Softmax function with a normalizing flow to enhance the uncertainty estimation ability of the model in molecular property classification. The proposed strategy was evaluated across diverse scenarios, including simulated experiments based on a synthetic dataset, ADMET predictions, and ligand-based virtual screening. The results demonstrate that compared with the vanilla model, the proposed strategy effectively alleviates the problem of giving overconfident but incorrect predictions. Our findings support the promising application of evidential deep learning in drug development and offer a valuable framework for further research.
Collapse
Affiliation(s)
- Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
| | - Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
| | - Xiang Zhang
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Yijie Chen
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Shihui Sun
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Yuanyuan Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
| | - Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Lingang Laboratory, Shanghai 200031, China
| | - Fu Xiao
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Wenyong Wu
- Lingang Laboratory, Shanghai 200031, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | | |
Collapse
|
26
|
Wahid KA, Kaffey ZY, Farris DP, Humbert-Vidan L, Moreno AC, Rasmussen M, Ren J, Naser MA, Netherton TJ, Korreman S, Balakrishnan G, Fuller CD, Fuentes D, Dohopolski MJ. Artificial Intelligence Uncertainty Quantification in Radiotherapy Applications - A Scoping Review. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.13.24307226. [PMID: 38798581 PMCID: PMC11118597 DOI: 10.1101/2024.05.13.24307226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Background/purpose The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. Methods We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. Results We identified 56 articles published from 2015-2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50%), followed by image-synthesis (13%), and multiple applications simultaneously (11%). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32%). Imaging data was used in 91% of studies, while only 13% incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60%), with Monte Carlo dropout being the most commonly implemented UQ method (32%) followed by ensembling (16%). 55% of studies did not share code or datasets. Conclusion Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, there was a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
Collapse
Affiliation(s)
- Kareem A. Wahid
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Zaphanlene Y. Kaffey
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - David P. Farris
- Research Medical Library, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Laia Humbert-Vidan
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Amy C. Moreno
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | | | - Jintao Ren
- Department of Oncology, Aarhus University Hospital, Denmark
| | - Mohamed A. Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Tucker J. Netherton
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Stine Korreman
- Department of Oncology, Aarhus University Hospital, Denmark
| | | | - Clifton D. Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - David Fuentes
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Michael J. Dohopolski
- Department of Radiation Oncology, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
27
|
Kantarakias KD, Papadakis G. Uncertainty quantification of time-average quantities of chaotic systems using sensitivity-enhanced polynomial chaos expansion. Phys Rev E 2024; 109:044208. [PMID: 38755938 DOI: 10.1103/physreve.109.044208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 03/19/2024] [Indexed: 05/18/2024]
Abstract
We consider the effect of multiple stochastic parameters on the time-average quantities of chaotic systems. We employ the recently proposed sensitivity-enhanced generalized polynomial chaos expansion, se-gPC, to quantify efficiently this effect. se-gPC is an extension of gPC expansion, enriched with the sensitivity of the time-averaged quantities with respect to the stochastic variables. To compute these sensitivities, the adjoint of the shadowing operator is derived in the frequency domain. Coupling the adjoint operator with gPC provides an efficient uncertainty quantification algorithm, which, in its simplest form, has computational cost that is independent of the number of random variables. The method is applied to the Kuramoto-Sivashinsky equation and is found to produce results that match very well with Monte Carlo simulations. The efficiency of the proposed method significantly outperforms sparse-grid approaches, such as Smolyak quadrature. These properties make the method suitable for application to other dynamical systems with many stochastic parameters.
Collapse
Affiliation(s)
| | - George Papadakis
- Department of Aeronautics, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
28
|
Dolezal JM, Kochanny S, Dyer E, Ramesh S, Srisuwananukorn A, Sacco M, Howard FM, Li A, Mohan P, Pearson AT. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinformatics 2024; 25:134. [PMID: 38539070 PMCID: PMC10967068 DOI: 10.1186/s12859-024-05758-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 03/20/2024] [Indexed: 05/04/2024] Open
Abstract
Deep learning methods have emerged as powerful tools for analyzing histopathological images, but current methods are often specialized for specific domains and software environments, and few open-source options exist for deploying models in an interactive interface. Experimenting with different deep learning approaches typically requires switching software libraries and reprocessing data, reducing the feasibility and practicality of experimenting with new architectures. We developed a flexible deep learning library for histopathology called Slideflow, a package which supports a broad array of deep learning methods for digital pathology and includes a fast whole-slide interface for deploying trained models. Slideflow includes unique tools for whole-slide image data processing, efficient stain normalization and augmentation, weakly-supervised whole-slide classification, uncertainty quantification, feature generation, feature space analysis, and explainability. Whole-slide image processing is highly optimized, enabling whole-slide tile extraction at 40x magnification in 2.5 s per slide. The framework-agnostic data processing pipeline enables rapid experimentation with new methods built with either Tensorflow or PyTorch, and the graphical user interface supports real-time visualization of slides, predictions, heatmaps, and feature space characteristics on a variety of hardware devices, including ARM-based devices such as the Raspberry Pi.
Collapse
Affiliation(s)
- James M Dolezal
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA.
| | - Sara Kochanny
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Emma Dyer
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Siddhi Ramesh
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Andrew Srisuwananukorn
- Division of Hematology, Department of Internal Medicine, The Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
| | - Matteo Sacco
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Frederick M Howard
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Anran Li
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Prajval Mohan
- Department of Computer Science, University of Chicago, Chicago, IL, USA
| | - Alexander T Pearson
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA.
| |
Collapse
|
29
|
Richer E, Solano MM, Cheriet F, Lesk MR, Costantino S. Denoising OCT videos based on temporal redundancy. Sci Rep 2024; 14:6605. [PMID: 38503804 PMCID: PMC10951312 DOI: 10.1038/s41598-024-56935-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 03/12/2024] [Indexed: 03/21/2024] Open
Abstract
The identification of eye diseases and their progression often relies on a clear visualization of the anatomy and on different metrics extracted from Optical Coherence Tomography (OCT) B-scans. However, speckle noise hinders the quality of rapid OCT imaging, hampering the extraction and reliability of biomarkers that require time series. By synchronizing the acquisition of OCT images with the timing of the cardiac pulse, we transform a low-quality OCT video into a clear version by phase-wrapping each frame to the heart pulsation and averaging frames that correspond to the same instant in the cardiac cycle. Here, we compare the performance of our one-cycle denoising strategy with a deep-learning architecture, Noise2Noise, as well as classical denoising methods such as BM3D and Non-Local Means (NLM). We systematically analyze different image quality descriptors as well as region-specific metrics to assess the denoising performance based on the anatomy of the eye. The one-cycle method achieves the highest denoising performance, increases image quality and preserves the high-resolution structures within the eye tissues. The proposed workflow can be readily implemented in a clinical setting.
Collapse
Affiliation(s)
- Emmanuelle Richer
- Department of Computer Engineering and Software Engineering, École Polytechnique de Montréal, Montreal, QC, H3T 1J4, Canada
- Maisonneuve-Rosemont Hospital Research Center, Montreal, QC, H1T 2M4, Canada
| | - Marissé Masís Solano
- Maisonneuve-Rosemont Hospital Research Center, Montreal, QC, H1T 2M4, Canada
- Department of Ophthalmology, Université de Montréal, Montreal, QC, H3T 1P1, Canada
| | - Farida Cheriet
- Department of Computer Engineering and Software Engineering, École Polytechnique de Montréal, Montreal, QC, H3T 1J4, Canada
| | - Mark R Lesk
- Maisonneuve-Rosemont Hospital Research Center, Montreal, QC, H1T 2M4, Canada
- Department of Ophthalmology, Université de Montréal, Montreal, QC, H3T 1P1, Canada
| | - Santiago Costantino
- Maisonneuve-Rosemont Hospital Research Center, Montreal, QC, H1T 2M4, Canada.
- Department of Ophthalmology, Université de Montréal, Montreal, QC, H3T 1P1, Canada.
| |
Collapse
|
30
|
Martín Vicario C, Rodríguez Salas D, Maier A, Hock S, Kuramatsu J, Kallmuenzer B, Thamm F, Taubmann O, Ditt H, Schwab S, Dörfler A, Muehlen I. Uncertainty-aware deep learning for trustworthy prediction of long-term outcome after endovascular thrombectomy. Sci Rep 2024; 14:5544. [PMID: 38448445 PMCID: PMC10917742 DOI: 10.1038/s41598-024-55761-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 02/27/2024] [Indexed: 03/08/2024] Open
Abstract
Acute ischemic stroke (AIS) is a leading global cause of mortality and morbidity. Improving long-term outcome predictions after thrombectomy can enhance treatment quality by supporting clinical decision-making. With the advent of interpretable deep learning methods in recent years, it is now possible to develop trustworthy, high-performing prediction models. This study introduces an uncertainty-aware, graph deep learning model that predicts endovascular thrombectomy outcomes using clinical features and imaging biomarkers. The model targets long-term functional outcomes, defined by the three-month modified Rankin Score (mRS), and mortality rates. A sample of 220 AIS patients in the anterior circulation who underwent endovascular thrombectomy (EVT) was included, with 81 (37%) demonstrating good outcomes (mRS ≤ 2). The performance of the different algorithms evaluated was comparable, with the maximum validation under the curve (AUC) reaching 0.87 using graph convolutional networks (GCN) for mRS prediction and 0.86 using fully connected networks (FCN) for mortality prediction. Moderate performance was obtained at admission (AUC of 0.76 using GCN), which improved to 0.84 post-thrombectomy and to 0.89 a day after stroke. Reliable uncertainty prediction of the model could be demonstrated.
Collapse
Affiliation(s)
- Celia Martín Vicario
- Department of Neuroradiology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany.
- Pattern Recognition Lab, Friedrich Alexander University, Erlangen, Germany.
| | - Dalia Rodríguez Salas
- Department of Neuroradiology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
- Pattern Recognition Lab, Friedrich Alexander University, Erlangen, Germany
| | - Andreas Maier
- Pattern Recognition Lab, Friedrich Alexander University, Erlangen, Germany
| | - Stefan Hock
- Department of Neuroradiology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | - Joji Kuramatsu
- Department of Neurology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | - Bernd Kallmuenzer
- Department of Neurology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | | | | | | | - Stefan Schwab
- Department of Neurology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | - Arnd Dörfler
- Department of Neuroradiology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | - Iris Muehlen
- Department of Neuroradiology, Friedrich-Alexander University of Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| |
Collapse
|
31
|
Heremans ERM, Seedat N, Buyse B, Testelmans D, van der Schaar M, De Vos M. U-PASS: An uncertainty-guided deep learning pipeline for automated sleep staging. Comput Biol Med 2024; 171:108205. [PMID: 38401452 DOI: 10.1016/j.compbiomed.2024.108205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/16/2024] [Accepted: 02/20/2024] [Indexed: 02/26/2024]
Abstract
With the increasing prevalence of machine learning in critical fields like healthcare, ensuring the safety and reliability of these systems is crucial. Estimating uncertainty plays a vital role in enhancing reliability by identifying areas of high and low confidence and reducing the risk of errors. This study introduces U-PASS, a specialized human-centered machine learning pipeline tailored for clinical applications, which effectively communicates uncertainty to clinical experts and collaborates with them to improve predictions. U-PASS incorporates uncertainty estimation at every stage of the process, including data acquisition, training, and model deployment. Training is divided into a supervised pre-training step and a semi-supervised recording-wise finetuning step. We apply U-PASS to the challenging task of sleep staging and demonstrate that it systematically improves performance at every stage. By optimizing the training dataset, actively seeking feedback from domain experts for informative samples, and deferring the most uncertain samples to experts, U-PASS achieves an impressive expert-level accuracy of 85% on a challenging clinical dataset of elderly sleep apnea patients. This represents a significant improvement over the starting point at 75% accuracy. The largest improvement gain is due to the deferral of uncertain epochs to a sleep expert. U-PASS presents a promising AI approach to incorporating uncertainty estimation in machine learning pipelines, improving their reliability and unlocking their potential in clinical settings.
Collapse
Affiliation(s)
- Elisabeth R M Heremans
- KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| | | | - Bertien Buyse
- UZ Leuven, Department of Pneumology, Herestraat 49, B-3000 Leuven, Belgium
| | - Dries Testelmans
- UZ Leuven, Department of Pneumology, Herestraat 49, B-3000 Leuven, Belgium
| | | | - Maarten De Vos
- KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| |
Collapse
|
32
|
Lv Q, Liu Y, Sun Y, Wu M. Insight into deep learning for glioma IDH medical image analysis: A systematic review. Medicine (Baltimore) 2024; 103:e37150. [PMID: 38363910 PMCID: PMC10869095 DOI: 10.1097/md.0000000000037150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 01/11/2024] [Indexed: 02/18/2024] Open
Abstract
BACKGROUND Deep learning techniques explain the enormous potential of medical image analysis, particularly in digital pathology. Concurrently, molecular markers have gained increasing significance over the past decade in the context of glioma patients, providing novel insights into diagnosis and more personalized treatment options. Deep learning combined with imaging and molecular analysis enables more accurate prognostication of patients, more accurate treatment plan proposals, and accurate biomarker (IDH) prediction for gliomas. This systematic study examines the development of deep learning techniques for IDH prediction using histopathology images, spanning the period from 2019 to 2023. METHOD The study adhered to the PRISMA reporting requirements, and databases including PubMed, Google Scholar, Google Search, and preprint repositories (such as arXiv) were systematically queried for pertinent literature spanning the period from 2019 to the 30th of 2023. Search phrases related to deep learning, digital pathology, glioma, and IDH were collaboratively utilized. RESULTS Fifteen papers meeting the inclusion criteria were included in the analysis. These criteria specifically encompassed studies utilizing deep learning for the analysis of hematoxylin and eosin images to determine the IDH status in patients with gliomas. CONCLUSIONS When predicting the status of IDH, the classifier built on digital pathological images demonstrates exceptional performance. The study's predictive effectiveness is enhanced with the utilization of the appropriate deep learning model. However, external verification is necessary to showcase their resilience and universality. Larger sample sizes and multicenter samples are necessary for more comprehensive research to evaluate performance and confirm clinical advantages.
Collapse
Affiliation(s)
- Qingqing Lv
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
- The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, 410078, Hunan, China
| | - Yihao Liu
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
- The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, 410078, Hunan, China
| | - Yingnan Sun
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
| | - Minghua Wu
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
- The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, 410078, Hunan, China
| |
Collapse
|
33
|
Tran AT, Zeevi T, Haider SP, Abou Karam G, Berson ER, Tharmaseelan H, Qureshi AI, Sanelli PC, Werring DJ, Malhotra A, Petersen NH, de Havenon A, Falcone GJ, Sheth KN, Payabvash S. Uncertainty-aware deep-learning model for prediction of supratentorial hematoma expansion from admission non-contrast head computed tomography scan. NPJ Digit Med 2024; 7:26. [PMID: 38321131 PMCID: PMC10847454 DOI: 10.1038/s41746-024-01007-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 01/10/2024] [Indexed: 02/08/2024] Open
Abstract
Hematoma expansion (HE) is a modifiable risk factor and a potential treatment target in patients with intracerebral hemorrhage (ICH). We aimed to train and validate deep-learning models for high-confidence prediction of supratentorial ICH expansion, based on admission non-contrast head Computed Tomography (CT). Applying Monte Carlo dropout and entropy of deep-learning model predictions, we estimated the model uncertainty and identified patients at high risk of HE with high confidence. Using the receiver operating characteristics area under the curve (AUC), we compared the deep-learning model prediction performance with multivariable models based on visual markers of HE determined by expert reviewers. We randomly split a multicentric dataset of patients (4-to-1) into training/cross-validation (n = 634) versus test (n = 159) cohorts. We trained and tested separate models for prediction of ≥6 mL and ≥3 mL ICH expansion. The deep-learning models achieved an AUC = 0.81 for high-confidence prediction of HE≥6 mL and AUC = 0.80 for prediction of HE≥3 mL, which were higher than visual maker models AUC = 0.69 for HE≥6 mL (p = 0.036) and AUC = 0.68 for HE≥3 mL (p = 0.043). Our results show that fully automated deep-learning models can identify patients at risk of supratentorial ICH expansion based on admission non-contrast head CT, with high confidence, and more accurately than benchmark visual markers.
Collapse
Grants
- U24 NS107136 NINDS NIH HHS
- UL1 TR001863 NCATS NIH HHS
- K76 AG059992 NIA NIH HHS
- P30 AG021342 NIA NIH HHS
- R03 NS112859 NINDS NIH HHS
- U24 NS107215 NINDS NIH HHS
- U01 NS106513 NINDS NIH HHS
- 2020097 Doris Duke Charitable Foundation
- T35 HL007649 NHLBI NIH HHS
- K23 NS110980 NINDS NIH HHS
- K23 NS118056 NINDS NIH HHS
- R01 NR018335 NINR NIH HHS
- Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- Doris Duke Charitable Foundation (DDCF)
- Doris Duke Charitable Foundation (2020097), American Society of Neuroradiology, and National Institutes of Health (K23NS118056).
- National Institutes of Health (K76AG059992, R03NS112859, and P30AG021342), the American Heart Association (18IDDG34280056), the Yale Pepper Scholar Award, and the Neurocritical Care Society Research Fellowship
- National Institutes of Health (U24NS107136, U24NS107215, R01NR018335, and U01NS106513) and the American Heart Association (18TPA34170180 and 17CSA33550004) and a Hyperfine Research Inc research grant.
Collapse
Affiliation(s)
- Anh T Tran
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Tal Zeevi
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Stefan P Haider
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
- Department of Otorhinolaryngology, University Hospital of Ludwig Maximilians Universität München, Munich, Germany
| | - Gaby Abou Karam
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Elisa R Berson
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Hishan Tharmaseelan
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Adnan I Qureshi
- Stroke Institute and Department of Neurology, University of Missouri, Columbia, MO, USA
| | - Pina C Sanelli
- Department of Radiology, Northwell Health, Manhasset, NY, USA
| | - David J Werring
- Stroke Research Centre, University College London, Queen Square Institute of Neurology, London, UK
| | - Ajay Malhotra
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Nils H Petersen
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
| | - Adam de Havenon
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
| | - Guido J Falcone
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
| | - Kevin N Sheth
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA.
| | - Seyedmehdi Payabvash
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|
34
|
Yue G, Zhuo G, Yan W, Zhou T, Tang C, Yang P, Wang T. Boundary uncertainty aware network for automated polyp segmentation. Neural Netw 2024; 170:390-404. [PMID: 38029720 DOI: 10.1016/j.neunet.2023.11.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 07/15/2023] [Accepted: 11/22/2023] [Indexed: 12/01/2023]
Abstract
Recently, leveraging deep neural networks for automated colorectal polyp segmentation has emerged as a hot topic due to the favored advantages in evading the limitations of visual inspection, e.g., overwork and subjectivity. However, most existing methods do not pay enough attention to the uncertain areas of colonoscopy images and often provide unsatisfactory segmentation performance. In this paper, we propose a novel boundary uncertainty aware network (BUNet) for precise and robust colorectal polyp segmentation. Specifically, considering that polyps vary greatly in size and shape, we first adopt a pyramid vision transformer encoder to learn multi-scale feature representations. Then, a simple yet effective boundary exploration module (BEM) is proposed to explore boundary cues from the low-level features. To make the network focus on the ambiguous area where the prediction score is biased to neither the foreground nor the background, we further introduce a boundary uncertainty aware module (BUM) that explores error-prone regions from the high-level features with the assistance of boundary cues provided by the BEM. Through the top-down hybrid deep supervision, our BUNet implements coarse-to-fine polyp segmentation and finally localizes polyp regions precisely. Extensive experiments on five public datasets show that BUNet is superior to thirteen competing methods in terms of both effectiveness and generalization ability.
Collapse
Affiliation(s)
- Guanghui Yue
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| | - Guibin Zhuo
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| | - Weiqing Yan
- School of Computer and Control Engineering, Yantai University, Yantai 264005, China
| | - Tianwei Zhou
- College of Management, Shenzhen University, Shenzhen 518060, China.
| | - Chang Tang
- School of Computer Science, China University of Geosciences, Wuhan 430074, China
| | - Peng Yang
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| | - Tianfu Wang
- National-Reginoal Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
35
|
Chanda T, Hauser K, Hobelsberger S, Bucher TC, Garcia CN, Wies C, Kittler H, Tschandl P, Navarrete-Dechent C, Podlipnik S, Chousakos E, Crnaric I, Majstorovic J, Alhajwan L, Foreman T, Peternel S, Sarap S, Özdemir İ, Barnhill RL, Llamas-Velasco M, Poch G, Korsing S, Sondermann W, Gellrich FF, Heppt MV, Erdmann M, Haferkamp S, Drexler K, Goebeler M, Schilling B, Utikal JS, Ghoreschi K, Fröhling S, Krieghoff-Henning E, Brinker TJ. Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma. Nat Commun 2024; 15:524. [PMID: 38225244 PMCID: PMC10789736 DOI: 10.1038/s41467-023-43095-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 10/31/2023] [Indexed: 01/17/2024] Open
Abstract
Artificial intelligence (AI) systems have been shown to help dermatologists diagnose melanoma more accurately, however they lack transparency, hindering user acceptance. Explainable AI (XAI) methods can help to increase transparency, yet often lack precise, domain-specific explanations. Moreover, the impact of XAI methods on dermatologists' decisions has not yet been evaluated. Building upon previous research, we introduce an XAI system that provides precise and domain-specific explanations alongside its differential diagnoses of melanomas and nevi. Through a three-phase study, we assess its impact on dermatologists' diagnostic accuracy, diagnostic confidence, and trust in the XAI-support. Our results show strong alignment between XAI and dermatologist explanations. We also show that dermatologists' confidence in their diagnoses, and their trust in the support system significantly increase with XAI compared to conventional AI. This study highlights dermatologists' willingness to adopt such XAI systems, promoting future use in the clinic.
Collapse
Affiliation(s)
- Tirtha Chanda
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Katja Hauser
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sarah Hobelsberger
- Department of Dermatology, University Hospital, Technical University Dresden, Dresden, Germany
| | - Tabea-Clara Bucher
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Carina Nogueira Garcia
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Christoph Wies
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty of University Heidelberg, Heidelberg, Germany
| | - Harald Kittler
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Philipp Tschandl
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Cristian Navarrete-Dechent
- Department of Dermatology, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Sebastian Podlipnik
- Dermatology Department, Hospital Clínic of Barcelona, University of Barcelona, IDIBAPS, Barcelona, Spain
| | - Emmanouil Chousakos
- 1st Department of Pathology, Medical School, National & Kapodistrian University of Athens, Athens, Greece
| | - Iva Crnaric
- Department of Dermatovenereology, Sestre milosrdnice University Hospital Center, Zagreb, Croatia
| | | | - Linda Alhajwan
- Department of Dermatology, Dubai London Clinic, Dubai, United Arab Emirates
| | | | - Sandra Peternel
- Department of Dermatovenereology, Clinical Hospital Center Rijeka, Faculty of Medicine, University of Rijeka, Rijeka, Croatia
| | | | - İrem Özdemir
- Department of Dermatology, Faculty of Medicine, Gazi University, Ankara, Turkey
| | - Raymond L Barnhill
- Department of Translational Research, Institut Curie, Unit of Formation and Research of Medicine University of Paris, Paris, France
| | | | - Gabriela Poch
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Dermatology, Venereology and Allergology, Berlin, Germany
| | - Sören Korsing
- Department of Dermatology, University Hospital Essen, University Duisburg-Essen, Essen, Germany
| | - Wiebke Sondermann
- Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | | | - Markus V Heppt
- Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Erdmann
- Department of Dermatology, Uniklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Sebastian Haferkamp
- Department of Dermatology, University Hospital Regensburg, Regensburg, Germany
| | - Konstantin Drexler
- Department of Dermatology, University Hospital Regensburg, Regensburg, Germany
| | - Matthias Goebeler
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Bastian Schilling
- Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany
| | - Jochen S Utikal
- Department of Dermatology, Venereology and Allergology, University Medical Center Mannheim, Ruprecht-Karl University of Heidelberg, Mannheim, Germany
| | - Kamran Ghoreschi
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Dermatology, Venereology and Allergology, Berlin, Germany
| | - Stefan Fröhling
- Division of Translational Medical Oncology, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Eva Krieghoff-Henning
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Titus J Brinker
- Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
36
|
Tomczak A, Ilic S, Marquardt G, Engel T, Navab N, Albarqouni S. Digital Staining of White Blood Cells With Confidence Estimation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3895-3906. [PMID: 37698963 DOI: 10.1109/tmi.2023.3314695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
Chemical staining of the blood smears is one of the crucial components of blood analysis. It is an expensive, lengthy and sensitive process, often prone to produce slight variations in colour and seen structures due to a lack of unified protocols across laboratories. Even though the current developments in deep generative modeling offer an opportunity to replace the chemical process with a digital one, there are specific safety-ensuring requirements due to the severe consequences of mistakes in a medical setting. Therefore digital staining system would profit from an additional confidence estimation quantifying the quality of the digitally stained white blood cell. To this aim, during the staining generation, we disentangle the latent space of the Generative Adversarial Network, obtaining separate representation s of the white blood cell and the staining. We estimate the generated image's confidence of white blood cell structure and staining quality by corrupting these representations with noise and quantifying the information retained between multiple outputs. We show that confidence estimated in this way correlates with image quality measured in terms of LPIPS values calculated for the generated and ground truth stained images. We validate our method by performing digital staining of images captured with a Differential Inference Contrast microscope on a dataset composed of white blood cells of 24 patients. The high absolute value of the correlation between our confidence score and LPIPS demonstrates the effectiveness of our method, opening the possibility of predicting the quality of generated output and ensuring trustworthiness in medical safety-critical setup.
Collapse
|
37
|
Brahma S, Kolbitsch C, Martin J, Schaeffter T, Kofler A. Data-efficient Bayesian learning for radial dynamic MR reconstruction. Med Phys 2023; 50:6955-6977. [PMID: 37367947 DOI: 10.1002/mp.16543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 04/07/2023] [Accepted: 05/20/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Cardiac MRI has become the gold-standard imaging technique for assessing cardiovascular morphology and function. In spite of this, its slow data acquisition process presents imaging challenges due to the motion from heartbeats, respiration, and blood flow. In recent studies, deep learning (DL) algorithms have shown promising results for the task of image reconstruction. However, there have been instances where they have introduced artifacts that may be misinterpreted as pathologies or may obscure the detection of pathologies. Therefore, it is important to obtain a metric, such as the uncertainty of the network output, that identifies such artifacts. However, this can be quite challenging for large-scale image reconstruction problems such as dynamic multi-coil non-Cartesian MRI. PURPOSE To efficiently quantify uncertainties of a physics-informed DL-based image reconstruction method for a large-scale accelerated 2D multi-coil dynamic radial MRI reconstruction problem, and demonstrate the benefits of physics-informed DL over model-agnostic DL in reducing uncertainties while at the same time improving image quality. METHODS We extended a recently proposed physics-informed 2D U-Net that learns spatio-temporal slices (named XT-YT U-Net), and employed it for the task of uncertainty quantification (UQ) by using Monte Carlo dropout and a Gaussian negative log-likelihood loss function. Our data comprised 2D dynamic MR images acquired with a radial balanced steady-state free precession sequence. The XT-YT U-Net, which allows for training with a limited amount of data, was trained and validated on a dataset of 15 healthy volunteers, and further tested on data from four patients. An extensive comparison between physics-informed and model-agnostic neural networks (NNs) concerning the obtained image quality and uncertainty estimates was performed. Further, we employed calibration plots to assess the quality of the UQ. RESULTS The inclusion of the MR-physics model of data acquisition as a building block in the NN architecture led to higher image quality (NRMSE:- 33 ± 8.2 % $-33 \pm 8.2 \%$ , PSNR:6.3 ± 1.3 % $6.3 \pm 1.3 \%$ , and SSIM:1.9 ± 0.96 % $1.9 \pm 0.96 \%$ ), lower uncertainties (- 46 ± 8.7 % $-46 \pm 8.7 \%$ ), and, based on the calibration plots, an improved UQ compared to its model-agnostic counterpart. Furthermore, the UQ information can be used to differentiate between anatomical structures (e.g., coronary arteries, ventricle boundaries) and artifacts. CONCLUSIONS Using an XT-YT U-Net, we were able to quantify uncertainties of a physics-informed NN for a high-dimensional and computationally demanding 2D multi-coil dynamic MR imaging problem. In addition to improving the image quality, embedding the acquisition model in the network architecture decreased the reconstruction uncertainties as well as quantitatively improved the UQ. The UQ provides additional information to assess the performance of different network approaches.
Collapse
Affiliation(s)
- Sherine Brahma
- Physikalisch-Technische Bundesanstalt (PTB), Braunschweig and Berlin, Germany
- Department of Radiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Christoph Kolbitsch
- Physikalisch-Technische Bundesanstalt (PTB), Braunschweig and Berlin, Germany
- School of Imaging Sciences and Biomedical Engineering, King's College London, London, UK
| | - Joerg Martin
- Physikalisch-Technische Bundesanstalt (PTB), Braunschweig and Berlin, Germany
| | - Tobias Schaeffter
- Physikalisch-Technische Bundesanstalt (PTB), Braunschweig and Berlin, Germany
- School of Imaging Sciences and Biomedical Engineering, King's College London, London, UK
- Department of Medical Engineering, Technical University of Berlin, Berlin, Germany
| | - Andreas Kofler
- Physikalisch-Technische Bundesanstalt (PTB), Braunschweig and Berlin, Germany
| |
Collapse
|
38
|
Yan X, Su Y, Ma W. Ensemble Multi-Quantiles: Adaptively Flexible Distribution Prediction for Uncertainty Quantification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:13068-13082. [PMID: 37339037 DOI: 10.1109/tpami.2023.3288028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2023]
Abstract
We propose a novel, succinct, and effective approach for distribution prediction to quantify uncertainty in machine learning. It incorporates adaptively flexible distribution prediction of [Formula: see text] in regression tasks. This conditional distribution's quantiles of probability levels spreading the interval (0,1) are boosted by additive models which are designed by us with intuitions and interpretability. We seek an adaptive balance between the structural integrity and the flexibility for [Formula: see text], while Gaussian assumption results in a lack of flexibility for real data and highly flexible approaches (e.g., estimating the quantiles separately without a distribution structure) inevitably have drawbacks and may not lead to good generalization. This ensemble multi-quantiles approach called EMQ proposed by us is totally data-driven, and can gradually depart from Gaussian and discover the optimal conditional distribution in the boosting. On extensive regression tasks from UCI datasets, we show that EMQ achieves state-of-the-art performance comparing to many recent uncertainty quantification methods. Visualization results further illustrate the necessity and the merits of such an ensemble model.
Collapse
|
39
|
Jiang Y, Wang C, Zhou S. Artificial intelligence-based risk stratification, accurate diagnosis and treatment prediction in gynecologic oncology. Semin Cancer Biol 2023; 96:82-99. [PMID: 37783319 DOI: 10.1016/j.semcancer.2023.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 08/27/2023] [Accepted: 09/25/2023] [Indexed: 10/04/2023]
Abstract
As data-driven science, artificial intelligence (AI) has paved a promising path toward an evolving health system teeming with thrilling opportunities for precision oncology. Notwithstanding the tremendous success of oncological AI in such fields as lung carcinoma, breast tumor and brain malignancy, less attention has been devoted to investigating the influence of AI on gynecologic oncology. Hereby, this review sheds light on the ever-increasing contribution of state-of-the-art AI techniques to the refined risk stratification and whole-course management of patients with gynecologic tumors, in particular, cervical, ovarian and endometrial cancer, centering on information and features extracted from clinical data (electronic health records), cancer imaging including radiological imaging, colposcopic images, cytological and histopathological digital images, and molecular profiling (genomics, transcriptomics, metabolomics and so forth). However, there are still noteworthy challenges beyond performance validation. Thus, this work further describes the limitations and challenges faced in the real-word implementation of AI models, as well as potential solutions to address these issues.
Collapse
Affiliation(s)
- Yuting Jiang
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE and State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, China; Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Chengdi Wang
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE and State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, China; Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Shengtao Zhou
- Department of Obstetrics and Gynecology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of MOE and State Key Laboratory of Biotherapy, West China Second Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, China; Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China.
| |
Collapse
|
40
|
Seoni S, Jahmunah V, Salvi M, Barua PD, Molinari F, Acharya UR. Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013-2023). Comput Biol Med 2023; 165:107441. [PMID: 37683529 DOI: 10.1016/j.compbiomed.2023.107441] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 08/27/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023]
Abstract
Uncertainty estimation in healthcare involves quantifying and understanding the inherent uncertainty or variability associated with medical predictions, diagnoses, and treatment outcomes. In this era of Artificial Intelligence (AI) models, uncertainty estimation becomes vital to ensure safe decision-making in the medical field. Therefore, this review focuses on the application of uncertainty techniques to machine and deep learning models in healthcare. A systematic literature review was conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our analysis revealed that Bayesian methods were the predominant technique for uncertainty quantification in machine learning models, with Fuzzy systems being the second most used approach. Regarding deep learning models, Bayesian methods emerged as the most prevalent approach, finding application in nearly all aspects of medical imaging. Most of the studies reported in this paper focused on medical images, highlighting the prevalent application of uncertainty quantification techniques using deep learning models compared to machine learning models. Interestingly, we observed a scarcity of studies applying uncertainty quantification to physiological signals. Thus, future research on uncertainty quantification should prioritize investigating the application of these techniques to physiological signals. Overall, our review highlights the significance of integrating uncertainty techniques in healthcare applications of machine learning and deep learning models. This can provide valuable insights and practical solutions to manage uncertainty in real-world medical data, ultimately improving the accuracy and reliability of medical diagnoses and treatment recommendations.
Collapse
Affiliation(s)
- Silvia Seoni
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | | | - Massimo Salvi
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - Prabal Datta Barua
- School of Business (Information System), University of Southern Queensland, Toowoomba, QLD, 4350, Australia; Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, 2007, Australia
| | - Filippo Molinari
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy.
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
41
|
Mehrtens HA, Kurz A, Bucher TC, Brinker TJ. Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise. Med Image Anal 2023; 89:102914. [PMID: 37544085 DOI: 10.1016/j.media.2023.102914] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 05/17/2023] [Accepted: 07/25/2023] [Indexed: 08/08/2023]
Abstract
In the past years, deep learning has seen an increase in usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole Slide Images, with a focus on the task of selective classification, where the model should reject the classification in situations in which it is uncertain. We conduct our experiments on tile-level under the aspects of domain shift and label noise, as well as on slide-level. In our experiments, we compare Deep Ensembles, Monte-Carlo Dropout, Stochastic Variational Inference, Test-Time Data Augmentation as well as ensembles of the latter approaches. We observe that ensembles of methods generally lead to better uncertainty estimates as well as an increased robustness towards domain shifts and label noise, while contrary to results from classical computer vision benchmarks no systematic gain of the other methods can be shown. Across methods, a rejection of the most uncertain samples reliably leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.
Collapse
Affiliation(s)
- Hendrik A Mehrtens
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Alexander Kurz
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Tabea-Clara Bucher
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Titus J Brinker
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
42
|
Jafari M, Shoeibi A, Khodatars M, Bagherzadeh S, Shalbaf A, García DL, Gorriz JM, Acharya UR. Emotion recognition in EEG signals using deep learning methods: A review. Comput Biol Med 2023; 165:107450. [PMID: 37708717 DOI: 10.1016/j.compbiomed.2023.107450] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 08/03/2023] [Accepted: 09/01/2023] [Indexed: 09/16/2023]
Abstract
Emotions are a critical aspect of daily life and serve a crucial role in human decision-making, planning, reasoning, and other mental states. As a result, they are considered a significant factor in human interactions. Human emotions can be identified through various sources, such as facial expressions, speech, behavior (gesture/position), or physiological signals. The use of physiological signals can enhance the objectivity and reliability of emotion detection. Compared with peripheral physiological signals, electroencephalogram (EEG) recordings are directly generated by the central nervous system and are closely related to human emotions. EEG signals have the great spatial resolution that facilitates the evaluation of brain functions, making them a popular modality in emotion recognition studies. Emotion recognition using EEG signals presents several challenges, including signal variability due to electrode positioning, individual differences in signal morphology, and lack of a universal standard for EEG signal processing. Moreover, identifying the appropriate features for emotion recognition from EEG data requires further research. Finally, there is a need to develop more robust artificial intelligence (AI) including conventional machine learning (ML) and deep learning (DL) methods to handle the complex and diverse EEG signals associated with emotional states. This paper examines the application of DL techniques in emotion recognition from EEG signals and provides a detailed discussion of relevant articles. The paper explores the significant challenges in emotion recognition using EEG signals, highlights the potential of DL techniques in addressing these challenges, and suggests the scope for future research in emotion recognition using DL techniques. The paper concludes with a summary of its findings.
Collapse
Affiliation(s)
- Mahboobeh Jafari
- Data Science and Computational Intelligence Institute, University of Granada, Spain
| | - Afshin Shoeibi
- Data Science and Computational Intelligence Institute, University of Granada, Spain.
| | - Marjane Khodatars
- Data Science and Computational Intelligence Institute, University of Granada, Spain
| | - Sara Bagherzadeh
- Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Ahmad Shalbaf
- Department of Biomedical Engineering and Medical Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - David López García
- Data Science and Computational Intelligence Institute, University of Granada, Spain
| | - Juan M Gorriz
- Data Science and Computational Intelligence Institute, University of Granada, Spain; Department of Psychiatry, University of Cambridge, UK
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
43
|
Chen W, Li C, Chen D, Luo X. A knowledge-based learning framework for self-supervised pre-training towards enhanced recognition of biomedical microscopy images. Neural Netw 2023; 167:810-826. [PMID: 37738716 DOI: 10.1016/j.neunet.2023.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 07/05/2023] [Accepted: 09/01/2023] [Indexed: 09/24/2023]
Abstract
Self-supervised pre-training has become the priory choice to establish reliable neural networks for automated recognition of massive biomedical microscopy images, which are routinely annotation-free, without semantics, and without guarantee of quality. Note that this paradigm is still at its infancy and limited by closely related open issues: (1) how to learn robust representations in an unsupervised manner from unlabeled biomedical microscopy images of low diversity in samples? and (2) how to obtain the most significant representations demanded by a high-quality segmentation? Aiming at these issues, this study proposes a knowledge-based learning framework (TOWER) towards enhanced recognition of biomedical microscopy images, which works in three phases by synergizing contrastive learning and generative learning methods: (1) Sample Space Diversification: Reconstructive proxy tasks have been enabled to embed a priori knowledge with context highlighted to diversify the expanded sample space; (2) Enhanced Representation Learning: Informative noise-contrastive estimation loss regularizes the encoder to enhance representation learning of annotation-free images; (3) Correlated Optimization: Optimization operations in pre-training the encoder and the decoder have been correlated via image restoration from proxy tasks, targeting the need for semantic segmentation. Experiments have been conducted on public datasets of biomedical microscopy images against the state-of-the-art counterparts (e.g., SimCLR and BYOL), and results demonstrate that: TOWER statistically excels in all self-supervised methods, achieving a Dice improvement of 1.38 percentage points over SimCLR. TOWER also has potential in multi-modality medical image analysis and enables label-efficient semi-supervised learning, e.g., reducing the annotation cost by up to 99% in pathological classification.
Collapse
Affiliation(s)
- Wei Chen
- School of Computer, National University of Defense Technology, Changsha 410073, China
| | - Chen Li
- School of Computer, National University of Defense Technology, Changsha 410073, China.
| | - Dan Chen
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Xin Luo
- School of Computer, National University of Defense Technology, Changsha 410073, China
| |
Collapse
|
44
|
Pepe A, Egger J, Codari M, Willemink MJ, Gsaxner C, Li J, Roth PM, Schmalstieg D, Mistelbauer G, Fleischmann D. Automated cross-sectional view selection in CT angiography of aortic dissections with uncertainty awareness and retrospective clinical annotations. Comput Biol Med 2023; 165:107365. [PMID: 37647783 DOI: 10.1016/j.compbiomed.2023.107365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 07/20/2023] [Accepted: 08/12/2023] [Indexed: 09/01/2023]
Abstract
Surveillance imaging of patients with chronic aortic diseases, such as aneurysms and dissections, relies on obtaining and comparing cross-sectional diameter measurements along the aorta at predefined aortic landmarks, over time. The orientation of the cross-sectional measuring planes at each landmark is currently defined manually by highly trained operators. Centerline-based approaches are unreliable in patients with chronic aortic dissection, because of the asymmetric flow channels, differences in contrast opacification, and presence of mural thrombus, making centerline computations or measurements difficult to generate and reproduce. In this work, we present three alternative approaches - INS, MCDS, MCDbS - based on convolutional neural networks and uncertainty quantification methods to predict the orientation (ϕ,θ) of such cross-sectional planes. For the monitoring of chronic aortic dissections, we show how a dataset of 162 CTA volumes with overall 3273 imperfect manual annotations routinely collected in a clinic can be efficiently used to accomplish this task, despite the presence of non-negligible interoperator variabilities in terms of mean absolute error (MAE) and 95% limits of agreement (LOA). We show how, despite the large limits of agreement in the training data, the trained model provides faster and more reproducible results than either an expert user or a centerline method. The remaining disagreement lies within the variability produced by three independent expert annotators and matches the current state of the art, providing a similar error, but in a fraction of the time.
Collapse
Affiliation(s)
- Antonio Pepe
- Graz University of Technology, Institute of Computer Graphics and Vision, Inffeldgasse 16/II, 8010 Graz, Austria; Stanford University, School of Medicine, 3D and Quantitative Imaging Lab, 300 Pasteur Drive Stanford, CA 94305, USA; Computer Algorithms for Médicine (Café) Laboratory, Graz, Austria.
| | - Jan Egger
- Computer Algorithms for Médicine (Café) Laboratory, Graz, Austria; University Medicine Essen, Institute for AI in Medicine (IKIM), Girardetstraße 2, 45131 Essen, Germany.
| | - Marina Codari
- Stanford University, School of Medicine, 3D and Quantitative Imaging Lab, 300 Pasteur Drive Stanford, CA 94305, USA.
| | - Martin J Willemink
- Stanford University, School of Medicine, 3D and Quantitative Imaging Lab, 300 Pasteur Drive Stanford, CA 94305, USA.
| | - Christina Gsaxner
- Graz University of Technology, Institute of Computer Graphics and Vision, Inffeldgasse 16/II, 8010 Graz, Austria; Computer Algorithms for Médicine (Café) Laboratory, Graz, Austria.
| | - Jianning Li
- Computer Algorithms for Médicine (Café) Laboratory, Graz, Austria; University Medicine Essen, Institute for AI in Medicine (IKIM), Girardetstraße 2, 45131 Essen, Germany.
| | - Peter M Roth
- Graz University of Technology, Institute of Computer Graphics and Vision, Inffeldgasse 16/II, 8010 Graz, Austria.
| | - Dieter Schmalstieg
- Graz University of Technology, Institute of Computer Graphics and Vision, Inffeldgasse 16/II, 8010 Graz, Austria.
| | - Gabriel Mistelbauer
- Stanford University, School of Medicine, 3D and Quantitative Imaging Lab, 300 Pasteur Drive Stanford, CA 94305, USA.
| | - Dominik Fleischmann
- Stanford University, School of Medicine, 3D and Quantitative Imaging Lab, 300 Pasteur Drive Stanford, CA 94305, USA.
| |
Collapse
|
45
|
Nashwan AJ. A New Era in Cardiometabolic Management: Unlocking the Potential of Artificial Intelligence for Improved Patient Outcomes. Endocr Pract 2023; 29:743-745. [PMID: 37328103 DOI: 10.1016/j.eprac.2023.06.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 06/03/2023] [Accepted: 06/07/2023] [Indexed: 06/18/2023]
Affiliation(s)
- Abdulqadir J Nashwan
- Nursing Department, Hamad Medical Corporation, Doha, Qatar; Department of Public Health, College of Health Sciences, QU Health, Qatar University, Doha, Qatar.
| |
Collapse
|
46
|
Zhou T, Zhu S. Uncertainty quantification and attention-aware fusion guided multi-modal MR brain tumor segmentation. Comput Biol Med 2023; 163:107142. [PMID: 37331100 DOI: 10.1016/j.compbiomed.2023.107142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/17/2023] [Accepted: 06/05/2023] [Indexed: 06/20/2023]
Abstract
Brain tumor is one of the most aggressive cancers in the world, accurate brain tumor segmentation plays a critical role in clinical diagnosis and treatment planning. Although deep learning models have presented remarkable success in medical segmentation, they can only obtain the segmentation map without capturing the segmentation uncertainty. To achieve accurate and safe clinical results, it is necessary to produce extra uncertainty maps to assist the subsequent segmentation revision. To this end, we propose to exploit the uncertainty quantification in the deep learning model and apply it to multi-modal brain tumor segmentation. In addition, we develop an effective attention-aware multi-modal fusion method to learn the complimentary feature information from the multiple MR modalities. First, a multi-encoder-based 3D U-Net is proposed to obtain the initial segmentation results. Then, an estimated Bayesian model is presented to measure the uncertainty of the initial segmentation results. Finally, the obtained uncertainty maps are integrated into a deep learning-based segmentation network, serving as an additional constraint information to further refine the segmentation results. The proposed network is evaluated on publicly available BraTS 2018 and BraTS 2019 datasets. The experimental results demonstrate that the proposed method outperforms the previous state-of-the-art methods on Dice score, Hausdorff distance and Sensitivity metrics. Furthermore, the proposed components could be easily applied to other network architectures and other computer vision fields.
Collapse
Affiliation(s)
- Tongxue Zhou
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China
| | - Shan Zhu
- School of Life and Environmental Science, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
47
|
Park J, Lee K, Park N, You SC, Ko J. Self-Attention LSTM-FCN model for arrhythmia classification and uncertainty assessment. Artif Intell Med 2023; 142:102570. [PMID: 37316094 DOI: 10.1016/j.artmed.2023.102570] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 04/09/2023] [Accepted: 04/27/2023] [Indexed: 06/16/2023]
Abstract
This paper presents ArrhyMon, a self-attention-based LSTM-FCN model for arrhythmia classification from ECG signal inputs. ArrhyMon targets to detect and classify six different types of arrhythmia apart from normal ECG patterns. To the best of our knowledge, ArrhyMon is the first end-to-end classification model that successfully targets the classification of six detailed arrhythmia types and compared to previous work does not require additional preprocessing and/or feature extraction operations separate from the classification model. ArrhyMon's deep learning model is designed to capture and exploit both global and local features embedded in ECG sequences by integrating fully convolutional network (FCN) layers and a self-attention-based long and short-term memory (LSTM) architecture. Moreover, to enhance its practicality, ArrhyMon incorporates a deep ensemble-based uncertainty model that generates a confidence-level measure for each classification result. We evaluate ArrhyMon's effectiveness using three publicly available arrhythmia datasets (i.e., MIT-BIH, Physionet Cardiology Challenge 2017 and 2020/2021) to show that ArrhyMon achieves state-of-the-art classification performance (average accuracy 99.63%), and that confidence measures show close correlation with subjective diagnosis made from practitioners.
Collapse
Affiliation(s)
- JaeYeon Park
- School of Integrated Technology, College of Computing, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Kichang Lee
- School of Integrated Technology, College of Computing, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Noseong Park
- Department of Artificial Intelligence, College of Computing, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Seng Chan You
- Department of Biomedical Systems Informatics, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - JeongGil Ko
- School of Integrated Technology, College of Computing, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea; Department of Biomedical Systems Informatics, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| |
Collapse
|
48
|
Liu X, Prince JL, Xing F, Zhuo J, Reese T, Stone M, El Fakhri G, Woo J. Attentive continuous generative self-training for unsupervised domain adaptive medical image translation. Med Image Anal 2023; 88:102851. [PMID: 37329854 PMCID: PMC10527936 DOI: 10.1016/j.media.2023.102851] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/28/2023] [Accepted: 05/23/2023] [Indexed: 06/19/2023]
Abstract
Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseudo-label filtering based on the maximum softmax probability, there is a paucity of prior work on self-training-based UDA for generative tasks, including image modality translation. To fill this gap, in this work, we seek to develop a generative self-training (GST) framework for domain adaptive image translation with continuous value prediction and regression objectives. Specifically, we quantify both aleatoric and epistemic uncertainties within our GST using variational Bayes learning to measure the reliability of synthesized data. We also introduce a self-attention scheme that de-emphasizes the background region to prevent it from dominating the training process. The adaptation is then carried out by an alternating optimization scheme with target domain supervision that focuses attention on the regions with reliable pseudo-labels. We evaluated our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation. Extensive validations with unpaired target domain data showed that our GST yielded superior synthesis performance in comparison to adversarial training UDA methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA.
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jiachen Zhuo
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Timothy Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| |
Collapse
|
49
|
Asif A, Rajpoot K, Graham S, Snead D, Minhas F, Rajpoot N. Unleashing the potential of AI for pathology: challenges and recommendations. J Pathol 2023; 260:564-577. [PMID: 37550878 PMCID: PMC10952719 DOI: 10.1002/path.6168] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 08/09/2023]
Abstract
Computational pathology is currently witnessing a surge in the development of AI techniques, offering promise for achieving breakthroughs and significantly impacting the practices of pathology and oncology. These AI methods bring with them the potential to revolutionize diagnostic pipelines as well as treatment planning and overall patient care. Numerous peer-reviewed studies reporting remarkable performance across diverse tasks serve as a testimony to the potential of AI in the field. However, widespread adoption of these methods in clinical and pre-clinical settings still remains a challenge. In this review article, we present a detailed analysis of the major obstacles encountered during the development of effective models and their deployment in practice. We aim to provide readers with an overview of the latest developments, assist them with insights into identifying some specific challenges that may require resolution, and suggest recommendations and potential future research directions. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Amina Asif
- Tissue Image Analytics Centre, Department of Computer ScienceUniversity of WarwickCoventryUK
| | - Kashif Rajpoot
- School of Computer ScienceUniversity of BirminghamBirminghamUK
| | - Simon Graham
- Histofy Ltd, Birmingham Business ParkBirminghamUK
| | - David Snead
- Histofy Ltd, Birmingham Business ParkBirminghamUK
- Department of PathologyUniversity Hospitals Coventry & Warwickshire NHS TrustCoventryUK
| | - Fayyaz Minhas
- Tissue Image Analytics Centre, Department of Computer ScienceUniversity of WarwickCoventryUK
- Cancer Research CentreUniversity of WarwickCoventryUK
| | - Nasir Rajpoot
- Tissue Image Analytics Centre, Department of Computer ScienceUniversity of WarwickCoventryUK
- Histofy Ltd, Birmingham Business ParkBirminghamUK
- Cancer Research CentreUniversity of WarwickCoventryUK
- The Alan Turing InstituteLondonUK
| |
Collapse
|
50
|
Zaitseva E, Levashenko V, Rabcan J, Kvassay M. A New Fuzzy-Based Classification Method for Use in Smart/Precision Medicine. Bioengineering (Basel) 2023; 10:838. [PMID: 37508865 PMCID: PMC10376790 DOI: 10.3390/bioengineering10070838] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 07/08/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
The development of information technology has had a significant impact on various areas of human activity, including medicine. It has led to the emergence of the phenomenon of Industry 4.0, which, in turn, led to the development of the concept of Medicine 4.0. Medicine 4.0, or smart medicine, can be considered as a structural association of such areas as AI-based medicine, telemedicine, and precision medicine. Each of these areas has its own characteristic data, along with the specifics of their processing and analysis. Nevertheless, at present, all these types of data must be processed simultaneously, in order to provide the most complete picture of the health of each individual patient. In this paper, after a brief analysis of the topic of medical data, a new classification method is proposed that allows the processing of the maximum number of data types. The specificity of this method is its use of a fuzzy classifier. The effectiveness of this method is confirmed by an analysis of the results from the classification of various types of data for medical applications and health problems. In this paper, as an illustration of the proposed method, a fuzzy decision tree has been used as the fuzzy classifier. The accuracy of the classification in terms of the proposed method, based on a fuzzy classifier, gives the best performance in comparison with crisp classifiers.
Collapse
Affiliation(s)
- Elena Zaitseva
- Department of Informatics, Faculty of Management Science and Informatics, University of Zilina, 01026 Zilina, Slovakia
| | - Vitaly Levashenko
- Department of Informatics, Faculty of Management Science and Informatics, University of Zilina, 01026 Zilina, Slovakia
| | - Jan Rabcan
- Department of Informatics, Faculty of Management Science and Informatics, University of Zilina, 01026 Zilina, Slovakia
| | - Miroslav Kvassay
- Department of Informatics, Faculty of Management Science and Informatics, University of Zilina, 01026 Zilina, Slovakia
| |
Collapse
|