1
|
Irfan M, Haq IU, Malik KM, Muhammad K. One-shot learning for generalization in medical image classification across modalities. Comput Med Imaging Graph 2025; 122:102507. [PMID: 40049026 DOI: 10.1016/j.compmedimag.2025.102507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 01/22/2025] [Accepted: 01/29/2025] [Indexed: 03/24/2025]
Abstract
Generalizability is one of the biggest challenges hindering the advancement of medical sensing technologies across multiple imaging modalities. This issue is further impaired when the imaging data is limited in scope or of poor quality. To tackle this, we propose a generalized and robust, lightweight one-shot learning method for medical image classification across various imaging modalities, including X-ray, microscopic, and CT scans. Our model introduces a collaborative one-shot training (COST) approach, incorporating both meta-learning and metric-learning. This approach allows for effective training on only one image per class. To ensure generalization with fewer epochs, we employ gradient generalization at dense and fully connected layers, utilizing a lightweight Siamese network with triplet loss and shared parameters. The proposed model was evaluated on 12 medical image datasets from MedMNIST2D, achieving an average accuracy of 91.5 % and area under the curve (AUC) of 0.89, outperforming state-of-the-art models such as ResNet-50 and AutoML by over 10 % on certain datasets. Further, in the OCTMNIST dataset, our model achieved an AUC of 0.91 compared to ResNet-50's 0.77. Ablation studies further validate the superiority of our approach, with the COST method showing significant improvement in convergence speed and accuracy when compared to traditional one-shot learning setups. Additionally, our model's lightweight architecture requires only 0.15 million parameters, making it well-suited for deployment on resource-constrained devices.
Collapse
Affiliation(s)
- Muhammad Irfan
- SMILES LAB, College of Innovation & Technology, University of Michigan-Flint, Flint, MI 48502, USA
| | - Ijaz Ul Haq
- SMILES LAB, College of Innovation & Technology, University of Michigan-Flint, Flint, MI 48502, USA
| | - Khalid Mahmood Malik
- SMILES LAB, College of Innovation & Technology, University of Michigan-Flint, Flint, MI 48502, USA.
| | - Khan Muhammad
- VIS2KNOW Lab, Department of Applied Artificial Intelligence, School of Convergence, College of Computing and Informatics, Sungkyunkwan University, Seoul 03063, South Korea.
| |
Collapse
|
2
|
Wu ML, Peng YF. Multi-modality multiorgan image segmentation using continual learning with enhanced hard attention to the task. Med Phys 2025. [PMID: 40268717 DOI: 10.1002/mp.17842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 01/27/2025] [Accepted: 03/29/2025] [Indexed: 04/25/2025] Open
Abstract
BACKGROUND Enabling a deep neural network (DNN) to learn multiple tasks using the concept of continual learning potentially better mimics human brain functions. However, current continual learning studies for medical image segmentation are mostly limited to single-modality images at identical anatomical locations. PURPOSE To propose and evaluate a continual learning method termed eHAT (enhanced hard attention to the task) for performing multi-modality, multiorgan segmentation tasks using a DNN. METHODS Four public datasets covering the lumbar spine, heart, and brain acquired by magnetic resonance imaging (MRI) and computed tomography (CT) were included to segment the vertebral bodies, the right ventricle, and brain tumors, respectively. Three-task (spine CT, heart MRI, and brain MRI) and four-task (spine CT, heart MRI, brain MRI, and spine MRI) models were tested for eHAT, with the three-task results compared with state-of-the-art continual learning methods. The effectiveness of multitask performance was measured using the forgetting rate, defined as the average difference in Dice coefficients and Hausdorff distances between multiple-task and single-task models. The ability to transfer knowledge to different tasks was evaluated using backward transfer (BWT). RESULTS The forgetting rates were -2.51% to -0.60% for the three-task eHAT models with varying task orders, substantially better than the -18.13% to -3.59% using original hard attention to the task (HAT), while those in four-task models were -2.54% to -1.59%. In addition, four-task U-net models with eHAT using only half the number of channels (1/4 parameters) yielded nearly equal performance with or without regularization. A retrospective model comparison showed that eHAT with fixed or automatic regularization had significantly superior BWT (-3% to 0%) compared to HAT (-22% to -4%). CONCLUSION We demonstrate for the first time that eHAT effectively achieves continual learning of multi-modality, multiorgan segmentation tasks using a single DNN, with improved forgetting rates compared with HAT.
Collapse
Affiliation(s)
- Ming-Long Wu
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan
| | - Yi-Fan Peng
- Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
3
|
Woerner S, Jaques A, Baumgartner CF. A comprehensive and easy-to-use multi-domain multi-task medical imaging meta-dataset. Sci Data 2025; 12:666. [PMID: 40253434 PMCID: PMC12009356 DOI: 10.1038/s41597-025-04866-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 03/20/2025] [Indexed: 04/21/2025] Open
Abstract
While the field of medical image analysis has undergone a transformative shift with the integration of machine learning techniques, the main challenge of these techniques is often the scarcity of large, diverse, and well-annotated datasets. Medical images vary in format, size, and other parameters and therefore require extensive preprocessing and standardization, for usage in machine learning. Addressing these challenges, we introduce the Medical Imaging Meta-Dataset (MedIMeta), a novel multi-domain, multi-task meta-dataset. MedIMeta contains 19 medical imaging datasets spanning 10 different domains and encompassing 54 distinct medical tasks, all of which are standardized to the same format and readily usable in PyTorch or other ML frameworks. We perform a technical validation of MedIMeta, demonstrating its utility through fully supervised and cross-domain few-shot learning baselines.
Collapse
Affiliation(s)
- Stefano Woerner
- Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany.
| | - Arthur Jaques
- Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany
| | - Christian F Baumgartner
- Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany
- Faculty of Health Sciences and Medicine, University of Lucerne, Lucerne, Switzerland
| |
Collapse
|
4
|
Zhang J, Lai Z, Kong H, Yang J. Learning the Optimal Discriminant SVM With Feature Extraction. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:2897-2911. [PMID: 40030888 DOI: 10.1109/tpami.2025.3529711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Subspace learning and Support Vector Machine (SVM) are two critical techniques in pattern recognition, playing pivotal roles in feature extraction and classification. However, how to learn the optimal subspace such that the SVM classifier can perform the best is still a challenging problem due to the difficulty in optimization, computation, and algorithm convergence. To address these problems, this paper develops a novel method named Optimal Discriminant Support Vector Machine (ODSVM), which integrates support vector classification with discriminative subspace learning in a seamless framework. As a result, the most discriminative subspace and the corresponding optimal SVM are obtained simultaneously to pursue the best classification performance. The efficient optimization framework is designed for binary and multi-class ODSVM. Moreover, a fast sequential minimization optimization (SMO) algorithm with pruning is proposed to accelerate the computation in multi-class ODSVM. Unlike other related methods, ODSVM has a strong theoretical guarantee of global convergence, highlighting its superiority and stability. Numerical experiments are conducted on thirteen datasets and the results demonstrate that ODSVM outperforms existing methods with statistical significance.
Collapse
|
5
|
Wei H, Yang Y, Sun S, Feng M, Wang R, Han X. LMTTM-VMI: Linked Memory Token Turing Machine for 3D volumetric medical image classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 262:108640. [PMID: 39951959 DOI: 10.1016/j.cmpb.2025.108640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 01/04/2025] [Accepted: 02/01/2025] [Indexed: 02/17/2025]
Abstract
Biomedical imaging is vital for the diagnosis and treatment of various medical conditions, yet the effective integration of deep learning technologies into this field presents challenges. Traditional methods often struggle to efficiently capture the spatial characteristics and intricate structural features of 3D volumetric medical images, limiting memory utilization and model adaptability. To address this, we introduce a Linked Memory Token Turing Machine (LMTTM), which utilizes external linked memory to efficiently process spatial dependencies and structural complexities within 3D volumetric medical images, aiding in accurate diagnoses. LMTTM can efficiently record the features of 3D volumetric medical images in an external linked memory module, enhancing complex image classification through improved feature accumulation and reasoning capabilities. Our experiments on six 3D volumetric medical image datasets from the MedMNIST v2 demonstrate that our proposed LMTTM model achieves average ACC of 82.4%, attaining state-of-the-art (SOTA) performance. Moreover, ablation studies confirmed that the Linked Memory outperforms its predecessor, TTM's original Memory, by up to 5.7%, highlighting LMTTM's effectiveness in 3D volumetric medical image classification and its potential to assist healthcare professionals in diagnosis and treatment planning. The code is released at https://github.com/hongkai-wei/LMTTM-VMI.
Collapse
Affiliation(s)
- Hongkai Wei
- School of Information Engineering, Chang'an University, Xi'an, 710064 Shaanxi, China
| | - Yang Yang
- School of Information Engineering, Chang'an University, Xi'an, 710064 Shaanxi, China
| | - Shijie Sun
- School of Information Engineering, Chang'an University, Xi'an, 710064 Shaanxi, China.
| | - Mingtao Feng
- School of Computer Science and Technology, Xidian University, Xi'an, 710126 Shaanxi, China
| | - Rong Wang
- School of Information Engineering, Chang'an University, Xi'an, 710064 Shaanxi, China
| | - Xianfeng Han
- College of Computer & Information Science, Southwest University, 400715 Chongqing, China
| |
Collapse
|
6
|
Doerrich S, Di Salvo F, Brockmann J, Ledig C. Rethinking model prototyping through the MedMNIST+ dataset collection. Sci Rep 2025; 15:7669. [PMID: 40044786 PMCID: PMC11883007 DOI: 10.1038/s41598-025-92156-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 02/25/2025] [Indexed: 03/09/2025] Open
Abstract
The integration of deep learning based systems in clinical practice is often impeded by challenges rooted in limited and heterogeneous medical datasets. In addition, the field has increasingly prioritized marginal performance gains on a few, narrowly scoped benchmarks over clinical applicability, slowing down meaningful algorithmic progress. This trend often results in excessive fine-tuning of existing methods on selected datasets rather than fostering clinically relevant innovations. In response, this work introduces a comprehensive benchmark for the MedMNIST+ dataset collection, designed to diversify the evaluation landscape across several imaging modalities, anatomical regions, classification tasks and sample sizes. We systematically reassess commonly used Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) architectures across distinct medical datasets, training methodologies, and input resolutions to validate and refine existing assumptions about model effectiveness and development. Our findings suggest that computationally efficient training schemes and modern foundation models offer viable alternatives to costly end-to-end training. Additionally, we observe that higher image resolutions do not consistently improve performance beyond a certain threshold. This highlights the potential benefits of using lower resolutions, particularly in prototyping stages, to reduce computational demands without sacrificing accuracy. Notably, our analysis reaffirms the competitiveness of CNNs compared to ViTs, emphasizing the importance of comprehending the intrinsic capabilities of different architectures. Finally, by establishing a standardized evaluation framework, we aim to enhance transparency, reproducibility, and comparability within the MedMNIST+ dataset collection as well as future research. Code is available at (https://github.com/sdoerrich97/rethinking-model-prototyping-MedMNISTPlus).
Collapse
Affiliation(s)
| | | | - Julius Brockmann
- University of Bamberg, xAILab Bamberg, Bamberg, 96047, Germany
- Ludwig Maximilian University of Munich, Munich, 80539, Germany
| | - Christian Ledig
- University of Bamberg, xAILab Bamberg, Bamberg, 96047, Germany
| |
Collapse
|
7
|
Ghilea R, Rekik I. Replica tree-based federated learning using limited data. Neural Netw 2025; 186:107281. [PMID: 40015035 DOI: 10.1016/j.neunet.2025.107281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 11/07/2024] [Accepted: 02/12/2025] [Indexed: 03/01/2025]
Abstract
Learning from limited data has been extensively studied in machine learning, considering that deep neural networks achieve optimal performance when trained using a large amount of samples. Although various strategies have been proposed for centralized training, the topic of federated learning with small datasets remains largely unexplored. Moreover, in realistic scenarios, such as settings where medical institutions are involved, the number of participating clients is also constrained. In this work, we propose a novel federated learning framework, named RepTreeFL. At the core of the solution is the concept of a replica, where we replicate each participating client by copying its model architecture and perturbing its local data distribution. Our approach enables learning from limited data and a small number of clients by aggregating a larger number of models with diverse data distributions. Furthermore, we leverage the hierarchical structure of the clients network (both original and virtual), alongside the model diversity across replicas, and introduce a diversity-based tree aggregation, where replicas are combined in a tree-like manner and the aggregation weights are dynamically updated based on the model discrepancy. We evaluated our method on two tasks and two types of data, graph generation and image classification (binary and multi-class), with both homogeneous and heterogeneous model architectures. Experimental results demonstrate the effectiveness and outperformance of RepTreeFL in settings where both data and clients are limited.
Collapse
Affiliation(s)
- Ramona Ghilea
- BASIRA Lab, Imperial-X (I-X) and Department of Computing, Imperial College London, London, UK
| | - Islem Rekik
- BASIRA Lab, Imperial-X (I-X) and Department of Computing, Imperial College London, London, UK.
| |
Collapse
|
8
|
Albuquerque C, Henriques R, Castelli M. Deep learning-based object detection algorithms in medical imaging: Systematic review. Heliyon 2025; 11:e41137. [PMID: 39758372 PMCID: PMC11699422 DOI: 10.1016/j.heliyon.2024.e41137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 12/04/2024] [Accepted: 12/10/2024] [Indexed: 01/06/2025] Open
Abstract
Over the past decade, Deep Learning (DL) techniques have demonstrated remarkable advancements across various domains, driving their widespread adoption. Particularly in medical image analysis, DL received greater attention for tasks like image segmentation, object detection, and classification. This paper provides an overview of DL-based object recognition in medical images, exploring recent methods and emphasizing different imaging techniques and anatomical applications. Utilizing a meticulous quantitative and qualitative analysis following PRISMA guidelines, we examined publications based on citation rates to explore into the utilization of DL-based object detectors across imaging modalities and anatomical domains. Our findings reveal a consistent rise in the utilization of DL-based object detection models, indicating unexploited potential in medical image analysis. Predominantly within Medicine and Computer Science domains, research in this area is most active in the US, China, and Japan. Notably, DL-based object detection methods have gotten significant interest across diverse medical imaging modalities and anatomical domains. These methods have been applied to a range of techniques including CR scans, pathology images, and endoscopic imaging, showcasing their adaptability. Moreover, diverse anatomical applications, particularly in digital pathology and microscopy, have been explored. The analysis underscores the presence of varied datasets, often with significant discrepancies in size, with a notable percentage being labeled as private or internal, and with prospective studies in this field remaining scarce. Our review of existing trends in DL-based object detection in medical images offers insights for future research directions. The continuous evolution of DL algorithms highlighted in the literature underscores the dynamic nature of this field, emphasizing the need for ongoing research and fitted optimization for specific applications.
Collapse
|
9
|
Lei W, Xu W, Li K, Zhang X, Zhang S. MedLSAM: Localize and segment anything model for 3D CT images. Med Image Anal 2025; 99:103370. [PMID: 39447436 DOI: 10.1016/j.media.2024.103370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 09/09/2024] [Accepted: 10/09/2024] [Indexed: 10/26/2024]
Abstract
Recent advancements in foundation models have shown significant potential in medical image analysis. However, there is still a gap in models specifically designed for medical image localization. To address this, we introduce MedLAM, a 3D medical foundation localization model that accurately identifies any anatomical part within the body using only a few template scans. MedLAM employs two self-supervision tasks: unified anatomical mapping (UAM) and multi-scale similarity (MSS) across a comprehensive dataset of 14,012 CT scans. Furthermore, we developed MedLSAM by integrating MedLAM with the Segment Anything Model (SAM). This innovative framework requires extreme point annotations across three directions on several templates to enable MedLAM to locate the target anatomical structure in the image, with SAM performing the segmentation. It significantly reduces the amount of manual annotation required by SAM in 3D medical imaging scenarios. We conducted extensive experiments on two 3D datasets covering 38 distinct organs. Our findings are twofold: (1) MedLAM can directly localize anatomical structures using just a few template scans, achieving performance comparable to fully supervised models; (2) MedLSAM closely matches the performance of SAM and its specialized medical adaptations with manual prompts, while minimizing the need for extensive point annotations across the entire dataset. Moreover, MedLAM has the potential to be seamlessly integrated with future 3D SAM models, paving the way for enhanced segmentation performance. Our code is public at https://github.com/openmedlab/MedLSAM.
Collapse
Affiliation(s)
- Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Wei Xu
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Kang Li
- Shanghai AI Lab, Shanghai, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Xiaofan Zhang
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| | - Shaoting Zhang
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| |
Collapse
|
10
|
Singh D, Marathe A, Roy S, Walambe R, Kotecha K. Explainable rotation-invariant self-supervised representation learning. MethodsX 2024; 13:102959. [PMID: 39329154 PMCID: PMC11426157 DOI: 10.1016/j.mex.2024.102959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 09/11/2024] [Indexed: 09/28/2024] Open
Abstract
This paper describes a method that can perform robust detection and classification in out-of-distribution rotated images in the medical domain. In real-world medical imaging tools, noise due to the rotation of the body part is frequently observed. This noise reduces the accuracy of AI-based classification and prediction models. Hence, it is important to develop models which are rotation invariant. To that end, the proposed method - RISC (rotation invariant self-supervised vision framework) addresses this issue of rotational corruption. We present state-of-the-art rotation-invariant classification results and provide explainability for the performance in the domain. The evaluation of the proposed method is carried out on real-world adversarial examples in Medical Imagery-OrganAMNIST, RetinaMNIST and PneumoniaMNIST. It is observed that RISC outperforms the rotation-affected benchmark methods by obtaining 22\%, 17\% and 2\% accuracy boost on OrganAMNIST, PneumoniaMNIST and RetinaMNIST rotated baselines respectively. Further, explainability results are demonstrated. This methods paper describes:•a representation learning approach that can perform robust detection and classification in out-of-distribution rotated images in the medical domain.•It presents a method that incorporates self-supervised rotation invariance for correcting rotational corruptions.•GradCAM-based explainability for the rotational SSL pretext task and the downstream classification outcomes for the three benchmark datasets are presented.
Collapse
Affiliation(s)
- Devansh Singh
- Symbiosis Centre for Applied Artificial Intelligence, Symbiosis Institute of Technology, India
| | - Aboli Marathe
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Sidharth Roy
- Department of Computer and Information Science, University of Pennsylvania, USA
| | - Rahee Walambe
- Symbiosis Centre for Applied Artificial Intelligence, Symbiosis Institute of Technology, India
| | - Ketan Kotecha
- Symbiosis Centre for Applied Artificial Intelligence, Symbiosis Institute of Technology, India
| |
Collapse
|
11
|
Fei M, McMillan AB. Technical Note: Neural Network Architectures for Self-Supervised Body Part Regression Models with Automated Localized Segmentation Application. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01319-z. [PMID: 39538050 DOI: 10.1007/s10278-024-01319-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 10/03/2024] [Accepted: 10/25/2024] [Indexed: 11/16/2024]
Abstract
The advancement of medical image deep learning necessitates tools that can accurately identify body regions from whole-body scans to serve as an essential pre-processing step for downstream tasks. Typically, these deep learning models rely on labeled data and supervised learning, which is labor-intensive. However, the emergence of self-supervised learning is revolutionizing the field by eliminating the need for labels. The purpose of this study was to compare neural network architectures of self-supervised models that produced a body part regression (BPR) slice score to aid in the development of anatomically localized segmentation models. VGG, ResNet, DenseNet, ConvNext, and EfficientNet BPR models were implemented in the MONAI/Pytorch framework. Landmark organs were correlated to slice scores and mean absolute error (MAE) was calculated from the predicted slice and the actual slice of various organ landmarks. Four localized DynUNet segmentation models (thorax, upper abdomen, lower abdomen, and pelvis) were developed using the BPR slice scores. Dice similarity coefficient (DSC) was compared between the localized and baseline segmentation models. The best performing BPR model was the EfficientNet architecture with an overall 3.18 MAE, compared to the VGG baseline model with a MAE of 6.29. The localized segmentation model significantly outperformed the baseline in 16 out of 20 organs with a DSC of 0.88. Enhanced neural networks like EfficientNet have a large performance increase in localizing anatomical structures in a CT compared in BPR task. Utilizing BPR slice score is shown to be effective in anatomically localized segmentation tasks with improved performance.
Collapse
Affiliation(s)
- Michael Fei
- Creighton University School of Medicine, Phoenix, AZ, USA.
| | | |
Collapse
|
12
|
Chen Y, Lu W, Qin X, Wang J, Xie X. MetaFed: Federated Learning Among Federations With Cyclic Knowledge Distillation for Personalized Healthcare. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:16671-16682. [PMID: 37506019 DOI: 10.1109/tnnls.2023.3297103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
Federated learning (FL) has attracted increasing attention to building models without accessing raw user data, especially in healthcare. In real applications, different federations can seldom work together due to possible reasons such as data heterogeneity and distrust/inexistence of the central server. In this article, we propose a novel framework called MetaFed to facilitate trustworthy FL between different federations. MetaFed obtains a personalized model for each federation without a central server via the proposed cyclic knowledge distillation. Specifically, MetaFed treats each federation as a meta distribution and aggregates knowledge of each federation in a cyclic manner. The training is split into two parts: common knowledge accumulation and personalization. Comprehensive experiments on seven benchmarks demonstrate that MetaFed without a server achieves better accuracy compared with state-of-the-art methods [e.g., 10%+ accuracy improvement compared with the baseline for physical activity monitoring dataset (PAMAP2)] with fewer communication costs. More importantly, MetaFed shows remarkable performance in real-healthcare-related applications.
Collapse
|
13
|
Jia J, Yu B, Mody P, Ninaber MK, Schouffoer AA, de Vries-Bouwstra JK, Kroft LJM, Staring M, Stoel BC. Using 3D point cloud and graph-based neural networks to improve the estimation of pulmonary function tests from chest CT. Comput Biol Med 2024; 182:109192. [PMID: 39341113 DOI: 10.1016/j.compbiomed.2024.109192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 08/23/2024] [Accepted: 09/21/2024] [Indexed: 09/30/2024]
Abstract
Pulmonary function tests (PFTs) are important clinical metrics to measure the severity of interstitial lung disease for systemic sclerosis patients. However, PFTs cannot always be performed by spirometry if there is a risk of disease transmission or other contraindications. In addition, it is unclear how lung function is affected by changes in lung vessels. Therefore, convolution neural networks (CNNs) were previously proposed to estimate PFTs from chest CT scans (CNN-CT) and extracted vessels (CNN-Vessel). Due to GPU memory constraints, however, these networks used down-sampled images, which causes a loss of information on small vessels. Previous literature has indicated that detailed vessel information from CT scans can be helpful for PFT estimation. Therefore, this paper proposes to use a point cloud neural network (PNN-Vessel) and graph neural network (GNN-Vessel) to estimate PFTs from point cloud and graph-based representations of pulmonary vessel centerlines, respectively. After that, we combine different networks and perform multiple variable step-wise regression analysis to explore if vessel-based networks can contribute to the PFT estimation, in addition to CNN-CT. Results showed that both PNN-Vessel and GNN-Vessel outperformed CNN-Vessel, by 14% and 4%, respectively, when averaged across the intra-class correlation coefficient (ICC) scores of four PFTs metrics. In addition, compared to CNN-Vessel, PNN-Vessel used 30% of training time (1.1 h) and 7% parameters (2.1 M) and GNN-Vessel used only 7% training time (0.25 h) and 0.7% parameters (0.2 M). We combined CNN-CT, PNN-Vessel and GNN-Vessel with the weights obtained from multiple variable regression methods, which achieved the best PFT estimation accuracy (ICC of 0.748, 0.742, 0.836 and 0.835 for the four PFT measures respectively). The results verified that more detailed vessel information could provide further explanation for PFT estimation from anatomical imaging.
Collapse
Affiliation(s)
- Jingnan Jia
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Bo Yu
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands; School of Artificial Intelligence, Jilin University, 130015, Changchun, China; Engineering Research Center of Knowledge-Driven Human-Machine Intelligence, Ministry of Education, China.
| | - Prerak Mody
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Maarten K Ninaber
- Department of Pulmonology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Anne A Schouffoer
- Department of Rheumatology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Jeska K de Vries-Bouwstra
- Department of Rheumatology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Lucia J M Kroft
- Department of Radiology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Marius Staring
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - Berend C Stoel
- Division of Image Processing, Department of Radiology, Leiden University Medical Center, PO Box 9600, 2300 RC, Leiden, The Netherlands.
| |
Collapse
|
14
|
Zhang H, Chung ACS. Depth-Aware Networks for Multi-Organ Lesion Detection in Chest CT Scans. Bioengineering (Basel) 2024; 11:998. [PMID: 39451374 PMCID: PMC11503988 DOI: 10.3390/bioengineering11100998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Revised: 09/22/2024] [Accepted: 09/24/2024] [Indexed: 10/26/2024] Open
Abstract
Computer tomography (CT) scans' capabilities in detecting lesions have been increasing remarkably in the past decades. In this paper, we propose a multi-organ lesion detection (MOLD) approach to better address real-life chest-related clinical needs. MOLD is a challenging task, especially within a large, high resolution image volume, due to various types of background information interference and large differences in lesion sizes. Furthermore, the appearance similarity between lesions and other normal tissues demands more discriminative features. In order to overcome these challenges, we introduce depth-aware (DA) and skipped-layer hierarchical training (SHT) mechanisms with the novel Dense 3D context enhanced (Dense 3DCE) lesion detection model. The novel Dense 3DCE framework considers the shallow, medium, and deep-level features together comprehensively. In addition, equipped with our SHT scheme, the backpropagation process can now be supervised under precise control, while the DA scheme can effectively incorporate depth domain knowledge into the scheme. Extensive experiments have been carried out on a publicly available, widely used DeepLesion dataset, and the results prove the effectiveness of our DA-SHT Dense 3DCE network in the MOLD task.
Collapse
Affiliation(s)
- Han Zhang
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China
| | | |
Collapse
|
15
|
Liu W, Zhang B, Liu T, Jiang J, Liu Y. Artificial Intelligence in Pancreatic Image Analysis: A Review. SENSORS (BASEL, SWITZERLAND) 2024; 24:4749. [PMID: 39066145 PMCID: PMC11280964 DOI: 10.3390/s24144749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 07/28/2024]
Abstract
Pancreatic cancer is a highly lethal disease with a poor prognosis. Its early diagnosis and accurate treatment mainly rely on medical imaging, so accurate medical image analysis is especially vital for pancreatic cancer patients. However, medical image analysis of pancreatic cancer is facing challenges due to ambiguous symptoms, high misdiagnosis rates, and significant financial costs. Artificial intelligence (AI) offers a promising solution by relieving medical personnel's workload, improving clinical decision-making, and reducing patient costs. This study focuses on AI applications such as segmentation, classification, object detection, and prognosis prediction across five types of medical imaging: CT, MRI, EUS, PET, and pathological images, as well as integrating these imaging modalities to boost diagnostic accuracy and treatment efficiency. In addition, this study discusses current hot topics and future directions aimed at overcoming the challenges in AI-enabled automated pancreatic cancer diagnosis algorithms.
Collapse
Affiliation(s)
- Weixuan Liu
- Sydney Smart Technology College, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China; (W.L.); (B.Z.)
| | - Bairui Zhang
- Sydney Smart Technology College, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China; (W.L.); (B.Z.)
| | - Tao Liu
- School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China;
| | - Juntao Jiang
- College of Control Science and Engineering, Zhejiang University, Hangzhou 310058, China
| | - Yong Liu
- College of Control Science and Engineering, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
16
|
Abhishek K, Brown CJ, Hamarneh G. Multi-sample ζ-mixup: richer, more realistic synthetic samples from a p-series interpolant. JOURNAL OF BIG DATA 2024; 11:43. [PMID: 38528850 PMCID: PMC10960781 DOI: 10.1186/s40537-024-00898-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 02/28/2024] [Indexed: 03/27/2024]
Abstract
Modern deep learning training procedures rely on model regularization techniques such as data augmentation methods, which generate training samples that increase the diversity of data and richness of label information. A popular recent method, mixup, uses convex combinations of pairs of original samples to generate new samples. However, as we show in our experiments, mixup can produce undesirable synthetic samples, where the data is sampled off the manifold and can contain incorrect labels. We propose ζ -mixup, a generalization of mixup with provably and demonstrably desirable properties that allows convex combinations of T ≥ 2 samples, leading to more realistic and diverse outputs that incorporate information from T original samples by using a p-series interpolant. We show that, compared to mixup, ζ -mixup better preserves the intrinsic dimensionality of the original datasets, which is a desirable property for training generalizable models. Furthermore, we show that our implementation of ζ -mixup is faster than mixup, and extensive evaluation on controlled synthetic and 26 diverse real-world natural and medical image classification datasets shows that ζ -mixup outperforms mixup, CutMix, and traditional data augmentation techniques. The code will be released at https://github.com/kakumarabhishek/zeta-mixup.
Collapse
Affiliation(s)
- Kumar Abhishek
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada
| | - Colin J Brown
- Engineering, Hinge Health, 455 Market Street, Suite 700, San Francisco, 94105 USA
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada
| |
Collapse
|
17
|
Suzuki H, Kawata Y, Aokage K, Matsumoto Y, Sugiura T, Tanabe N, Nakano Y, Tsuchida T, Kusumoto M, Marumo K, Kaneko M, Niki N. Aorta and main pulmonary artery segmentation using stacked U-Net and localization on non-contrast-enhanced computed tomography images. Med Phys 2024; 51:1232-1243. [PMID: 37519027 DOI: 10.1002/mp.16654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 07/17/2023] [Accepted: 07/17/2023] [Indexed: 08/01/2023] Open
Abstract
BACKGROUND The contact between the aorta, main pulmonary artery (MPA), main pulmonary vein, vena cava (VC), and esophagus affects segmentation of the aorta and MPA in non-contrast-enhanced computed tomography (NCE-CT) images. PURPOSE A two-stage stacked U-Net and localization of the aorta and MPA were developed for the segmentation of the aorta and MPA in NCE-CT images. METHODS Normal-dose NCE-CT images of 24 subjects with chronic thromboembolic pulmonary hypertension (CTEPH) and low-dose NCE-CT images of 100 subjects without CTEPH were used in this study. The aorta is in contact with the ascending aorta (AA) and MPA, the AA with the VC, the aortic arch (AR) with the VC and esophagus, and the descending aorta (DA) with the esophagus. These contact surfaces were manually annotated. The contact surfaces were quantified using the contact surface ratio (CSR). Segmentation of the aorta and MPA in NCE-CT images was performed by localization of the aorta and MPA and a two-stage stacked U-Net. Localization was performed by extracting and processing the trachea and main bronchus. The first stage of the stacked U-Net consisted of a 2D U-Net, 2D U-Net with a pre-trained VGG-16 encoder, and 2D attention U-Net. The second stage consisted of a 3D U-Net with four input channels: the CT volume and three segmentation results of the first stage. The model was trained and tested using 10-fold cross-validation. Segmentation of the entire volume was evaluated using the Dice similarity coefficient (DSC). Segmentation of the contact area was also assessed using the mean surface distance (MSD). The statistical analysis of the evaluation underwent a multi-comparison correction. CTEPH and non-CTEPH cases were classified based on the vessel diameters measured from the segmented MPA. RESULTS For the noncontact surfaces of AA, the MSD of stacked U-Net was 0.31 ± 0.10 mm (p < 0.05) and 0.32 ± 0.13 mm (p < 0.05) for non-CTEPH and CTEPH cases, respectively. For contact surfaces with a CSR of 0.4 or greater in AA, the MSD was 0.52 ± 0.23 mm (p < 0.05), and 0.68 ± 0.29 mm (p > 0.05) for non-CTEPH and CTEPH cases, respectively. MSDs were lower than those of 2D and 3D U-Nets for contact and noncontact surfaces; moreover, MSDs increased slightly with larger CSRs. However, the stacked U-Net achieved MSDs of approximately 1 pixel for a wide contact surface. The area under the receiver operating characteristic curve for CTEPH and non-CTEPH classification using the right main pulmonary artery (RMPA) diameter was 0.97 (95% confidence interval [CI]: 0.94-1.00). CONCLUSIONS Segmentation of the aorta and MPA on NCE-CT images were affected by vascular and esophageal contact. The application of stacked U-Net and localization techniques for non-CTEPH and CTEPH cases mitigated the impact of contact, suggesting its potential for diagnosing CTEPH.
Collapse
Affiliation(s)
- Hidenobu Suzuki
- Faculty of Science and Technology, Tokushima University, Tokushima, Japan
| | - Yoshiki Kawata
- Institute of Post-LED Photonics, Tokushima University, Tokushima, Japan
| | - Keiju Aokage
- Department of Thoracic Surgery, National Cancer Center Hospital East, Chiba, Japan
| | - Yuji Matsumoto
- Department of Endoscopy, Respiratory Endoscopy Division, National Cancer Center Hospital, Tokyo, Japan
| | - Toshihiko Sugiura
- Department of Respirology, Chiba University Graduate School of Medicine, Chiba, Japan
| | - Nobuhiro Tanabe
- Department of Respirology, Chiba University Graduate School of Medicine, Chiba, Japan
| | - Yasutaka Nakano
- Division of Respiratory Medicine, Department of Internal Medicine, Shiga University of Medical Science, Shiga, Japan
| | - Takaaki Tsuchida
- Department of Endoscopy, Respiratory Endoscopy Division, National Cancer Center Hospital, Tokyo, Japan
| | - Masahiko Kusumoto
- Division of Diagnostic Radiology, National Cancer Center Hospital, Tokyo, Japan
| | | | | | - Noboru Niki
- Faculty of Science and Technology, Tokushima University, Tokushima, Japan
| |
Collapse
|
18
|
Zhu M, Liao J, Liu J, Yuan Y. FedOSS: Federated Open Set Recognition via Inter-Client Discrepancy and Collaboration. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:190-202. [PMID: 37428659 DOI: 10.1109/tmi.2023.3294014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/12/2023]
Abstract
Open set recognition (OSR) aims to accurately classify known diseases and recognize unseen diseases as the unknown class in medical scenarios. However, in existing OSR approaches, gathering data from distributed sites to construct large-scale centralized training datasets usually leads to high privacy and security risk, which could be alleviated elegantly via the popular cross-site training paradigm, federated learning (FL). To this end, we represent the first effort to formulate federated open set recognition (FedOSR), and meanwhile propose a novel Federated Open Set Synthesis (FedOSS) framework to address the core challenge of FedOSR: the unavailability of unknown samples for all anticipated clients during the training phase. The proposed FedOSS framework mainly leverages two modules, i.e., Discrete Unknown Sample Synthesis (DUSS) and Federated Open Space Sampling (FOSS), to generate virtual unknown samples for learning decision boundaries between known and unknown classes. Specifically, DUSS exploits inter-client knowledge inconsistency to recognize known samples near decision boundaries and then pushes them beyond decision boundaries to synthesize discrete virtual unknown samples. FOSS unites these generated unknown samples from different clients to estimate the class-conditional distributions of open data space near decision boundaries and further samples open data, thereby improving the diversity of virtual unknown samples. Additionally, we conduct comprehensive ablation experiments to verify the effectiveness of DUSS and FOSS. FedOSS shows superior performance on public medical datasets in comparison with state-of-the-art approaches. The source code is available at https://github.com/CityU-AIM-Group/FedOSS.
Collapse
|
19
|
Chen H, Wang R, Wang X, Li J, Fang Q, Li H, Bai J, Peng Q, Meng D, Wang L. Unsupervised Local Discrimination for Medical Images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15912-15929. [PMID: 37494162 DOI: 10.1109/tpami.2023.3299038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Contrastive learning, which aims to capture general representation from unlabeled images to initialize the medical analysis models, has been proven effective in alleviating the high demand for expensive annotations. Current methods mainly focus on instance-wise comparisons to learn the global discriminative features, however, pretermitting the local details to distinguish tiny anatomical structures, lesions, and tissues. To address this challenge, in this paper, we propose a general unsupervised representation learning framework, named local discrimination (LD), to learn local discriminative features for medical images by closely embedding semantically similar pixels and identifying regions of similar structures across different images. Specifically, this model is equipped with an embedding module for pixel-wise embedding and a clustering module for generating segmentation. And these two modules are unified by optimizing our novel region discrimination loss function in a mutually beneficial mechanism, which enables our model to reflect structure information as well as measure pixel-wise and region-wise similarity. Furthermore, based on LD, we propose a center-sensitive one-shot landmark localization algorithm and a shape-guided cross-modality segmentation model to foster the generalizability of our model. When transferred to downstream tasks, the learned representation by our method shows a better generalization, outperforming representation from 18 state-of-the-art (SOTA) methods and winning 9 out of all 12 downstream tasks. Especially for the challenging lesion segmentation tasks, the proposed method achieves significantly better performance.
Collapse
|
20
|
Kazemimoghadam M, Yang Z, Chen M, Ma L, Lu W, Gu X. Leveraging global binary masks for structure segmentation in medical images. Phys Med Biol 2023; 68:10.1088/1361-6560/acf2e2. [PMID: 37607564 PMCID: PMC10511220 DOI: 10.1088/1361-6560/acf2e2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 08/22/2023] [Indexed: 08/24/2023]
Abstract
Deep learning (DL) models for medical image segmentation are highly influenced by intensity variations of input images and lack generalization due to primarily utilizing pixels' intensity information for inference. Acquiring sufficient training data is another challenge limiting models' applications. Here, we proposed to leverage the consistency of organs' anatomical position and shape information in medical images. We introduced a framework leveraging recurring anatomical patterns through global binary masks for organ segmentation. Two scenarios were studied: (1) global binary masks were the only input for the U-Net based model, forcing exclusively encoding organs' position and shape information for rough segmentation or localization. (2) Global binary masks were incorporated as an additional channel providing position/shape clues to mitigate training data scarcity. Two datasets of the brain and heart computed tomography (CT) images with their ground-truth were split into (26:10:10) and (12:3:5) for training, validation, and test respectively. The two scenarios were evaluated using full training split as well as reduced subsets of training data. In scenario (1), training exclusively on global binary masks led to Dice scores of 0.77 ± 0.06 and 0.85 ± 0.04 for the brain and heart structures respectively. Average Euclidian distance of 3.12 ± 1.43 mm and 2.5 ± 0.93 mm were obtained relative to the center of mass of the ground truth for the brain and heart structures respectively. The outcomes indicated encoding a surprising degree of position and shape information through global binary masks. In scenario (2), incorporating global binary masks led to significantly higher accuracy relative to the model trained on only CT images in small subsets of training data; the performance improved by 4.3%-125.3% and 1.3%-48.1% for 1-8 training cases of the brain and heart datasets respectively. The findings imply the advantages of utilizing global binary masks for building models that are robust to image intensity variations as well as an effective approach to boost performance when access to labeled training data is highly limited.
Collapse
Affiliation(s)
- Mahdieh Kazemimoghadam
- Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX, 75390 USA
| | - Zi Yang
- Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX, 75390 USA
| | - Mingli Chen
- Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX, 75390 USA
| | - Lin Ma
- Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX, 75390 USA
| | - Weiguo Lu
- Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX, 75390 USA
| | - Xuejun Gu
- Department of Radiation Oncology, the University of Texas Southwestern Medical Center, Dallas TX, 75390 USA
- Department of Radiation Oncology, Stanford University, Stanford, CA 94305
| |
Collapse
|
21
|
Huang Y, Jiao J, Yu J, Zheng Y, Wang Y. Si-MSPDNet: A multiscale Siamese network with parallel partial decoders for the 3-D measurement of spines in 3D ultrasonic images. Comput Med Imaging Graph 2023; 108:102262. [PMID: 37385048 DOI: 10.1016/j.compmedimag.2023.102262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 05/26/2023] [Accepted: 06/09/2023] [Indexed: 07/01/2023]
Abstract
Early screening and frequent monitoring effectively decrease the risk of severe scoliosis, but radiation exposure is a consequence of traditional radiograph examinations. Additionally, traditional X-ray images on the coronal or sagittal plane have difficulty providing three-dimensional (3-D) information on spinal deformities. The Scolioscan system provides an innovative 3-D spine imaging approach via ultrasonic scanning, and its feasibility has been demonstrated in numerous studies. In this paper, to further examine the potential of spinal ultrasonic data for describing 3-D spinal deformities, we propose a novel deep-learning tracker named Si-MSPDNet for extracting widely employed landmarks (spinous process (SP)) in ultrasonic images of spines and establish a 3-D spinal profile to measure 3-D spinal deformities. Si-MSPDNet has a Siamese architecture. First, we employ two efficient two-stage encoders to extract features from the uncropped ultrasonic image and the patch centered on the SP cut from the image. Then, a fusion block is designed to strengthen the communication between encoded features and further refine them from channel and spatial perspectives. The SP is a very small target in ultrasonic images, so its representation is weak in the highest-level feature maps. To overcome this, we ignore the highest-level feature maps and introduce parallel partial decoders to localize the SP. The correlation evaluation in the traditional Siamese network is also expanded to multiple scales to enhance cooperation. Furthermore, we propose a binary guided mask based on vertebral anatomical prior knowledge, which can further improve the performance of our tracker by highlighting the potential region with SP. The binary-guided mask is also utilized for fully automatic initialization in tracking. We collected spinal ultrasonic data and corresponding radiographs on the coronal and sagittal planes from 150 patients to evaluate the tracking precision of Si-MSPDNet and the performance of the generated 3-D spinal profile. Experimental results revealed that our tracker achieved a tracking success rate of 100% and a mean IoU of 0.882, outperforming some commonly used tracking and real-time detection models. Furthermore, a high correlation existed on both the coronal and sagittal planes between our projected spinal curve and that extracted from the spinal annotation in X-ray images. The correlation between the tracking results of the SP and their ground truths on other projected planes was also satisfactory. More importantly, the difference in mean curvatures was slight on all projected planes between tracking results and ground truths. Thus, this study effectively demonstrates the promising potential of our 3-D spinal profile extraction method for the 3-D measurement of spinal deformities using 3-D ultrasound data.
Collapse
Affiliation(s)
- Yi Huang
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China
| | - Jing Jiao
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China
| | - Jinhua Yu
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Fudan University, 200433, China
| | - Yongping Zheng
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region of China; Research Institute for Smart Ageing, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region of China.
| | - Yuanyuan Wang
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Fudan University, 200433, China.
| |
Collapse
|
22
|
Zhao D, Wang W, Tang T, Zhang YY, Yu C. Current progress in artificial intelligence-assisted medical image analysis for chronic kidney disease: A literature review. Comput Struct Biotechnol J 2023; 21:3315-3326. [PMID: 37333860 PMCID: PMC10275698 DOI: 10.1016/j.csbj.2023.05.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 05/28/2023] [Accepted: 05/28/2023] [Indexed: 06/20/2023] Open
Abstract
Chronic kidney disease (CKD) causes irreversible damage to kidney structure and function. Arising from various etiologies, risk factors for CKD include hypertension and diabetes. With a progressively increasing global prevalence, CKD is an important public health problem worldwide. Medical imaging has become an important diagnostic tool for CKD through the non-invasive identification of macroscopic renal structural abnormalities. Artificial intelligence (AI)-assisted medical imaging techniques aid clinicians in the analysis of characteristics that cannot be easily discriminated by the naked eye, providing valuable information for the identification and management of CKD. Recent studies have demonstrated the effectiveness of AI-assisted medical image analysis as a clinical support tool using radiomics- and deep learning-based AI algorithms for improving the early detection, pathological assessment, and prognostic evaluation of various forms of CKD, including autosomal dominant polycystic kidney disease. Herein, we provide an overview of the potential roles of AI-assisted medical image analysis for the diagnosis and management of CKD.
Collapse
Affiliation(s)
- Dan Zhao
- Department of Nephrology, Tongji Hospital, School of Medicine, Tongji University, Shanghai 200065, China
| | - Wei Wang
- Department of Radiology, Tongji Hospital, School of Medicine, Tongji University, Shanghai 200065, China
| | - Tian Tang
- Department of Nephrology, Tongji Hospital, School of Medicine, Tongji University, Shanghai 200065, China
| | - Ying-Ying Zhang
- Department of Nephrology, Tongji Hospital, School of Medicine, Tongji University, Shanghai 200065, China
| | - Chen Yu
- Department of Nephrology, Tongji Hospital, School of Medicine, Tongji University, Shanghai 200065, China
| |
Collapse
|
23
|
Yang J, Shi R, Wei D, Liu Z, Zhao L, Ke B, Pfister H, Ni B. MedMNIST v2 - A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Sci Data 2023; 10:41. [PMID: 36658144 PMCID: PMC9852451 DOI: 10.1038/s41597-022-01721-8] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Accepted: 09/26/2022] [Indexed: 01/20/2023] Open
Abstract
We introduce MedMNIST v2, a large-scale MNIST-like dataset collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into a small size of 28 × 28 (2D) or 28 × 28 × 28 (3D) with the corresponding classification labels so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various dataset scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression, and multi-label). The resulting dataset, consisting of 708,069 2D images and 9,998 3D images in total, could support numerous research/educational purposes in biomedical image analysis, computer vision, and machine learning. We benchmark several baseline methods on MedMNIST v2, including 2D/3D neural networks and open-source/commercial AutoML tools. The data and code are publicly available at https://medmnist.com/ .
Collapse
Affiliation(s)
- Jiancheng Yang
- grid.16821.3c0000 0004 0368 8293Shanghai Jiao Tong University, Shanghai, China
| | - Rui Shi
- grid.16821.3c0000 0004 0368 8293Shanghai Jiao Tong University, Shanghai, China
| | - Donglai Wei
- grid.208226.c0000 0004 0444 7053Boston College, Chestnut Hill, MA USA
| | - Zequan Liu
- grid.1957.a0000 0001 0728 696XRWTH Aachen University, Aachen, Germany
| | - Lin Zhao
- grid.8547.e0000 0001 0125 2443Department of Endocrinology and Metabolism, Fudan Institute of Metabolic Diseases, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Bilian Ke
- grid.16821.3c0000 0004 0368 8293Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | | | - Bingbing Ni
- grid.16821.3c0000 0004 0368 8293Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
24
|
Iyer S, Blair A, White C, Dawes L, Moses D, Sowmya A. Vertebral compression fracture detection using imitation learning, patch based convolutional neural networks and majority voting. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2023] Open
|
25
|
Udupa JK, Liu T, Jin C, Zhao L, Odhner D, Tong Y, Agrawal V, Pednekar G, Nag S, Kotia T, Goodman M, Wileyto EP, Mihailidis D, Lukens JN, Berman AT, Stambaugh J, Lim T, Chowdary R, Jalluri D, Jabbour SK, Kim S, Reyhan M, Robinson CG, Thorstad WL, Choi JI, Press R, Simone CB, Camaratta J, Owens S, Torigian DA. Combining natural and artificial intelligence for robust automatic anatomy segmentation: Application in neck and thorax auto-contouring. Med Phys 2022; 49:7118-7149. [PMID: 35833287 PMCID: PMC10087050 DOI: 10.1002/mp.15854] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 06/20/2022] [Accepted: 06/30/2022] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Automatic segmentation of 3D objects in computed tomography (CT) is challenging. Current methods, based mainly on artificial intelligence (AI) and end-to-end deep learning (DL) networks, are weak in garnering high-level anatomic information, which leads to compromised efficiency and robustness. This can be overcome by incorporating natural intelligence (NI) into AI methods via computational models of human anatomic knowledge. PURPOSE We formulate a hybrid intelligence (HI) approach that integrates the complementary strengths of NI and AI for organ segmentation in CT images and illustrate performance in the application of radiation therapy (RT) planning via multisite clinical evaluation. METHODS The system employs five modules: (i) body region recognition, which automatically trims a given image to a precisely defined target body region; (ii) NI-based automatic anatomy recognition object recognition (AAR-R), which performs object recognition in the trimmed image without DL and outputs a localized fuzzy model for each object; (iii) DL-based recognition (DL-R), which refines the coarse recognition results of AAR-R and outputs a stack of 2D bounding boxes (BBs) for each object; (iv) model morphing (MM), which deforms the AAR-R fuzzy model of each object guided by the BBs output by DL-R; and (v) DL-based delineation (DL-D), which employs the object containment information provided by MM to delineate each object. NI from (ii), AI from (i), (iii), and (v), and their combination from (iv) facilitate the HI system. RESULTS The HI system was tested on 26 organs in neck and thorax body regions on CT images obtained prospectively from 464 patients in a study involving four RT centers. Data sets from one separate independent institution involving 125 patients were employed in training/model building for each of the two body regions, whereas 104 and 110 data sets from the 4 RT centers were utilized for testing on neck and thorax, respectively. In the testing data sets, 83% of the images had limitations such as streak artifacts, poor contrast, shape distortion, pathology, or implants. The contours output by the HI system were compared to contours drawn in clinical practice at the four RT centers by utilizing an independently established ground-truth set of contours as reference. Three sets of measures were employed: accuracy via Dice coefficient (DC) and Hausdorff boundary distance (HD), subjective clinical acceptability via a blinded reader study, and efficiency by measuring human time saved in contouring by the HI system. Overall, the HI system achieved a mean DC of 0.78 and 0.87 and a mean HD of 2.22 and 4.53 mm for neck and thorax, respectively. It significantly outperformed clinical contouring in accuracy and saved overall 70% of human time over clinical contouring time, whereas acceptability scores varied significantly from site to site for both auto-contours and clinically drawn contours. CONCLUSIONS The HI system is observed to behave like an expert human in robustness in the contouring task but vastly more efficiently. It seems to use NI help where image information alone will not suffice to decide, first for the correct localization of the object and then for the precise delineation of the boundary.
Collapse
Affiliation(s)
- Jayaram K. Udupa
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Tiange Liu
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- School of Information Science and EngineeringYanshan UniversityQinhuangdaoChina
| | - Chao Jin
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Liming Zhao
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dewey Odhner
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Yubing Tong
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Vibhu Agrawal
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Gargi Pednekar
- Quantitative Radiology SolutionsPhiladelphiaPennsylvaniaUSA
| | - Sanghita Nag
- Quantitative Radiology SolutionsPhiladelphiaPennsylvaniaUSA
| | - Tarun Kotia
- Quantitative Radiology SolutionsPhiladelphiaPennsylvaniaUSA
| | | | - E. Paul Wileyto
- Department of Biostatistics and EpidemiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dimitris Mihailidis
- Department of Radiation OncologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - John Nicholas Lukens
- Department of Radiation OncologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Abigail T. Berman
- Department of Radiation OncologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Joann Stambaugh
- Department of Radiation OncologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Tristan Lim
- Department of Radiation OncologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Rupa Chowdary
- Department of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dheeraj Jalluri
- Department of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Salma K. Jabbour
- Department of Radiation OncologyRutgers UniversityNew BrunswickNew JerseyUSA
| | - Sung Kim
- Department of Radiation OncologyRutgers UniversityNew BrunswickNew JerseyUSA
| | - Meral Reyhan
- Department of Radiation OncologyRutgers UniversityNew BrunswickNew JerseyUSA
| | | | - Wade L. Thorstad
- Department of Radiation OncologyWashington UniversitySt. LouisMissouriUSA
| | | | | | | | - Joe Camaratta
- Quantitative Radiology SolutionsPhiladelphiaPennsylvaniaUSA
| | - Steve Owens
- Quantitative Radiology SolutionsPhiladelphiaPennsylvaniaUSA
| | - Drew A. Torigian
- Medical Image Processing GroupDepartment of RadiologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| |
Collapse
|
26
|
Lang Y, Lian C, Xiao D, Deng H, Thung KH, Yuan P, Gateno J, Kuang T, Alfi DM, Wang L, Shen D, Xia JJ, Yap PT. Localization of Craniomaxillofacial Landmarks on CBCT Images Using 3D Mask R-CNN and Local Dependency Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2856-2866. [PMID: 35544487 PMCID: PMC9673501 DOI: 10.1109/tmi.2022.3174513] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Cephalometric analysis relies on accurate detection of craniomaxillofacial (CMF) landmarks from cone-beam computed tomography (CBCT) images. However, due to the complexity of CMF bony structures, it is difficult to localize landmarks efficiently and accurately. In this paper, we propose a deep learning framework to tackle this challenge by jointly digitalizing 105 CMF landmarks on CBCT images. By explicitly learning the local geometrical relationships between the landmarks, our approach extends Mask R-CNN for end-to-end prediction of landmark locations. Specifically, we first apply a detection network on a down-sampled 3D image to leverage global contextual information to predict the approximate locations of the landmarks. We subsequently leverage local information provided by higher-resolution image patches to refine the landmark locations. On patients with varying non-syndromic jaw deformities, our method achieves an average detection accuracy of 1.38± 0.95mm, outperforming a related state-of-the-art method.
Collapse
|
27
|
Zhang Y, Hu N, Li Z, Ji X, Liu S, Sha Y, Song X, Zhang J, Hu L, Li W. Lumbar spine localisation method based on feature fusion. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2022. [DOI: 10.1049/cit2.12137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Yonghong Zhang
- Robotics Institute School of Mechanical Engineering and Automation Beihang University Beijing China
- Beijing Zhuzheng Robot Co., LTD Beijing China
| | - Ning Hu
- Department of Mechanical, Aerospace and Biomedical Engineering University of Tennessee Knoxville Tennessee USA
| | - Zhuofu Li
- Department of Orthopaedics Peking University Third Hospital Beijing China
- Engineering Research Center of Bone and Joint Precision Medicine Ministry of Education Beijing China
- Beijing Key Laboratory of Spinal Disease Research Beijing China
| | - Xuquan Ji
- Beijing Zhuzheng Robot Co., LTD Beijing China
- School of Biological Science and Medical Engineering Beihang University Beijing China
| | - Shanshan Liu
- Department of Orthopaedics Peking University Third Hospital Beijing China
- Engineering Research Center of Bone and Joint Precision Medicine Ministry of Education Beijing China
- Beijing Key Laboratory of Spinal Disease Research Beijing China
| | - Youyang Sha
- Department of Computer Science University of Warwick Coventry UK
| | - Xiongkang Song
- Robotics Institute School of Mechanical Engineering and Automation Beihang University Beijing China
- Beijing Zhuzheng Robot Co., LTD Beijing China
| | - Jian Zhang
- Robotics Institute School of Mechanical Engineering and Automation Beihang University Beijing China
- Beijing Zhuzheng Robot Co., LTD Beijing China
| | - Lei Hu
- Robotics Institute School of Mechanical Engineering and Automation Beihang University Beijing China
- Beijing Zhuzheng Robot Co., LTD Beijing China
| | - Weishi Li
- Department of Orthopaedics Peking University Third Hospital Beijing China
- Engineering Research Center of Bone and Joint Precision Medicine Ministry of Education Beijing China
- Beijing Key Laboratory of Spinal Disease Research Beijing China
| |
Collapse
|
28
|
Ahmed S, Dera D, Hassan SU, Bouaynaya N, Rasool G. Failure Detection in Deep Neural Networks for Medical Imaging. FRONTIERS IN MEDICAL TECHNOLOGY 2022; 4:919046. [PMID: 35958121 PMCID: PMC9359318 DOI: 10.3389/fmedt.2022.919046] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/20/2022] [Indexed: 11/13/2022] Open
Abstract
Deep neural networks (DNNs) have started to find their role in the modern healthcare system. DNNs are being developed for diagnosis, prognosis, treatment planning, and outcome prediction for various diseases. With the increasing number of applications of DNNs in modern healthcare, their trustworthiness and reliability are becoming increasingly important. An essential aspect of trustworthiness is detecting the performance degradation and failure of deployed DNNs in medical settings. The softmax output values produced by DNNs are not a calibrated measure of model confidence. Softmax probability numbers are generally higher than the actual model confidence. The model confidence-accuracy gap further increases for wrong predictions and noisy inputs. We employ recently proposed Bayesian deep neural networks (BDNNs) to learn uncertainty in the model parameters. These models simultaneously output the predictions and a measure of confidence in the predictions. By testing these models under various noisy conditions, we show that the (learned) predictive confidence is well calibrated. We use these reliable confidence values for monitoring performance degradation and failure detection in DNNs. We propose two different failure detection methods. In the first method, we define a fixed threshold value based on the behavior of the predictive confidence with changing signal-to-noise ratio (SNR) of the test dataset. The second method learns the threshold value with a neural network. The proposed failure detection mechanisms seamlessly abstain from making decisions when the confidence of the BDNN is below the defined threshold and hold the decision for manual review. Resultantly, the accuracy of the models improves on the unseen test samples. We tested our proposed approach on three medical imaging datasets: PathMNIST, DermaMNIST, and OrganAMNIST, under different levels and types of noise. An increase in the noise of the test images increases the number of abstained samples. BDNNs are inherently robust and show more than 10% accuracy improvement with the proposed failure detection methods. The increased number of abstained samples or an abrupt increase in the predictive variance indicates model performance degradation or possible failure. Our work has the potential to improve the trustworthiness of DNNs and enhance user confidence in the model predictions.
Collapse
Affiliation(s)
- Sabeen Ahmed
- Department of Electrical and Computer Engineering, Rowan University, Glassboro, NJ, United States
- *Correspondence: Sabeen Ahmed
| | - Dimah Dera
- University of Texas Rio Grande Valley, Brownsville, TX, United States
| | | | - Nidhal Bouaynaya
- Department of Electrical and Computer Engineering, Rowan University, Glassboro, NJ, United States
| | - Ghulam Rasool
- Machine Learning Department, Moffitt Cancer Center, Tampa, FL, United States
| |
Collapse
|
29
|
Navarro F, Sasahara G, Shit S, Sekuboyina A, Ezhov I, Peeken JC, Combs SE, Menze BH. A Unified 3D Framework for Organs-at-Risk Localization and Segmentation for Radiation Therapy Planning. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1544-1547. [PMID: 36086554 DOI: 10.1109/embc48229.2022.9871680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Automatic localization and segmentation of organs-at-risk (OAR) in CT are essential pre-processing steps in medical image analysis tasks, such as radiation therapy planning. For instance, the segmentation of OAR surrounding tumors enables the maximization of radiation to the tumor area without compromising the healthy tissues. However, the current medical workflow requires manual delineation of OAR, which is prone to errors and is annotator-dependent. In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation rather than novel localization or segmentation architectures. To the best of our knowledge, our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging. In the first step, a 3D multi-variate regression network predicts organs' centroids and bounding boxes. Secondly, 3D organ-specific segmentation networks are leveraged to generate a multi-organ segmentation map. Our method achieved an overall Dice score of 0.9260 ± 0.18% on the VISCERAL dataset containing CT scans with varying fields of view and multiple organs.
Collapse
|
30
|
Jin C, Udupa JK, Zhao L, Tong Y, Odhner D, Pednekar G, Nag S, Lewis S, Poole N, Mannikeri S, Govindasamy S, Singh A, Camaratta J, Owens S, Torigian DA. Object recognition in medical images via anatomy-guided deep learning. Med Image Anal 2022; 81:102527. [DOI: 10.1016/j.media.2022.102527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 03/31/2022] [Accepted: 06/24/2022] [Indexed: 11/25/2022]
|
31
|
Few-Shot Learning with Collateral Location Coding and Single-Key Global Spatial Attention for Medical Image Classification. ELECTRONICS 2022. [DOI: 10.3390/electronics11091510] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Humans are born with the ability to learn quickly by discerning objects from a few samples, to acquire new skills in a short period of time, and to make decisions based on limited prior experience and knowledge. The existing deep learning models for medical image classification often rely on a large number of labeled training samples, whereas the fast learning ability of deep neural networks has failed to develop. In addition, it requires a large amount of time and computing resource to retrain the model when the deep model encounters classes it has never seen before. However, for healthcare applications, enabling a model to generalize new clinical scenarios is of great importance. The existing image classification methods cannot explicitly use the location information of the pixel, making them insensitive to cues related only to the location. Besides, they also rely on local convolution and cannot properly utilize global information, which is essential for image classification. To alleviate these problems, we propose a collateral location coding to help the network explicitly exploit the location information of each pixel to make it easier for the network to recognize cues related to location only, and a single-key global spatial attention is designed to make the pixels at each location perceive the global spatial information in a low-cost way. Experimental results on three medical image benchmark datasets demonstrate that our proposed algorithm outperforms the state-of-the-art approaches in both effectiveness and generalization ability.
Collapse
|
32
|
Iyer S, Blair A, Dawes L, Moses D, White C, Sowmya A. Supervised and semi-supervised 3D organ localisation in CT images combining reinforcement learning with imitation learning. Biomed Phys Eng Express 2022; 8. [PMID: 35385835 DOI: 10.1088/2057-1976/ac64c5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 04/06/2022] [Indexed: 11/12/2022]
Abstract
Computer aided diagnostics often requires analysis of a region of interest (ROI) within a radiology scan, and the ROI may be an organ or a suborgan. Although deep learning algorithms have the ability to outperform other methods, they rely on the availability of a large amount of annotated data. Motivated by the need to address this limitation, an approach to localisation and detection of multiple organs based on supervised and semi-supervised learning is presented here. It draws upon previous work by the authors on localising the thoracic and lumbar spine region in CT images. The method generates six bounding boxes of organs of interest, which are then fused to a single bounding box. The results of experiments on localisation of the Spleen, Left and Right Kidneys in CT Images using supervised and semi supervised learning (SSL) demonstrate the ability to address data limitations with a much smaller data set and fewer annotations, compared to other state-of-the-art methods. The SSL performance was evaluated using three different mixes of labelled and unlabelled data (i.e. 30:70,35:65,40:60) for each of lumbar spine, spleen left and right kidneys respectively. The results indicate that SSL provides a workable alternative especially in medical imaging where it is difficult to obtain annotated data.
Collapse
Affiliation(s)
- Sankaran Iyer
- School of Computer Science and Engineering, The University of New South Wales, Australia
| | - Alan Blair
- School of Computer Science and Engineering, The University of New South Wales, Australia
| | - Laughlin Dawes
- Department of Medical Imaging, Prince of Wales Hospital, NSW, Australia
| | - Daniel Moses
- Department of Medical Imaging, Prince of Wales Hospital, NSW, Australia
| | - Christopher White
- Department of Endocrinology and Metabolism, Prince of Wales Hospital, NSW, Australia
| | - Arcot Sowmya
- School of Computer Science and Engineering, The University of New South Wales, Australia
| |
Collapse
|
33
|
Chen X, Li Y, Yao L, Adeli E, Zhang Y, Wang X. Generative Adversarial U-Net for Domain-free Few-shot Medical Diagnosis. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.03.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
34
|
Luu MH, Walsum TV, Mai HS, Franklin D, Nguyen TTT, Le TM, Moelker A, Le VK, Vu DL, Le NH, Tran QL, Chu DT, Trung NL. Automatic scan range for dose-reduced multiphase CT imaging of the liver utilizing CNNs and Gaussian models. Med Image Anal 2022; 78:102422. [PMID: 35339951 DOI: 10.1016/j.media.2022.102422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 12/27/2021] [Accepted: 03/11/2022] [Indexed: 12/24/2022]
Abstract
Multiphase CT scanning of the liver is performed for several clinical applications; however, radiation exposure from CT scanning poses a nontrivial cancer risk to the patients. The radiation dose may be reduced by determining the scan range of the subsequent scans by the location of the target of interest in the first scan phase. The purpose of this study is to present and assess an automatic method for determining the scan range for multiphase CT scans. Our strategy is to first apply a CNN-based method for detecting the liver in 2D slices, and to use a liver range search algorithm for detecting the liver range in the scout volume. The target liver scan range for subsequent scans can be obtained by adding safety margins achieved from Gaussian liver motion models to the scan range determined from the scout. Experiments were performed on 657 multiphase CT volumes obtained from multiple hospitals. The experiment shows that the proposed liver detection method can detect the liver in 223 out of a total of 224 3D volumes on average within one second, with mean intersection of union, wall distance and centroid distance of 85.5%, 5.7 mm and 9.7 mm, respectively. In addition, the performance of the proposed liver detection method is comparable to the best of the state-of-the-art 3D liver detectors in the liver detection accuracy while it requires less processing time. Furthermore, we apply the liver scan range generation method on the liver CT images acquired from radiofrequency ablation and Y-90 transarterial radioembolization (selective internal radiation therapy) interventions of 46 patients from two hospitals. The result shows that the automatic scan range generation can significantly reduce the effective radiation dose by an average of 14.5% (2.56 mSv) compared to manual performance by the radiographer from Y-90 transarterial radioembolization, while no statistically significant difference in performance was found with the CT images from intra RFA intervention (p = 0.81). Finally, three radiologists assess both the original and the range-reduced images for evaluating the effect of the range reduction method on their clinical decisions. We conclude that the automatic liver scan range generation method is able to reduce excess radiation compared to the manual performance with a high accuracy and without penalizing the clinical decision.
Collapse
Affiliation(s)
- Manh Ha Luu
- AVITECH, University of Engineering and Technology, VNU, Hanoi, Vietnam; Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands; FET, University of Engineering and Technology, VNU, Hanoi, Vietnam.
| | - Theo van Walsum
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
| | - Hong Son Mai
- Department of Nuclear Medicine, Hospital 108, Hanoi, Vietnam
| | - Daniel Franklin
- School of Electrical and Data Engineering, University of Technology Sydney, Sydney, Australia
| | | | - Thi My Le
- Department of Radiology and Nuclear Medicine, Vinmec Hospital, Hanoi, Vietnam
| | - Adriaan Moelker
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
| | - Van Khang Le
- Radiology Center, Bach Mai Hospital, Hanoi, Vietnam
| | - Dang Luu Vu
- Radiology Center, Bach Mai Hospital, Hanoi, Vietnam
| | - Ngoc Ha Le
- Department of Nuclear Medicine, Hospital 108, Hanoi, Vietnam
| | - Quoc Long Tran
- FIT, University of Engineering and Technology, VNU, Hanoi, Vietnam
| | - Duc Trinh Chu
- FET, University of Engineering and Technology, VNU, Hanoi, Vietnam
| | - Nguyen Linh Trung
- AVITECH, University of Engineering and Technology, VNU, Hanoi, Vietnam
| |
Collapse
|
35
|
Xiong X, Smith BJ, Graves SA, Sunderland JJ, Graham MM, Gross BA, Buatti JM, Beichel RR. Quantification of uptake in pelvis F-18 FLT PET-CT images using a 3D localization and segmentation CNN. Med Phys 2022; 49:1585-1598. [PMID: 34982836 PMCID: PMC9447843 DOI: 10.1002/mp.15440] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 12/01/2021] [Accepted: 12/02/2021] [Indexed: 11/12/2022] Open
Abstract
PURPOSE The purpose of this work was to develop and validate a deep convolutional neural network (CNN) approach for the automated pelvis segmentation in computed tomography (CT) scans to enable the quantification of active pelvic bone marrow by means of Fluorothymidine F-18 (FLT) tracer uptake measurement in positron emission tomography (PET) scans. This quantification is a critical step in calculating bone marrow dose for radiopharmaceutical therapy clinical applications as well as external beam radiation doses. METHODS An approach for the combined localization and segmentation of the pelvis in CT volumes of varying sizes, ranging from full-body to pelvis CT scans, was developed that utilizes a novel CNN architecture in combination with a random sampling strategy. The method was validated on 34 planning CT scans and 106 full-body FLT PET-CT scans using a cross-validation strategy. Specifically, two different training and CNN application options were studied, quantitatively assessed, and statistically compared. RESULTS The proposed method was able to successfully locate and segment the pelvis in all test cases. On all data sets, an average Dice coefficient of 0.9396 ± $\pm$ 0.0182 or better was achieved. The relative tracer uptake measurement error ranged between 0.065% and 0.204%. The proposed approach is time-efficient and shows a reduction in runtime of up to 95% compared to a standard U-Net-based approach without a localization component. CONCLUSIONS The proposed method enables the efficient calculation of FLT uptake in the pelvis. Thus, it represents a valuable tool to facilitate bone marrow preserving adaptive radiation therapy and radiopharmaceutical dose calculation. Furthermore, the method can be adapted to process other bone structures as well as organs.
Collapse
Affiliation(s)
- Xiaofan Xiong
- Department of Biomedical Engineering, The University of Iowa, Iowa City, IA 52242
| | - Brian J. Smith
- Department of Biostatistics, The University of Iowa, Iowa City, IA 52242
| | - Stephen A. Graves
- Department of Radiology, The University of Iowa, Iowa City, IA 52242
| | | | - Michael M. Graham
- Department of Radiology, The University of Iowa, Iowa City, IA 52242
| | - Brandie A. Gross
- Department of Radiation Oncology, University of Iowa Hospitals and Clinics, Iowa City, IA 52242
| | - John M. Buatti
- Department of Radiation Oncology, University of Iowa Hospitals and Clinics, Iowa City, IA 52242
| | - Reinhard R. Beichel
- Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA 52242
| |
Collapse
|
36
|
An Algorithm for Automatic Rib Fracture Recognition Combined with nnU-Net and DenseNet. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2022; 2022:5841451. [PMID: 35251210 PMCID: PMC8896936 DOI: 10.1155/2022/5841451] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 01/31/2022] [Indexed: 11/29/2022]
Abstract
Rib fracture is the most common thoracic clinical trauma. Most patients have multiple different types of rib fracture regions, so accurate and rapid identification of all trauma regions is crucial for the treatment of rib fracture patients. In this study, a two-stage rib fracture recognition model based on nnU-Net is proposed. First, a deep learning segmentation model is trained to generate candidate rib fracture regions, and then, a deep learning classification model is trained in the second stage to classify the segmented local fracture regions according to the candidate fracture regions generated in the first stage to determine whether they are fractures or not. The results show that the two-stage deep learning model proposed in this study improves the accuracy of rib fracture recognition and reduces the false-positive and false-negative rates of rib fracture detection, which can better assist doctors in fracture region recognition.
Collapse
|
37
|
Chen X, Lian C, Deng HH, Kuang T, Lin HY, Xiao D, Gateno J, Shen D, Xia JJ, Yap PT. Fast and Accurate Craniomaxillofacial Landmark Detection via 3D Faster R-CNN. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:3867-3878. [PMID: 34310293 PMCID: PMC8686670 DOI: 10.1109/tmi.2021.3099509] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Automatic craniomaxillofacial (CMF) landmark localization from cone-beam computed tomography (CBCT) images is challenging, considering that 1) the number of landmarks in the images may change due to varying deformities and traumatic defects, and 2) the CBCT images used in clinical practice are typically large. In this paper, we propose a two-stage, coarse-to-fine deep learning method to tackle these challenges with both speed and accuracy in mind. Specifically, we first use a 3D faster R-CNN to roughly locate landmarks in down-sampled CBCT images that have varying numbers of landmarks. By converting the landmark point detection problem to a generic object detection problem, our 3D faster R-CNN is formulated to detect virtual, fixed-size objects in small boxes with centers indicating the approximate locations of the landmarks. Based on the rough landmark locations, we then crop 3D patches from the high-resolution images and send them to a multi-scale UNet for the regression of heatmaps, from which the refined landmark locations are finally derived. We evaluated the proposed approach by detecting up to 18 landmarks on a real clinical dataset of CMF CBCT images with various conditions. Experiments show that our approach achieves state-of-the-art accuracy of 0.89 ± 0.64mm in an average time of 26.2 seconds per volume.
Collapse
|
38
|
|
39
|
Hussain MA, Hamarneh G, Garbi R. Cascaded Regression Neural Nets for Kidney Localization and Segmentation-free Volume Estimation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1555-1567. [PMID: 33606626 DOI: 10.1109/tmi.2021.3060465] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Kidney volume is an essential biomarker for a number of kidney disease diagnoses, for example, chronic kidney disease. Existing total kidney volume estimation methods often rely on an intermediate kidney segmentation step. On the other hand, automatic kidney localization in volumetric medical images is a critical step that often precedes subsequent data processing and analysis. Most current approaches perform kidney localization via an intermediate classification or regression step. This paper proposes an integrated deep learning approach for (i) kidney localization in computed tomography scans and (ii) segmentation-free renal volume estimation. Our localization method uses a selection-convolutional neural network that approximates the kidney inferior-superior span along the axial direction. Cross-sectional (2D) slices from the estimated span are subsequently used in a combined sagittal-axial Mask-RCNN that detects the organ bounding boxes on the axial and sagittal slices, the combination of which produces a final 3D organ bounding box. Furthermore, we use a fully convolutional network to estimate the kidney volume that skips the segmentation procedure. We also present a mathematical expression to approximate the 'volume error' metric from the 'Sørensen-Dice coefficient.' We accessed 100 patients' CT scans from the Vancouver General Hospital records and obtained 210 patients' CT scans from the 2019 Kidney Tumor Segmentation Challenge database to validate our method. Our method produces a kidney boundary wall localization error of ~2.4mm and a mean volume estimation error of ~5%.
Collapse
|
40
|
Fu Y, Lei Y, Wang T, Curran WJ, Liu T, Yang X. A review of deep learning based methods for medical image multi-organ segmentation. Phys Med 2021; 85:107-122. [PMID: 33992856 PMCID: PMC8217246 DOI: 10.1016/j.ejmp.2021.05.003] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 03/12/2021] [Accepted: 05/03/2021] [Indexed: 12/12/2022] Open
Abstract
Deep learning has revolutionized image processing and achieved the-state-of-art performance in many medical image segmentation tasks. Many deep learning-based methods have been published to segment different parts of the body for different medical applications. It is necessary to summarize the current state of development for deep learning in the field of medical image segmentation. In this paper, we aim to provide a comprehensive review with a focus on multi-organ image segmentation, which is crucial for radiotherapy where the tumor and organs-at-risk need to be contoured for treatment planning. We grouped the surveyed methods into two broad categories which are 'pixel-wise classification' and 'end-to-end segmentation'. Each category was divided into subgroups according to their network design. For each type, we listed the surveyed works, highlighted important contributions and identified specific challenges. Following the detailed review, we discussed the achievements, shortcomings and future potentials of each category. To enable direct comparison, we listed the performance of the surveyed works that used thoracic and head-and-neck benchmark datasets.
Collapse
Affiliation(s)
- Yabo Fu
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA, USA
| | - Yang Lei
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA, USA
| | - Tonghe Wang
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA, USA
| | - Walter J Curran
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA, USA
| | - Tian Liu
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA, USA
| | - Xiaofeng Yang
- Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA, USA.
| |
Collapse
|
41
|
Tang Y, Gao R, Han S, Chen Y, Gao D, Nath V, Bermudez C, Savona MR, Bao S, Lyu I, Huo Y, Landman BA. Body Part Regression With Self-Supervision. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1499-1507. [PMID: 33560981 PMCID: PMC10243464 DOI: 10.1109/tmi.2021.3058281] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Body part regression is a promising new technique that enables content navigation through self-supervised learning. Using this technique, the global quantitative spatial location for each axial view slice is obtained from computed tomography (CT). However, it is challenging to define a unified global coordinate system for body CT scans due to the large variabilities in image resolution, contrasts, sequences, and patient anatomy. Therefore, the widely used supervised learning approach cannot be easily deployed. To address these concerns, we propose an annotation-free method named blind-unsupervised-supervision network (BUSN). The contributions of the work are in four folds: (1) 1030 multi-center CT scans are used in developing BUSN without any manual annotation. (2) the proposed BUSN corrects the predictions from unsupervised learning and uses the corrected results as the new supervision; (3) to improve the consistency of predictions, we propose a novel neighbor message passing (NMP) scheme that is integrated with BUSN as a statistical learning based correction; and (4) we introduce a new pre-processing pipeline with inclusion of the BUSN, which is validated on 3D multi-organ segmentation. The proposed method is trained on 1,030 whole body CT scans (230,650 slices) from five datasets, as well as an independent external validation cohort with 100 scans. From the body part regression results, the proposed BUSN achieved significantly higher median R-squared score (=0.9089) than the state-of-the-art unsupervised method (=0.7153). When introducing BUSN as a preprocessing stage in volumetric segmentation, the proposed pre-processing pipeline using BUSN approach increases the total mean Dice score of the 3D abdominal multi-organ segmentation from 0.7991 to 0.8145.
Collapse
Affiliation(s)
- Yucheng Tang
- Department of Electrical Engineering, Vanderbilt University
| | - Riqiang Gao
- Department of Electrical Engineering and Computer Science, Vanderbilt University
| | | | | | - Dashan Gao
- 12 Sigma Technologies, San Diego, CA 92130, USA
| | - Vishwesh Nath
- Department of Electrical Engineering and Computer Science, Vanderbilt University
| | | | - Michael R. Savona
- Department of Medicine and Program in Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37235, USA
| | - Shunxing Bao
- Department of Electrical Engineering and Computer Science, Vanderbilt University
| | - Ilwoo Lyu
- Department of Electrical Engineering and Computer Science, Vanderbilt University
| | - Yuankai Huo
- Department of Electrical Engineering, Vanderbilt University
| | - Bennett A. Landman
- Department of Electrical Engineering and Computer Science, Vanderbilt University
| |
Collapse
|
42
|
Santhosh Reddy D, Rajalakshmi P, Mateen M. A deep learning based approach for classification of abdominal organs using ultrasound images. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
43
|
Li H, Wang S, Tang J, Wu J, Liu Y. Computed Tomography- (CT-) Based Virtual Surgery Planning for Spinal Intervertebral Foraminal Assisted Clinical Treatment. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:5521916. [PMID: 33747415 PMCID: PMC7960066 DOI: 10.1155/2021/5521916] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 02/20/2021] [Accepted: 03/01/2021] [Indexed: 11/26/2022]
Abstract
With the development of minimally invasive spine concepts and the introduction of new minimally invasive instruments, minimally invasive spine technology, represented by foraminoscopy, has flourished, and percutaneous foraminoscopy has become one of the most reliable minimally invasive procedures for the treatment of lumbar disc herniation. Percutaneous foraminoscopy is a safe and effective minimally invasive spinal endoscopic surgical technique. It fully protects the paravertebral muscles and soft tissues as well as the posterior column structure of the spine, provides precise treatment of the target nucleus pulposus tissue, with the advantages of less surgical trauma, fewer postoperative complications, and rapid postoperative recovery, and is widely promoted and used in clinical practice. In this paper, we can view the location, morphology, structure, alignment, and adjacency relationships by performing coronary, CT, and diagonal reconstruction along the attachment of the yellow ligaments and performing 3D reconstruction or processing techniques after performing CT scans. This allows clinicians to observe the laminoplasty and the stenosis of the vertebral canal in a more intuitive and overall manner. It has clinical significance for the display of the sublaminar spine as well as the physician's judgment of the disease and the choice of surgery.
Collapse
Affiliation(s)
- Hao Li
- Department of Orthopaedics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu 221000, China
| | - Song Wang
- Department of Orthopaedics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu 221000, China
| | - Jinlong Tang
- Department of Orthopaedics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu 221000, China
| | - Jibin Wu
- Department of Orthopaedics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu 221000, China
| | - Yong Liu
- Department of Orthopaedics, Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu 221000, China
| |
Collapse
|
44
|
Tang Y, Gao R, Lee HH, Xu Z, Savoie BV, Bao S, Huo Y, Fogo AB, Harris R, de Caestecker MP, Spraggins J, Landman BA. Renal Cortex, Medulla and Pelvicaliceal System Segmentation on Arterial Phase CT Images with Random Patch-based Networks. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2021; 11596:115961D. [PMID: 34531632 PMCID: PMC8442958 DOI: 10.1117/12.2581101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Renal segmentation on contrast-enhanced computed tomography (CT) provides distinct spatial context and morphology. Current studies for renal segmentations are highly dependent on manual efforts, which are time-consuming and tedious. Hence, developing an automatic framework for the segmentation of renal cortex, medulla and pelvicalyceal system is an important quantitative assessment of renal morphometry. Recent innovations in deep methods have driven performance toward levels for which clinical translation is appealing. However, the segmentation of renal structures can be challenging due to the limited field-of-view (FOV) and variability among patients. In this paper, we propose a method to automatically label the renal cortex, the medulla and pelvicalyceal system. First, we retrieved 45 clinically-acquired deidentified arterial phase CT scans (45 patients, 90 kidneys) without diagnosis codes (ICD-9) involving kidney abnormalities. Second, an interpreter performed manual segmentation to pelvis, medulla and cortex slice-by-slice on all retrieved subjects under expert supervision. Finally, we proposed a patch-based deep neural networks to automatically segment renal structures. Compared to the automatic baseline algorithm (3D U-Net) and conventional hierarchical method (3D U-Net Hierarchy), our proposed method achieves improvement of 0.7968 to 0.6749 (3D U-Net), 0.7482 (3D U-Net Hierarchy) in terms of mean Dice scores across three classes (p-value < 0.001, paired t-tests between our method and 3D U-Net Hierarchy). In summary, the proposed algorithm provides a precise and efficient method for labeling renal structures.
Collapse
Affiliation(s)
- Yucheng Tang
- Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212
| | - Riqiang Gao
- Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212
| | - Ho Hin Lee
- Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212
| | - Zhoubing Xu
- Siemens Healthineers, Princeton, NJ, USA 08540
| | - Brent V Savoie
- Radiology, Vanderbilt University Medical Center, Nashville, TN, USA 37235
| | - Shunxing Bao
- Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212
| | - Yuankai Huo
- Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212
| | - Agnes B Fogo
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN USA 37232
- Departments of Medicine and Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA 37232
| | - Raymond Harris
- Division of Nephrology and Hypertension, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN USA 37232
| | - Mark P de Caestecker
- Division of Nephrology and Hypertension, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN USA 37232
| | - Jeffrey Spraggins
- Department of Biochemistry, Vanderbilt University, Nashville, TN, USA 37232
| | - Bennett A Landman
- Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37212
- Radiology, Vanderbilt University Medical Center, Nashville, TN, USA 37235
| |
Collapse
|
45
|
Ghatwary N, Zolgharni M, Janan F, Ye X. Learning Spatiotemporal Features for Esophageal Abnormality Detection From Endoscopic Videos. IEEE J Biomed Health Inform 2021; 25:131-142. [PMID: 32750901 DOI: 10.1109/jbhi.2020.2995193] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Esophageal cancer is categorized as a type of disease with a high mortality rate. Early detection of esophageal abnormalities (i.e. precancerous and early cancerous) can improve the survival rate of the patients. Recent deep learning-based methods for selected types of esophageal abnormality detection from endoscopic images have been proposed. However, no methods have been introduced in the literature to cover the detection from endoscopic videos, detection from challenging frames and detection of more than one esophageal abnormality type. In this paper, we present an efficient method to automatically detect different types of esophageal abnormalities from endoscopic videos. We propose a novel 3D Sequential DenseConvLstm network that extracts spatiotemporal features from the input video. Our network incorporates 3D Convolutional Neural Network (3DCNN) and Convolutional Lstm (ConvLstm) to efficiently learn short and long term spatiotemporal features. The generated feature map is utilized by a region proposal network and ROI pooling layer to produce a bounding box that detects abnormality regions in each frame throughout the video. Finally, we investigate a post-processing method named Frame Search Conditional Random Field (FS-CRF) that improves the overall performance of the model by recovering the missing regions in neighborhood frames within the same clip. We extensively validate our model on an endoscopic video dataset that includes a variety of esophageal abnormalities. Our model achieved high performance using different evaluation metrics showing 93.7% recall, 92.7% precision, and 93.2% F-measure. Moreover, as no results have been reported in the literature for the esophageal abnormality detection from endoscopic videos, to validate the robustness of our model, we have tested the model on a publicly available colonoscopy video dataset, achieving the polyp detection performance in a recall of 81.18%, precision of 96.45% and F-measure 88.16%, compared to the state-of-the-art results of 78.84% recall, 90.51% precision and 84.27% F-measure using the same dataset. This demonstrates that the proposed method can be adapted to different gastrointestinal endoscopic video applications with a promising performance.
Collapse
|
46
|
Statistical deformation reconstruction using multi-organ shape features for pancreatic cancer localization. Med Image Anal 2021; 67:101829. [DOI: 10.1016/j.media.2020.101829] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 08/12/2020] [Accepted: 09/12/2020] [Indexed: 11/20/2022]
|
47
|
Jin L, Yang J, Kuang K, Ni B, Gao Y, Sun Y, Gao P, Ma W, Tan M, Kang H, Chen J, Li M. Deep-learning-assisted detection and segmentation of rib fractures from CT scans: Development and validation of FracNet. EBioMedicine 2020; 62:103106. [PMID: 33186809 PMCID: PMC7670192 DOI: 10.1016/j.ebiom.2020.103106] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 10/17/2020] [Accepted: 10/19/2020] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Diagnosis of rib fractures plays an important role in identifying trauma severity. However, quickly and precisely identifying the rib fractures in a large number of CT images with increasing number of patients is a tough task, which is also subject to the qualification of radiologist. We aim at a clinically applicable automatic system for rib fracture detection and segmentation from CT scans. METHODS A total of 7,473 annotated traumatic rib fractures from 900 patients in a single center were enrolled into our dataset, named RibFrac Dataset, which were annotated with a human-in-the-loop labeling procedure. We developed a deep learning model, named FracNet, to detect and segment rib fractures. 720, 60 and 120 patients were randomly split as training cohort, tuning cohort and test cohort, respectively. Free-Response ROC (FROC) analysis was used to evaluate the sensitivity and false positives of the detection performance, and Intersection-over-Union (IoU) and Dice Coefficient (Dice) were used to evaluate the segmentation performance of predicted rib fractures. Observer studies, including independent human-only study and human-collaboration study, were used to benchmark the FracNet with human performance and evaluate its clinical applicability. A annotated subset of RibFrac Dataset, including 420 for training, 60 for tuning and 120 for test, as well as our code for model training and evaluation, was open to research community to facilitate both clinical and engineering research. FINDINGS Our method achieved a detection sensitivity of 92.9% with 5.27 false positives per scan and a segmentation Dice of 71.5%on the test cohort. Human experts achieved much lower false positives per scan, while underperforming the deep neural networks in terms of detection sensitivities with longer time in diagnosis. With human-computer collobration, human experts achieved higher detection sensitivities than human-only or computer-only diagnosis. INTERPRETATION The proposed FracNet provided increasing detection sensitivity of rib fractures with significantly decreased clinical time consumed, which established a clinically applicable method to assist the radiologist in clinical practice. FUNDING A full list of funding bodies that contributed to this study can be found in the Acknowledgements section. The funding sources played no role in the study design; collection, analysis, and interpretation of data; writing of the report; or decision to submit the article for publication .
Collapse
Affiliation(s)
- Liang Jin
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China
| | - Jiancheng Yang
- Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, P.R. China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, P.R. China; Dianei Technology, Shanghai, P.R. China
| | | | - Bingbing Ni
- Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, P.R. China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, P.R. China; Huawei Hisilicon, Shanghai, P.R. China
| | - Yiyi Gao
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China
| | - Yingli Sun
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China
| | - Pan Gao
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China
| | - Weiling Ma
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China
| | - Mingyu Tan
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China
| | - Hui Kang
- Dianei Technology, Shanghai, P.R. China
| | | | - Ming Li
- Radiology Department, Huadong Hospital, affiliated to Fudan University, Shanghai, China; Institute of Functional and Molecular Medical Imaging, Fudan University, Shanghai, China.
| |
Collapse
|
48
|
Polsinelli M, Cinque L, Placidi G. A light CNN for detecting COVID-19 from CT scans of the chest. Pattern Recognit Lett 2020; 140:95-100. [PMID: 33041409 PMCID: PMC7532353 DOI: 10.1016/j.patrec.2020.10.001] [Citation(s) in RCA: 107] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Revised: 07/22/2020] [Accepted: 10/02/2020] [Indexed: 01/06/2023]
Abstract
Computer Tomography (CT) imaging of the chest is a valid diagnosis tool to detect COVID-19 promptly and to control the spread of the disease. In this work we propose a light Convolutional Neural Network (CNN) design, based on the model of the SqueezeNet, for the efficient discrimination of COVID-19 CT images with respect to other community-acquired pneumonia and/or healthy CT images. The architecture allows to an accuracy of 85.03% with an improvement of about 3.2% in the first dataset arrangement and of about 2.1% in the second dataset arrangement. The obtained gain, though of low entity, can be really important in medical diagnosis and, in particular, for Covid-19 scenario. Also the average classification time on a high-end workstation, 1.25 s, is very competitive with respect to that of more complex CNN designs, 13.41 s, witch require pre-processing. The proposed CNN can be executed on medium-end laptop without GPU acceleration in 7.81 s: this is impossible for methods requiring GPU acceleration. The performance of the method can be further improved with efficient pre-processing strategies for witch GPU acceleration is not necessary.
Collapse
Affiliation(s)
- Matteo Polsinelli
- A2VI Lab, Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Via Vetoio, L'Aquila 67100, Italy
| | - Luigi Cinque
- Dept. Computer Science, Via Salaria, Sapienza University, Rome, Italy
| | - Giuseppe Placidi
- A2VI Lab, Dept. of Life, Health and Environmental Sciences, University of L'Aquila, Via Vetoio, L'Aquila 67100, Italy
| |
Collapse
|
49
|
Holistic multitask regression network for multiapplication shape regression segmentation. Med Image Anal 2020; 65:101783. [DOI: 10.1016/j.media.2020.101783] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 05/31/2020] [Accepted: 07/09/2020] [Indexed: 11/23/2022]
|
50
|
Yang X, Wang X, Wang Y, Dou H, Li S, Wen H, Lin Y, Heng PA, Ni D. Hybrid attention for automatic segmentation of whole fetal head in prenatal ultrasound volumes. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 194:105519. [PMID: 32447146 DOI: 10.1016/j.cmpb.2020.105519] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 04/05/2020] [Accepted: 04/23/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVE Biometric measurements of fetal head are important indicators for maternal and fetal health monitoring during pregnancy. 3D ultrasound (US) has unique advantages over 2D scan in covering the whole fetal head and may promote the diagnoses. However, automatically segmenting the whole fetal head in US volumes still pends as an emerging and unsolved problem. The challenges that automated solutions need to tackle include the poor image quality, boundary ambiguity, long-span occlusion, and the appearance variability across different fetal poses and gestational ages. In this paper, we propose the first fully-automated solution to segment the whole fetal head in US volumes. METHODS The segmentation task is firstly formulated as an end-to-end volumetric mapping under an encoder-decoder deep architecture. We then combine the segmentor with a proposed hybrid attention scheme (HAS) to select discriminative features and suppress the non-informative volumetric features in a composite and hierarchical way. With little computation overhead, HAS proves to be effective in addressing boundary ambiguity and deficiency. To enhance the spatial consistency in segmentation, we further organize multiple segmentors in a cascaded fashion to refine the results by revisiting context in the prediction of predecessors. RESULTS Validated on a large dataset collected from 100 healthy volunteers, our method presents superior segmentation performance (DSC (Dice Similarity Coefficient), 96.05%), remarkable agreements with experts (-1.6±19.5 mL). With another 156 volumes collected from 52 volunteers, we ahieve high reproducibilities (mean standard deviation 11.524 mL) against scan variations. CONCLUSION This is the first investigation about whole fetal head segmentation in 3D US. Our method is promising to be a feasible solution in assisting the volumetric US-based prenatal studies.
Collapse
Affiliation(s)
- Xin Yang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Xu Wang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China
| | - Yi Wang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China
| | - Haoran Dou
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China
| | - Shengli Li
- Department of Ultrasound, Affiliated Shenzhen Maternal and Child Healthcare Hospital of Nanfang Medical University, Shenzhen, China
| | - Huaxuan Wen
- Department of Ultrasound, Affiliated Shenzhen Maternal and Child Healthcare Hospital of Nanfang Medical University, Shenzhen, China
| | - Yi Lin
- Department of Ultrasound, Affiliated Shenzhen Maternal and Child Healthcare Hospital of Nanfang Medical University, Shenzhen, China
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China.
| |
Collapse
|