1
|
Karimijafarbigloo S, Azad R, Kazerouni A, Merhof D. MedScale-Former: Self-guided multiscale transformer for medical image segmentation. Med Image Anal 2025; 103:103554. [PMID: 40209553 DOI: 10.1016/j.media.2025.103554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 03/01/2025] [Accepted: 03/14/2025] [Indexed: 04/12/2025]
Abstract
Accurate medical image segmentation is crucial for enabling automated clinical decision procedures. However, existing supervised deep learning methods for medical image segmentation face significant challenges due to their reliance on extensive labeled training data. To address this limitation, our novel approach introduces a dual-branch transformer network operating on two scales, strategically encoding global contextual dependencies while preserving local information. To promote self-supervised learning, our method leverages semantic dependencies between different scales, generating a supervisory signal for inter-scale consistency. Additionally, it incorporates a spatial stability loss within each scale, fostering self-supervised content clustering. While intra-scale and inter-scale consistency losses enhance feature uniformity within clusters, we introduce a cross-entropy loss function atop the clustering score map to effectively model cluster distributions and refine decision boundaries. Furthermore, to account for pixel-level similarities between organ or lesion subpixels, we propose a selective kernel regional attention module as a plug and play component. This module adeptly captures and outlines organ or lesion regions, slightly enhancing the definition of object boundaries. Our experimental results on skin lesion, lung organ, and multiple myeloma plasma cell segmentation tasks demonstrate the superior performance of our method compared to state-of-the-art approaches.
Collapse
Affiliation(s)
| | - Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Amirhossein Kazerouni
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany; Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
| |
Collapse
|
2
|
Lee JH, Oh SJ, Kim K, Lim CY, Choi SH, Chung MJ. Improved unsupervised 3D lung lesion detection and localization by fusing global and local features: Validation in 3D low-dose computed tomography. Med Image Anal 2025; 103:103559. [PMID: 40198972 DOI: 10.1016/j.media.2025.103559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 01/29/2025] [Accepted: 03/18/2025] [Indexed: 04/10/2025]
Abstract
Unsupervised anomaly detection (UAD) is crucial in low-dose computed tomography (LDCT). Recent AI technologies, leveraging global features, have enabled effective UAD with minimal training data of normal patients. However, this approach, devoid of utilizing local features, exhibits vulnerability in detecting deep lesions within the lungs. In other words, while the conventional use of global features can achieve high specificity, it often comes with limited sensitivity. Developing a UAD AI model with high sensitivity is essential to prevent false negatives, especially in screening patients with diseases demonstrating high mortality rates. We have successfully pioneered a new LDCT UAD AI model that leverages local features, achieving a previously unattainable increase in sensitivity compared to global methods (17.5% improvement). Furthermore, by integrating this approach with conventional global-based techniques, we have successfully consolidated the advantages of each model - high sensitivity from the local model and high specificity from the global model - into a single, unified, trained model (17.6% and 33.5% improvement, respectively). Without the need for additional training, we anticipate achieving significant diagnostic efficacy in various LDCT applications, where both high sensitivity and specificity are essential, using our fixed model. Code is available at https://github.com/kskim-phd/Fusion-UADL.
Collapse
Affiliation(s)
- Ju Hwan Lee
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea; Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea
| | - Seong Je Oh
- Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Republic of Korea
| | - Kyungsu Kim
- School of Transdisciplinary Innovations, Artificial Intelligence Institute, Interdisciplinary Program in Bioengineering, and Interdisciplinary Program in Artificial Intelligence, Seoul, Republic of Korea; Department of Biomedical Science, Medical Research Center, SNUH Institute of Convergence Medicine with Innovative Technology, SNUH Institute of Healthcare AI Research, Seoul, Republic of Korea.
| | - Chae Yeon Lim
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea; Department of Medical Device Management and Research, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea
| | - Seung Hong Choi
- School of Transdisciplinary Innovations, Interdisciplinary Program in Bioengineering, and Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, Republic of Korea; Department of Radiology, Department of Biomedical Science and Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Myung Jin Chung
- Medical AI Research Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea; Department of Data Convergence and Future Medicine, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea; Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
3
|
Pouget E, Dedieu V, Magnin ML, Biard M, Lienemann G, Garcier JM, Magnin B. Response surface methodology for predicting optimal conditions in very low-dose chest CT imaging. Phys Med 2025; 131:104916. [PMID: 39923359 DOI: 10.1016/j.ejmp.2025.104916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 12/15/2024] [Accepted: 01/30/2025] [Indexed: 02/11/2025] Open
Abstract
OBJECTIVES Dose reduction techniques, such as new reconstruction algorithms and automated exposure control systems vary with manufacturer and scanner models, complicating the optimization and standardization procedures. We investigated the feasibility of using the design of experiments in CT protocols optimization. MATERIALS & METHODS A Doehlert matrix was used to define the experiments to carry out. Measurements were conducted on a 128-slice CT scanner using an anthropomorphic chest phantom with a 5 mm diameter lesion that has a HU of -800. CT images were reconstructed using iterative (ASIR-V) and deep learning-based reconstruction techniques at low (DLIR-L) and high (DLIR-H) strengths. Lesion detectability was assessed using two self-supervised learning-based model observers and six human observers. Second-order polynomial functions have been established to model the combined effect of noise index (NI) and percentage of ASIR-V on dose and model observers' performances. The analysis of agreement between model and human observers was performed using correlation coefficients and Bland-Altman test. RESULTS The optimal conditions predicted by this method were NI = 64, % ASIR-V = 60 and DLIR-H reconstruction. They were found in good agreement with the experimental results obtained by the average human observer, as showed by the Bland-Altman plot with a mean absolute difference of -0.01 ± 3.16. Compared to 60 % ASIR-V, these results suggested an approximately 64 % dose reduction potential for DLIR-H without compromising lesion detection. CONCLUSION The proposed method can predict the optimal conditions that ensure diagnostic quality of low-dose chest CT examinations, while minimizing the number of experiments to carry out.
Collapse
Affiliation(s)
- Eléonore Pouget
- Department of Medical Physics, Jean Perrin Comprehensive Cancer Center F-63000 Clermont-Ferrand, France; Clermont-Ferrand University, UMR 1240 INSERM IMoST, 58 rue Montalembert F-63000 Clermont-Ferrand, France.
| | - Véronique Dedieu
- Department of Medical Physics, Jean Perrin Comprehensive Cancer Center F-63000 Clermont-Ferrand, France; Clermont-Ferrand University, UMR 1240 INSERM IMoST, 58 rue Montalembert F-63000 Clermont-Ferrand, France
| | | | - Marie Biard
- CHU Estaing, Service de radiologie F-63000 Clermont-Ferrand, France
| | | | - Jean-Marc Garcier
- CHU Estaing, Service de radiologie F-63000 Clermont-Ferrand, France; Institut Pascal, UMR 6602 CNRS, Université Clermont Auvergne, Clermont-Ferrand, France; DI2AM, DRCI, Clermont University Hospital, Clermont-Ferrand, France
| | - Benoît Magnin
- CHU Estaing, Service de radiologie F-63000 Clermont-Ferrand, France; Institut Pascal, UMR 6602 CNRS, Université Clermont Auvergne, Clermont-Ferrand, France; DI2AM, DRCI, Clermont University Hospital, Clermont-Ferrand, France
| |
Collapse
|
4
|
Zhu Y, Cai X, Wang X, Chen X, Fu Z, Yao Y. BSDA: Bayesian Random Semantic Data Augmentation for Medical Image Classification. SENSORS (BASEL, SWITZERLAND) 2024; 24:7511. [PMID: 39686050 DOI: 10.3390/s24237511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 11/06/2024] [Accepted: 11/12/2024] [Indexed: 12/18/2024]
Abstract
Data augmentation is a crucial regularization technique for deep neural networks, particularly in medical imaging tasks with limited data. Deep learning models are highly effective at linearizing features, enabling the alteration of feature semantics through the shifting of latent space representations-an approach known as semantic data augmentation (SDA). The paradigm of SDA involves shifting features in a specified direction. Current SDA methods typically sample the amount of shifting from a Gaussian distribution or the sample variance. However, excessive shifting can lead to changes in data labels, which may negatively impact model performance. To address this issue, we propose a computationally efficient method called Bayesian Random Semantic Data Augmentation (BSDA). BSDA can be seamlessly integrated as a plug-and-play component into any neural network. Our experiments demonstrate that BSDA outperforms competitive methods and is suitable for both 2D and 3D medical image datasets, as well as most medical imaging modalities. Additionally, BSDA is compatible with mainstream neural network models and enhances baseline performance. The code is available online.
Collapse
Affiliation(s)
- Yaoyao Zhu
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610213, China
- The School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Xiuding Cai
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610213, China
- The School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Xueyao Wang
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610213, China
- The School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Xiaoqing Chen
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610213, China
- The School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Zhongliang Fu
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610213, China
- The School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Yu Yao
- Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610213, China
- The School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| |
Collapse
|
5
|
Ozkan E, Boix X. Multi-domain improves classification in out-of-distribution and data-limited scenarios for medical image analysis. Sci Rep 2024; 14:24412. [PMID: 39420026 PMCID: PMC11487066 DOI: 10.1038/s41598-024-73561-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Accepted: 09/18/2024] [Indexed: 10/19/2024] Open
Abstract
Current machine learning methods for medical image analysis primarily focus on developing models tailored for their specific tasks, utilizing data within their target domain. These specialized models tend to be data-hungry and often exhibit limitations in generalizing to out-of-distribution samples. In this work, we show that employing models that incorporate multiple domains instead of specialized ones significantly alleviates the limitations observed in specialized models. We refer to this approach as multi-domain model and compare its performance to that of specialized models. For this, we introduce the incorporation of diverse medical image domains, including different imaging modalities like X-ray, MRI, CT, and ultrasound images, as well as various viewpoints such as axial, coronal, and sagittal views. Our findings underscore the superior generalization capabilities of multi-domain models, particularly in scenarios characterized by limited data availability and out-of-distribution, frequently encountered in healthcare applications. The integration of diverse data allows multi-domain models to utilize information across domains, enhancing the overall outcomes substantially. To illustrate, for organ recognition, multi-domain model can enhance accuracy by up to 8% compared to conventional specialized models.
Collapse
Affiliation(s)
- Ece Ozkan
- Department of Brain and Cognitive Sciences, MIT, Cambridge, 02139, USA.
- Department of Computer Science, ETH Zurich, 8092, Zurich, Switzerland.
| | - Xavier Boix
- Fujitsu Research of America, Inc., Sunnyvale, 94085, USA
| |
Collapse
|
6
|
Piffer S, Ubaldi L, Tangaro S, Retico A, Talamonti C. Tackling the small data problem in medical image classification with artificial intelligence: a systematic review. PROGRESS IN BIOMEDICAL ENGINEERING (BRISTOL, ENGLAND) 2024; 6:032001. [PMID: 39655846 DOI: 10.1088/2516-1091/ad525b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 05/30/2024] [Indexed: 12/18/2024]
Abstract
Though medical imaging has seen a growing interest in AI research, training models require a large amount of data. In this domain, there are limited sets of data available as collecting new data is either not feasible or requires burdensome resources. Researchers are facing with the problem of small datasets and have to apply tricks to fight overfitting. 147 peer-reviewed articles were retrieved from PubMed, published in English, up until 31 July 2022 and articles were assessed by two independent reviewers. We followed the Preferred Reporting Items for Systematic reviews and Meta-Analyse (PRISMA) guidelines for the paper selection and 77 studies were regarded as eligible for the scope of this review. Adherence to reporting standards was assessed by using TRIPOD statement (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis). To solve the small data issue transfer learning technique, basic data augmentation and generative adversarial network were applied in 75%, 69% and 14% of cases, respectively. More than 60% of the authors performed a binary classification given the data scarcity and the difficulty of the tasks. Concerning generalizability, only four studies explicitly stated an external validation of the developed model was carried out. Full access to all datasets and code was severely limited (unavailable in more than 80% of studies). Adherence to reporting standards was suboptimal (<50% adherence for 13 of 37 TRIPOD items). The goal of this review is to provide a comprehensive survey of recent advancements in dealing with small medical images samples size. Transparency and improve quality in publications as well as follow existing reporting standards are also supported.
Collapse
Affiliation(s)
- Stefano Piffer
- Department of Experimental and Clinical Biomedical Sciences, University of Florence, Florence, Italy
- National Institute for Nuclear Physics (INFN), Florence Division, Florence, Italy
| | - Leonardo Ubaldi
- Department of Experimental and Clinical Biomedical Sciences, University of Florence, Florence, Italy
- National Institute for Nuclear Physics (INFN), Florence Division, Florence, Italy
| | - Sabina Tangaro
- Department of Soil, Plant and Food Sciences, University of Bari Aldo Moro, Bari, Italy
- INFN, Bari Division, Bari, Italy
| | | | - Cinzia Talamonti
- Department of Experimental and Clinical Biomedical Sciences, University of Florence, Florence, Italy
- National Institute for Nuclear Physics (INFN), Florence Division, Florence, Italy
| |
Collapse
|
7
|
Chen T, Bai Y, Mao H, Liu S, Xu K, Xiong Z, Ma S, Yang F, Zhao Y. Cross-modality transfer learning with knowledge infusion for diabetic retinopathy grading. Front Med (Lausanne) 2024; 11:1400137. [PMID: 38808141 PMCID: PMC11130363 DOI: 10.3389/fmed.2024.1400137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 04/15/2024] [Indexed: 05/30/2024] Open
Abstract
Background Ultra-wide-field (UWF) fundus photography represents an emerging retinal imaging technique offering a broader field of view, thus enhancing its utility in screening and diagnosing various eye diseases, notably diabetic retinopathy (DR). However, the application of computer-aided diagnosis for DR using UWF images confronts two major challenges. The first challenge arises from the limited availability of labeled UWF data, making it daunting to train diagnostic models due to the high cost associated with manual annotation of medical images. Secondly, existing models' performance requires enhancement due to the absence of prior knowledge to guide the learning process. Purpose By leveraging extensively annotated datasets within the field, which encompass large-scale, high-quality color fundus image datasets annotated at either image-level or pixel-level, our objective is to transfer knowledge from these datasets to our target domain through unsupervised domain adaptation. Methods Our approach presents a robust model for assessing the severity of diabetic retinopathy (DR) by leveraging unsupervised lesion-aware domain adaptation in ultra-wide-field (UWF) images. Furthermore, to harness the wealth of detailed annotations in publicly available color fundus image datasets, we integrate an adversarial lesion map generator. This generator supplements the grading model by incorporating auxiliary lesion information, drawing inspiration from the clinical methodology of evaluating DR severity by identifying and quantifying associated lesions. Results We conducted both quantitative and qualitative evaluations of our proposed method. In particular, among the six representative DR grading methods, our approach achieved an accuracy (ACC) of 68.18% and a precision (pre) of 67.43%. Additionally, we conducted extensive experiments in ablation studies to validate the effectiveness of each component of our proposed method. Conclusion In conclusion, our method not only improves the accuracy of DR grading, but also enhances the interpretability of the results, providing clinicians with a reliable DR grading scheme.
Collapse
Affiliation(s)
- Tao Chen
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Yanmiao Bai
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Haiting Mao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Shouyue Liu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Keyi Xu
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Zhouwei Xiong
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Shaodong Ma
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Fang Yang
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| | - Yitian Zhao
- Cixi Biomedical Research Institute, Wenzhou Medical University, Ningbo, China
- Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesNingbo, China
| |
Collapse
|
8
|
Pouget E, Dedieu V. Applying Self-Supervised Learning to Image Quality Assessment in Chest CT Imaging. Bioengineering (Basel) 2024; 11:335. [PMID: 38671757 PMCID: PMC11048026 DOI: 10.3390/bioengineering11040335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 03/25/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
Many new reconstruction techniques have been deployed to allow low-dose CT examinations. Such reconstruction techniques exhibit nonlinear properties, which strengthen the need for a task-based measure of image quality. The Hotelling observer (HO) is the optimal linear observer and provides a lower bound of the Bayesian ideal observer detection performance. However, its computational complexity impedes its widespread practical usage. To address this issue, we proposed a self-supervised learning (SSL)-based model observer to provide accurate estimates of HO performance in very low-dose chest CT images. Our approach involved a two-stage model combining a convolutional denoising auto-encoder (CDAE) for feature extraction and dimensionality reduction and a support vector machine for classification. To evaluate this approach, we conducted signal detection tasks employing chest CT images with different noise structures generated by computer-based simulations. We compared this approach with two supervised learning-based methods: a single-layer neural network (SLNN) and a convolutional neural network (CNN). The results showed that the CDAE-based model was able to achieve similar detection performance to the HO. In addition, it outperformed both SLNN and CNN when a reduced number of training images was considered. The proposed approach holds promise for optimizing low-dose CT protocols across scanner platforms.
Collapse
Affiliation(s)
- Eléonore Pouget
- Department of Medical Physics, Jean Perrin Comprehensive Cancer Center, F-63000 Clermont-Ferrand, France;
- UMR 1240 INSERM IMoST, University of Clermont-Ferrand, F-63000 Clermont-Ferrand, France
| | - Véronique Dedieu
- Department of Medical Physics, Jean Perrin Comprehensive Cancer Center, F-63000 Clermont-Ferrand, France;
- UMR 1240 INSERM IMoST, University of Clermont-Ferrand, F-63000 Clermont-Ferrand, France
| |
Collapse
|
9
|
Chen H, Wang R, Wang X, Li J, Fang Q, Li H, Bai J, Peng Q, Meng D, Wang L. Unsupervised Local Discrimination for Medical Images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15912-15929. [PMID: 37494162 DOI: 10.1109/tpami.2023.3299038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Contrastive learning, which aims to capture general representation from unlabeled images to initialize the medical analysis models, has been proven effective in alleviating the high demand for expensive annotations. Current methods mainly focus on instance-wise comparisons to learn the global discriminative features, however, pretermitting the local details to distinguish tiny anatomical structures, lesions, and tissues. To address this challenge, in this paper, we propose a general unsupervised representation learning framework, named local discrimination (LD), to learn local discriminative features for medical images by closely embedding semantically similar pixels and identifying regions of similar structures across different images. Specifically, this model is equipped with an embedding module for pixel-wise embedding and a clustering module for generating segmentation. And these two modules are unified by optimizing our novel region discrimination loss function in a mutually beneficial mechanism, which enables our model to reflect structure information as well as measure pixel-wise and region-wise similarity. Furthermore, based on LD, we propose a center-sensitive one-shot landmark localization algorithm and a shape-guided cross-modality segmentation model to foster the generalizability of our model. When transferred to downstream tasks, the learned representation by our method shows a better generalization, outperforming representation from 18 state-of-the-art (SOTA) methods and winning 9 out of all 12 downstream tasks. Especially for the challenging lesion segmentation tasks, the proposed method achieves significantly better performance.
Collapse
|
10
|
Karimijafarbigloo S, Azad R, Kazerouni A, Velichko Y, Bagci U, Merhof D. Self-supervised Semantic Segmentation: Consistency over Transformation. ... IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS. IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION 2023; 2023:2646-2655. [PMID: 38298808 PMCID: PMC10829429 DOI: 10.1109/iccvw60793.2023.00280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2024]
Abstract
Accurate medical image segmentation is of utmost importance for enabling automated clinical decision procedures. However, prevailing supervised deep learning approaches for medical image segmentation encounter significant challenges due to their heavy dependence on extensive labeled training data. To tackle this issue, we propose a novel self-supervised algorithm, S 3 - Net , which integrates a robust framework based on the proposed Inception Large Kernel Attention (I-LKA) modules. This architectural enhancement makes it possible to comprehensively capture contextual information while preserving local intricacies, thereby enabling precise semantic segmentation. Furthermore, considering that lesions in medical images often exhibit deformations, we leverage deformable convolution as an integral component to effectively capture and delineate lesion deformations for superior object boundary definition. Additionally, our self-supervised strategy emphasizes the acquisition of invariance to affine transformations, which is commonly encountered in medical scenarios. This emphasis on robustness with respect to geometric distortions significantly enhances the model's ability to accurately model and handle such distortions. To enforce spatial consistency and promote the grouping of spatially connected image pixels with similar feature representations, we introduce a spatial consistency loss term. This aids the network in effectively capturing the relationships among neighboring pixels and enhancing the overall segmentation quality. The S 3 - N e t approach iteratively learns pixel-level feature representations for image content clustering in an end-to-end manner. Our experimental results on skin lesion and lung organ segmentation tasks show the superior performance of our method compared to the SOTA approaches.
Collapse
Affiliation(s)
| | - Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Germany
| | | | - Yury Velichko
- Department of Radiology, Northwestern University, Chicago, USA
| | - Ulas Bagci
- Department of Radiology, Northwestern University, Chicago, USA
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Germany
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
| |
Collapse
|
11
|
Xu Z, Dai Y, Liu F, Chen W, Liu Y, Shi L, Liu S, Zhou Y. Swin MAE: Masked autoencoders for small datasets. Comput Biol Med 2023; 161:107037. [PMID: 37230020 DOI: 10.1016/j.compbiomed.2023.107037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 04/23/2023] [Accepted: 05/11/2023] [Indexed: 05/27/2023]
Abstract
The development of deep learning models in medical image analysis is majorly limited by the lack of large-sized and well-annotated datasets. Unsupervised learning does not require labels and is more suitable for solving medical image analysis problems. However, most unsupervised learning methods must be applied to large datasets. To make unsupervised learning applicable to small datasets, we proposed Swin MAE, a masked autoencoder with Swin Transformer as its backbone. Even on a dataset of only a few thousand medical images, Swin MAE can still learn useful semantic features purely from images without using any pre-trained models. It can equal or even slightly outperform the supervised model obtained by Swin Transformer trained on ImageNet in the transfer learning results of downstream tasks. Compared to MAE, Swin MAE brought a performance improvement of twice and five times for downstream tasks on BTCV and our parotid dataset, respectively. The code is publicly available at https://github.com/Zian-Xu/Swin-MAE.
Collapse
Affiliation(s)
- Zi'an Xu
- Northeastern University, Shenyang, China
| | - Yin Dai
- Northeastern University, Shenyang, China.
| | - Fayu Liu
- China Medical University, Shenyang, China
| | | | - Yue Liu
- Northeastern University, Shenyang, China
| | - Lifu Shi
- Liaoning Jiayin Medical Technology Co., China
| | - Sheng Liu
- China Medical University, Shenyang, China
| | | |
Collapse
|
12
|
Khademi S, Heidarian S, Afshar P, Enshaei N, Naderkhani F, Rafiee MJ, Oikonomou A, Shafiee A, Babaki Fard F, plataniotis KN, Mohammadi A. Robust framework for COVID-19 identication from a multicenter dataset of chest CT scans. PLoS One 2023; 18:e0282121. [PMID: 36862633 PMCID: PMC9980818 DOI: 10.1371/journal.pone.0282121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 02/07/2023] [Indexed: 03/03/2023] Open
Abstract
The main objective of this study is to develop a robust deep learning-based framework to distinguish COVID-19, Community-Acquired Pneumonia (CAP), and Normal cases based on volumetric chest CT scans, which are acquired in different imaging centers using different scanners and technical settings. We demonstrated that while our proposed model is trained on a relatively small dataset acquired from only one imaging center using a specific scanning protocol, it performs well on heterogeneous test sets obtained by multiple scanners using different technical parameters. We also showed that the model can be updated via an unsupervised approach to cope with the data shift between the train and test sets and enhance the robustness of the model upon receiving a new external dataset from a different center. More specifically, we extracted the subset of the test images for which the model generated a confident prediction and used the extracted subset along with the training set to retrain and update the benchmark model (the model trained on the initial train set). Finally, we adopted an ensemble architecture to aggregate the predictions from multiple versions of the model. For initial training and development purposes, an in-house dataset of 171 COVID-19, 60 CAP, and 76 Normal cases was used, which contained volumetric CT scans acquired from one imaging center using a single scanning protocol and standard radiation dose. To evaluate the model, we collected four different test sets retrospectively to investigate the effects of the shifts in the data characteristics on the model's performance. Among the test cases, there were CT scans with similar characteristics as the train set as well as noisy low-dose and ultra-low-dose CT scans. In addition, some test CT scans were obtained from patients with a history of cardiovascular diseases or surgeries. This dataset is referred to as the "SPGC-COVID" dataset. The entire test dataset used in this study contains 51 COVID-19, 28 CAP, and 51 Normal cases. Experimental results indicate that our proposed framework performs well on all test sets achieving total accuracy of 96.15% (95%CI: [91.25-98.74]), COVID-19 sensitivity of 96.08% (95%CI: [86.54-99.5]), CAP sensitivity of 92.86% (95%CI: [76.50-99.19]), Normal sensitivity of 98.04% (95%CI: [89.55-99.95]) while the confidence intervals are obtained using the significance level of 0.05. The obtained AUC values (One class vs Others) are 0.993 (95%CI: [0.977-1]), 0.989 (95%CI: [0.962-1]), and 0.990 (95%CI: [0.971-1]) for COVID-19, CAP, and Normal classes, respectively. The experimental results also demonstrate the capability of the proposed unsupervised enhancement approach in improving the performance and robustness of the model when being evaluated on varied external test sets.
Collapse
Affiliation(s)
- Sadaf Khademi
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada
| | - Shahin Heidarian
- Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
| | - Parnian Afshar
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada
| | - Nastaran Enshaei
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada
| | - Farnoosh Naderkhani
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada
| | - Moezedin Javad Rafiee
- Department of Medicine and Diagnostic Radiology, McGill University, Montreal, QC, Canada
| | - Anastasia Oikonomou
- Department of Medical Imaging, Sunnybrook Health Sciences Center, Toronto, Canada
| | - Akbar Shafiee
- Department of Cardiovascular Research, Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran
| | | | | | - Arash Mohammadi
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada
- * E-mail:
| |
Collapse
|
13
|
Ma W, Li X, Zou L, Fan C, Wu M. Symmetrical awareness network for cross-site ultrasound thyroid nodule segmentation. Front Public Health 2023; 11:1055815. [PMID: 36969643 PMCID: PMC10031019 DOI: 10.3389/fpubh.2023.1055815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 02/17/2023] [Indexed: 03/29/2023] Open
Abstract
Recent years have seen remarkable progress of learning-based methods on Ultrasound Thyroid Nodules segmentation. However, with very limited annotations, the multi-site training data from different domains makes the task remain challenging. Due to domain shift, the existing methods cannot be well generalized to the out-of-set data, which limits the practical application of deep learning in the field of medical imaging. In this work, we propose an effective domain adaptation framework which consists of a bidirectional image translation module and two symmetrical image segmentation modules. The framework improves the generalization ability of deep neural networks in medical image segmentation. The image translation module conducts the mutual conversion between the source domain and the target domain, while the symmetrical image segmentation modules perform image segmentation tasks in both domains. Besides, we utilize adversarial constraint to further bridge the domain gap in feature space. Meanwhile, a consistency loss is also utilized to make the training process more stable and efficient. Experiments on a multi-site ultrasound thyroid nodule dataset achieve 96.22% for PA and 87.06% for DSC in average, demonstrating that our method performs competitively in cross-domain generalization ability with state-of-the-art segmentation methods.
Collapse
Affiliation(s)
- Wenxuan Ma
- Electronic Information School, Wuhan University, Wuhan, China
| | - Xiaopeng Li
- Electronic Information School, Wuhan University, Wuhan, China
| | - Lian Zou
- Electronic Information School, Wuhan University, Wuhan, China
| | - Cien Fan
- Electronic Information School, Wuhan University, Wuhan, China
- *Correspondence: Cien Fan
| | - Meng Wu
- Department of Ultrasound, Zhongnan Hospital of Wuhan University, Wuhan, China
- Meng Wu
| |
Collapse
|
14
|
Garcea F, Serra A, Lamberti F, Morra L. Data augmentation for medical imaging: A systematic literature review. Comput Biol Med 2023; 152:106391. [PMID: 36549032 DOI: 10.1016/j.compbiomed.2022.106391] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 11/22/2022] [Accepted: 11/29/2022] [Indexed: 12/13/2022]
Abstract
Recent advances in Deep Learning have largely benefited from larger and more diverse training sets. However, collecting large datasets for medical imaging is still a challenge due to privacy concerns and labeling costs. Data augmentation makes it possible to greatly expand the amount and variety of data available for training without actually collecting new samples. Data augmentation techniques range from simple yet surprisingly effective transformations such as cropping, padding, and flipping, to complex generative models. Depending on the nature of the input and the visual task, different data augmentation strategies are likely to perform differently. For this reason, it is conceivable that medical imaging requires specific augmentation strategies that generate plausible data samples and enable effective regularization of deep neural networks. Data augmentation can also be used to augment specific classes that are underrepresented in the training set, e.g., to generate artificial lesions. The goal of this systematic literature review is to investigate which data augmentation strategies are used in the medical domain and how they affect the performance of clinical tasks such as classification, segmentation, and lesion detection. To this end, a comprehensive analysis of more than 300 articles published in recent years (2018-2022) was conducted. The results highlight the effectiveness of data augmentation across organs, modalities, tasks, and dataset sizes, and suggest potential avenues for future research.
Collapse
Affiliation(s)
- Fabio Garcea
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy
| | - Alessio Serra
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy
| | - Fabrizio Lamberti
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy
| | - Lia Morra
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy.
| |
Collapse
|
15
|
Abstract
Machine learning techniques used in computer-aided medical image analysis usually suffer from the domain shift problem caused by different distributions between source/reference data and target data. As a promising solution, domain adaptation has attracted considerable attention in recent years. The aim of this paper is to survey the recent advances of domain adaptation methods in medical image analysis. We first present the motivation of introducing domain adaptation techniques to tackle domain heterogeneity issues for medical image analysis. Then we provide a review of recent domain adaptation models in various medical image analysis tasks. We categorize the existing methods into shallow and deep models, and each of them is further divided into supervised, semi-supervised and unsupervised methods. We also provide a brief summary of the benchmark medical image datasets that support current domain adaptation research. This survey will enable researchers to gain a better understanding of the current status, challenges and future directions of this energetic research field.
Collapse
|
16
|
Nolde JM, Lugo-Gavidia LM, Carnagarin R, Azzam O, Kiuchi MG, Mian A, Schlaich MP. K-means panning - Developing a new standard in automated MSNA signal recognition with a weakly supervised learning approach. Comput Biol Med 2022; 140:105087. [PMID: 34864300 DOI: 10.1016/j.compbiomed.2021.105087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/15/2021] [Accepted: 11/25/2021] [Indexed: 11/03/2022]
Abstract
BACKGROUND Accessibility of labelled datasets is often a key limitation for the application of Machine Learning in clinical research. A novel semi-automated weak-labelling approach based on unsupervised clustering was developed to classify a large dataset of microneurography signals and subsequently used to train a Neural Network to reproduce the labelling process. METHODS Clusters of microneurography signals were created with k-means and then labelled in terms of the validity of the signals contained in each cluster. Only purely positive or negative clusters were labelled, whereas clusters with mixed content were passed on to the next iteration of the algorithm to undergo another cycle of unsupervised clustering and labelling of the clusters. After several iterations of this process, only pure labelled clusters remained which were used to train a Deep Neural Network. RESULTS Overall, 334,548 individual signal peaks form the integrated data were extracted and more than 99.99% of the data was labelled in six iterations of this novel application of weak labelling with the help of a domain expert. A Deep Neural Network trained based on this dataset achieved consistent accuracies above 95%. DISCUSSION Data extraction and the novel iterative approach of labelling unsupervised clusters enabled creation of a large, labelled dataset combining unsupervised learning and expert ratings of signal-peaks on cluster basis in a time effective manner. Further research is needed to validate the methodology and employ it on other types of physiologic data for which it may enable efficient generation of large labelled datasets.
Collapse
Affiliation(s)
- Janis M Nolde
- Dobney Hypertension Centre, Medical School - Royal Perth Hospital Unit / Royal Perth Hospital Medical Research Foundation, University of Western Australia, Perth, Australia
| | - Leslie Marisol Lugo-Gavidia
- Dobney Hypertension Centre, Medical School - Royal Perth Hospital Unit / Royal Perth Hospital Medical Research Foundation, University of Western Australia, Perth, Australia
| | - Revathy Carnagarin
- Dobney Hypertension Centre, Medical School - Royal Perth Hospital Unit / Royal Perth Hospital Medical Research Foundation, University of Western Australia, Perth, Australia
| | - Omar Azzam
- Dobney Hypertension Centre, Medical School - Royal Perth Hospital Unit / Royal Perth Hospital Medical Research Foundation, University of Western Australia, Perth, Australia
| | - Márcio Galindo Kiuchi
- Dobney Hypertension Centre, Medical School - Royal Perth Hospital Unit / Royal Perth Hospital Medical Research Foundation, University of Western Australia, Perth, Australia
| | - Ajmal Mian
- School of Computer Science and Software Engineering, The University of Western Australia, Perth, Australia
| | - Markus P Schlaich
- Dobney Hypertension Centre, Medical School - Royal Perth Hospital Unit / Royal Perth Hospital Medical Research Foundation, University of Western Australia, Perth, Australia; Department of Cardiology and Nephrology, Royal Perth Hospital, Perth, Australia; Department of Nephrology, Royal Perth Hospital, Perth, Australia; Neurovascular Hypertension & Kidney Disease Laboratory, Baker Heart and Diabetes Institute, Melbourne, Australia.
| |
Collapse
|
17
|
Nguyen HT, Bao Tran T, Luong HH, Nguyen Huynh TK. Decoders configurations based on Unet family and feature pyramid network for COVID-19 segmentation on CT images. PeerJ Comput Sci 2021; 7:e719. [PMID: 34616895 PMCID: PMC8459784 DOI: 10.7717/peerj-cs.719] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 08/26/2021] [Indexed: 06/13/2023]
Abstract
Coronavirus Disease 2019 (COVID-19) pandemic has been ferociously destroying global health and economics. According to World Health Organisation (WHO), until May 2021, more than one hundred million infected cases and 3.2 million deaths have been reported in over 200 countries. Unfortunately, the numbers are still on the rise. Therefore, scientists are making a significant effort in researching accurate, efficient diagnoses. Several studies advocating artificial intelligence proposed COVID diagnosis methods on lung images with high accuracy. Furthermore, some affected areas in the lung images can be detected accurately by segmentation methods. This work has considered state-of-the-art Convolutional Neural Network architectures, combined with the Unet family and Feature Pyramid Network (FPN) for COVID segmentation tasks on Computed Tomography (CT) scanner samples from the Italian Society of Medical and Interventional Radiology dataset. The experiments show that the decoder-based Unet family has reached the best (a mean Intersection Over Union (mIoU) of 0.9234, 0.9032 in dice score, and a recall of 0.9349) with a combination between SE ResNeXt and Unet++. The decoder with the Unet family obtained better COVID segmentation performance in comparison with Feature Pyramid Network. Furthermore, the proposed method outperforms recent segmentation state-of-the-art approaches such as the SegNet-based network, ADID-UNET, and A-SegNet + FTL. Therefore, it is expected to provide good segmentation visualizations of medical images.
Collapse
Affiliation(s)
- Hai Thanh Nguyen
- College of Information and Communication Technology, Can Tho University, Can Tho, Vietnam
| | - Toan Bao Tran
- Center of Software Engineering, Duy Tan University, Da Nang, Vietnam
- Institute of Research and Development, Duy Tan University, Da Nang, Vietnam
| | | | | |
Collapse
|
18
|
|
19
|
Guan H, Liu Y, Yang E, Yap PT, Shen D, Liu M. Multi-site MRI harmonization via attention-guided deep domain adaptation for brain disorder identification. Med Image Anal 2021; 71:102076. [PMID: 33930828 PMCID: PMC8184627 DOI: 10.1016/j.media.2021.102076] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 12/21/2020] [Accepted: 04/03/2021] [Indexed: 01/18/2023]
Abstract
Structural magnetic resonance imaging (MRI) has shown great clinical and practical values in computer-aided brain disorder identification. Multi-site MRI data increase sample size and statistical power, but are susceptible to inter-site heterogeneity caused by different scanners, scanning protocols, and subject cohorts. Multi-site MRI harmonization (MMH) helps alleviate the inter-site difference for subsequent analysis. Some MMH methods performed at imaging level or feature extraction level are concise but lack robustness and flexibility to some extent. Even though several machine/deep learning-based methods have been proposed for MMH, some of them require a portion of labeled data in the to-be-analyzed target domain or ignore the potential contributions of different brain regions to the identification of brain disorders. In this work, we propose an attention-guided deep domain adaptation (AD2A) framework for MMH and apply it to automated brain disorder identification with multi-site MRIs. The proposed framework does not need any category label information of target data, and can also automatically identify discriminative regions in whole-brain MR images. Specifically, the proposed AD2A is composed of three key modules: (1) an MRI feature encoding module to extract representations of input MRIs, (2) an attention discovery module to automatically locate discriminative dementia-related regions in each whole-brain MRI scan, and (3) a domain transfer module trained with adversarial learning for knowledge transfer between the source and target domains. Experiments have been performed on 2572 subjects from four benchmark datasets with T1-weighted structural MRIs, with results demonstrating the effectiveness of the proposed method in both tasks of brain disorder identification and disease progression prediction.
Collapse
Affiliation(s)
- Hao Guan
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yunbi Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Erkun Yang
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| |
Collapse
|