1
|
Usama M, Nyman E, Näslund U, Grönlund C. A domain adaptation model for carotid ultrasound: Image harmonization, noise reduction, and impact on cardiovascular risk markers. Comput Biol Med 2025; 190:110030. [PMID: 40179806 DOI: 10.1016/j.compbiomed.2025.110030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Revised: 01/10/2025] [Accepted: 03/12/2025] [Indexed: 04/05/2025]
Abstract
Deep learning has been used extensively for medical image analysis applications, assuming the training and test data adhere to the same probability distributions. However, a common challenge arises when dealing with medical images generated by different systems or even the same system with varying parameter settings. Such images often contain diverse textures and noise patterns, violating the assumption. Consequently, models trained on data from one machine or setting usually struggle to perform effectively on data from another. To address this issue in ultrasound images, we proposed a Generative Adversarial Network (GAN) based model in this paper. We formulated image harmonization and denoising tasks as an image-to-image translation task, wherein we adapt the texture pattern and reduced noise in Carotid ultrasound images while keeping the image content (the anatomy) unchanged. The performance was evaluated using feature distribution and pixel-space similarity metrics. In addition, blood-to-tissue contrast and influence on computed risk markers (Grey scale median, GSM) were evaluated. The results showed that domain adaptation was achieved in both tasks (histogram correlation 0.920 (0.043) and 0.844 (0.062)), as compared to no adaptation (0.890 (0.077) and 0.707 (0.098)), and that the anatomy of the images was retained (structure similarity index measure e.g. the arterial wall 0.71 (0.09) and 0.80 (0.08)). In addition, the image noise level (contrast) did not change in the image harmonization task (-34.1 (3.8) vs -35.2 (4.1) dB) but was improved in the noise reduction task (-23.5 (3.2) vs -46.7 (18.1) dB). To validate the performance of the proposed model, we compare its results with CycleGAN, the current state-of-the-art model. Our model outperformed CycleGAN in both tasks. Finally, the risk marker GSM was significantly changed in the noise reduction but not in the image harmonization task. We conclude that domain translation models are powerful tools for improving ultrasound image while retaining the underlying anatomy, but downstream calculations of risk markers may be affected.
Collapse
Affiliation(s)
- Mohd Usama
- Department of Diagnostics and Intervention, Biomedical Engineering and Radiation Physics, Umea University, Umea, Sweden.
| | - Emma Nyman
- Department of Public Health and Clinical Medicine, Umea University, Umea, Sweden.
| | - Ulf Näslund
- Department of Public Health and Clinical Medicine, Umea University, Umea, Sweden.
| | - Christer Grönlund
- Department of Diagnostics and Intervention, Biomedical Engineering and Radiation Physics, Umea University, Umea, Sweden.
| |
Collapse
|
2
|
Yang Y, Fu H, Aviles-Rivero AI, Xing Z, Zhu L. DiffMIC-v2: Medical Image Classification via Improved Diffusion Network. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:2244-2255. [PMID: 40031019 DOI: 10.1109/tmi.2025.3530399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Recently, Denoising Diffusion Models have achieved outstanding success in generative image modeling and attracted significant attention in the computer vision community. Although a substantial amount of diffusion-based research has focused on generative tasks, few studies apply diffusion models to medical diagnosis. In this paper, we propose a diffusion-based network (named DiffMIC-v2) to address general medical image classification by eliminating unexpected noise and perturbations in image representations. To achieve this goal, we first devise an improved dual-conditional guidance strategy that conditions each diffusion step with multiple granularities to enhance step-wise regional attention. Furthermore, we design a novel Heterologous diffusion process that achieves efficient visual representation learning in the latent space. We evaluate the effectiveness of our DiffMIC-v2 on four medical classification tasks with different image modalities, including thoracic diseases classification on chest X-ray, placental maturity grading on ultrasound images, skin lesion classification using dermatoscopic images, and diabetic retinopathy grading using fundus images. Experimental results demonstrate that our DiffMIC-v2 outperforms state-of-the-art methods by a significant margin, which indicates the universality and effectiveness of the proposed model on multi-class and multi-label classification tasks. DiffMIC-v2 can use fewer iterations than our previous DiffMIC to obtain accurate estimations, and also achieves greater runtime efficiency with superior results. The code will be publicly available at https://github.com/scott-yjyang/DiffMICv2.
Collapse
|
3
|
Zhou W, Yang X, Ji J, Yi Y. C 2 MAL: cascaded network-guided class-balanced multi-prototype auxiliary learning for source-free domain adaptive medical image segmentation. Med Biol Eng Comput 2025:10.1007/s11517-025-03287-0. [PMID: 39831950 DOI: 10.1007/s11517-025-03287-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2024] [Accepted: 12/31/2024] [Indexed: 01/22/2025]
Abstract
Source-free domain adaptation (SFDA) has become crucial in medical image analysis, enabling the adaptation of source models across diverse datasets without labeled target domain images. Self-training, a popular SFDA approach, iteratively refines self-generated pseudo-labels using unlabeled target domain data to adapt a pre-trained model from the source domain. However, it often faces model instability due to incorrect pseudo-label accumulation and foreground-background class imbalance. This paper presents a pioneering SFDA framework, named cascaded network-guided class-balanced multi-prototype auxiliary learning (C2 MAL), to enhance model stability. Firstly, we introduce the cascaded translation-segmentation network (CTS-Net), which employs iterative learning between translation and segmentation networks to generate accurate pseudo-labels. The CTS-Net employs a translation network to synthesize target-like images from unreliable predictions of the initial target domain images. The synthesized results refine segmentation network training, ensuring semantic alignment and minimizing visual disparities. Subsequently, reliable pseudo-labels guide the class-balanced multi-prototype auxiliary learning network (CMAL-Net) for effective model adaptation. CMAL-Net incorporates a new multi-prototype auxiliary learning strategy with a memory network to complement source domain data. We propose a class-balanced calibration loss and multi-prototype-guided symmetry cross-entropy loss to tackle class imbalance issue and enhance model adaptability to the target domain. Extensive experiments on four benchmark fundus image datasets validate the superiority of C2 MAL over state-of-the-art methods, especially in scenarios with significant domain shifts. Our code is available at https://github.com/yxk-art/C2MAL .
Collapse
Affiliation(s)
- Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China.
| | - Xuekun Yang
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Jianhang Ji
- Faculty of Data Science, City University of Macau, Macau, China
| | - Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| |
Collapse
|
4
|
Tang Y, Lyu T, Jin H, Du Q, Wang J, Li Y, Li M, Chen Y, Zheng J. Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning. Med Image Anal 2024; 98:103327. [PMID: 39191093 DOI: 10.1016/j.media.2024.103327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 08/20/2024] [Accepted: 08/21/2024] [Indexed: 08/29/2024]
Abstract
Low-dose computed tomography (LDCT) denoising tasks face significant challenges in practical imaging scenarios. Supervised methods encounter difficulties in real-world scenarios as there are no paired data for training. Moreover, when applied to datasets with varying noise patterns, these methods may experience decreased performance owing to the domain gap. Conversely, unsupervised methods do not require paired data and can be directly trained on real-world data. However, they often exhibit inferior performance compared to supervised methods. To address this issue, it is necessary to leverage the strengths of these supervised and unsupervised methods. In this paper, we propose a novel domain adaptive noise reduction framework (DANRF), which integrates both knowledge transfer and style generalization learning to effectively tackle the domain gap problem. Specifically, an iterative knowledge transfer method with knowledge distillation is selected to train the target model using unlabeled target data and a pre-trained source model trained with paired simulation data. Meanwhile, we introduce the mean teacher mechanism to update the source model, enabling it to adapt to the target domain. Furthermore, an iterative style generalization learning process is also designed to enrich the style diversity of the training dataset. We evaluate the performance of our approach through experiments conducted on multi-source datasets. The results demonstrate the feasibility and effectiveness of our proposed DANRF model in multi-source LDCT image processing tasks. Given its hybrid nature, which combines the advantages of supervised and unsupervised learning, and its ability to bridge domain gaps, our approach is well-suited for improving practical low-dose CT imaging in clinical settings. Code for our proposed approach is publicly available at https://github.com/tyfeiii/DANRF.
Collapse
Affiliation(s)
- Yufei Tang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Tianling Lyu
- Research Center of Augmented Intelligence, Zhejiang Lab, Hangzhou, 310000, China
| | - Haoyang Jin
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Qiang Du
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Jiping Wang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Yunxiang Li
- Nanovision Technology Co., Ltd., Beiqing Road, Haidian District, Beijing, 100094, China
| | - Ming Li
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Yang Chen
- Laboratory of Image Science and Technology, the School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Jian Zheng
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China; Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai, Weihai, 264200, China.
| |
Collapse
|
5
|
Zheng B, Zhang R, Diao S, Zhu J, Yuan Y, Cai J, Shao L, Li S, Qin W. Dual domain distribution disruption with semantics preservation: Unsupervised domain adaptation for medical image segmentation. Med Image Anal 2024; 97:103275. [PMID: 39032395 DOI: 10.1016/j.media.2024.103275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/14/2024] [Accepted: 07/10/2024] [Indexed: 07/23/2024]
Abstract
Recent unsupervised domain adaptation (UDA) methods in medical image segmentation commonly utilize Generative Adversarial Networks (GANs) for domain translation. However, the translated images often exhibit a distribution deviation from the ideal due to the inherent instability of GANs, leading to challenges such as visual inconsistency and incorrect style, consequently causing the segmentation model to fall into the fixed wrong pattern. To address this problem, we propose a novel UDA framework known as Dual Domain Distribution Disruption with Semantics Preservation (DDSP). Departing from the idea of generating images conforming to the target domain distribution in GAN-based UDA methods, we make the model domain-agnostic and focus on anatomical structural information by leveraging semantic information as constraints to guide the model to adapt to images with disrupted distributions in both source and target domains. Furthermore, we introduce the inter-channel similarity feature alignment based on the domain-invariant structural prior information, which facilitates the shared pixel-wise classifier to achieve robust performance on target domain features by aligning the source and target domain features across channels. Without any exaggeration, our method significantly outperforms existing state-of-the-art UDA methods on three public datasets (i.e., the heart dataset, the brain dataset, and the prostate dataset). The code is available at https://github.com/MIXAILAB/DDSPSeg.
Collapse
Affiliation(s)
- Boyun Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ranran Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Songhui Diao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jingke Zhu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yixuan Yuan
- Department of Electronic Engineering, The Chinese University of Hong Kong, 999077, Hong Kong, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, 999077, Hong Kong, China
| | - Liang Shao
- Department of Cardiology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang 330013, China
| | - Shuo Li
- Department of Biomedical Engineering, Department of Computer and Data Science, Case Western Reserve University, Cleveland, United States.
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
6
|
Chen L, Bian Y, Zeng J, Meng Q, Zhu W, Shi F, Shao C, Chen X, Xiang D. Style Consistency Unsupervised Domain Adaptation Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:4882-4895. [PMID: 39236126 DOI: 10.1109/tip.2024.3451934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Unsupervised domain adaptation medical image segmentation is aimed to segment unlabeled target domain images with labeled source domain images. However, different medical imaging modalities lead to large domain shift between their images, in which well-trained models from one imaging modality often fail to segment images from anothor imaging modality. In this paper, to mitigate domain shift between source domain and target domain, a style consistency unsupervised domain adaptation image segmentation method is proposed. First, a local phase-enhanced style fusion method is designed to mitigate domain shift and produce locally enhanced organs of interest. Second, a phase consistency discriminator is constructed to distinguish the phase consistency of domain-invariant features between source domain and target domain, so as to enhance the disentanglement of the domain-invariant and style encoders and removal of domain-specific features from the domain-invariant encoder. Third, a style consistency estimation method is proposed to obtain inconsistency maps from intermediate synthesized target domain images with different styles to measure the difficult regions, mitigate domain shift between synthesized target domain images and real target domain images, and improve the integrity of interested organs. Fourth, style consistency entropy is defined for target domain images to further improve the integrity of the interested organ by the concentration on the inconsistent regions. Comprehensive experiments have been performed with an in-house dataset and a publicly available dataset. The experimental results have demonstrated the superiority of our framework over state-of-the-art methods.
Collapse
|
7
|
Luo Y, Yang Q, Liu Z, Shi Z, Huang W, Zheng G, Cheng J. Target-Guided Diffusion Models for Unpaired Cross-Modality Medical Image Translation. IEEE J Biomed Health Inform 2024; 28:4062-4071. [PMID: 38662561 DOI: 10.1109/jbhi.2024.3393870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
In a clinical setting, the acquisition of certain medical image modality is often unavailable due to various considerations such as cost, radiation, etc. Therefore, unpaired cross-modality translation techniques, which involve training on the unpaired data and synthesizing the target modality with the guidance of the acquired source modality, are of great interest. Previous methods for synthesizing target medical images are to establish one-shot mapping through generative adversarial networks (GANs). As promising alternatives to GANs, diffusion models have recently received wide interests in generative tasks. In this paper, we propose a target-guided diffusion model (TGDM) for unpaired cross-modality medical image translation. For training, to encourage our diffusion model to learn more visual concepts, we adopted a perception prioritized weight scheme (P2W) to the training objectives. For sampling, a pre-trained classifier is adopted in the reverse process to relieve modality-specific remnants from source data. Experiments on both brain MRI-CT and prostate MRI-US datasets demonstrate that the proposed method achieves a visually realistic result that mimics a vivid anatomical section of the target organ. In addition, we have also conducted a subjective assessment based on the synthesized samples to further validate the clinical value of TGDM.
Collapse
|
8
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
9
|
Huang J, Chen K, Ren Y, Sun J, Wang Y, Tao T, Pu X. CDDnet: Cross-domain denoising network for low-dose CT image via local and global information alignment. Comput Biol Med 2023; 163:107219. [PMID: 37422942 DOI: 10.1016/j.compbiomed.2023.107219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 05/21/2023] [Accepted: 06/25/2023] [Indexed: 07/11/2023]
Abstract
The domain shift problem has emerged as a challenge in cross-domain low-dose CT (LDCT) image denoising task, where the acquisition of a sufficient number of medical images from multiple sources may be constrained by privacy concerns. In this study, we propose a novel cross-domain denoising network (CDDnet) that incorporates both local and global information of CT images. To address the local component, a local information alignment module has been proposed to regularize the similarity between extracted target and source features from selected patches. To align the general information of the semantic structure from a global perspective, an autoencoder is adopted to learn the latent correlation between the source label and the estimated target label generated by the pre-trained denoiser. Experimental results demonstrate that our proposed CDDnet effectively alleviates the domain shift problem, outperforming other deep learning-based and domain adaptation-based methods under cross-domain scenarios.
Collapse
Affiliation(s)
- Jiaxin Huang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Kecheng Chen
- Department of Electrical Engineering, City University of Hong Kong, 999077, Hong Kong Special Administrative Region of China
| | - Yazhou Ren
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China; Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110, China
| | - Jiayu Sun
- West China Hospital, Sichuan University, Chengdu, 610044, China
| | - Yanmei Wang
- Institute of Traditional Chinese Medicine, Sichuan College of Traditional Chinese Medicine (Sichuan Second Hospital of TCM), Chengdu, 610075, China
| | - Tao Tao
- Institute of Traditional Chinese Medicine, Sichuan College of Traditional Chinese Medicine (Sichuan Second Hospital of TCM), Chengdu, 610075, China
| | - Xiaorong Pu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China; Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110, China; NHC Key Laboratory of Nuclear Technology Medical Transformation, Mianyang Central Hospital, Mianyang, 621000, China.
| |
Collapse
|
10
|
Wunderlich A, Sklar J. Data-driven modeling of noise time series with convolutional generative adversarial networks. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2023; 4:10.1088/2632-2153/acee44. [PMID: 37693073 PMCID: PMC10484071 DOI: 10.1088/2632-2153/acee44] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2023] Open
Abstract
Random noise arising from physical processes is an inherent characteristic of measurements and a limiting factor for most signal processing and data analysis tasks. Given the recent interest in generative adversarial networks (GANs) for data-driven modeling, it is important to determine to what extent GANs can faithfully reproduce noise in target data sets. In this paper, we present an empirical investigation that aims to shed light on this issue for time series. Namely, we assess two general-purpose GANs for time series that are based on the popular deep convolutional GAN architecture, a direct time-series model and an image-based model that uses a short-time Fourier transform data representation. The GAN models are trained and quantitatively evaluated using distributions of simulated noise time series with known ground-truth parameters. Target time series distributions include a broad range of noise types commonly encountered in physical measurements, electronics, and communication systems: band-limited thermal noise, power law noise, shot noise, and impulsive noise. We find that GANs are capable of learning many noise types, although they predictably struggle when the GAN architecture is not well suited to some aspects of the noise, e.g. impulsive time-series with extreme outliers. Our findings provide insights into the capabilities and potential limitations of current approaches to time-series GANs and highlight areas for further research. In addition, our battery of tests provides a useful benchmark to aid the development of deep generative models for time series.
Collapse
Affiliation(s)
- Adam Wunderlich
- Communications Technology Laboratory, National Institute of Standards and Technology, Boulder, CO 80305, United States of America
| | - Jack Sklar
- Communications Technology Laboratory, National Institute of Standards and Technology, Boulder, CO 80305, United States of America
| |
Collapse
|
11
|
Li M, Wang J, Chen Y, Tang Y, Wu Z, Qi Y, Jiang H, Zheng J, Tsui BMW. Low-Dose CT Image Synthesis for Domain Adaptation Imaging Using a Generative Adversarial Network With Noise Encoding Transfer Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2616-2630. [PMID: 37030685 DOI: 10.1109/tmi.2023.3261822] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning (DL) based image processing methods have been successfully applied to low-dose x-ray images based on the assumption that the feature distribution of the training data is consistent with that of the test data. However, low-dose computed tomography (LDCT) images from different commercial scanners may contain different amounts and types of image noise, violating this assumption. Moreover, in the application of DL based image processing methods to LDCT, the feature distributions of LDCT images from simulation and clinical CT examination can be quite different. Therefore, the network models trained with simulated image data or LDCT images from one specific scanner may not work well for another CT scanner and image processing task. To solve such domain adaptation problem, in this study, a novel generative adversarial network (GAN) with noise encoding transfer learning (NETL), or GAN-NETL, is proposed to generate a paired dataset with a different noise style. Specifically, we proposed a method to perform noise encoding operator and incorporate it into the generator to extract a noise style. Meanwhile, with a transfer learning (TL) approach, the image noise encoding operator transformed the noise type of the source domain to that of the target domain for realistic noise generation. One public and two private datasets are used to evaluate the proposed method. Experiment results demonstrated the feasibility and effectiveness of our proposed GAN-NETL model in LDCT image synthesis. In addition, we conduct additional image denoising study using the synthesized clinical LDCT data, which verified the merit of the proposed synthesis in improving the performance of the DL based LDCT processing method.
Collapse
|
12
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
13
|
Keaton MR, Zaveri RJ, Doretto G. CellTranspose: Few-shot Domain Adaptation for Cellular Instance Segmentation. IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION. IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION 2023; 2023:455-466. [PMID: 38170053 PMCID: PMC10760785 DOI: 10.1109/wacv56688.2023.00053] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Automated cellular instance segmentation is a process utilized for accelerating biological research for the past two decades, and recent advancements have produced higher quality results with less effort from the biologist. Most current endeavors focus on completely cutting the researcher out of the picture by generating highly generalized models. However, these models invariably fail when faced with novel data, distributed differently than the ones used for training. Rather than approaching the problem with methods that presume the availability of large amounts of target data and computing power for retraining, in this work we address the even greater challenge of designing an approach that requires minimal amounts of new annotated data as well as training time. We do so by designing specialized contrastive losses that leverage the few annotated samples very efficiently. A large set of results show that 3 to 5 annotations lead to models with accuracy that: 1) significantly mitigate the covariate shift effects; 2) matches or surpasses other adaptation methods; 3) even approaches methods that have been fully retrained on the target distribution. The adaptation training is only a few minutes, paving a path towards a balance between model performance, computing requirements and expert-level annotation needs.
Collapse
|
14
|
Kugelman J, Alonso-Caneiro D, Read SA, Collins MJ. A review of generative adversarial network applications in optical coherence tomography image analysis. JOURNAL OF OPTOMETRY 2022; 15 Suppl 1:S1-S11. [PMID: 36241526 PMCID: PMC9732473 DOI: 10.1016/j.optom.2022.09.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 08/19/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
Optical coherence tomography (OCT) has revolutionized ophthalmic clinical practice and research, as a result of the high-resolution images that the method is able to capture in a fast, non-invasive manner. Although clinicians can interpret OCT images qualitatively, the ability to quantitatively and automatically analyse these images represents a key goal for eye care by providing clinicians with immediate and relevant metrics to inform best clinical practice. The range of applications and methods to analyse OCT images is rich and rapidly expanding. With the advent of deep learning methods, the field has experienced significant progress with state-of-the-art-performance for several OCT image analysis tasks. Generative adversarial networks (GANs) represent a subfield of deep learning that allows for a range of novel applications not possible in most other deep learning methods, with the potential to provide more accurate and robust analyses. In this review, the progress in this field and clinical impact are reviewed and the potential future development of applications of GANs to OCT image processing are discussed.
Collapse
Affiliation(s)
- Jason Kugelman
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia.
| | - David Alonso-Caneiro
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia
| | - Scott A Read
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia
| | - Michael J Collins
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia
| |
Collapse
|
15
|
Efficient tooth gingival margin line reconstruction via adversarial learning. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
16
|
Ali H, Umander J, Rohlén R, Röhrle O, Grönlund C. Modelling intra-muscular contraction dynamics using in silico to in vivo domain translation. Biomed Eng Online 2022; 21:46. [PMID: 35804415 PMCID: PMC9270806 DOI: 10.1186/s12938-022-01016-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 06/20/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Advances in sports medicine, rehabilitation applications and diagnostics of neuromuscular disorders are based on the analysis of skeletal muscle contractions. Recently, medical imaging techniques have transformed the study of muscle contractions, by allowing identification of individual motor units' activity, within the whole studied muscle. However, appropriate image-based simulation models, which would assist the continued development of these new imaging methods are missing. This is mainly due to a lack of models that describe the complex interaction between tissues within a muscle and its surroundings, e.g., muscle fibres, fascia, vasculature, bone, skin, and subcutaneous fat. Herein, we propose a new approach to overcome this limitation. METHODS In this work, we propose to use deep learning to model the authentic intra-muscular skeletal muscle contraction pattern using domain-to-domain translation between in silico (simulated) and in vivo (experimental) image sequences of skeletal muscle contraction dynamics. For this purpose, the 3D cycle generative adversarial network (cycleGAN) models were evaluated on several hyperparameter settings and modifications. The results show that there were large differences between the spatial features of in silico and in vivo data, and that a model could be trained to generate authentic spatio-temporal features similar to those obtained from in vivo experimental data. In addition, we used difference maps between input and output of the trained model generator to study the translated characteristics of in vivo data. RESULTS This work provides a model to generate authentic intra-muscular skeletal muscle contraction dynamics that could be used to gain further and much needed physiological and pathological insights and assess and overcome limitations within the newly developed research field of neuromuscular imaging.
Collapse
Affiliation(s)
- Hazrat Ali
- Department of Electrical and Computer Engineering, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, Pakistan
- Department of Radiation Sciences, Umeå University, Umeå, Sweden
| | | | - Robin Rohlén
- Department of Radiation Sciences, Umeå University, Umeå, Sweden
| | - Oliver Röhrle
- Stuttgart Center for Simulation Technology (SC SimTech), University of Stuttgart, Stuttgart, Germany
- Institute for Modelling and Simulation of Biomechanical Systems, Chair for Computational Biophysics and Biorobotics, University of Stuttgart, Stuttgart, Germany
| | | |
Collapse
|
17
|
Vo T, Khan N. Edge-preserving Image Synthesis for Unsupervised Domain Adaptation in Medical Image Segmentation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:3753-3757. [PMID: 36085629 DOI: 10.1109/embc48229.2022.9871402] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Domain Adaptation is a technique to address the lack of massive amounts of labeled data in different application domains. Unsupervised domain adaptation is the process of adapting a model to an unseen target dataset using solely labeled source data and unlabeled target domain data. Though many image-spaces domain adaptation methods have been proposed to capture pixel-level domain-shift, such techniques may fail to maintain high-level semantic information for the segmentation task. For the case of biomedical images, fine details such as blood vessels can be lost during the image transformation operations between domains. In this work, we propose a model that adapts between domains using cycle-consistent loss while maintaining edge details of the original images by enforcing an edge-based loss during the adaptation process. We demonstrate the effectiveness of our algorithm by comparing it to other approaches on two eye fundus vessels segmentation datasets. We achieve 3.1 % increment in Dice score compared to the SOTA and ∼ 7.02% increment compared to a vanilla CycleGAN implementation. Clinical relevance- The proposed adaptation scheme can provide better performance on unseen data for semantic segmentation, which is widely applied in computer-aided diagnosis. Such robust performance can reduce the reliance of a large amount of labeled data, which is a common problem in the medical domain.
Collapse
|
18
|
Artificial Intelligence-Based Prediction of Oroantral Communication after Tooth Extraction Utilizing Preoperative Panoramic Radiography. Diagnostics (Basel) 2022; 12:diagnostics12061406. [PMID: 35741216 PMCID: PMC9221677 DOI: 10.3390/diagnostics12061406] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 06/02/2022] [Accepted: 06/04/2022] [Indexed: 02/01/2023] Open
Abstract
Oroantral communication (OAC) is a common complication after tooth extraction of upper molars. Profound preoperative panoramic radiography analysis might potentially help predict OAC following tooth extraction. In this exploratory study, we evaluated n = 300 consecutive cases (100 OAC and 200 controls) and trained five machine learning algorithms (VGG16, InceptionV3, MobileNetV2, EfficientNet, and ResNet50) to predict OAC versus non-OAC (binary classification task) from the input images. Further, four oral and maxillofacial experts evaluated the respective panoramic radiography and determined performance metrics (accuracy, area under the curve (AUC), precision, recall, F1-score, and receiver operating characteristics curve) of all diagnostic approaches. Cohen’s kappa was used to evaluate the agreement between expert evaluations. The deep learning algorithms reached high specificity (highest specificity 100% for InceptionV3) but low sensitivity (highest sensitivity 42.86% for MobileNetV2). The AUCs from VGG16, InceptionV3, MobileNetV2, EfficientNet, and ResNet50 were 0.53, 0.60, 0.67, 0.51, and 0.56, respectively. Expert 1–4 reached an AUC of 0.550, 0.629, 0.500, and 0.579, respectively. The specificity of the expert evaluations ranged from 51.74% to 95.02%, whereas sensitivity ranged from 14.14% to 59.60%. Cohen’s kappa revealed a poor agreement for the oral and maxillofacial expert evaluations (Cohen’s kappa: 0.1285). Overall, present data indicate that OAC cannot be sufficiently predicted from preoperative panoramic radiography. The false-negative rate, i.e., the rate of positive cases (OAC) missed by the deep learning algorithms, ranged from 57.14% to 95.24%. Surgeons should not solely rely on panoramic radiography when evaluating the probability of OAC occurrence. Clinical testing of OAC is warranted after each upper-molar tooth extraction.
Collapse
|
19
|
Xu L, Yang C, Zhang F, Cheng X, Wei Y, Fan S, Liu M, He X, Deng J, Xie T, Wang X, Liu M, Song B. Deep Learning Using CT Images to Grade Clear Cell Renal Cell Carcinoma: Development and Validation of a Prediction Model. Cancers (Basel) 2022; 14:cancers14112574. [PMID: 35681555 PMCID: PMC9179576 DOI: 10.3390/cancers14112574] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 04/21/2022] [Accepted: 04/29/2022] [Indexed: 02/06/2023] Open
Abstract
Simple Summary Clear cell renal cell carcinoma (ccRCC) pathologic grade identification is essential to both monitoring patients’ conditions and constructing individualized subsequent treatment strategies. However, biopsies are typically used to obtain the pathological grade, entailing tremendous physical and mental suffering as well as heavy economic burden, not to mention the increased risk of complications. Our study explores a new way to provide grade assessment of ccRCC on the basis of the individual’s appearance on CT images. A deep learning (DL) method that includes self-supervised learning is constructed to identify patients with high grade for ccRCC. We confirmed that our grading network can accurately differentiate between different grades of CT scans of ccRCC patients using a cohort of 706 patients from West China Hospital. The promising diagnostic performance indicates that our DL framework is an effective, non-invasive and labor-saving method for decoding CT images, offering a valuable means for ccRCC grade stratification and individualized patient treatment. Abstract This retrospective study aimed to develop and validate deep-learning-based models for grading clear cell renal cell carcinoma (ccRCC) patients. A cohort enrolling 706 patients (n = 706) with pathologically verified ccRCC was used in this study. A temporal split was applied to verify our models: the first 83.9% of the cases (years 2010–2017) for development and the last 16.1% (year 2018–2019) for validation (development cohort: n = 592; validation cohort: n = 114). Here, we demonstrated a deep learning(DL) framework initialized by a self-supervised pre-training method, developed with the addition of mixed loss strategy and sample reweighting to identify patients with high grade for ccRCC. Four types of DL networks were developed separately and further combined with different weights for better prediction. The single DL model achieved up to an area under curve (AUC) of 0.864 in the validation cohort, while the ensembled model yielded the best predictive performance with an AUC of 0.882. These findings confirms that our DL approach performs either favorably or comparably in terms of grade assessment of ccRCC with biopsies whilst enjoying the non-invasive and labor-saving property.
Collapse
Affiliation(s)
- Lifeng Xu
- The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou 324000, China; (L.X.); (F.Z.)
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
| | - Chun Yang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Feng Zhang
- The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou 324000, China; (L.X.); (F.Z.)
| | - Xuan Cheng
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Yi Wei
- West China Hospital, Sichuan University, Chengdu 610000, China;
| | - Shixiao Fan
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Minghui Liu
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Xiaopeng He
- West China Hospital, Sichuan University, Chengdu 610000, China;
- Affiliated Hospital of Southwest Medical University, Luzhou 646000, China
- Correspondence: (X.H.); (B.S.)
| | - Jiali Deng
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Tianshu Xie
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Xiaomin Wang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Ming Liu
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; (C.Y.); (X.C.); (S.F.); (M.L.); (J.D.); (T.X.); (X.W.); (M.L.)
- University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Bin Song
- West China Hospital, Sichuan University, Chengdu 610000, China;
- Correspondence: (X.H.); (B.S.)
| |
Collapse
|
20
|
Kim HE, Cosa-Linan A, Santhanam N, Jannesari M, Maros ME, Ganslandt T. Transfer learning for medical image classification: a literature review. BMC Med Imaging 2022; 22:69. [PMID: 35418051 PMCID: PMC9007400 DOI: 10.1186/s12880-022-00793-7] [Citation(s) in RCA: 183] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 03/30/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Transfer learning (TL) with convolutional neural networks aims to improve performances on a new task by leveraging the knowledge of similar tasks learned in advance. It has made a major contribution to medical image analysis as it overcomes the data scarcity problem as well as it saves time and hardware resources. However, transfer learning has been arbitrarily configured in the majority of studies. This review paper attempts to provide guidance for selecting a model and TL approaches for the medical image classification task. METHODS 425 peer-reviewed articles were retrieved from two databases, PubMed and Web of Science, published in English, up until December 31, 2020. Articles were assessed by two independent reviewers, with the aid of a third reviewer in the case of discrepancies. We followed the PRISMA guidelines for the paper selection and 121 studies were regarded as eligible for the scope of this review. We investigated articles focused on selecting backbone models and TL approaches including feature extractor, feature extractor hybrid, fine-tuning and fine-tuning from scratch. RESULTS The majority of studies (n = 57) empirically evaluated multiple models followed by deep models (n = 33) and shallow (n = 24) models. Inception, one of the deep models, was the most employed in literature (n = 26). With respect to the TL, the majority of studies (n = 46) empirically benchmarked multiple approaches to identify the optimal configuration. The rest of the studies applied only a single approach for which feature extractor (n = 38) and fine-tuning from scratch (n = 27) were the two most favored approaches. Only a few studies applied feature extractor hybrid (n = 7) and fine-tuning (n = 3) with pretrained models. CONCLUSION The investigated studies demonstrated the efficacy of transfer learning despite the data scarcity. We encourage data scientists and practitioners to use deep models (e.g. ResNet or Inception) as feature extractors, which can save computational costs and time without degrading the predictive power.
Collapse
Affiliation(s)
- Hee E Kim
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany.
| | - Alejandro Cosa-Linan
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Nandhini Santhanam
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Mahboubeh Jannesari
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Mate E Maros
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
| | - Thomas Ganslandt
- Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health (CPD-BW), Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167, Mannheim, Germany
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg, Wetterkreuz 15, 91058, Erlangen, Germany
| |
Collapse
|
21
|
Chen X, Li Y, Yao L, Adeli E, Zhang Y, Wang X. Generative Adversarial U-Net for Domain-free Few-shot Medical Diagnosis. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.03.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
22
|
Chen J, Zhang Z, Xie X, Li Y, Xu T, Ma K, Zheng Y. Beyond Mutual Information: Generative Adversarial Network for Domain Adaptation Using Information Bottleneck Constraint. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:595-607. [PMID: 34606453 DOI: 10.1109/tmi.2021.3117996] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Medical images from multicentres often suffer from the domain shift problem, which makes the deep learning models trained on one domain usually fail to generalize well to another. One of the potential solutions for the problem is the generative adversarial network (GAN), which has the capacity to translate images between different domains. Nevertheless, the existing GAN-based approaches are prone to fail at preserving image-objects in image-to-image (I2I) translation, which reduces their practicality on domain adaptation tasks. In this regard, a novel GAN (namely IB-GAN) is proposed to preserve image-objects during cross-domain I2I adaptation. Specifically, we integrate the information bottleneck constraint into the typical cycle-consistency-based GAN to discard the superfluous information (e.g., domain information) and maintain the consistency of disentangled content features for image-object preservation. The proposed IB-GAN is evaluated on three tasks-polyp segmentation using colonoscopic images, the segmentation of optic disc and cup in fundus images and the whole heart segmentation using multi-modal volumes. We show that the proposed IB-GAN can generate realistic translated images and remarkably boost the generalization of widely used segmentation networks (e.g., U-Net).
Collapse
|
23
|
Abstract
Machine learning techniques used in computer-aided medical image analysis usually suffer from the domain shift problem caused by different distributions between source/reference data and target data. As a promising solution, domain adaptation has attracted considerable attention in recent years. The aim of this paper is to survey the recent advances of domain adaptation methods in medical image analysis. We first present the motivation of introducing domain adaptation techniques to tackle domain heterogeneity issues for medical image analysis. Then we provide a review of recent domain adaptation models in various medical image analysis tasks. We categorize the existing methods into shallow and deep models, and each of them is further divided into supervised, semi-supervised and unsupervised methods. We also provide a brief summary of the benchmark medical image datasets that support current domain adaptation research. This survey will enable researchers to gain a better understanding of the current status, challenges and future directions of this energetic research field.
Collapse
|
24
|
AFA: adversarial frequency alignment for domain generalized lung nodule detection. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-06928-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
25
|
Liu Z, Ni S, Yang C, Sun W, Huang D, Su H, Shu J, Qin N. Axillary lymph node metastasis prediction by contrast-enhanced computed tomography images for breast cancer patients based on deep learning. Comput Biol Med 2021; 136:104715. [PMID: 34388460 DOI: 10.1016/j.compbiomed.2021.104715] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 07/09/2021] [Accepted: 07/27/2021] [Indexed: 12/09/2022]
Abstract
When doctors use contrast-enhanced computed tomography (CECT) images to predict the metastasis of axillary lymph nodes (ALN) for breast cancer patients, the prediction performance could be degraded by subjective factors such as experience, psychological factors, and degree of fatigue. This study aims to exploit efficient deep learning schemes to predict the metastasis of ALN automatically via CECT images. A new construction called deformable sampling module (DSM) was meticulously designed as a plug-and-play sampling module in the proposed deformable attention VGG19 (DA-VGG19). A dataset of 800 samples labeled from 800 CECT images of 401 breast cancer patients retrospectively enrolled in the last three years was adopted to train, validate, and test the deep convolutional neural network models. By comparing the accuracy, positive predictive value, negative predictive value, sensitivity and specificity indices, the performance of the proposed model is analyzed in detail. The best-performing DA-VGG19 model achieved an accuracy of 0.9088, which is higher than that of other classification neural networks. As such, the proposed intelligent diagnosis algorithm can provide doctors with daily diagnostic assistance and advice and reduce the workload of doctors. The source code mentioned in this article will be released later.
Collapse
Affiliation(s)
- Ziyi Liu
- Institute of Systems Science and Technology, School of Electrical Engineering, Southwest Jiaotong University, Chengdu, 611756, China
| | - Sijie Ni
- Institute of Systems Science and Technology, School of Electrical Engineering, Southwest Jiaotong University, Chengdu, 611756, China
| | - Chunmei Yang
- Department of Radiology, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, China
| | - Weihao Sun
- Institute of Systems Science and Technology, School of Electrical Engineering, Southwest Jiaotong University, Chengdu, 611756, China
| | - Deqing Huang
- Institute of Systems Science and Technology, School of Electrical Engineering, Southwest Jiaotong University, Chengdu, 611756, China
| | - Hu Su
- Institute of Systems Science and Technology, School of Electrical Engineering, Southwest Jiaotong University, Chengdu, 611756, China
| | - Jian Shu
- Department of Radiology, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, China.
| | - Na Qin
- Institute of Systems Science and Technology, School of Electrical Engineering, Southwest Jiaotong University, Chengdu, 611756, China.
| |
Collapse
|
26
|
Guan H, Liu Y, Yang E, Yap PT, Shen D, Liu M. Multi-site MRI harmonization via attention-guided deep domain adaptation for brain disorder identification. Med Image Anal 2021; 71:102076. [PMID: 33930828 PMCID: PMC8184627 DOI: 10.1016/j.media.2021.102076] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 12/21/2020] [Accepted: 04/03/2021] [Indexed: 01/18/2023]
Abstract
Structural magnetic resonance imaging (MRI) has shown great clinical and practical values in computer-aided brain disorder identification. Multi-site MRI data increase sample size and statistical power, but are susceptible to inter-site heterogeneity caused by different scanners, scanning protocols, and subject cohorts. Multi-site MRI harmonization (MMH) helps alleviate the inter-site difference for subsequent analysis. Some MMH methods performed at imaging level or feature extraction level are concise but lack robustness and flexibility to some extent. Even though several machine/deep learning-based methods have been proposed for MMH, some of them require a portion of labeled data in the to-be-analyzed target domain or ignore the potential contributions of different brain regions to the identification of brain disorders. In this work, we propose an attention-guided deep domain adaptation (AD2A) framework for MMH and apply it to automated brain disorder identification with multi-site MRIs. The proposed framework does not need any category label information of target data, and can also automatically identify discriminative regions in whole-brain MR images. Specifically, the proposed AD2A is composed of three key modules: (1) an MRI feature encoding module to extract representations of input MRIs, (2) an attention discovery module to automatically locate discriminative dementia-related regions in each whole-brain MRI scan, and (3) a domain transfer module trained with adversarial learning for knowledge transfer between the source and target domains. Experiments have been performed on 2572 subjects from four benchmark datasets with T1-weighted structural MRIs, with results demonstrating the effectiveness of the proposed method in both tasks of brain disorder identification and disease progression prediction.
Collapse
Affiliation(s)
- Hao Guan
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yunbi Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Erkun Yang
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| |
Collapse
|
27
|
Li Y, Zhou D, Liu TT, Shen XZ. Application of deep learning in image recognition and diagnosis of gastric cancer. Artif Intell Gastrointest Endosc 2021; 2:12-24. [DOI: 10.37126/aige.v2.i2.12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/30/2021] [Accepted: 04/20/2021] [Indexed: 02/06/2023] Open
Abstract
In recent years, artificial intelligence has been extensively applied in the diagnosis of gastric cancer based on medical imaging. In particular, using deep learning as one of the mainstream approaches in image processing has made remarkable progress. In this paper, we also provide a comprehensive literature survey using four electronic databases, PubMed, EMBASE, Web of Science, and Cochrane. The literature search is performed until November 2020. This article provides a summary of the existing algorithm of image recognition, reviews the available datasets used in gastric cancer diagnosis and the current trends in applications of deep learning theory in image recognition of gastric cancer. covers the theory of deep learning on endoscopic image recognition. We further evaluate the advantages and disadvantages of the current algorithms and summarize the characteristics of the existing image datasets, then combined with the latest progress in deep learning theory, and propose suggestions on the applications of optimization algorithms. Based on the existing research and application, the label, quantity, size, resolutions, and other aspects of the image dataset are also discussed. The future developments of this field are analyzed from two perspectives including algorithm optimization and data support, aiming to improve the diagnosis accuracy and reduce the risk of misdiagnosis.
Collapse
Affiliation(s)
- Yu Li
- Department of Gastroenterology and Hepatology, Zhongshan Hospital Affiliated to Fudan University, Shanghai 200032, China
| | - Da Zhou
- Department of Gastroenterology and Hepatology, Zhongshan Hospital Affiliated to Fudan University, Shanghai 200032, China
| | - Tao-Tao Liu
- Department of Gastroenterology and Hepatology, Zhongshan Hospital Affiliated to Fudan University, Shanghai 200032, China
| | - Xi-Zhong Shen
- Department of Gastroenterology and Hepatology, Zhongshan Hospital Affiliated to Fudan University, Shanghai 200032, China
| |
Collapse
|
28
|
Zhou Y, Yu K, Wang M, Ma Y, Peng Y, Chen Z, Zhu W, Shi F, Chen X. Speckle Noise Reduction for OCT Images based on Image Style Transfer and Conditional GAN. IEEE J Biomed Health Inform 2021; 26:139-150. [PMID: 33882009 DOI: 10.1109/jbhi.2021.3074852] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Raw optical coherence tomography (OCT) images typically are of low quality because speckle noise blurs retinal structures, severely compromising visual quality and degrading performances of subsequent image analysis tasks. In our previous study, we have developed a Conditional Generative Adversarial Network (cGAN) for speckle noise removal in OCT images collected by several commercial OCT scanners, which we collectively refer to as scanner T. In this paper, we improve the cGAN model and apply it to our in-house OCT scanner (scanner B) for speckle noise suppression. The proposed model consists of two steps: 1) We train a Cycle-Consistent GAN (CycleGAN) to learn style transfer between two OCT image datasets collected by different scanners. The purpose of the CycleGAN is to leverage the ground truth dataset created in our previous study. 2) We train a mini-cGAN model based on the PatchGAN mechanism with the ground truth dataset to suppress speckle noise in OCT images. After training, we first apply the CycleGAN model to convert raw images collected by scanner B to match the style of the images from scanner T, and subsequently use the mini-cGAN model to suppress speckle noise in the style transferred images. We evaluate the proposed method on a dataset collected by scanner B. Experimental results show that the improved model outperforms our previous method and other state-of-the-art models in speckle noise removal, retinal structure preservation and contrast enhancement.
Collapse
|
29
|
Pandey P, P PA, Kyatham V, Mishra D, Dastidar TR. Target-Independent Domain Adaptation for WBC Classification Using Generative Latent Search. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3979-3991. [PMID: 32746144 DOI: 10.1109/tmi.2020.3009029] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Automating the classification of camera-obtained microscopic images of White Blood Cells (WBCs) and related cell subtypes has assumed importance since it aids the laborious manual process of review and diagnosis. Several State-Of-The-Art (SOTA) methods developed using Deep Convolutional Neural Networks suffer from the problem of domain shift - severe performance degradation when they are tested on data (target) obtained in a setting different from that of the training (source). The change in the target data might be caused by factors such as differences in camera/microscope types, lenses, lighting-conditions etc. This problem can potentially be solved using Unsupervised Domain Adaptation (UDA) techniques albeit standard algorithms presuppose the existence of a sufficient amount of unlabelled target data which is not always the case with medical images. In this paper, we propose a method for UDA that is devoid of the need for target data. Given a test image from the target data, we obtain its 'closest-clone' from the source data that is used as a proxy in the classifier. We prove the existence of such a clone given that infinite number of data points can be sampled from the source distribution. We propose a method in which a latent-variable generative model based on variational inference is used to simultaneously sample and find the 'closest-clone' from the source distribution through an optimization procedure in the latent space. We demonstrate the efficacy of the proposed method over several SOTA UDA methods for WBC classification on datasets captured using different imaging modalities under multiple settings.
Collapse
|
30
|
Ling H, Guo ZY, Tan LL, Guan RC, Chen JB, Song CL. Machine learning in diagnosis of coronary artery disease. Chin Med J (Engl) 2020; 134:401-403. [PMID: 33252376 PMCID: PMC7909316 DOI: 10.1097/cm9.0000000000001202] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Indexed: 01/24/2023] Open
Affiliation(s)
- Hao Ling
- Department of Cardiology, the Second Hospital of Jilin University, Changchun, Jilin 130012, China
| | - Zi-Yuan Guo
- Department of Cardiology, the Second Hospital of Jilin University, Changchun, Jilin 130012, China
| | - Lin-Lin Tan
- Department of Cardiology, the Second Hospital of Jilin University, Changchun, Jilin 130012, China
| | - Ren-Chu Guan
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China
| | - Jing-Bo Chen
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China
| | - Chun-Li Song
- Department of Cardiology, the Second Hospital of Jilin University, Changchun, Jilin 130012, China
| |
Collapse
|
31
|
Luo Y, Chen K, Liu L, Liu J, Mao J, Ke G, Sun M. Dehaze of Cataractous Retinal Images Using an Unpaired Generative Adversarial Network. IEEE J Biomed Health Inform 2020; 24:3374-3383. [PMID: 32750919 DOI: 10.1109/jbhi.2020.2999077] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Cataracts are the leading cause of visual impairment worldwide. Examination of the retina through cataracts using a fundus camera is challenging and error-prone due to degraded image quality. We sought to develop an algorithm to dehaze such images to support diagnosis by either ophthalmologists or computer-aided diagnosis systems. Based on the generative adversarial network (GAN) concept, we designed two neural networks: CataractSimGAN and CataractDehazeNet. CataractSimGAN was intended for the synthesis of cataract-like images through unpaired clear retinal images and cataract images. CataractDehazeNet was trained using pairs of synthesized cataract-like images and the corresponding clear images through supervised learning. With two networks trained independently, the number of hyper-parameters was reduced, leading to better performance. We collected 400 retinal images without cataracts and 400 hazy images from cataract patients as the training dataset. Fifty cataract images and the corresponding clear images from the same patients after surgery comprised the test dataset. The clear images after surgery were used for reference to evaluate the performance of our method. CataractDehazeNet was able to enhance the degraded image from cataract patients substantially and to visualize blood vessels and the optic disc, while actively suppressing the artifacts common in application of similar methods. Thus, we developed an algorithm to improve the quality of the retinal images acquired from cataract patients. We achieved high structure similarity and fidelity between processed images and images from the same patients after cataract surgery.
Collapse
|
32
|
Uncertainty-aware domain alignment for anatomical structure segmentation. Med Image Anal 2020; 64:101732. [PMID: 32580058 DOI: 10.1016/j.media.2020.101732] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 05/13/2020] [Accepted: 05/22/2020] [Indexed: 11/20/2022]
Abstract
Automatic and accurate segmentation of anatomical structures on medical images is crucial for detecting various potential diseases. However, the segmentation performance of established deep neural networks may degenerate on different modalities or devices owing to the significant difference across the domains, a problem known as domain shift. In this work, we propose an uncertainty-aware domain alignment framework to address the domain shift problem in the cross-domain Unsupervised Domain Adaptation (UDA) task. Specifically, we design an Uncertainty Estimation and Segmentation Module (UESM) to obtain the uncertainty map estimation. Then, a novel Uncertainty-aware Cross Entropy (UCE) loss is proposed to leverage the uncertainty information to boost the segmentation performance on highly uncertain regions. To further improve the performance in the UDA task, an Uncertainty-aware Self-Training (UST) strategy is developed to choose the optimal target samples by uncertainty guidance. In addition, the Uncertainty Feature Recalibration Module (UFRM) is applied to enforce the framework to minimize the cross-domain discrepancy. The proposed framework is evaluated on a private cross-device Optical Coherence Tomography (OCT) dataset and a public cross-modality cardiac dataset released by MMWHS 2017. Extensive experiments indicate that the proposed UESM is both efficient and effective for the uncertainty estimation in the UDA task, achieving state-of-the-art performance on both cross-modality and cross-device datasets.
Collapse
|