1
|
Liu W, Pang J, Zhang B, Wang J, Liu B, Tao D. See Degraded Objects: A Physics-Guided Approach for Object Detection in Adverse Environments. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2198-2212. [PMID: 40138227 DOI: 10.1109/tip.2025.3551533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/29/2025]
Abstract
In adverse environments, the detector often fails to detect degraded objects because they are almost invisible and their features are weakened by the environment. Common approaches involve image enhancement to support detection, but they inevitably introduce human-invisible noise that negatively impacts the detector. In this work, we propose a physics-guided approach for object detection in adverse environments, which gives a straightforward solution that injects the physical priors into the detector, enabling it to detect poorly visible objects. The physical priors, derived from the imaging mechanism and image property, include environment prior and frequency prior. The environment prior is generated from the physical model, e.g., the atmospheric model, which reflects the density of environmental noise. The frequency prior is explored based on an observation that the amplitude spectrum could highlight object regions from the background. The proposed two priors are complementary in principle. Furthermore, we present a physics-guided loss that incorporates a novel weight item, which is estimated by applying the membership function on physical priors and could capture the extent of degradation. By backpropagating the physics-guided loss, physics knowledge is injected into the detector to aid in locating degraded objects. We conduct experiments in synthetic foggy environment, real foggy environment, and real underwater scenario. The results demonstrate that our method is effective and achieves state-of-the-art performance. The code is available at https://github.com/PangJian123/See-Degraded-Objects.
Collapse
|
2
|
Ma ZP, Zhu YM, Zhang XD, Zhao YX, Zheng W, Yuan SR, Li GY, Zhang TL. Investigating the Use of Generative Adversarial Networks-Based Deep Learning for Reducing Motion Artifacts in Cardiac Magnetic Resonance. J Multidiscip Healthc 2025; 18:787-799. [PMID: 39963324 PMCID: PMC11830935 DOI: 10.2147/jmdh.s492163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Accepted: 01/21/2025] [Indexed: 02/20/2025] Open
Abstract
Objective To evaluate the effectiveness of deep learning technology based on generative adversarial networks (GANs) in reducing motion artifacts in cardiac magnetic resonance (CMR) cine sequences. Methods The training and testing datasets consisted of 2000 and 200 pairs of clear and blurry images, respectively, acquired through simulated motion artifacts in CMR cine sequences. These datasets were used to establish and train a deep learning GAN model. To assess the efficacy of the deep learning network in mitigating motion artifacts, 100 images with simulated motion artifacts and 37 images with real-world motion artifacts encountered in clinical practice were selected. Image quality pre- and post-optimization was assessed using metrics including Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Leningrad Focus Measure, and a 5-point Likert scale. Results After GAN optimization, notable improvements were observed in the PSNR, SSIM, and focus measure metrics for the 100 images with simulated artifacts. These metrics increased from initial values of 23.85±2.85, 0.71±0.08, and 4.56±0.67, respectively, to 27.91±1.74, 0.83±0.05, and 7.74±0.39 post-optimization. Additionally, the subjective assessment scores significantly improved from 2.44±1.08 to 4.44±0.66 (P<0.001). For the 37 images with real-world artifacts, the Tenengrad Focus Measure showed a significant enhancement, rising from 6.06±0.91 to 10.13±0.48 after artifact removal. Subjective ratings also increased from 3.03±0.73 to 3.73±0.87 (P<0.001). Conclusion GAN-based deep learning technology effectively reduces motion artifacts present in CMR cine images, demonstrating significant potential for clinical application in optimizing CMR motion artifact management.
Collapse
Affiliation(s)
- Ze-Peng Ma
- Department of Radiology, Affiliated Hospital of Hebei University/ Clinical Medical College, Hebei University, Baoding, 071000, People’s Republic of China
- Hebei Key Laboratory of Precise Imaging of inflammation Tumors, Baoding, Hebei Province, 071000, People’s Republic of China
| | - Yue-Ming Zhu
- College of Electronic and Information Engineering, Hebei University, Baoding, Hebei Province, 071002, People’s Republic of China
| | - Xiao-Dan Zhang
- Department of Ultrasound, Affiliated Hospital of Hebei University, Baoding, Hebei Province, 071000, People’s Republic of China
| | - Yong-Xia Zhao
- Department of Radiology, Affiliated Hospital of Hebei University/ Clinical Medical College, Hebei University, Baoding, 071000, People’s Republic of China
| | - Wei Zheng
- College of Electronic and Information Engineering, Hebei University, Baoding, Hebei Province, 071002, People’s Republic of China
| | - Shuang-Rui Yuan
- Department of Radiology, Affiliated Hospital of Hebei University/ Clinical Medical College, Hebei University, Baoding, 071000, People’s Republic of China
| | - Gao-Yang Li
- Department of Radiology, Affiliated Hospital of Hebei University/ Clinical Medical College, Hebei University, Baoding, 071000, People’s Republic of China
| | - Tian-Le Zhang
- Department of Radiology, Affiliated Hospital of Hebei University/ Clinical Medical College, Hebei University, Baoding, 071000, People’s Republic of China
| |
Collapse
|
3
|
Lin C, Chen Y, Feng S, Huang M. A multibranch and multiscale neural network based on semantic perception for multimodal medical image fusion. Sci Rep 2024; 14:17609. [PMID: 39080442 PMCID: PMC11289490 DOI: 10.1038/s41598-024-68183-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 07/22/2024] [Indexed: 08/02/2024] Open
Abstract
Medical imaging is indispensable for accurate diagnosis and effective treatment, with modalities like MRI and CT providing diverse yet complementary information. Traditional image fusion methods, while essential in consolidating information from multiple modalities, often suffer from poor image quality and loss of crucial details due to inadequate handling of semantic information and limited feature extraction capabilities. This paper introduces a novel medical image fusion technique leveraging unsupervised image segmentation to enhance the semantic understanding of the fusion process. The proposed method, named DUSMIF, employs a multi-branch, multi-scale deep learning architecture that integrates advanced attention mechanisms to refine the feature extraction and fusion processes. An innovative approach that utilizes unsupervised image segmentation to extract semantic information is introduced, which is then integrated into the fusion process. This not only enhances the semantic relevance of the fused images but also improves the overall fusion quality. The paper proposes a sophisticated network structure that extracts and fuses features at multiple scales and across multiple branches. This structure is designed to capture a comprehensive range of image details and contextual information, significantly improving the fusion outcomes. Multiple attention mechanisms are incorporated to selectively emphasize important features and integrate them effectively across different modalities and scales. This approach ensures that the fused images maintain high quality and detail fidelity. A joint loss function combining content loss, structural similarity loss, and semantic loss is formulated. This function not only guides the network in preserving image brightness and texture but also ensures that the fused image closely resembles the source images in both content and structure. The proposed method demonstrates superior performance over existing fusion techniques in objective assessments and subjective evaluations, confirming its effectiveness in enhancing the diagnostic utility of fused medical images.
Collapse
Affiliation(s)
- Cong Lin
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, Hainan, China
| | - Yinjie Chen
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, Hainan, China
| | - Siling Feng
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, Hainan, China.
| | - Mengxing Huang
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, Hainan, China.
| |
Collapse
|
4
|
Chinnaiyan AM, Alfred Sylam BW. Deep demosaicking convolution neural network and quantum wavelet transform-based image denoising. NETWORK (BRISTOL, ENGLAND) 2024:1-25. [PMID: 38989778 DOI: 10.1080/0954898x.2024.2358950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 05/17/2024] [Indexed: 07/12/2024]
Abstract
Demosaicking is a popular scientific area that is being explored by a vast number of scientists. Current digital imaging technologies capture colour images with a single monochrome sensor. In addition, the colour images were captured using a sensor coupled with a Colour Filter Array (CFA). Furthermore, the demosaicking procedure is required to obtain a full-colour image. Image denoising and image demosaicking are the two important image restoration techniques, which have increased popularity in recent years. Finding a suitable strategy for multiple image restoration is critical for researchers. Hence, a deep learning (DL) based image denoising and image demosaicking is developed in this research. Moreover, the Autoregressive Circle Wave Optimization (ACWO) based Demosaicking Convolutional Neural Network (DMCNN) is designed for image demosaicking. The Quantum Wavelet Transform (QWT) is used in the image denoising process. Similarly, Quantum Wavelet Transform (QWT) is used to analyse the abrupt changes in the input image with noise. The transformed image is then subjected to a thresholding technique, which determines an appropriate threshold range. Once the threshold range has been determined, soft thresholding is applied to the resulting wavelet coefficients. After that, the extraction and reconstruction of the original image is carried out using the Inverse Quantum Wavelet Transform (IQWT). Finally, the fused image is created by combining the results of both processes using a weighted average. The denoised and demosaicked images are combined using the weighted average technique. Furthermore, the proposed QWT+DMCNN-ACWO model provided the ideal values of Peak signal-to-noise ratio (PSNR), Second derivative like measure of enhancement (SDME), Structural Similarity Index (SSIM), Figure of Merit (FOM) of 0.890, and computational time of 49.549 dB, 59.53 dB, 0.963, 0.890, and 0.571, respectively.
Collapse
Affiliation(s)
- Anitha Mary Chinnaiyan
- Research Scholar, Department of Computer Science, Nesamony Memorial Christian College, Marthandam Manonmaniam Sundaranar University, Tirunelveli, India
| | - Boyed Wesley Alfred Sylam
- Department of PG Computer Science, Nesamony Memorial Christian College, Marthandam Manonmaniam Sundaranar University, Abishekapatti, Tirunelveli, India
| |
Collapse
|
5
|
Wang Z, Li B, Yu H, Zhang Z, Ran M, Xia W, Yang Z, Lu J, Chen H, Zhou J, Shan H, Zhang Y. Promoting fast MR imaging pipeline by full-stack AI. iScience 2024; 27:108608. [PMID: 38174317 PMCID: PMC10762466 DOI: 10.1016/j.isci.2023.108608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 10/17/2023] [Accepted: 11/29/2023] [Indexed: 01/05/2024] Open
Abstract
Magnetic resonance imaging (MRI) is a widely used imaging modality in clinics for medical disease diagnosis, staging, and follow-up. Deep learning has been extensively used to accelerate k-space data acquisition, enhance MR image reconstruction, and automate tissue segmentation. However, these three tasks are usually treated as independent tasks and optimized for evaluation by radiologists, thus ignoring the strong dependencies among them; this may be suboptimal for downstream intelligent processing. Here, we present a novel paradigm, full-stack learning (FSL), which can simultaneously solve these three tasks by considering the overall imaging process and leverage the strong dependence among them to further improve each task, significantly boosting the efficiency and efficacy of practical MRI workflows. Experimental results obtained on multiple open MR datasets validate the superiority of FSL over existing state-of-the-art methods on each task. FSL has great potential to optimize the practical workflow of MRI for medical diagnosis and radiotherapy.
Collapse
Affiliation(s)
- Zhiwen Wang
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Bowen Li
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Hui Yu
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Zhongzhou Zhang
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Maosong Ran
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Wenjun Xia
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Ziyuan Yang
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Jingfeng Lu
- School of Cyber Science and Engineering, Sichuan University, Chengdu, Sichuan, China
| | - Hu Chen
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Hongming Shan
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, China
| | - Yi Zhang
- School of Cyber Science and Engineering, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
6
|
Xie Y, Zhang J, Liu L, Wang H, Ye Y, Verjans J, Xia Y. ReFs: A hybrid pre-training paradigm for 3D medical image segmentation. Med Image Anal 2024; 91:103023. [PMID: 37956551 DOI: 10.1016/j.media.2023.103023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 10/31/2023] [Accepted: 11/02/2023] [Indexed: 11/15/2023]
Abstract
Self-supervised learning (SSL) has achieved remarkable progress in medical image segmentation. The application of an SSL algorithm often follows a two-stage training process: using unlabeled data to perform label-free representation learning and fine-tuning the pre-trained model on the downstream tasks. One issue of this paradigm is that the SSL step is unaware of the downstream task, which may lead to sub-optimal feature representation for a target task. In this paper, we propose a hybrid pre-training paradigm that is driven by both self-supervised and supervised objectives. To achieve this, a supervised reference task is involved in self-supervised learning, aiming to improve the representation quality. Specifically, we employ the off-the-shelf medical image segmentation task as reference, and encourage learning a representation that (1) incurs low prediction loss on both SSL and reference tasks and (2) leads to a similar gradient when updating the feature extractor from either task. In this way, the reference task pilots SSL in the direction beneficial for the downstream segmentation. To this end, we propose a simple but effective gradient matching method to optimize the model towards a consistent direction, thus improving the compatibility of both SSL and supervised reference tasks. We call this hybrid pre-training paradigm reference-guided self-supervised learning (ReFs), and perform it on a large-scale unlabeled dataset and an additional reference dataset. The experimental results demonstrate its effectiveness on seven downstream medical image segmentation benchmarks.
Collapse
Affiliation(s)
| | - Jianpeng Zhang
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | | | - Hu Wang
- University of Adelaide, Australia
| | - Yiwen Ye
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | | | - Yong Xia
- School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China.
| |
Collapse
|
7
|
Qiu Z, Guo H, Hu J, Jiang H, Luo C. Joint Fusion and Detection via Deep Learning in UAV-Borne Multispectral Sensing of Scatterable Landmine. SENSORS (BASEL, SWITZERLAND) 2023; 23:5693. [PMID: 37420862 DOI: 10.3390/s23125693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 06/04/2023] [Accepted: 06/14/2023] [Indexed: 07/09/2023]
Abstract
Compared with traditional mine detection methods, UAV-based measures are more suitable for the rapid detection of large areas of scatterable landmines, and a multispectral fusion strategy based on a deep learning model is proposed to facilitate mine detection. Using the UAV-borne multispectral cruise platform, we establish a multispectral dataset of scatterable mines, with mine-spreading areas of the ground vegetation considered. In order to achieve the robust detection of occluded landmines, first, we employ an active learning strategy to refine the labeling of the multispectral dataset. Then, we propose an image fusion architecture driven by detection, in which we use YOLOv5 for the detection part, to improve the detection performance instructively while enhancing the quality of the fused image. Specifically, a simple and lightweight fusion network is designed to sufficiently aggregate texture details and semantic information of the source images and obtain a higher fusion speed. Moreover, we leverage detection loss as well as a joint-training algorithm to allow the semantic information to dynamically flow back into the fusion network. Extensive qualitative and quantitative experiments demonstrate that the detection-driven fusion (DDF) that we propose can effectively increase the recall rate, especially for occluded landmines, and verify the feasibility of multispectral data through reasonable processing.
Collapse
Affiliation(s)
- Zhongze Qiu
- School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518107, China
| | - Hangfu Guo
- School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518107, China
| | - Jun Hu
- School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518107, China
| | - Hejun Jiang
- Science and Technology on Near-Surface Detection Laboratory, Wuxi 214035, China
| | - Chaopeng Luo
- Science and Technology on Near-Surface Detection Laboratory, Wuxi 214035, China
| |
Collapse
|
8
|
Xu YKT, Graves AR, Coste GI, Huganir RL, Bergles DE, Charles AS, Sulam J. Cross-modality supervised image restoration enables nanoscale tracking of synaptic plasticity in living mice. Nat Methods 2023; 20:935-944. [PMID: 37169928 PMCID: PMC10250193 DOI: 10.1038/s41592-023-01871-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/04/2023] [Indexed: 05/13/2023]
Abstract
Learning is thought to involve changes in glutamate receptors at synapses, submicron structures that mediate communication between neurons in the central nervous system. Due to their small size and high density, synapses are difficult to resolve in vivo, limiting our ability to directly relate receptor dynamics to animal behavior. Here we developed a combination of computational and biological methods to overcome these challenges. First, we trained a deep-learning image-restoration algorithm that combines the advantages of ex vivo super-resolution and in vivo imaging modalities to overcome limitations specific to each optical system. When applied to in vivo images from transgenic mice expressing fluorescently labeled glutamate receptors, this restoration algorithm super-resolved synapses, enabling the tracking of behavior-associated synaptic plasticity with high spatial resolution. This method demonstrates the capabilities of image enhancement to learn from ex vivo data and imaging techniques to improve in vivo imaging resolution.
Collapse
Affiliation(s)
- Yu Kang T Xu
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Austin R Graves
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Engineering, Baltimore, MD, USA
- Center for Imaging Science, Johns Hopkins University, Baltimore, MD, USA
| | - Gabrielle I Coste
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Richard L Huganir
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Dwight E Bergles
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Adam S Charles
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University School of Engineering, Baltimore, MD, USA.
- Center for Imaging Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Jeremias Sulam
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University School of Engineering, Baltimore, MD, USA.
- Center for Imaging Science, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
9
|
Han M, Shim H, Baek J. Utilization of an attentive map to preserve anatomical features for training convolutional neural-network-based low-dose CT denoiser. Med Phys 2023; 50:2787-2804. [PMID: 36734478 DOI: 10.1002/mp.16263] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 12/04/2022] [Accepted: 01/17/2023] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The purpose of a convolutional neural network (CNN)-based denoiser is to increase the diagnostic accuracy of low-dose computed tomography (LDCT) imaging. To increase diagnostic accuracy, there is a need for a method that reflects the features related to diagnosis during the denoising process. PURPOSE To provide a training strategy for LDCT denoisers that relies more on diagnostic task-related features to improve diagnostic accuracy. METHODS An attentive map derived from a lesion classifier (i.e., determining lesion-present or not) is created to represent the extent to which each pixel influences the decision by the lesion classifier. This is used as a weight to emphasize important parts of the image. The proposed training method consists of two steps. In the first one, the initial parameters of the CNN denoiser are trained using LDCT and normal-dose CT image pairs via supervised learning. In the second one, the learned parameters are readjusted using the attentive map to restore the fine details of the image. RESULTS Structural details and the contrast are better preserved in images generated by using the denoiser trained via the proposed method than in those generated by conventional denoisers. The proposed denoiser also yields higher lesion detectability and localization accuracy than conventional denoisers. CONCLUSIONS A denoiser trained using the proposed method preserves the small structures and the contrast in the denoised images better than without it. Specifically, using the attentive map improves the lesion detectability and localization accuracy of the denoiser.
Collapse
Affiliation(s)
- Minah Han
- Graduate School of Artificial Intelligence, Yonsei University, Seoul, South Korea.,Bareunex Imaging, Inc., Seoul, South Korea
| | - Hyunjung Shim
- Graduate School of AI, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Jongduk Baek
- Graduate School of Artificial Intelligence, Yonsei University, Seoul, South Korea.,Bareunex Imaging, Inc., Seoul, South Korea
| |
Collapse
|
10
|
Sonar Image Garbage Detection via Global Despeckling and Dynamic Attention Graph Optimization. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
11
|
Recognizing the Differentiation Degree of Human Induced Pluripotent Stem Cell-Derived Retinal Pigment Epithelium Cells Using Machine Learning and Deep Learning-Based Approaches. Cells 2023; 12:cells12020211. [PMID: 36672144 PMCID: PMC9856279 DOI: 10.3390/cells12020211] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/13/2022] [Accepted: 12/27/2022] [Indexed: 01/06/2023] Open
Abstract
Induced pluripotent stem cells (iPSCs) can be differentiated into mesenchymal stem cells (iPSC-MSCs), retinal ganglion cells (iPSC-RGCs), and retinal pigmental epithelium cells (iPSC-RPEs) to meet the demand of regeneration medicine. Since the production of iPSCs and iPSC-derived cell lineages generally requires massive and time-consuming laboratory work, artificial intelligence (AI)-assisted approach that can facilitate the cell classification and recognize the cell differentiation degree is of critical demand. In this study, we propose the multi-slice tensor model, a modified convolutional neural network (CNN) designed to classify iPSC-derived cells and evaluate the differentiation efficiency of iPSC-RPEs. We removed the fully connected layers and projected the features using principle component analysis (PCA), and subsequently classified iPSC-RPEs according to various differentiation degree. With the assistance of the support vector machine (SVM), this model further showed capabilities to classify iPSCs, iPSC-MSCs, iPSC-RPEs, and iPSC-RGCs with an accuracy of 97.8%. In addition, the proposed model accurately recognized the differentiation of iPSC-RPEs and showed the potential to identify the candidate cells with ideal features and simultaneously exclude cells with immature/abnormal phenotypes. This rapid screening/classification system may facilitate the translation of iPSC-based technologies into clinical uses, such as cell transplantation therapy.
Collapse
|
12
|
Huang H, Xie Y, Wang G, Zhang L, Zhou W. DLNLF-net: Denoised local and non-local deep features fusion network for malignancy characterization of hepatocellular carcinoma. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 227:107201. [PMID: 36335751 DOI: 10.1016/j.cmpb.2022.107201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 10/17/2022] [Accepted: 10/23/2022] [Indexed: 06/16/2023]
Abstract
INTRODUCTION Hepatocellular carcinoma (HCC) is a primary liver cancer with high mortality rate. The degree of HCC malignancy is an important prognostic factor for predicting recurrence and survival after surgical resection or liver transplantation in clinical practice. Currently, deep features obtained from data-driven machine learning algorithms have demonstrated superior performance in characterising lesion features in medical imaging processing. However, previous convolutional neural network (CNN)-based studies on HCC lesion characterisation were based on traditional local deep features. The aim of this study was to propose a denoised local and non-local deep features fusion network (DLNLF-net) for grading HCC. METHODS Gadolinium-diethylenetriaminepentaacetic-acid-enhanced magnetic resonance imaging data of 117 histopathologically proven HCCs were collected from 112 patients with resected HCC between October 2012 and October 2018. The proposed DLNLF-net primarily consists of three modules: feature denoising, non-local feature extraction, and bilinear kernel fusion. First, local feature maps were extracted from the original tumour images using convolution operations, followed by a feature denoising block to generate denoised local features. Simultaneously, a non-local feature extraction block was employed on the local feature maps to generate non-local features. Finally, the two generated features were fused using a bilinear kernel model to output the classification results. The dataset was divided into a training set (77 HCC images) and an independent test set (40 HCC images). Training and independent testing were repeated five times to reduce measurement errors. Accuracy, sensitivity, specificity, and area under the curve (AUC) values in the five repetitive tests were calculated to evaluate the performance of the proposed method. RESULTS Denoised local features (AUC 89.19%) and non-local features (AUC 88.28%) showed better performance than local features (AUC 86.21%) and global average pooling features (AUC 87.1%) that were derived from a CNN for malignancy characterisation of HCC. Furthermore, the proposed DLNFL-net yielded superior performance (AUC 94.89%) than a typical 3D CNN (AUC 86.21%), bilinear CNN (AUC 90.46%), recently proposed local and global diffusion method (AUC 93.94%), and convolutional block attention module method (AUC 93.62%) for malignancy characterisation of HCC. CONCLUSION The non-local operation demonstrated a better capability of yielding global representation, and feature denoising based on the non-local operation achieved performance gains for lesion characterisation. The proposed DLNLF-net, which integrates denoised local and non-local deep features, evidently outperforms conventional CNN-based methods in the malignancy characterisation of HCC.
Collapse
Affiliation(s)
- Haoyuan Huang
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou 510006, China
| | - Yanyan Xie
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou 510006, China
| | - Guangyi Wang
- Department of Radiology, Guangdong Provincial People's Hospital, Guangzhou 510080, China
| | - Lijuan Zhang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wu Zhou
- School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou 510006, China.
| |
Collapse
|
13
|
Image denoising in the deep learning era. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10305-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
14
|
Lee Y, Cho S, Jun D. Video Super-Resolution Method Using Deformable Convolution-Based Alignment Network. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22218476. [PMID: 36366175 PMCID: PMC9656337 DOI: 10.3390/s22218476] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/24/2022] [Accepted: 11/01/2022] [Indexed: 06/12/2023]
Abstract
With the advancement of sensors, image and video processing have developed for use in the visual sensing area. Among them, video super-resolution (VSR) aims to reconstruct high-resolution sequences from low-resolution sequences. To use consecutive contexts within a low-resolution sequence, VSR learns the spatial and temporal characteristics of multiple frames of the low-resolution sequence. As one of the convolutional neural network-based VSR methods, we propose a deformable convolution-based alignment network (DCAN) to generate scaled high-resolution sequences with quadruple the size of the low-resolution sequences. The proposed method consists of a feature extraction block, two different alignment blocks that use deformable convolution, and an up-sampling block. Experimental results show that the proposed DCAN achieved better performances in both the peak signal-to-noise ratio and structural similarity index measure than the compared methods. The proposed DCAN significantly reduces the network complexities, such as the number of network parameters, the total memory, and the inference speed, compared with the latest method.
Collapse
Affiliation(s)
- Yooho Lee
- Department of Computer Engineering, Dong-A University, Busan 49315, Korea
| | - Sukhee Cho
- Media Intelligence Laboratory Electronics and Telecommunications Research Institute (ETRI), Daejeon 34129, Korea
| | - Dongsan Jun
- Department of Computer Engineering, Dong-A University, Busan 49315, Korea
| |
Collapse
|
15
|
Wang H, Liu Z, Peng D, Cheng Z. Attention-guided joint learning CNN with noise robustness for bearing fault diagnosis and vibration signal denoising. ISA TRANSACTIONS 2022; 128:470-484. [PMID: 34961609 DOI: 10.1016/j.isatra.2021.11.028] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 11/15/2021] [Accepted: 11/15/2021] [Indexed: 06/14/2023]
Abstract
Mechanical system usually operates in harsh environments, and the monitored vibration signal faces substantial noise interference, which brings great challenges to the robust fault diagnosis. This paper proposes a novel attention-guided joint learning convolutional neural network (JL-CNN) for mechanical equipment condition monitoring. Fault diagnosis task (FD-Task) and signal denoising task (SD-Task) are integrated into an end-to-end CNN architecture, achieving good noise robustness through dual-task joint learning. JL-CNN mainly includes a joint feature encoding network and two attention-based encoder networks. This architecture allows FD-Task and SD-Task can achieve deep cooperation and mutual learning. The JL-CNN is evaluated on the wheelset bearing dataset and motor bearing dataset, which shows that JL-CNN has excellent fault diagnosis ability and signal denoising ability, and it has good performance under strong noise and unknown noise.
Collapse
Affiliation(s)
- Huan Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Zhiliang Liu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; Institute of Electronic and Information Engineering of UESTC in Guangdong, Dongguan, 523808, China.
| | - Dandan Peng
- Department of Mechanical Engineering, KU Leuven, Leuven, Belgium; Dynamics of Mechanical and Mechatronic Systems, Flanders Make, Belgium
| | - Zhe Cheng
- College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China; Laboratory of Science and Technology on Integrated Logistics Support, NUDT, Changsha 410073, China
| |
Collapse
|
16
|
Li H, Yang Z, Hong X, Zhao Z, Chen J, Shi Y, Pan J. DnSwin: Toward real-world denoising via a continuous Wavelet Sliding Transformer. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
17
|
Huang H, Hu C, Li J, Dong X, Chen H. CoCoCs: co-optimized compressive imaging driven by high-level vision. OPTICS EXPRESS 2022; 30:30894-30910. [PMID: 36242185 DOI: 10.1364/oe.468733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 07/31/2022] [Indexed: 06/16/2023]
Abstract
Compressive imaging senses optically encoded high-dimensional scene data with far fewer measurements and then performs reconstruction via appropriate algorithms. In this paper, we present a novel noniterative end-to-end deep learning-based framework for compressive imaging, dubbed CoCoCs. In comparison to existing approaches, we extend the pipeline by co-optimizing the recovery algorithm with optical coding as well as cascaded high-level computer vision tasks to boost the quality of the reconstruction. We demonstrate the proposed framework on two typical compressive imaging systems, i.e., single pixel imaging and snapshot video compressive imaging. Extensive results, including conventional image quality criteria, mean opinion scores, and accuracy in image classification and motion recognition, confirm that CoCoCs can yield realistic images and videos, which are friendly to both human viewing and computer vision. We hope CoCoCs will give impetus to bridge the gap between compressive imagers and computer vision and the perception of human.
Collapse
|
18
|
|
19
|
Chen Z, Jiang Y, Liu D, Wang Z. CERL: A Unified Optimization Framework for Light Enhancement With Realistic Noise. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; PP:4162-4172. [PMID: 35700251 DOI: 10.1109/tip.2022.3180213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Low-light images captured in the real world are inevitably corrupted by sensor noise. Such noise is spatially variant and highly dependent on the underlying pixel intensity, deviating from the oversimplified assumptions in conventional denoising. Existing light enhancement methods either overlook the important impact of real-world noise during enhancement, or treat noise removal as a separate pre- or post-processing step. We present Coordinated Enhancement for Real-world Low-light Noisy Images (CERL), that seamlessly integrates light enhancement and noise suppression parts into a unified and physics-grounded optimization framework. For the real low-light noise removal part, we customize a self-supervised denoising model that can easily be adapted without referring to clean ground-truth images. For the light enhancement part, we also improve the design of a state-of-the-art backbone. The two parts are then joint formulated into one principled plug-and-play optimization. Our approach is compared against state-of-the-art low-light enhancement methods both qualitatively and quantitatively. Besides standard benchmarks, we further collect and test on a new realistic low-light mobile photography dataset (RLMP), whose mobile-captured photos display heavier realistic noise than those taken by high-quality cameras. CERL consistently produces the most visually pleasing and artifact-free results across all experiments. Our RLMP dataset and codes are available at: https://github.com/VITA-Group/CERL.
Collapse
|
20
|
Perceptual adversarial non-residual learning for blind image denoising. Soft comput 2022. [DOI: 10.1007/s00500-022-06853-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
Guo L, Zha Z, Ravishankar S, Wen B. Exploiting Non-Local Priors via Self-Convolution for Highly-Efficient Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1311-1324. [PMID: 35020596 DOI: 10.1109/tip.2022.3140918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Constructing effective priors is critical to solving ill-posed inverse problems in image processing and computational imaging. Recent works focused on exploiting non-local similarity by grouping similar patches for image modeling, and demonstrated state-of-the-art results in many image restoration applications. However, compared to classic methods based on filtering or sparsity, non-local algorithms are more time-consuming, mainly due to the highly inefficient block matching step, i.e., distance between every pair of overlapping patches needs to be computed. In this work, we propose a novel Self-Convolution operator to exploit image non-local properties in a unified framework. We prove that the proposed Self-Convolution based formulation can generalize the commonly-used non-local modeling methods, as well as produce results equivalent to standard methods, but with much cheaper computation. Furthermore, by applying Self-Convolution, we propose an effective multi-modality image restoration scheme, which is much more efficient than conventional block matching for non-local modeling. Experimental results demonstrate that (1) Self-Convolution with fast Fourier transform implementation can significantly speed up most of the popular non-local image restoration algorithms, with two-fold to nine-fold faster block matching, and (2) the proposed online multi-modality image restoration scheme achieves superior denoising results than competing methods in both efficiency and effectiveness on RGB-NIR images. The code for this work is publicly available at https://github.com/GuoLanqing/Self-Convolution.
Collapse
|
22
|
Lei Y, Zhang J, Shan H. Strided Self-Supervised Low-Dose CT Denoising for Lung Nodule Classification. PHENOMICS (CHAM, SWITZERLAND) 2021; 1:257-268. [PMID: 36939784 PMCID: PMC9590543 DOI: 10.1007/s43657-021-00025-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 09/04/2021] [Accepted: 09/14/2021] [Indexed: 11/26/2022]
Abstract
Lung nodule classification based on low-dose computed tomography (LDCT) images has attracted major attention thanks to the reduced radiation dose and its potential for early diagnosis of lung cancer from LDCT-based lung cancer screening. However, LDCT images suffer from severe noise, largely influencing the performance of lung nodule classification. Current methods combining denoising and classification tasks typically require the corresponding normal-dose CT (NDCT) images as the supervision for the denoising task, which is impractical in the context of clinical diagnosis using LDCT. To jointly train these two tasks in a unified framework without the NDCT images, this paper introduces a novel self-supervised method, termed strided Noise2Neighbors or SN2N, for blind medical image denoising and lung nodule classification, where the supervision is generated from noisy input images. More specifically, the proposed SN2N can construct the supervision information from its neighbors for LDCT denoising, which does not need NDCT images anymore. The proposed SN2N method enables joint training of LDCT denoising and lung nodule classification tasks by using self-supervised loss for denoising and cross-entropy loss for classification. Extensively experimental results on the Mayo LDCT dataset demonstrate that our SN2N achieves competitive performance compared with the supervised learning methods that have paired NDCT images as supervision. Moreover, our results on the LIDC-IDRI dataset show that the joint training of LDCT denoising and lung nodule classification significantly improves the performance of LDCT-based lung nodule classification.
Collapse
Affiliation(s)
- Yiming Lei
- Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, 200433 China
| | - Junping Zhang
- Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, 200433 China
| | - Hongming Shan
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, 201210 China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, 201210 China
| |
Collapse
|
23
|
Huang Y, Zhou Z, Sai X, Xu Y, Zou Y. Hierarchical hashing-based multi-source image retrieval method for image denoising. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.108028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
24
|
Zhang H, Chen H, Yang G, Zhang L. LR-Net: Low-Rank Spatial-Spectral Network for Hyperspectral Image Denoising. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:8743-8758. [PMID: 34665726 DOI: 10.1109/tip.2021.3120037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Due to the physical limitations of the imaging devices, hyperspectral images (HSIs) are commonly distorted by a mixture of Gaussian noise, impulse noise, stripes, and dead lines, leading to the decline in the performance of unmixing, classification, and other subsequent applications. In this paper, we propose a novel end-to-end low-rank spatial-spectral network (LR-Net) for the removal of the hybrid noise in HSIs. By integrating the low-rank physical property into a deep convolutional neural network (DCNN), the proposed LR-Net simultaneously enjoys the strong feature representation ability from DCNN and the implicit physical constraint of clean HSIs. Firstly, spatial-spectral atrous blocks (SSABs) are built to exploit spatial-spectral features of HSIs. Secondly, these spatial-spectral features are forwarded to a multi-atrous block (MAB) to aggregate the context in different receptive fields. Thirdly, the contextual features and spatial-spectral features from different levels are concatenated before being fed into a plug-and-play low-rank module (LRM) for feature reconstruction. With the help of the LRM, the workflow of low-rank matrix reconstruction can be streamlined in a differentiable manner. Finally, the low-rank features are utilized to capture the latent semantic relationships of the HSIs to recover clean HSIs. Extensive experiments on both simulated and real-world datasets were conducted. The experimental results show that the LR-Net outperforms other state-of-the-art denoising methods in terms of evaluation metrics and visual assessments. Particularly, through the collaborative integration of DCNNs and the low-rank property, the LR-Net shows strong stability and capacity for generalization.
Collapse
|
25
|
Chen K, Long K, Ren Y, Sun J, Pu X. Lesion-Inspired Denoising Network: Connecting Medical Image Denoising and Lesion Detection. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA 2021. [DOI: 10.1145/3474085.3475480] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Kecheng Chen
- University of Electronic Science and Technology of China, Chengdu, China
| | - Kun Long
- University of Electronic Science and Technology of China, Chengdu, China
| | - Yazhou Ren
- University of Electronic Science and Technology of China, Chengdu, China
| | - Jiayu Sun
- West China Hospital of SiChuan University, Chengdu, China
| | - Xiaorong Pu
- University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
26
|
Multi-Task Learning-Based Immunofluorescence Classification of Kidney Disease. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182010798. [PMID: 34682567 PMCID: PMC8535636 DOI: 10.3390/ijerph182010798] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/24/2021] [Accepted: 09/26/2021] [Indexed: 12/14/2022]
Abstract
Chronic kidney disease is one of the most important causes of mortality worldwide, but a shortage of nephrology pathologists has led to delays or errors in its diagnosis and treatment. Immunofluorescence (IF) images of patients with IgA nephropathy (IgAN), membranous nephropathy (MN), diabetic nephropathy (DN), and lupus nephritis (LN) were obtained from the General Hospital of Chinese PLA. The data were divided into training and test data. To simulate the inaccurate focus of the fluorescence microscope, the Gaussian method was employed to blur the IF images. We proposed a novel multi-task learning (MTL) method for image quality assessment, de-blurring, and disease classification tasks. A total of 1608 patients’ IF images were included—1289 in the training set and 319 in the test set. For non-blurred IF images, the classification accuracy of the test set was 0.97, with an AUC of 1.000. For blurred IF images, the proposed MTL method had a higher accuracy (0.94 vs. 0.93, p < 0.01) and higher AUC (0.993 vs. 0.986) than the common MTL method. The novel MTL method not only diagnosed four types of kidney diseases through blurred IF images but also showed good performance in two auxiliary tasks: image quality assessment and de-blurring.
Collapse
|
27
|
Wang X, Xu M, Zhang J, Jiang L, Li L, He M, Wang N, Liu H, Wang Z. Joint Learning of Multi-level Tasks for Diabetic Retinopathy Grading on Low-resolution Fundus Images. IEEE J Biomed Health Inform 2021; 26:2216-2227. [PMID: 34648460 DOI: 10.1109/jbhi.2021.3119519] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Diabetic retinopathy (DR) is a leading cause of permanent blindness among the working-age people. Automatic DR grading can help ophthalmologists make timely treatment for patients. However, the existing grading methods are usually trained with high resolution (HR) fundus images, such that the grading performance decreases a lot given low resolution (LR) images, which are common in clinic. In this paper, we mainly focus on DR grading with LR fundus images. According to our analysis on the DR task, we find that: 1) image super-resolution (ISR) can boost the performance of both DR grading and lesion segmentation; 2) the lesion segmentation regions of fundus images are highly consistent with pathological regions for DR grading. Based on our findings, we propose a convolutional neural network (CNN)-based method for joint learning of multi-level tasks for DR grading, called DeepMT-DR, which can simultaneously handle the low-level task of ISR, the mid-level task of lesion segmentation and the high-level task of disease severity classification on LR fundus images. Moreover, a novel task-aware loss is developed to encourage ISR to focus on the pathological regions for its subsequent tasks: lesion segmentation and DR grading. Extensive experimental results show that our DeepMT-DR method significantly outperforms other state-of-the-art methods for DR grading over three datasets. In addition, our method achieves comparable performance in two auxiliary tasks of ISR and lesion segmentation.
Collapse
|
28
|
Li K, Zhou W, Li H, Anastasio MA. Assessing the Impact of Deep Neural Network-Based Image Denoising on Binary Signal Detection Tasks. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2295-2305. [PMID: 33929958 PMCID: PMC8673589 DOI: 10.1109/tmi.2021.3076810] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
A variety of deep neural network (DNN)-based image denoising methods have been proposed for use with medical images. Traditional measures of image quality (IQ) have been employed to optimize and evaluate these methods. However, the objective evaluation of IQ for the DNN-based denoising methods remains largely lacking. In this work, we evaluate the performance of DNN-based denoising methods by use of task-based IQ measures. Specifically, binary signal detection tasks under signal-known-exactly (SKE) with background-known-statistically (BKS) conditions are considered. The performance of the ideal observer (IO) and common linear numerical observers are quantified and detection efficiencies are computed to assess the impact of the denoising operation on task performance. The numerical results indicate that, in the cases considered, the application of a denoising network can result in a loss of task-relevant information in the image. The impact of the depth of the denoising networks on task performance is also assessed. The presented results highlight the need for the objective evaluation of IQ for DNN-based denoising technologies and may suggest future avenues for improving their effectiveness in medical imaging applications.
Collapse
|
29
|
Ye H, Li H, Chen CLP. Adaptive Deep Cascade Broad Learning System and Its Application in Image Denoising. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4450-4463. [PMID: 32203051 DOI: 10.1109/tcyb.2020.2978500] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article proposes a novel regularization deep cascade broad learning system (DCBLS) architecture, which includes one cascaded feature mapping nodes layer and one cascaded enhancement nodes layer. Then, the transformation feature representation is easily obtained by incorporating the enhancement nodes and the feature mapping nodes. Once such a representation is established, a final output layer is constructed by implementing a simple convex optimization model. Furthermore, a parallelization framework on the new method is designed to make it compatible with large-scale data. Simultaneously, an adaptive regularization parameter criterion is adopted under some conditions. Moreover, the stability and error estimate of this method are discussed and proved mathematically. The proposed method could extract sufficient available information from the raw data compared with the standard broad learning system and could achieve compellent successes in image denoising. The experiments results on benchmark datasets, including natural images as well as hyperspectral images, verify the effectiveness and superiority of the proposed method in comparison with the state-of-the-art approaches for image denoising.
Collapse
|
30
|
Adversarial Gaussian Denoiser for Multiple-Level Image Denoising. SENSORS 2021; 21:s21092998. [PMID: 33923320 PMCID: PMC8123214 DOI: 10.3390/s21092998] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 04/17/2021] [Accepted: 04/23/2021] [Indexed: 12/28/2022]
Abstract
Image denoising is a challenging task that is essential in numerous computer vision and image processing problems. This study proposes and applies a generative adversarial network-based image denoising training architecture to multiple-level Gaussian image denoising tasks. Convolutional neural network-based denoising approaches come across a blurriness issue that produces denoised images blurry on texture details. To resolve the blurriness issue, we first performed a theoretical study of the cause of the problem. Subsequently, we proposed an adversarial Gaussian denoiser network, which uses the generative adversarial network-based adversarial learning process for image denoising tasks. This framework resolves the blurriness problem by encouraging the denoiser network to find the distribution of sharp noise-free images instead of blurry images. Experimental results demonstrate that the proposed framework can effectively resolve the blurriness problem and achieve significant denoising efficiency than the state-of-the-art denoising methods.
Collapse
|
31
|
Sun W, Gong D, Shi Q, van den Hengel A, Zhang Y. Learning to Zoom-In via Learning to Zoom-Out: Real-World Super-Resolution by Generating and Adapting Degradation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2947-2962. [PMID: 33471753 DOI: 10.1109/tip.2021.3049951] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Most learning-based super-resolution (SR) methods aim to recover high-resolution (HR) image from a given low-resolution (LR) image via learning on LR-HR image pairs. The SR methods learned on synthetic data do not perform well in real-world, due to the domain gap between the artificially synthesized and real LR images. Some efforts are thus taken to capture real-world image pairs. However, the captured LR-HR image pairs usually suffer from unavoidable misalignment, which hampers the performance of end- to-end learning. Here, focusing on the real-world SR, we ask a different question: since misalignment is unavoidable, can we propose a method that does not need LR-HR image pairing and alignment at all and utilizes real images as they are? Hence we propose a framework to learn SR from an arbitrary set of unpaired LR and HR images and see how far a step can go in such a realistic and "unsupervised" setting. To do so, we firstly train a degradation generation network to generate realistic LR images and, more importantly, to capture their distribution (i.e., learning to zoom out). Instead of assuming the domain gap has been eliminated, we minimize the discrepancy between the generated data and real data while learning a degradation adaptive SR network (i.e., learning to zoom in). The proposed unpaired method achieves state-of- the-art SR results on real-world images, even in the datasets that favour the paired-learning methods more.
Collapse
|
32
|
A Comprehensive Benchmark Analysis of Single Image Deraining: Current Challenges and Future Perspectives. Int J Comput Vis 2021. [DOI: 10.1007/s11263-020-01416-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
33
|
Mapping Large-Scale Mangroves along the Maritime Silk Road from 1990 to 2015 Using a Novel Deep Learning Model and Landsat Data. REMOTE SENSING 2021. [DOI: 10.3390/rs13020245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Mangroves are important ecosystems and their distribution and dynamics can provide an understanding of the processes of ecological change. Meanwhile, mangroves protection is also an important element of the Maritime Silk Road (MSR) Cooperation Project. Large amounts of accessible satellite remote sensing data can provide timely and accurate information on the dynamics of mangroves, offering significant advantages in space, time, and characterization. In view of the capability of deep learning in processing massive data in recent years, we developed a new deep learning model—Capsules-Unet, which introduces the capsule concept into U-net to extract mangroves with high accuracy by learning the spatial relationship between objects in images. This model can significantly reduce the number of network parameters to improve the efficiency of data processing. This study uses Landsat data combined with Capsules-Unet to map the dynamics of mangrove changes over the 25 years (1990–2015) along the MSR. The results show that there was a loss in the mangrove area of 1,356,686 ha (about 21.5%) between 1990 and 2015, with anthropic activities such as agriculture, aquaculture, tourism, urban development, and over-development appearing to be the likely drivers of this decline. This information contributes to the understanding of ecological conditions, variability characteristics, and influencing factors along the MSR.
Collapse
|
34
|
Noise-Aware and Light-Weight VLSI Design of Bilateral Filter for Robust and Fast Image Denoising in Mobile Systems. SENSORS 2020; 20:s20174722. [PMID: 32825616 PMCID: PMC7506639 DOI: 10.3390/s20174722] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 08/19/2020] [Accepted: 08/19/2020] [Indexed: 11/17/2022]
Abstract
The range kernel of bilateral filter degrades image quality unintentionally in real environments because the pixel intensity varies randomly due to the noise that is generated in image sensors. Furthermore, the range kernel increases the complexity due to the comparisons with neighboring pixels and the multiplications with the corresponding weights. In this paper, we propose a noise-aware range kernel, which estimates noise using an intensity difference-based image noise model and dynamically adjusts weights according to the estimated noise, in order to alleviate the quality degradation of bilateral filters by noise. In addition, to significantly reduce the complexity, an approximation scheme is introduced, which converts the proposed noise-aware range kernel into a binary kernel while using the statistical hypothesis test method. Finally, blue a fully parallelized and pipelined very-large-scale integration (VLSI) architecture of a noise-aware bilateral filter (NABF) that is based on the proposed binary range kernel is presented, which was successfully implemented in field-programmable gate array (FPGA). The experimental results show that the proposed NABF is more robust to noise than the conventional bilateral filter under various noise conditions. Furthermore, the proposed VLSI design of the NABF achieves 10.5 and 95.7 times higher throughput and uses 63.6–97.5% less internal memory than state-of-the-art bilateral filter designs.
Collapse
|
35
|
Gu X, Guo Y, Deligianni F, Yang GZ. Coupled Real-Synthetic Domain Adaptation for Real-World Deep Depth Enhancement. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020:1-1. [PMID: 32340948 DOI: 10.1109/tip.2020.2988574] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Advances in depth sensing technologies have allowed simultaneous acquisition of both color and depth data under different environments. However, most depth sensors have lower resolution than that of the associated color channels and such a mismatch can affect applications that require accurate depth recovery. Existing depth enhancement methods use simplistic noise models and cannot generalize well under real-world conditions. In this paper, a coupled real-synthetic domain adaptation method is proposed, which enables domain transfer between high-quality depth simulators and real depth camera information for super-resolution depth recovery. The method first enables the realistic degradation from synthetic images, and then enhances degraded depth data to high quality with a color-guided sub-network. The key advantage of the work is that it generalizes well to real-world datasets without further training or fine-tuning. Detailed quantitative and qualitative results are presented, and it is demonstrated that the proposed method achieves improved performance compared to previous methods fine-tuned on the specific datasets.
Collapse
|