1
|
Malyugina A, Anantrasirichai N, Bull D. Wavelet-Based Topological Loss for Low-Light Image Denoising. SENSORS (BASEL, SWITZERLAND) 2025; 25:2047. [PMID: 40218560 PMCID: PMC11990961 DOI: 10.3390/s25072047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Revised: 03/11/2025] [Accepted: 03/21/2025] [Indexed: 04/14/2025]
Abstract
Despite significant advances in image denoising, most algorithms rely on supervised learning, with their performance largely dependent on the quality and diversity of training data. It is widely assumed that digital image distortions are caused by spatially invariant Additive White Gaussian Noise (AWGN). However, the analysis of real-world data suggests that this assumption is invalid. Therefore, this paper tackles image corruption by real noise, providing a framework to capture and utilise the underlying structural information of an image along with the spatial information conventionally used for deep learning tasks. We propose a novel denoising loss function that incorporates topological invariants and is informed by textural information extracted from the image wavelet domain. The effectiveness of this proposed method was evaluated by training state-of-the-art denoising models on the BVI-Lowlight dataset, which features a wide range of real noise distortions. Adding a topological term to common loss functions leads to a significant increase in the LPIPS (Learned Perceptual Image Patch Similarity) metric, with the improvement reaching up to 25%. The results indicate that the proposed loss function enables neural networks to learn noise characteristics better. We demonstrate that they can consequently extract the topological features of noise-free images, resulting in enhanced contrast and preserved textural information.
Collapse
Affiliation(s)
- Alexandra Malyugina
- Visual Information Laboratory, University of Bristol, Bristol BS1 5DD, UK; (N.A.); (D.B.)
| | | | | |
Collapse
|
2
|
Anwar S, Barnes N, Petersson L. Attention-Based Real Image Restoration. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:3954-3964. [PMID: 34898442 DOI: 10.1109/tnnls.2021.3131739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep convolutional neural networks perform better on images containing spatially invariant degradations, also known as synthetic degradations; however, their performance is limited on real-degraded photographs and requires multiple-stage network modeling. To advance the practicability of restoration algorithms, this article proposes a novel single-stage blind real image restoration network ( Net) by employing a modular architecture. We use a residual on the residual structure to ease low-frequency information flow and apply feature attention to exploit the channel dependencies. Furthermore, the evaluation in terms of quantitative metrics and visual quality for four restoration tasks, i.e., denoising, super-resolution, raindrop removal, and JPEG compression on 11 real degraded datasets against more than 30 state-of-the-art algorithms, demonstrates the superiority of our Net. We also present the comparison on three synthetically generated degraded datasets for denoising to showcase our method's capability on synthetics denoising. The codes, trained models, and results are available on https://github.com/saeed-anwar/R2Net.
Collapse
|
3
|
Gao Y, Cai Z, Xie X, Deng J, Dou Z, Ma X. Sparse representation for restoring images by exploiting topological structure of graph of patches. IET IMAGE PROCESSING 2025; 19. [DOI: 10.1049/ipr2.70004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 01/16/2025] [Indexed: 03/02/2025]
Abstract
AbstractImage restoration poses a significant challenge, aiming to accurately recover damaged images by delving into their inherent characteristics. Various models and algorithms have been explored by researchers to address different types of image distortions, including sparse representation, grouped sparse representation, and low‐rank self‐representation. The grouped sparse representation algorithm leverages the prior knowledge of non‐local self‐similarity and imposes sparsity constraints to maintain texture information within images. To further exploit the intrinsic properties of images, this study proposes a novel low‐rank representation‐guided grouped sparse representation image restoration algorithm. This algorithm integrates self‐representation models and trace optimization techniques to effectively preserve the original image structure, thereby enhancing image restoration performance while retaining the original texture and structural information. The proposed method was evaluated on image denoising and deblocking tasks across several datasets, demonstrating promising results.
Collapse
Affiliation(s)
- Yaxian Gao
- School of Information Engineering Shaanxi Xueqian Normal University Xi'an Shaanxi China
| | - Zhaoyuan Cai
- School of Computer Science and Technology Xidian University Xi'an Shaanxi China
| | - Xianghua Xie
- Department of Computer Science Swansea University Swansea UK
| | - Jingjing Deng
- Department of Computer Science Durham University Durham UK
| | - Zengfa Dou
- School of Information Engineering Shaanxi Xueqian Normal University Xi'an Shaanxi China
| | - Xiaoke Ma
- School of Computer Science and Technology Xidian University Xi'an Shaanxi China
| |
Collapse
|
4
|
Chen Z, He X, Zhang T, Xiong S, Ren C. Dual-stage feedback network for lightweight color image compression artifact reduction. Neural Netw 2024; 179:106555. [PMID: 39068676 DOI: 10.1016/j.neunet.2024.106555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 05/29/2024] [Accepted: 07/18/2024] [Indexed: 07/30/2024]
Abstract
Lossy image coding techniques usually result in various undesirable compression artifacts. Recently, deep convolutional neural networks have seen encouraging advances in compression artifact reduction. However, most of them focus on the restoration of the luma channel without considering the chroma components. Besides, most deep convolutional neural networks are hard to deploy in practical applications because of their high model complexity. In this article, we propose a dual-stage feedback network (DSFN) for lightweight color image compression artifact reduction. Specifically, we propose a novel curriculum learning strategy to drive a DSFN to reduce color image compression artifacts in a luma-to-RGB manner. In the first stage, the DSFN is dedicated to reconstructing the luma channel, whose high-level features containing rich structural information are then rerouted to the second stage by a feedback connection to guide the RGB image restoration. Furthermore, we present a novel enhanced feedback block for efficient high-level feature extraction, in which an adaptive iterative self-refinement module is carefully designed to refine the low-level features progressively, and an enhanced separable convolution is advanced to exploit multiscale image information fully. Extensive experiments show the notable advantage of our DSFN over several state-of-the-art methods in both quantitative indices and visual effects with lower model complexity.
Collapse
Affiliation(s)
- Zhengxin Chen
- College of Electronic and Information Engineering, Sichuan University, Chengdu, 610065, China
| | - Xiaohai He
- College of Electronic and Information Engineering, Sichuan University, Chengdu, 610065, China
| | - Tingrong Zhang
- College of Electronic and Information Engineering, Sichuan University, Chengdu, 610065, China
| | - Shuhua Xiong
- College of Electronic and Information Engineering, Sichuan University, Chengdu, 610065, China
| | - Chao Ren
- College of Electronic and Information Engineering, Sichuan University, Chengdu, 610065, China.
| |
Collapse
|
5
|
Ma L, Zhao Y, Peng P, Tian Y. Sensitivity Decouple Learning for Image Compression Artifacts Reduction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:3620-3633. [PMID: 38787669 DOI: 10.1109/tip.2024.3403034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
With the benefit of deep learning techniques, recent researches have made significant progress in image compression artifacts reduction. Despite their improved performances, prevailing methods only focus on learning a mapping from the compressed image to the original one but ignore the intrinsic attributes of the given compressed images, which greatly harms the performance of downstream parsing tasks. Different from these methods, we propose to decouple the intrinsic attributes into two complementary features for artifacts reduction, i.e., the compression-insensitive features to regularize the high-level semantic representations during training and the compression-sensitive features to be aware of the compression degree. To achieve this, we first employ adversarial training to regularize the compressed and original encoded features for retaining high-level semantics, and we then develop the compression quality-aware feature encoder for compression-sensitive features. Based on these dual complementary features, we propose a Dual Awareness Guidance Network (DAGN) to utilize these awareness features as transformation guidance during the decoding phase. In our proposed DAGN, we develop a cross-feature fusion module to maintain the consistency of compression-insensitive features by fusing compression-insensitive features into the artifacts reduction baseline. Our method achieves an average 2.06 dB PSNR gains on BSD500, outperforming state-of-the-art methods, and only requires 29.7 ms to process one image on BSD500. Besides, the experimental results on LIVE1 and LIU4K also demonstrate the efficiency, effectiveness, and superiority of the proposed method in terms of quantitative metrics, visual quality, and downstream machine vision tasks.
Collapse
|
6
|
Yu A, Shan L, Zhu W, Jie J, Hou B. A novel improved total variation algorithm for the elimination of scratch-type defects in high-voltage cable cross-sections. PLoS One 2024; 19:e0300260. [PMID: 38626015 PMCID: PMC11020849 DOI: 10.1371/journal.pone.0300260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 02/23/2024] [Indexed: 04/18/2024] Open
Abstract
In the quality inspection process of high-voltage cables, several commonly used indicators include cable length, insulation thickness, and the number of conductors within the core. Among these factors, the count of conductors holds particular significance as a key determinant of cable quality. Machine vision technology has found extensive application in automatically detecting the number of conductors in cross-sectional images of high-voltage cables. However, the presence of scratch-type defects in cut high-voltage cable cross-sections can significantly compromise the precision of conductor count detection. To address this problem, this paper introduces a novel improved total variation (TV) algorithm, marking the first-ever application of the TV algorithm in this domain. Considering the staircase effect, the direct use of the TV algorithm is prone to cause serious loss of image edge information. The proposed algorithm firstly introduces multimodal features to effectively mitigate the staircase effect. While eliminating scratch-type defects, the algorithm endeavors to preserve the original image's edge information, consequently yielding a noteworthy enhancement in detection accuracy. Furthermore, a dataset was curated, comprising images of cross-sections of high-voltage cables of varying sizes, each displaying an assortment of scratch-type defects. Experimental findings conclusively demonstrate the algorithm's exceptional efficiency in eradicating diverse scratch-type defects within high-voltage cable cross-sections. The average scratch elimination rate surpasses 90%, with an impressive 96.15% achieved on cable sample 4. A series of conducted ablation experiments in this paper substantiate a significant enhancement in cable image quality. Notably, the Edge Preservation Index (EPI) exhibits an improvement of approximately 20%, resulting in a substantial boost to conductor count detection accuracy, thus effectively enhancing the quality of high-voltage cable production.
Collapse
Affiliation(s)
- Aihua Yu
- School of Automation and Electrical Engineering, Zhejiang University of Science and Technology, Hangzhou, Zhejiang Province, China
| | - Lina Shan
- School of Automation and Electrical Engineering, Zhejiang University of Science and Technology, Hangzhou, Zhejiang Province, China
| | - Wen Zhu
- School of Automation and Electrical Engineering, Zhejiang University of Science and Technology, Hangzhou, Zhejiang Province, China
| | - Jing Jie
- School of Automation and Electrical Engineering, Zhejiang University of Science and Technology, Hangzhou, Zhejiang Province, China
| | - Beiping Hou
- School of Automation and Electrical Engineering, Zhejiang University of Science and Technology, Hangzhou, Zhejiang Province, China
| |
Collapse
|
7
|
Cai Z, Xie X, Deng J, Dou Z, Tong B, Ma X. Image restoration with group sparse representation and low‐rank group residual learning. IET IMAGE PROCESSING 2024; 18:741-760. [DOI: 10.1049/ipr2.12982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 11/01/2023] [Indexed: 01/11/2025]
Abstract
AbstractImage restoration, as a fundamental research topic of image processing, is to reconstruct the original image from degraded signal using the prior knowledge of image. Group sparse representation (GSR) is powerful for image restoration; it however often leads to undesirable sparse solutions in practice. In order to improve the quality of image restoration based on GSR, the sparsity residual model expects the representation learned from degraded images to be as close as possible to the true representation. In this article, a group residual learning based on low‐rank self‐representation is proposed to automatically estimate the true group sparse representation. It makes full use of the relation among patches and explores the subgroup structures within the same group, which makes the sparse residual model have better interpretation furthermore, results in high‐quality restored images. Extensive experimental results on two typical image restoration tasks (image denoising and deblocking) demonstrate that the proposed algorithm outperforms many other popular or state‐of‐the‐art image restoration methods.
Collapse
Affiliation(s)
- Zhaoyuan Cai
- School of Computer Science and Technology Xidian University Xi'an Shaanxi China
| | - Xianghua Xie
- Department of Computer Science Swansea University Swansea UK
| | - Jingjing Deng
- Department of Computer Science Durham University Durham UK
| | - Zengfa Dou
- 20th Research Institute China Electronic Science and Technology Group Co., Ltd Xi'an Shaanxi China
| | - Bo Tong
- Xi'an Thermal Power Research Institute Co., Ltd Xi'an China
| | - Xiaoke Ma
- School of Computer Science and Technology Xidian University Xi'an Shaanxi China
| |
Collapse
|
8
|
Chen K, Chen J, Zeng H, Shen X. Fast-MFQE: A Fast Approach for Multi-Frame Quality Enhancement on Compressed Video. SENSORS (BASEL, SWITZERLAND) 2023; 23:7227. [PMID: 37631763 PMCID: PMC10457967 DOI: 10.3390/s23167227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/11/2023] [Accepted: 08/12/2023] [Indexed: 08/27/2023]
Abstract
For compressed images and videos, quality enhancement is essential. Though there have been remarkable achievements related to deep learning, deep learning models are too large to apply to real-time tasks. Therefore, a fast multi-frame quality enhancement method for compressed video, named Fast-MFQE, is proposed to meet the requirement of video-quality enhancement for real-time applications. There are three main modules in this method. One is the image pre-processing building module (IPPB), which is used to reduce redundant information of input images. The second one is the spatio-temporal fusion attention (STFA) module. It is introduced to effectively merge temporal and spatial information of input video frames. The third one is the feature reconstruction network (FRN), which is developed to effectively reconstruct and enhance the spatio-temporal information. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods in terms of lightweight parameters, inference speed, and quality enhancement performance. Even at a resolution of 1080p, the Fast-MFQE achieves a remarkable inference speed of over 25 frames per second, while providing a PSNR increase of 19.6% on average when QP = 37.
Collapse
Affiliation(s)
- Kemi Chen
- College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China; (K.C.)
| | - Jing Chen
- College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China; (K.C.)
| | - Huanqiang Zeng
- College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China; (K.C.)
- College of Engineering , Huaqiao University, Quanzhou 362021, China
| | - Xueyuan Shen
- College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China; (K.C.)
| |
Collapse
|
9
|
Ali AM, Benjdira B, Koubaa A, El-Shafai W, Khan Z, Boulila W. Vision Transformers in Image Restoration: A Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:2385. [PMID: 36904589 PMCID: PMC10006889 DOI: 10.3390/s23052385] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 02/14/2023] [Accepted: 02/17/2023] [Indexed: 06/18/2023]
Abstract
The Vision Transformer (ViT) architecture has been remarkably successful in image restoration. For a while, Convolutional Neural Networks (CNN) predominated in most computer vision tasks. Now, both CNN and ViT are efficient approaches that demonstrate powerful capabilities to restore a better version of an image given in a low-quality format. In this study, the efficiency of ViT in image restoration is studied extensively. The ViT architectures are classified for every task of image restoration. Seven image restoration tasks are considered: Image Super-Resolution, Image Denoising, General Image Enhancement, JPEG Compression Artifact Reduction, Image Deblurring, Removing Adverse Weather Conditions, and Image Dehazing. The outcomes, the advantages, the limitations, and the possible areas for future research are detailed. Overall, it is noted that incorporating ViT in the new architectures for image restoration is becoming a rule. This is due to some advantages compared to CNN, such as better efficiency, especially when more data are fed to the network, robustness in feature extraction, and a better feature learning approach that sees better the variances and characteristics of the input. Nevertheless, some drawbacks exist, such as the need for more data to show the benefits of ViT over CNN, the increased computational cost due to the complexity of the self-attention block, a more challenging training process, and the lack of interpretability. These drawbacks represent the future research direction that should be targeted to increase the efficiency of ViT in the image restoration domain.
Collapse
Affiliation(s)
- Anas M. Ali
- Robotics and Internet-of-Things Laboratory, Prince Sultan University, Riyadh 12435, Saudi Arabia
- Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
| | - Bilel Benjdira
- Robotics and Internet-of-Things Laboratory, Prince Sultan University, Riyadh 12435, Saudi Arabia
- SE & ICT Laboratory, LR18ES44, ENICarthage, University of Carthage, Tunis 1054, Tunisia
| | - Anis Koubaa
- Robotics and Internet-of-Things Laboratory, Prince Sultan University, Riyadh 12435, Saudi Arabia
| | - Walid El-Shafai
- Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
- Security Engineering Laboratory, Computer Science Department, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Zahid Khan
- Robotics and Internet-of-Things Laboratory, Prince Sultan University, Riyadh 12435, Saudi Arabia
| | - Wadii Boulila
- Robotics and Internet-of-Things Laboratory, Prince Sultan University, Riyadh 12435, Saudi Arabia
- RIADI Laboratory, University of Manouba, Manouba 2010, Tunisia
| |
Collapse
|
10
|
Zhang X, Wu X. Multi-Modality Deep Restoration of Extremely Compressed Face Videos. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2024-2037. [PMID: 35259095 DOI: 10.1109/tpami.2022.3157388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Arguably the most common and salient object in daily video communications is the talking head, as encountered in social media, virtual classrooms, teleconferences, news broadcasting, talk shows, etc. When communication bandwidth is limited by network congestions or cost effectiveness, compression artifacts in talking head videos are inevitable. The resulting video quality degradation is highly visible and objectionable due to high acuity of human visual system to faces. To solve this problem, we develop a multi-modality deep convolutional neural network method for restoring face videos that are aggressively compressed. The main innovation is a new DCNN architecture that incorporates known priors of multiple modalities: the video-synchronized speech signal and semantic elements of the compression code stream, including motion vectors, code partition map and quantization parameters. These priors strongly correlate with the latent video and hence they are able to enhance the capability of deep learning to remove compression artifacts. Ample empirical evidences are presented to validate the superior performance of the proposed DCNN method on face videos over the existing state-of-the-art methods.
Collapse
|
11
|
Liu Y, Anwar S, Qin Z, Ji P, Caldwell S, Gedeon T. Disentangling Noise from Images: A Flow-Based Image Denoising Neural Network. SENSORS (BASEL, SWITZERLAND) 2022; 22:9844. [PMID: 36560213 PMCID: PMC9787817 DOI: 10.3390/s22249844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 11/14/2022] [Accepted: 12/08/2022] [Indexed: 06/17/2023]
Abstract
The prevalent convolutional neural network (CNN)-based image denoising methods extract features of images to restore the clean ground truth, achieving high denoising accuracy. However, these methods may ignore the underlying distribution of clean images, inducing distortions or artifacts in denoising results. This paper proposes a new perspective to treat image denoising as a distribution learning and disentangling task. Since the noisy image distribution can be viewed as a joint distribution of clean images and noise, the denoised images can be obtained via manipulating the latent representations to the clean counterpart. This paper also provides a distribution-learning-based denoising framework. Following this framework, we present an invertible denoising network, FDN, without any assumptions on either clean or noise distributions, as well as a distribution disentanglement method. FDN learns the distribution of noisy images, which is different from the previous CNN-based discriminative mapping. Experimental results demonstrate FDN's capacity to remove synthetic additive white Gaussian noise (AWGN) on both category-specific and remote sensing images. Furthermore, the performance of FDN surpasses that of previously published methods in real image denoising with fewer parameters and faster speed.
Collapse
Affiliation(s)
- Yang Liu
- The Research School of Computer Science, The Australian National University, Canberra, ACT 2600, Australia
- Imaging and Computer Vision, Data61, CSIRO, Canberra, ACT 2600, Australia
| | - Saeed Anwar
- The Research School of Computer Science, The Australian National University, Canberra, ACT 2600, Australia
- Imaging and Computer Vision, Data61, CSIRO, Canberra, ACT 2600, Australia
- The School of Computer Science, The University of Technology Sydney, 15 Broadway Ultimo, Sydney, NSW 2007, Australia
| | - Zhenyue Qin
- The Research School of Computer Science, The Australian National University, Canberra, ACT 2600, Australia
| | - Pan Ji
- The OPPO US Research, San Francisco, CA 94303, USA
| | - Sabrina Caldwell
- The Research School of Computer Science, The Australian National University, Canberra, ACT 2600, Australia
| | - Tom Gedeon
- The Research School of Computer Science, The Australian National University, Canberra, ACT 2600, Australia
| |
Collapse
|
12
|
Zha Z, Yuan X, Wen B, Zhang J, Zhu C. Nonconvex Structural Sparsity Residual Constraint for Image Restoration. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:12440-12453. [PMID: 34161250 DOI: 10.1109/tcyb.2021.3084931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This article proposes a novel nonconvex structural sparsity residual constraint (NSSRC) model for image restoration, which integrates structural sparse representation (SSR) with nonconvex sparsity residual constraint (NC-SRC). Although SSR itself is powerful for image restoration by combining the local sparsity and nonlocal self-similarity in natural images, in this work, we explicitly incorporate the novel NC-SRC prior into SSR. Our proposed approach provides more effective sparse modeling for natural images by applying a more flexible sparse representation scheme, leading to high-quality restored images. Moreover, an alternating minimizing framework is developed to solve the proposed NSSRC-based image restoration problems. Extensive experimental results on image denoising and image deblocking validate that the proposed NSSRC achieves better results than many popular or state-of-the-art methods over several publicly available datasets.
Collapse
|
13
|
Fu X, Wang M, Cao X, Ding X, Zha ZJ. A Model-Driven Deep Unfolding Method for JPEG Artifacts Removal. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6802-6816. [PMID: 34081590 DOI: 10.1109/tnnls.2021.3083504] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep learning-based methods have achieved notable progress in removing blocking artifacts caused by lossy JPEG compression on images. However, most deep learning-based methods handle this task by designing black-box network architectures to directly learn the relationships between the compressed images and their clean versions. These network architectures are always lack of sufficient interpretability, which limits their further improvements in deblocking performance. To address this issue, in this article, we propose a model-driven deep unfolding method for JPEG artifacts removal, with interpretable network structures. First, we build a maximum posterior (MAP) model for deblocking using convolutional dictionary learning and design an iterative optimization algorithm using proximal operators. Second, we unfold this iterative algorithm into a learnable deep network structure, where each module corresponds to a specific operation of the iterative algorithm. In this way, our network inherits the benefits of both the powerful model ability of data-driven deep learning method and the interpretability of traditional model-driven method. By training the proposed network in an end-to-end manner, all learnable modules can be automatically explored to well characterize the representations of both JPEG artifacts and image content. Experiments on synthetic and real-world datasets show that our method is able to generate competitive or even better deblocking results, compared with state-of-the-art methods both quantitatively and qualitatively.
Collapse
|
14
|
Zha Z, Wen B, Yuan X, Zhou J, Zhu C, Kot AC. A Hybrid Structural Sparsification Error Model for Image Restoration. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4451-4465. [PMID: 33625989 DOI: 10.1109/tnnls.2021.3057439] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recent works on structural sparse representation (SSR), which exploit image nonlocal self-similarity (NSS) prior by grouping similar patches for processing, have demonstrated promising performance in various image restoration applications. However, conventional SSR-based image restoration methods directly fit the dictionaries or transforms to the internal (corrupted) image data. The trained internal models inevitably suffer from overfitting to data corruption, thus generating the degraded restoration results. In this article, we propose a novel hybrid structural sparsification error (HSSE) model for image restoration, which jointly exploits image NSS prior using both the internal and external image data that provide complementary information. Furthermore, we propose a general image restoration scheme based on the HSSE model, and an alternating minimization algorithm for a range of image restoration applications, including image inpainting, image compressive sensing and image deblocking. Extensive experiments are conducted to demonstrate that the proposed HSSE-based scheme outperforms many popular or state-of-the-art image restoration methods in terms of both objective metrics and visual perception.
Collapse
|
15
|
Compression Loss-Based Spatial-Temporal Attention Module for Compressed Video Quality Enhancement. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
16
|
Li Z, Wang F, Cui L, Liu J. Dual Mixture Model Based CNN for Image Denoising. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3618-3629. [PMID: 35576410 DOI: 10.1109/tip.2022.3173814] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Non-Gaussian residual error and noise are common in the real applications, and they can be efficiently addressed by some non-quadratic fidelity terms in the classic variational method. However, they have not been well integrated into the architectures design in the convolutional neural networks (CNN) based image denoising method. In this paper, we propose a deep learning approach to handle non-Gaussian residual error. Our method is developed on an universal approximation property for the probability density functions of the non-Gaussian error/noise. By considering the duality of the maximum likelihood estimation for the non-Gaussian error, an adaptive weighting strategy can be derived for image fidelity. To get a good image prior, a learnable regularizer is adopted. Solving such a problem iteratively can be unrolled as a weighted residual CNN architecture. The main advantage of our method is that the weighted residual block can well handle the non-Gaussian residual, especially for the noise with non-uniformly spatial distribution. Numerical results show that it has better performance on non-Gaussian noise (e.g. Gaussian mixture, random-valued impulse noise) removal than the related existing methods.
Collapse
|
17
|
Tang B, He X, Wu X, Chen H, Xiong S. Sequential Enhancement for Compressed Video Using Deep Convolutional Generative Adversarial Network. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10865-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
18
|
Sheng X, Li L, Liu D, Xiong Z. Attribute Artifacts Removal for Geometry-Based Point Cloud Compression. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3399-3413. [PMID: 35503831 DOI: 10.1109/tip.2022.3170722] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Geometry-based point cloud compression (G-PCC) can achieve remarkable compression efficiency for point clouds. However, it still leads to serious attribute compression artifacts, especially under low bitrate scenarios. In this paper, we propose a Multi-Scale Graph Attention Network (MS-GAT) to remove the artifacts of point cloud attributes compressed by G-PCC. We first construct a graph based on point cloud geometry coordinates and then use the Chebyshev graph convolutions to extract features of point cloud attributes. Considering that one point may be correlated with points both near and far away from it, we propose a multi-scale scheme to capture the short- and long-range correlations between the current point and its neighboring and distant points. To address the problem that various points may have different degrees of artifacts caused by adaptive quantization, we introduce the quantization step per point as an extra input to the proposed network. We also incorporate a weighted graph attentional layer into the network to pay special attention to the points with more attribute artifacts. To the best of our knowledge, this is the first attribute artifacts removal method for G-PCC. We validate the effectiveness of our method over various point clouds. Objective comparison results show that our proposed method achieves an average of 9.74% BD-rate reduction compared with Predlift and 10.13% BD-rate reduction compared with RAHT. Subjective comparison results present that visual artifacts such as color shifting, blurring, and quantization noise are reduced.
Collapse
|
19
|
A nonlocal HEVC in-loop filter using CNN-based compression noise estimation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03259-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
20
|
An effective nonlocal means image denoising framework based on non-subsampled shearlet transform. Soft comput 2022. [DOI: 10.1007/s00500-022-06845-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
21
|
Abstract
AbstractThe deep image prior showed that a randomly initialized network with a suitable architecture can be trained to solve inverse imaging problems by simply optimizing it’s parameters to reconstruct a single degraded image. However, it suffers from two practical limitations. First, it remains unclear how to control the prior beyond the choice of the network architecture. Second, training requires an oracle stopping criterion as during the optimization the performance degrades after reaching an optimum value. To address these challenges we introduce a frequency-band correspondence measure to characterize the spectral bias of the deep image prior, where low-frequency image signals are learned faster and better than high-frequency counterparts. Based on our observations, we propose techniques to prevent the eventual performance degradation and accelerate convergence. We introduce a Lipschitz-controlled convolution layer and a Gaussian-controlled upsampling layer as plug-in replacements for layers used in the deep architectures. The experiments show that with these changes the performance does not degrade during optimization, relieving us from the need for an oracle stopping criterion. We further outline a stopping criterion to avoid superfluous computation. Finally, we show that our approach obtains favorable results compared to current approaches across various denoising, deblocking, inpainting, super-resolution and detail enhancement tasks. Code is available at https://github.com/shizenglin/Measure-and-Control-Spectral-Bias.
Collapse
|
22
|
He J, He X, Zhang M, Xiong S, Chen H. Deep dual-domain semi-blind network for compressed image quality enhancement. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107870] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
23
|
Chen H, He X, Yang H, Qing L, Teng Q. A Feature-Enriched Deep Convolutional Neural Network for JPEG Image Compression Artifacts Reduction and its Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:430-444. [PMID: 34793307 DOI: 10.1109/tnnls.2021.3124370] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The amount of multimedia data, such as images and videos, has been increasing rapidly with the development of various imaging devices and the Internet, bringing more stress and challenges to information storage and transmission. The redundancy in images can be reduced to decrease data size via lossy compression, such as the most widely used standard Joint Photographic Experts Group (JPEG). However, the decompressed images generally suffer from various artifacts (e.g., blocking, banding, ringing, and blurring) due to the loss of information, especially at high compression ratios. This article presents a feature-enriched deep convolutional neural network for compression artifacts reduction (FeCarNet, for short). Taking the dense network as the backbone, FeCarNet enriches features to gain valuable information via introducing multi-scale dilated convolutions, along with the efficient 1 ×1 convolution for lowering both parameter complexity and computation cost. Meanwhile, to make full use of different levels of features in FeCarNet, a fusion block that consists of attention-based channel recalibration and dimension reduction is developed for local and global feature fusion. Furthermore, short and long residual connections both in the feature and pixel domains are combined to build a multi-level residual structure, thereby benefiting the network training and performance. In addition, aiming at reducing computation complexity further, pixel-shuffle-based image downsampling and upsampling layers are, respectively, arranged at the head and tail of the FeCarNet, which also enlarges the receptive field of the whole network. Experimental results show the superiority of FeCarNet over state-of-the-art compression artifacts reduction approaches in terms of both restoration capacity and model complexity. The applications of FeCarNet on several computer vision tasks, including image deblurring, edge detection, image segmentation, and object detection, demonstrate the effectiveness of FeCarNet further.
Collapse
|
24
|
VidalMata RG, Banerjee S, RichardWebster B, Albright M, Davalos P, McCloskey S, Miller B, Tambo A, Ghosh S, Nagesh S, Yuan Y, Hu Y, Wu J, Yang W, Zhang X, Liu J, Wang Z, Chen HT, Huang TW, Chin WC, Li YC, Lababidi M, Otto C, Scheirer WJ. Bridging the Gap Between Computational Photography and Visual Recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:4272-4290. [PMID: 32750769 DOI: 10.1109/tpami.2020.2996538] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
What is the current state-of-the-art for image restoration and enhancement applied to degraded images acquired under less than ideal circumstances? Can the application of such algorithms as a pre-processing step improve image interpretability for manual analysis or automatic visual recognition to classify scene content? While there have been important advances in the area of computational photography to restore or enhance the visual quality of an image, the capabilities of such techniques have not always translated in a useful way to visual recognition tasks. Consequently, there is a pressing need for the development of algorithms that are designed for the joint problem of improving visual appearance and recognition, which will be an enabling factor for the deployment of visual recognition tools in many real-world scenarios. To address this, we introduce the UG 2 dataset as a large-scale benchmark composed of video imagery captured under challenging conditions, and two enhancement tasks designed to test algorithmic impact on visual quality and automatic object recognition. Furthermore, we propose a set of metrics to evaluate the joint improvement of such tasks as well as individual algorithmic advances, including a novel psychophysics-based evaluation regime for human assessment and a realistic set of quantitative measures for object recognition performance. We introduce six new algorithms for image restoration or enhancement, which were created as part of the IARPA sponsored UG 2 Challenge workshop held at CVPR 2018. Under the proposed evaluation regime, we present an in-depth analysis of these algorithms and a host of deep learning-based and classic baseline approaches. From the observed results, it is evident that we are in the early days of building a bridge between computational photography and visual recognition, leaving many opportunities for innovation in this area.
Collapse
|
25
|
Deep multi-task learning for image/video distortions identification. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06576-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
26
|
|
27
|
Reduction of Compression Artifacts Using a Densely Cascading Image Restoration Network. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11177803] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Since high quality realistic media are widely used in various computer vision applications, image compression is one of the essential technologies to enable real-time applications. Image compression generally causes undesired compression artifacts, such as blocking artifacts and ringing effects. In this study, we propose a densely cascading image restoration network (DCRN), which consists of an input layer, a densely cascading feature extractor, a channel attention block, and an output layer. The densely cascading feature extractor has three densely cascading (DC) blocks, and each DC block contains two convolutional layers, five dense layers, and a bottleneck layer. To optimize the proposed network architectures, we investigated the trade-off between quality enhancement and network complexity. Experimental results revealed that the proposed DCRN can achieve a better peak signal-to-noise ratio and structural similarity index measure for compressed joint photographic experts group (JPEG) images compared to the previous methods.
Collapse
|
28
|
Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y. Residual Dense Network for Image Restoration. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:2480-2495. [PMID: 31985406 DOI: 10.1109/tpami.2020.2968521] [Citation(s) in RCA: 143] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recently, deep convolutional neural network (CNN) has achieved great success for image restoration (IR) and provided hierarchical features at the same time. However, most deep CNN based IR models do not make full use of the hierarchical features from the original low-quality images; thereby, resulting in relatively-low performance. In this work, we propose a novel and efficient residual dense network (RDN) to address this problem in IR, by making a better tradeoff between efficiency and effectiveness in exploiting the hierarchical features from all the convolutional layers. Specifically, we propose residual dense block (RDB) to extract abundant local features via densely connected convolutional layers. RDB further allows direct connections from the state of preceding RDB to all the layers of current RDB, leading to a contiguous memory mechanism. To adaptively learn more effective features from preceding and current local features and stabilize the training of wider network, we proposed local feature fusion in RDB. After fully obtaining dense local features, we use global feature fusion to jointly and adaptively learn global hierarchical features in a holistic way. We demonstrate the effectiveness of RDN with several representative IR applications, single image super-resolution, Gaussian image denoising, image compression artifact reduction, and image deblurring. Experiments on benchmark and real-world datasets show that our RDN achieves favorable performance against state-of-the-art methods for each IR task quantitatively and visually.
Collapse
|
29
|
Zha Z, Wen B, Yuan X, Zhou JT, Zhou J, Zhu C. Triply Complementary Priors for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5819-5834. [PMID: 34133279 DOI: 10.1109/tip.2021.3086049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recent works that utilized deep models have achieved superior results in various image restoration (IR) applications. Such approach is typically supervised, which requires a corpus of training images with distributions similar to the images to be recovered. On the other hand, the shallow methods, which are usually unsupervised remain promising performance in many inverse problems, e.g., image deblurring and image compressive sensing (CS), as they can effectively leverage nonlocal self-similarity priors of natural images. However, most of such methods are patch-based leading to the restored images with various artifacts due to naive patch aggregation in addition to the slow speed. Using either approach alone usually limits performance and generalizability in IR tasks. In this paper, we propose a joint low-rank and deep (LRD) image model, which contains a pair of triply complementary priors, namely, internal and external, shallow and deep, and non-local and local priors. We then propose a novel hybrid plug-and-play (H-PnP) framework based on the LRD model for IR. Following this, a simple yet effective algorithm is developed to solve the proposed H-PnP based IR problems. Extensive experimental results on several representative IR tasks, including image deblurring, image CS and image deblocking, demonstrate that the proposed H-PnP algorithm achieves favorable performance compared to many popular or state-of-the-art IR methods in terms of both objective and visual perception.
Collapse
|
30
|
Xu Z, Foi A. Anisotropic Denoising of 3D Point Clouds by Aggregation of Multiple Surface-Adaptive Estimates. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:2851-2868. [PMID: 31841412 DOI: 10.1109/tvcg.2019.2959761] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
3D point clouds commonly contain positional errors which can be regarded as noise. We propose a point cloud denoising algorithm based on aggregation of multiple anisotropic estimates computed on local coordinate systems. These local estimates are adaptive to the shape of the surface underlying the point cloud, leveraging an extension of the Local Polynomial Approximation (LPA) - Intersection of Confidence Intervals (ICI) technique to 3D point clouds. The adaptivity due to LPA-ICI is further strengthened by the dense aggregation with data-driven weights. Experimental results demonstrate state-of-the-art restoration quality of both sharp features and smooth areas.
Collapse
|
31
|
Zha Z, Wen B, Yuan X, Zhou J, Zhu C. Image Restoration via Reconciliation of Group Sparsity and Low-Rank Models. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5223-5238. [PMID: 34010133 DOI: 10.1109/tip.2021.3078329] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Image nonlocal self-similarity (NSS) property has been widely exploited via various sparsity models such as joint sparsity (JS) and group sparse coding (GSC). However, the existing NSS-based sparsity models are either too restrictive, e.g., JS enforces the sparse codes to share the same support, or too general, e.g., GSC imposes only plain sparsity on the group coefficients, which limit their effectiveness for modeling real images. In this paper, we propose a novel NSS-based sparsity model, namely, low-rank regularized group sparse coding (LR-GSC), to bridge the gap between the popular GSC and JS. The proposed LR-GSC model simultaneously exploits the sparsity and low-rankness of the dictionary-domain coefficients for each group of similar patches. An alternating minimization with an adaptive adjusted parameter strategy is developed to solve the proposed optimization problem for different image restoration tasks, including image denoising, image deblocking, image inpainting, and image compressive sensing. Extensive experimental results demonstrate that the proposed LR-GSC algorithm outperforms many popular or state-of-the-art methods in terms of objective and perceptual metrics.
Collapse
|
32
|
Guan Z, Xing Q, Xu M, Yang R, Liu T, Wang Z. MFQE 2.0: A New Approach for Multi-Frame Quality Enhancement on Compressed Video. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:949-963. [PMID: 31581073 DOI: 10.1109/tpami.2019.2944806] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The past few years have witnessed great success in applying deep learning to enhance the quality of compressed image/video. The existing approaches mainly focus on enhancing the quality of a single frame, not considering the similarity between consecutive frames. Since heavy fluctuation exists across compressed video frames as investigated in this paper, frame similarity can be utilized for quality enhancement of low-quality frames given their neighboring high-quality frames. This task is Multi-Frame Quality Enhancement (MFQE). Accordingly, this paper proposes an MFQE approach for compressed video, as the first attempt in this direction. In our approach, we first develop a Bidirectional Long Short-Term Memory (BiLSTM) based detector to locate Peak Quality Frames (PQFs) in compressed video. Then, a novel Multi-Frame Convolutional Neural Network (MF-CNN) is designed to enhance the quality of compressed video, in which the non-PQF and its nearest two PQFs are the input. In MF-CNN, motion between the non-PQF and PQFs is compensated by a motion compensation subnet. Subsequently, a quality enhancement subnet fuses the non-PQF and compensated PQFs, and then reduces the compression artifacts of the non-PQF. Also, PQF quality is enhanced in the same way. Finally, experiments validate the effectiveness and generalization ability of our MFQE approach in advancing the state-of-the-art quality enhancement of compressed video.
Collapse
|
33
|
A recurrent video quality enhancement framework with multi-granularity frame-fusion and frame difference based attention. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.12.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
34
|
Chen C, Xiong Z, Tian X, Zha ZJ, Wu F. Real-World Image Denoising with Deep Boosting. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:3071-3087. [PMID: 31180840 DOI: 10.1109/tpami.2019.2921548] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We propose a Deep Boosting Framework (DBF) for real-world image denoising by integrating the deep learning technique into the boosting algorithm. The DBF replaces conventional handcrafted boosting units by elaborate convolutional neural networks, which brings notable advantages in terms of both performance and speed. We design a lightweight Dense Dilated Fusion Network (DDFN) as an embodiment of the boosting unit, which addresses the vanishing of gradients during training due to the cascading of networks while promoting the efficiency of limited parameters. The capabilities of the proposed method are first validated on several representative simulation tasks including non-blind and blind Gaussian denoising and JPEG image deblocking. We then focus on a practical scenario to tackle with the complex and challenging real-world noise. To facilitate leaning-based methods including ours, we build a new Real-world Image Denoising (RID) dataset, which contains 200 pairs of high-resolution images with diverse scene content under various shooting conditions. Moreover, we conduct comprehensive analysis on the domain shift issue for real-world denoising and propose an effective one-shot domain transfer scheme to address this issue. Comprehensive experiments on widely used benchmarks demonstrate that the proposed method significantly surpasses existing methods on the task of real-world image denoising. Code and dataset are available at https://github.com/ngchc/deepBoosting.
Collapse
|
35
|
Sun W, He X, Chen H, Sheriff RE, Xiong S. A quality enhancement framework with noise distribution characteristics for high efficiency video coding. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.048] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
36
|
Shape Adaptive Neighborhood Information-Based Semi-Supervised Learning for Hyperspectral Image Classification. REMOTE SENSING 2020. [DOI: 10.3390/rs12182976] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hyperspectral image (HSI) classification is an important research topic in detailed analysis of the Earth’s surface. However, the performance of the classification is often hampered by the high-dimensionality features and limited training samples of the HSIs which has fostered research about semi-supervised learning (SSL). In this paper, we propose a shape adaptive neighborhood information (SANI) based SSL (SANI-SSL) method that takes full advantage of the adaptive spatial information to select valuable unlabeled samples in order to improve the classification ability. The improvement of the classification mainly relies on two aspects: (1) the improvement of the feature discriminability, which is accomplished by exploiting spectral-spatial information, and (2) the improvement of the training samples’ representativeness which is accomplished by exploiting the SANI for both labeled and unlabeled samples. First, the SANI of labeled samples is extracted, and the breaking ties (BT) method is used in order to select valuable unlabeled samples from the labeled samples’ neighborhood. Second, the SANI of unlabeled samples are also used to find more valuable samples, with the classifier combination method being used as a strategy to ensure confidence and the adaptive interval method used as a strategy to ensure informativeness. The experimental comparison results tested on three benchmark HSI datasets have demonstrated the significantly superior performance of our proposed method.
Collapse
|
37
|
Zha Z, Yuan X, Zhou J, Zhu C, Wen B. Image Restoration via Simultaneous Nonlocal Self-Similarity Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8561-8576. [PMID: 32822296 DOI: 10.1109/tip.2020.3015545] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Through exploiting the image nonlocal self-similarity (NSS) prior by clustering similar patches to construct patch groups, recent studies have revealed that structural sparse representation (SSR) models can achieve promising performance in various image restoration tasks. However, most existing SSR methods only exploit the NSS prior from the input degraded (internal) image, and few methods utilize the NSS prior from external clean image corpus; how to jointly exploit the NSS priors of internal image and external clean image corpus is still an open problem. In this paper, we propose a novel approach for image restoration by simultaneously considering internal and external nonlocal self-similarity (SNSS) priors that offer mutually complementary information. Specifically, we first group nonlocal similar patches from images of a training corpus. Then a group-based Gaussian mixture model (GMM) learning algorithm is applied to learn an external NSS prior. We exploit the SSR model by integrating the NSS priors of both internal and external image data. An alternating minimization with an adaptive parameter adjusting strategy is developed to solve the proposed SNSS-based image restoration problems, which makes the entire algorithm more stable and practical. Experimental results on three image restoration applications, namely image denoising, deblocking and deblurring, demonstrate that the proposed SNSS produces superior results compared to many popular or state-of-the-art methods in both objective and perceptual quality measurements.
Collapse
|
38
|
Makinen Y, Azzari L, Foi A. Collaborative Filtering of Correlated Noise: Exact Transform-Domain Variance for Improved Shrinkage and Patch Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8339-8354. [PMID: 32784137 DOI: 10.1109/tip.2020.3014721] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Collaborative filters perform denoising through transform-domain shrinkage of a group of similar patches extracted from an image. Existing collaborative filters of stationary correlated noise have all used simple approximations of the transform noise power spectrum adopted from methods which do not employ patch grouping and instead operate on a single patch. We note the inaccuracies of these approximations and introduce a method for the exact computation of the noise power spectrum. Unlike earlier methods, the calculated noise variances are exact even when noise in one patch is correlated with noise in any of the other patches. We discuss the adoption of the exact noise power spectrum within shrinkage, in similarity testing (patch matching), and in aggregation. We also introduce effective approximations of the spectrum for faster computation. Extensive experiments support the proposed method over earlier crude approximations used by image denoising filters such as Block-Matching and 3D-filtering (BM3D), demonstrating dramatic improvement in many challenging conditions.
Collapse
|
39
|
Chen H, He X, An C, Nguyen TQ. Adaptive image coding efficiency enhancement using deep convolutional neural networks. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.03.042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
40
|
Denoise magnitude diffusion magnetic resonance images via variance-stabilizing transformation and optimal singular-value manipulation. Neuroimage 2020; 215:116852. [PMID: 32305566 PMCID: PMC7292796 DOI: 10.1016/j.neuroimage.2020.116852] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 04/07/2020] [Accepted: 04/10/2020] [Indexed: 12/12/2022] Open
Abstract
Although shown to have a great utility for a wide range of neuroscientific and clinical applications, diffusion-weighted magnetic resonance imaging (dMRI) faces a major challenge of low signal-to-noise ratio (SNR), especially when pushing the spatial resolution for improved delineation of brain's fine structure or increasing the diffusion weighting for increased angular contrast or both. Here, we introduce a comprehensive denoising framework for denoising magnitude dMRI. The framework synergistically combines the variance stabilizing transform (VST) with optimal singular value manipulation. The purpose of VST is to transform the Rician data to Gaussian-like data so that an asymptotically optimal singular value manipulation strategy tailored for Gaussian data can be used. The output of the framework is the estimated underlying diffusion signal for each voxel in the image domain. The usefulness of the proposed framework for denoising magnitude dMRI is demonstrated using both simulation and real-data experiments. Our results show that the proposed denoising framework can significantly improve SNR across the entire brain, leading to substantially enhanced performances for estimating diffusion tensor related indices and for resolving crossing fibers when compared to another competing method. More encouragingly, the proposed method when used to denoise a single average of 7 Tesla Human Connectome Project-style diffusion acquisition provided comparable performances relative to those achievable with ten averages for resolving multiple fiber populations across the brain. As such, the proposed denoising method is expected to have a great utility for high-quality, high-resolution whole-brain dMRI, desirable for many neuroscientific and clinical applications.
Collapse
|
41
|
Sun M, He X, Xiong S, Ren C, Li X. Reduction of JPEG compression artifacts based on DCT coefficients prediction. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
42
|
Yang X, Xu Y, Quan Y, Ji H. Image Denoising via Sequential Ensemble Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5038-5049. [PMID: 32167898 DOI: 10.1109/tip.2020.2978645] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Image denoising is about removing measurement noise from input image for better signal-to-noise ratio. In recent years, there has been great progress on the development of data-driven approaches for image denoising, which introduce various techniques and paradigms from machine learning in the design of image denoisers. This paper aims at investigating the application of ensemble learning in image denoising, which combines a set of simple base denoisers to form a more effective image denoiser. Based on different types of image priors, two types of base denoisers in the form of transform-shrinkage are proposed for constructing the ensemble. Then, with an effective re-sampling scheme, several ensemble-learning-based image denoisers are constructed using different sequential combinations of multiple proposed base denoisers. The experiments showed that sequential ensemble learning can effectively boost the performance of image denoising.
Collapse
|
43
|
Mu J, Xiong R, Fan X, Liu D, Wu F, Gao W. Graph-Based Non-Convex Low-Rank Regularization for Image Compression Artifact Reduction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5374-5385. [PMID: 32149688 DOI: 10.1109/tip.2020.2975931] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Block transform coded images usually suffer from annoying artifacts at low bit-rates, because of the independent quantization of DCT coefficients. Image prior models play an important role in compressed image reconstruction. Natural image patches in a small neighborhood of the high-dimensional image space usually exhibit an underlying sub-manifold structure. To model the distribution of signal, we extract sub-manifold structure as prior knowledge. We utilize graph Laplacian regularization to characterize the sub-manifold structure at patch level. And similar patches are exploited as samples to estimate distribution of a particular patch. Instead of using Euclidean distance as similarity metric, we propose to use graph-domain distance to measure the patch similarity. Then we perform low-rank regularization on the similar-patch group, and incorporate a non-convex lp penalty to surrogate matrix rank. Finally, an alternatively minimizing strategy is employed to solve the non-convex problem. Experimental results show that our proposed method is capable of achieving more accurate reconstruction than the state-of-the-art methods in both objective and perceptual qualities.
Collapse
|
44
|
Rizvi S, Cao J, Zhang K, Hao Q. Deringing and denoising in extremely under-sampled Fourier single pixel imaging. OPTICS EXPRESS 2020; 28:7360-7374. [PMID: 32225966 DOI: 10.1364/oe.385233] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 01/16/2020] [Indexed: 06/10/2023]
Abstract
Undersampling in Fourier single pixel imaging (FSI) is often employed to reduce imaging time for real-time applications. However, the undersampled reconstruction contains ringing artifacts (Gibbs phenomenon) that occur because the high-frequency target information is not recorded. Furthermore, by employing 3-step FSI strategy (reduced measurements with low noise suppression) with a low-grade sensor (i.e., photodiode), this ringing is coupled with noise to produce unwanted artifacts, lowering image quality. To improve the imaging quality of real-time FSI, a fast image reconstruction framework based on deep convolutional autoencoder network (DCAN) is proposed. The network through context learning over FSI artifacts is capable of deringing, denoising, and recovering details in 256 × 256 images. The promising experimental results show that the proposed deep-learning-based FSI outperforms conventional FSI in terms of image quality even at very low sampling rates (1-4%).
Collapse
|
45
|
Zha Z, Yuan X, Wen B, Zhou J, Zhang J, Zhu C. From Rank Estimation to Rank Approximation: Rank Residual Constraint for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:3254-3269. [PMID: 31841410 DOI: 10.1109/tip.2019.2958309] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper, we propose a novel approach for the rank minimization problem, termed rank residual constraint (RRC). Different from existing low-rank based approaches, such as the well-known nuclear norm minimization (NNM) and the weighted nuclear norm minimization (WNNM), which estimate the underlying low-rank matrix directly from the corrupted observation, we progressively approximate (approach) the underlying low-rank matrix via minimizing the rank residual. Through integrating the image nonlocal self-similarity (NSS) prior with the proposed RRC model, we apply it to image restoration tasks, including image denoising and image compression artifacts reduction. Toward this end, we first obtain a good reference of the original image groups by using the image NSS prior, and then the rank residual of the image groups between this reference and the degraded image is minimized to achieve a better estimate to the desired image. In this manner, both the reference and the estimated image in each iteration are improved gradually and jointly. Based on the group-based sparse representation model, we further provide a theoretical analysis on the feasibility of the proposed RRC model. Experimental results demonstrate that the proposed RRC model outperforms many state-of-the-art schemes in both the objective and perceptual qualities.
Collapse
|
46
|
Hyperspectral Image Super-Resolution via Adaptive Dictionary Learning and Double l1 Constraint. REMOTE SENSING 2019. [DOI: 10.3390/rs11232809] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Hyperspectral image (HSI) super-resolution (SR) is an important technique for improving the spatial resolution of HSI. Recently, a method based on sparse representation improved the performance of HSI SR significantly. However, the spectral dictionary was learned under a fixed size, empirically, without considering the training data. Moreover, most of the existing methods fail to explore the relationship among the sparse coefficients. To address these crucial issues, an effective method for HSI SR is proposed in this paper. First, a spectral dictionary is learned, which can adaptively estimate a suitable size according to the input HSI without any prior information. Then, the proposed method exploits the nonlocal correlation of the sparse coefficients. Doubleregularized sparse representation is then introduced to achieve better reconstructions for HSI SR. Finally, a high spatial resolution HSI is generated by the obtained coefficients matrix and the learned adaptive size spectral dictionary. To evaluate the performance of the proposed method, we conduct experiments on two famous datasets. The experimental results demonstrate that it can outperform some relatively state-of-the-art methods in terms of the popular universal quality evaluation indexes.
Collapse
|
47
|
Zhao W, Liu Q, Lv Y, Qin B. Texture Variation Adaptive Image Denoising With Nonlocal PCA. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:5537-5551. [PMID: 31135359 DOI: 10.1109/tip.2019.2916976] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Image textures, as a kind of local variations, provide important information for the human visual system. Many image textures, especially the small-scale or stochastic textures, are rich in high-frequency variations, and are difficult to be preserved. Current state-of-the-art denoising algorithms typically adopt a nonlocal approach consisting of image patch grouping and group-wise denoising filtering. To achieve a better image denoising while preserving the variations in texture, we first adaptively group high correlated image patches with the same kinds of texture elements (texels) via an adaptive clustering method. This adaptive clustering method is applied in an over-clustering-and-iterative-merging approach, where its noise robustness is improved with a custom merging threshold relating to the noise level and cluster size. For texture-preserving denoising of each cluster, considering that the variations in texture are captured and wrapped in not only the between-dimension energy variations but also the within-dimension variations of PCA transform coefficients, we further propose a PCA-transform-domain variation adaptive filtering method to preserve the local variations in textures. Experiments on natural images show the superiority of the proposed transform-domain variation adaptive filtering to traditional PCA-based hard or soft threshold filtering. As a whole, the proposed denoising method achieves a favorable texture-preserving performance both quantitatively and visually, especially for irregular textures, which is further verified in camera raw image denoising.
Collapse
|
48
|
Lu G, Zhang X, Ouyang W, Xu D, Chen L, Gao Z. Deep Non-local Kalman Network for Video Compression Artifact Reduction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1725-1737. [PMID: 31567090 DOI: 10.1109/tip.2019.2943214] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Video compression algorithms are widely used to reduce the huge size of video data, but they also introduce unpleasant visual artifacts due to the lossy compression. In order to improve the quality of the compressed videos, we proposed a deep non-local Kalman network for compression artifact reduction. Specifically, the video restoration is modeled as a Kalman filtering procedure and the decoded frames can be restored from the proposed deep Kalman model. Instead of using the noisy previous decoded frames as temporal information, the less noisy previous restored frame is employed in a recursive way, which provides the potential to generate high quality restored frames. In the proposed framework, several deep neural networks are utilized to estimate the corresponding states in the Kalman filter and integrated together in the deep Kalman filtering network. More importantly, we also exploit the non-local prior information by incorporating the spatial and temporal non-local networks for better restoration. Our approach takes the advantages of both the model-based methods and learning-based methods, by combining the recursive nature of the Kalman model and powerful representation ability of neural networks. Extensive experimental results on the Vimeo-90k and HEVC benchmark datasets demonstrate the effectiveness of our proposed method.
Collapse
|
49
|
Sarno A, Andreozzi E, De Caro D, Di Meo G, Strollo AGM, Cesarelli M, Bifulco P. Real-time algorithm for Poissonian noise reduction in low-dose fluoroscopy: performance evaluation. Biomed Eng Online 2019; 18:94. [PMID: 31511017 PMCID: PMC6737613 DOI: 10.1186/s12938-019-0713-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 08/31/2019] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND Quantum noise intrinsically limits the quality of fluoroscopic images. The lower is the X-ray dose the higher is the noise. Fluoroscopy video processing can enhance image quality and allows further patient's dose lowering. This study aims to assess the performances achieved by a Noise Variance Conditioned Average (NVCA) spatio-temporal filter for real-time denoising of fluoroscopic sequences. The filter is specifically designed for quantum noise suppression and edge preservation. It is an average filter that excludes neighborhood pixel values exceeding noise statistic limits, by means of a threshold which depends on the local noise standard deviation, to preserve the image spatial resolution. The performances were evaluated in terms of contrast-to-noise-ratio (CNR) increment, image blurring (full width of the half maximum of the line spread function) and computational time. The NVCA filter performances were compared to those achieved by simple moving average filters and the state-of-the-art video denoising block matching-4D (VBM4D) algorithm. The influence of the NVCA filter size and threshold on the final image quality was evaluated too. RESULTS For NVCA filter mask size of 5 × 5 × 5 pixels (the third dimension represents the temporal extent of the filter) and a threshold level equal to 2 times the local noise standard deviation, the NVCA filter achieved a 10% increase of the CNR with respect to the unfiltered sequence, while the VBM4D achieved a 14% increase. In the case of NVCA, the edge blurring did not depend on the speed of the moving objects; on the other hand, the spatial resolution worsened of about 2.2 times by doubling the objects speed with VBM4D. The NVCA mask size and the local noise-threshold level are critical for final image quality. The computational time of the NVCA filter was found to be just few percentages of that required for the VBM4D filter. CONCLUSIONS The NVCA filter obtained a better image quality compared to simple moving average filters, and a lower but comparable quality when compared with the VBM4D filter. The NVCA filter showed to preserve edge sharpness, in particular in the case of moving objects (performing even better than VBM4D). The simplicity of the NVCA filter and its low computational burden make this filter suitable for real-time video processing and its hardware implementation is ready to be included in future fluoroscopy devices, offering further lowering of patient's X-ray dose.
Collapse
Affiliation(s)
- A Sarno
- Università di Napoli, "Federico II", dip. di Fisica "E. Pancini" & INFN sez. di Napoli, Via Cintia, 80126, Naples, Italy.
| | - E Andreozzi
- Department of Electrical Engineering and Information Technologies, Università di Napoli "Federico II", Via Claudio, 21, 80125, Naples, Italy
- Istituti Clinici Scientifici Maugeri S.p.A.-Società Benefit, Via S. Maugeri, 4, 27100, Pavia, Italy
| | - D De Caro
- Department of Electrical Engineering and Information Technologies, Università di Napoli "Federico II", Via Claudio, 21, 80125, Naples, Italy
| | - G Di Meo
- Department of Electrical Engineering and Information Technologies, Università di Napoli "Federico II", Via Claudio, 21, 80125, Naples, Italy
| | - A G M Strollo
- Department of Electrical Engineering and Information Technologies, Università di Napoli "Federico II", Via Claudio, 21, 80125, Naples, Italy
| | - M Cesarelli
- Department of Electrical Engineering and Information Technologies, Università di Napoli "Federico II", Via Claudio, 21, 80125, Naples, Italy
- Istituti Clinici Scientifici Maugeri S.p.A.-Società Benefit, Via S. Maugeri, 4, 27100, Pavia, Italy
| | - P Bifulco
- Department of Electrical Engineering and Information Technologies, Università di Napoli "Federico II", Via Claudio, 21, 80125, Naples, Italy
- Istituti Clinici Scientifici Maugeri S.p.A.-Società Benefit, Via S. Maugeri, 4, 27100, Pavia, Italy
| |
Collapse
|
50
|
Kong Z, Yang X. Color Image and Multispectral Image Denoising Using Block Diagonal Representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:4247-4259. [PMID: 30908228 DOI: 10.1109/tip.2019.2907478] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Filtering images of more than one channel are challenging in terms of both efficiency and effectiveness. By grouping similar patches to utilize the self-similarity and sparse linear approximation of natural images, recent nonlocal and transform-domain methods have been widely used in color and multispectral image (MSI) denoising. Many related methods focus on the modeling of group level correlation to enhance sparsity, which often resorts to a recursive strategy with a large number of similar patches. The importance of the patch level representation is understated. In this paper, we mainly investigate the influence and potential of representation at patch level by considering a general formulation with a block diagonal matrix. We further show that by training a proper global patch basis, along with a local principal component analysis transform in the grouping dimension, a simple transform-threshold-inverse method could produce very competitive results. Fast implementation is also developed to reduce the computational complexity. The extensive experiments on both the simulated and real datasets demonstrate its robustness, effectiveness, and efficiency.
Collapse
|