1
|
Wang J, Wei X, Lu H, Chen Y, He D. ConDiff-rPPG: Robust Remote Physiological Measurement to Heterogeneous Occlusions. IEEE J Biomed Health Inform 2024; 28:7090-7102. [PMID: 39052463 DOI: 10.1109/jbhi.2024.3433461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Remote photoplethysmography (rPPG) is a contactless technique that facilitates the measurement of physiological signals and cardiac activities through facial video recordings. This approach holds tremendous potential for various applications. However, existing rPPG methods often did not account for different types of occlusions that commonly occur in real-world scenarios, such as temporary movement or actions of humans in videos or dust on camera. The failure to address these occlusions can compromise the accuracy of rPPG algorithms. To address this issue, we proposed a novel Condiff-rPPG to improve the robustness of rPPG measurement facing various occlusions. First, we compressed the damaged face video into a spatio-temporal representation with several types of masks. Second, the diffusion model was designed to recover the missing information with observed values as a condition. Moreover, a novel low-rank decomposition regularization was proposed to eliminate background noise and maximize informative features. ConDiff-rPPG ensured consistency in optimization goals during the training process. Through extensive experiments, including intra- and cross-dataset evaluations, as well as ablation tests, we demonstrated the robustness and generalization ability of our proposed model.
Collapse
|
2
|
Ma L, Zhao Y, Peng P, Tian Y. Sensitivity Decouple Learning for Image Compression Artifacts Reduction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:3620-3633. [PMID: 38787669 DOI: 10.1109/tip.2024.3403034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
With the benefit of deep learning techniques, recent researches have made significant progress in image compression artifacts reduction. Despite their improved performances, prevailing methods only focus on learning a mapping from the compressed image to the original one but ignore the intrinsic attributes of the given compressed images, which greatly harms the performance of downstream parsing tasks. Different from these methods, we propose to decouple the intrinsic attributes into two complementary features for artifacts reduction, i.e., the compression-insensitive features to regularize the high-level semantic representations during training and the compression-sensitive features to be aware of the compression degree. To achieve this, we first employ adversarial training to regularize the compressed and original encoded features for retaining high-level semantics, and we then develop the compression quality-aware feature encoder for compression-sensitive features. Based on these dual complementary features, we propose a Dual Awareness Guidance Network (DAGN) to utilize these awareness features as transformation guidance during the decoding phase. In our proposed DAGN, we develop a cross-feature fusion module to maintain the consistency of compression-insensitive features by fusing compression-insensitive features into the artifacts reduction baseline. Our method achieves an average 2.06 dB PSNR gains on BSD500, outperforming state-of-the-art methods, and only requires 29.7 ms to process one image on BSD500. Besides, the experimental results on LIVE1 and LIU4K also demonstrate the efficiency, effectiveness, and superiority of the proposed method in terms of quantitative metrics, visual quality, and downstream machine vision tasks.
Collapse
|
3
|
Fu X, Wang M, Cao X, Ding X, Zha ZJ. A Model-Driven Deep Unfolding Method for JPEG Artifacts Removal. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6802-6816. [PMID: 34081590 DOI: 10.1109/tnnls.2021.3083504] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep learning-based methods have achieved notable progress in removing blocking artifacts caused by lossy JPEG compression on images. However, most deep learning-based methods handle this task by designing black-box network architectures to directly learn the relationships between the compressed images and their clean versions. These network architectures are always lack of sufficient interpretability, which limits their further improvements in deblocking performance. To address this issue, in this article, we propose a model-driven deep unfolding method for JPEG artifacts removal, with interpretable network structures. First, we build a maximum posterior (MAP) model for deblocking using convolutional dictionary learning and design an iterative optimization algorithm using proximal operators. Second, we unfold this iterative algorithm into a learnable deep network structure, where each module corresponds to a specific operation of the iterative algorithm. In this way, our network inherits the benefits of both the powerful model ability of data-driven deep learning method and the interpretability of traditional model-driven method. By training the proposed network in an end-to-end manner, all learnable modules can be automatically explored to well characterize the representations of both JPEG artifacts and image content. Experiments on synthetic and real-world datasets show that our method is able to generate competitive or even better deblocking results, compared with state-of-the-art methods both quantitatively and qualitatively.
Collapse
|
4
|
Chen H, He X, Yang H, Qing L, Teng Q. A Feature-Enriched Deep Convolutional Neural Network for JPEG Image Compression Artifacts Reduction and its Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:430-444. [PMID: 34793307 DOI: 10.1109/tnnls.2021.3124370] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The amount of multimedia data, such as images and videos, has been increasing rapidly with the development of various imaging devices and the Internet, bringing more stress and challenges to information storage and transmission. The redundancy in images can be reduced to decrease data size via lossy compression, such as the most widely used standard Joint Photographic Experts Group (JPEG). However, the decompressed images generally suffer from various artifacts (e.g., blocking, banding, ringing, and blurring) due to the loss of information, especially at high compression ratios. This article presents a feature-enriched deep convolutional neural network for compression artifacts reduction (FeCarNet, for short). Taking the dense network as the backbone, FeCarNet enriches features to gain valuable information via introducing multi-scale dilated convolutions, along with the efficient 1 ×1 convolution for lowering both parameter complexity and computation cost. Meanwhile, to make full use of different levels of features in FeCarNet, a fusion block that consists of attention-based channel recalibration and dimension reduction is developed for local and global feature fusion. Furthermore, short and long residual connections both in the feature and pixel domains are combined to build a multi-level residual structure, thereby benefiting the network training and performance. In addition, aiming at reducing computation complexity further, pixel-shuffle-based image downsampling and upsampling layers are, respectively, arranged at the head and tail of the FeCarNet, which also enlarges the receptive field of the whole network. Experimental results show the superiority of FeCarNet over state-of-the-art compression artifacts reduction approaches in terms of both restoration capacity and model complexity. The applications of FeCarNet on several computer vision tasks, including image deblurring, edge detection, image segmentation, and object detection, demonstrate the effectiveness of FeCarNet further.
Collapse
|
5
|
Gandam A, Sidhu JS, Verma S, Jhanjhi NZ, Nayyar A, Abouhawwash M, Nam Y. An efficient post-processing adaptive filtering technique to rectifying the flickering effects. PLoS One 2021; 16:e0250959. [PMID: 33970949 PMCID: PMC8109823 DOI: 10.1371/journal.pone.0250959] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 04/19/2021] [Indexed: 11/25/2022] Open
Abstract
Compression at a very low bit rate(≤0.5bpp) causes degradation in video frames with standard decoding algorithms like H.261, H.262, H.264, and MPEG-1 and MPEG-4, which itself produces lots of artifacts. This paper focuses on an efficient pre-and post-processing technique (PP-AFT) to address and rectify the problems of quantization error, ringing, blocking artifact, and flickering effect, which significantly degrade the visual quality of video frames. The PP-AFT method differentiates the blocked images or frames using activity function into different regions and developed adaptive filters as per the classified region. The designed process also introduces an adaptive flicker extraction and removal method and a 2-D filter to remove ringing effects in edge regions. The PP-AFT technique is implemented on various videos, and results are compared with different existing techniques using performance metrics like PSNR-B, MSSIM, and GBIM. Simulation results show significant improvement in the subjective quality of different video frames. The proposed method outperforms state-of-the-art de-blocking methods in terms of PSNR-B with average value lying between (0.7-1.9db) while (35.83-47.7%) reduced average GBIM keeping MSSIM values very close to the original sequence statistically 0.978.
Collapse
Affiliation(s)
- Anudeep Gandam
- Department of Electronics and Communication Engineering, IKG-Punjab Technical University Jalandhar, Punjab, India
| | - Jagroop Singh Sidhu
- Department of Electronics and Communication Engineering, DAVIET Jalandhar, Punjab, India
| | - Sahil Verma
- Department of Computer Science and Engineering, Chandigarh University, Mohali, Punjab, India
| | - N. Z. Jhanjhi
- School of Computer Science and Engineering, SCE Taylor’s University, Subang Jaya, Malaysia
| | - Anand Nayyar
- Graduate School, Duy Tan University, Da Nang, Viet Nam
- Faculty of Information Technology, Duy Tan University, Da Nang, Viet Nam
| | - Mohamed Abouhawwash
- Department of Mathematics, Faculty of Science, Mansoura University, Mansoura, Egypt
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, United States of America
| | - Yunyoung Nam
- Department of Computer Science and Engineering, Soonchunhyang University, Asan, Korea
| |
Collapse
|
6
|
Mu J, Xiong R, Fan X, Liu D, Wu F, Gao W. Graph-Based Non-Convex Low-Rank Regularization for Image Compression Artifact Reduction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5374-5385. [PMID: 32149688 DOI: 10.1109/tip.2020.2975931] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Block transform coded images usually suffer from annoying artifacts at low bit-rates, because of the independent quantization of DCT coefficients. Image prior models play an important role in compressed image reconstruction. Natural image patches in a small neighborhood of the high-dimensional image space usually exhibit an underlying sub-manifold structure. To model the distribution of signal, we extract sub-manifold structure as prior knowledge. We utilize graph Laplacian regularization to characterize the sub-manifold structure at patch level. And similar patches are exploited as samples to estimate distribution of a particular patch. Instead of using Euclidean distance as similarity metric, we propose to use graph-domain distance to measure the patch similarity. Then we perform low-rank regularization on the similar-patch group, and incorporate a non-convex lp penalty to surrogate matrix rank. Finally, an alternatively minimizing strategy is employed to solve the non-convex problem. Experimental results show that our proposed method is capable of achieving more accurate reconstruction than the state-of-the-art methods in both objective and perceptual qualities.
Collapse
|
7
|
Jia C, Wang S, Zhang X, Wang S, Liu J, Pu S, Ma S. Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:3343-3356. [PMID: 30714920 DOI: 10.1109/tip.2019.2896489] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Recently, convolutional neural network (CNN) has attracted tremendous attention and has achieved great success in many image processing tasks. In this paper, we focus on CNN technology combined with image restoration to facilitate video coding performance and propose the content-aware CNN based in-loop filtering for high-efficiency video coding (HEVC). In particular, we quantitatively analyze the structure of the proposed CNN model from multiple dimensions to make the model interpretable and optimal for CNN-based loop filtering. More specifically, each coding tree unit (CTU) is treated as an independent region for processing, such that the proposed content-aware multimodel filtering mechanism is realized by the restoration of different regions with different CNN models under the guidance of the discriminative network. To adapt the image content, the discriminative neural network is learned to analyze the content characteristics of each region for the adaptive selection of the deep learning model. The CTU level control is also enabled in the sense of rate-distortion optimization. To learn the CNN model, an iterative training method is proposed by simultaneously labeling filter categories at the CTU level and fine-tuning the CNN model parameters. The CNN based in-loop filter is implemented after sample adaptive offset in HEVC, and extensive experiments show that the proposed approach significantly improves the coding performance and achieves up to 10.0% bit-rate reduction. On average, 4.1%, 6.0%, 4.7%, and 6.0% bit-rate reduction can be obtained under all intra, low delay, low delay P, and random access configurations, respectively.
Collapse
|
8
|
Zhang M, Desrosiers C. High-quality Image Restoration Using Low-Rank Patch Regularization and Global Structure Sparsity. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 28:868-879. [PMID: 30296228 DOI: 10.1109/tip.2018.2874284] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In recent years, approaches based on nonlocal self similarity and global structure regularization have led to significant improvements in image restoration. Nonlocal self similarity exploits the repetitiveness of small image patches as a powerful prior in the reconstruction process. Likewise, global structure regularization is based on the principle that the structure of objects in the image is represented by a relatively small portion of pixels. Enforcing this structural information to be sparse can thus reduce the occurrence of reconstruction artifacts. So far, most image restoration approaches have considered one of these two strategies, but not both. This paper presents a novel image restoration method that combines nonlocal self similarity and global structure sparsity in a single efficient model. Group of similar patches are reconstructed simultaneously, via an adaptive regularization technique based on the weighted nuclear norm. Moreover, global structure is preserved using an innovative strategy, which decomposes the image into a smooth component and a sparse residual, the latter regularized using l1 norm. An optimization technique, based on the Alternating Direction Method of Multipliers (ADMM) algorithm, is used to recover corrupted images efficiently. The performance of the proposed method is evaluated on two important image restoration tasks: image completion and super-resolution. Experimental results show our method to outperform state-of-the-art approaches for these tasks, for various types and levels of image corruption.
Collapse
|
9
|
Song Q, Xiong R, Liu D, Xiong Z, Wu F, Gao W. Fast Image Super-Resolution via Local Adaptive Gradient Field Sharpening Transform. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:1966-1980. [PMID: 33156782 DOI: 10.1109/tip.2017.2789323] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This paper proposes a single-image super-resolution scheme by introducing a gradient field sharpening transform that converts the blurry gradient field of upsampled low-resolution (LR) image to a much sharper gradient field of original high-resolution (HR) image. Different from the existing methods that need to figure out the whole gradient profile structure and locate the edge points, we derive a new approach that sharpens the gradient field adaptively only based on the pixels in a small neighborhood. To maintain image contrast, image gradient is adaptively scaled to keep the integral of gradient field stable. Finally, the HR image is reconstructed by fusing the LR image with the sharpened HR gradient field. Experimental results demonstrate that the proposed algorithm can generate more accurate gradient field and produce super-resolved images with better objective and visual qualities. Another advantage is that the proposed gradient sharpening transform is very fast and suitable for low-complexity applications.
Collapse
|
10
|
Ghorai M, Mandal S, Chanda B. A Group-Based Image Inpainting Using Patch Refinement in MRF Framework. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:556-567. [PMID: 29136609 DOI: 10.1109/tip.2017.2768180] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper presents a Markov random field (MRF)-based image inpainting algorithm using patch selection from groups of similar patches and optimal patch assignment through joint patch refinement. In patch selection, a novel group formation strategy based on subspace clustering is introduced to search the candidate patches in relevant source region only. This improves patch searching in terms of both quality and time. We also propose an efficient patch refinement scheme using higher order singular value decomposition to capture underlying pattern among the candidate patches. This eliminates random variation and unwanted artifacts as well. Finally, a weight term is computed, based on the refined patches and is incorporated in the objective function of the MRF model to improve the optimal patch assignment. Experimental results on a large number of natural images and comparison with well-known existing methods demonstrate the efficacy and superiority of the proposed method.
Collapse
|
11
|
Liu X, Cheung G, Wu X, Zhao D. Random Walk Graph Laplacian-Based Smoothness Prior for Soft Decoding of JPEG Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:509-524. [PMID: 27849534 DOI: 10.1109/tip.2016.2627807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Given the prevalence of joint photographic experts group (JPEG) compressed images, optimizing image reconstruction from the compressed format remains an important problem. Instead of simply reconstructing a pixel block from the centers of indexed discrete cosine transform (DCT) coefficient quantization bins (hard decoding), soft decoding reconstructs a block by selecting appropriate coefficient values within the indexed bins with the help of signal priors. The challenge thus lies in how to define suitable priors and apply them effectively. In this paper, we combine three image priors-Laplacian prior for DCT coefficients, sparsity prior, and graph-signal smoothness prior for image patches-to construct an efficient JPEG soft decoding algorithm. Specifically, we first use the Laplacian prior to compute a minimum mean square error initial solution for each code block. Next, we show that while the sparsity prior can reduce block artifacts, limiting the size of the overcomplete dictionary (to lower computation) would lead to poor recovery of high DCT frequencies. To alleviate this problem, we design a new graph-signal smoothness prior (desired signal has mainly low graph frequencies) based on the left eigenvectors of the random walk graph Laplacian matrix (LERaG). Compared with the previous graph-signal smoothness priors, LERaG has desirable image filtering properties with low computation overhead. We demonstrate how LERaG can facilitate recovery of high DCT frequencies of a piecewise smooth signal via an interpretation of low graph frequency components as relaxed solutions to normalized cut in spectral clustering. Finally, we construct a soft decoding algorithm using the three signal priors with appropriate prior weights. Experimental results show that our proposal outperforms the state-of-the-art soft decoding algorithms in both objective and subjective evaluations noticeably.
Collapse
|
12
|
Xiong R, Liu H, Zhang X, Zhang J, Ma S, Wu F, Gao W. Image Denoising via Bandwise Adaptive Modeling and Regularization Exploiting Nonlocal Similarity. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:5793-5805. [PMID: 28114070 DOI: 10.1109/tip.2016.2614160] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
This paper proposes a new image denoising algorithm based on adaptive signal modeling and regularization. It improves the quality of images by regularizing each image patch using bandwise distribution modeling in transform domain. Instead of using a global model for all the patches in an image, it employs content-dependent adaptive models to address the non-stationarity of image signals and also the diversity among different transform bands. The distribution model is adaptively estimated for each patch individually. It varies from one patch location to another and also varies for different bands. In particular, we consider the estimated distribution to have non-zero expectation. To estimate the expectation and variance parameters for every band of a particular patch, we exploit the nonlocal correlation in image to collect a set of highly similar patches as the data samples to form the distribution. Irrelevant patches are excluded so that such adaptively learned model is more accurate than a global one. The image is ultimately restored via bandwise adaptive soft-thresholding, based on a Laplacian approximation of the distribution of similar-patch group transform coefficients. Experimental results demonstrate that the proposed scheme outperforms several state-of-the-art denoising methods in both the objective and the perceptual qualities.
Collapse
|