1
|
Huang Y, Li J, Hu Y, Gao X, Huang H. Transitional Learning: Exploring the Transition States of Degradation for Blind Super-resolution. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6495-6510. [PMID: 36107902 DOI: 10.1109/tpami.2022.3206870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Being extremely dependent on iterative estimation of the degradation prior or optimization of the model from scratch, the existing blind super-resolution (SR) methods are generally time-consuming and less effective, as the estimation of degradation proceeds from a blind initialization and lacks interpretable representation of degradations. To address it, this article proposes a transitional learning method for blind SR using an end-to-end network without any additional iterations in inference, and explores an effective representation for unknown degradation. To begin with, we analyze and demonstrate the transitionality of degradations as interpretable prior information to indirectly infer the unknown degradation model, including the widely used additive and convolutive degradations. We then propose a novel Transitional Learning method for blind Super-Resolution (TLSR), by adaptively inferring a transitional transformation function to solve the unknown degradations without any iterative operations in inference. Specifically, the end-to-end TLSR network consists of a degree of transitionality (DoT) estimation network, a homogeneous feature extraction network, and a transitional learning module. Quantitative and qualitative evaluations on blind SR tasks demonstrate that the proposed TLSR achieves superior performances and costs fewer complexities against the state-of-the-art blind SR methods. The code is available at github.com/YuanfeiHuang/TLSR.
Collapse
|
2
|
Yan J, Zhang K, Luo S, Xu J, Lu J, Xiong Z. Learning graph-constrained cascade regressors for single image super-resolution. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02904-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
3
|
Huang Y, Li J, Gao X, Hu Y, Lu W. Interpretable Detail-Fidelity Attention Network for Single Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2325-2339. [PMID: 33481708 DOI: 10.1109/tip.2021.3050856] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Benefiting from the strong capabilities of deep CNNs for feature representation and nonlinear mapping, deep-learning-based methods have achieved excellent performance in single image super-resolution. However, most existing SR methods depend on the high capacity of networks that are initially designed for visual recognition, and rarely consider the initial intention of super-resolution for detail fidelity. To pursue this intention, there are two challenging issues that must be solved: (1) learning appropriate operators which is adaptive to the diverse characteristics of smoothes and details; (2) improving the ability of the model to preserve low-frequency smoothes and reconstruct high-frequency details. To solve these problems, we propose a purposeful and interpretable detail-fidelity attention network to progressively process these smoothes and details in a divide-and-conquer manner, which is a novel and specific prospect of image super-resolution for the purpose of improving detail fidelity. This proposed method updates the concept of blindly designing or using deep CNNs architectures for only feature representation in local receptive fields. In particular, we propose a Hessian filtering for interpretable high-profile feature representation for detail inference, along with a dilated encoder-decoder and a distribution alignment cell to improve the inferred Hessian features in a morphological manner and statistical manner respectively. Extensive experiments demonstrate that the proposed method achieves superior performance compared to the state-of-the-art methods both quantitatively and qualitatively. The code is available at github.com/YuanfeiHuang/DeFiAN.
Collapse
|
4
|
Jiang J, Yu Y, Wang Z, Tang S, Hu R, Ma J. Ensemble Super-Resolution With a Reference Dataset. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:4694-4708. [PMID: 30843812 DOI: 10.1109/tcyb.2018.2890149] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
By developing sophisticated image priors or designing deep(er) architectures, a variety of image super-resolution (SR) approaches have been proposed recently and achieved very promising performance. A natural question that arises is whether these methods can be reformulated into a unifying framework and whether this framework assists in SR reconstruction? In this paper, we present a simple but effective single image SR method based on ensemble learning, which can produce a better performance than that could be obtained from any of SR methods to be ensembled (or called component super-resolvers). Based on the assumption that better component super-resolver should have larger ensemble weight when performing SR reconstruction, we present a maximum a posteriori (MAP) estimation framework for the inference of optimal ensemble weights. Especially, we introduce a reference dataset, which is composed of high-resolution (HR) and low-resolution (LR) image pairs, to measure the SR abilities (prior knowledge) of different component super-resolvers. To obtain the optimal ensemble weights, we propose to incorporate the reconstruction constraint, which states that the degenerated HR estimation should be equal to the LR observation one, as well as the prior knowledge of ensemble weights into the MAP estimation framework. Moreover, the proposed optimization problem can be solved by an analytical solution. We study the performance of the proposed method by comparing with different competitive approaches, including four state-of-the-art nondeep learning-based methods, four latest deep learning-based methods, and one ensemble learning-based method, and prove its effectiveness and superiority on some general image datasets and face image datasets.
Collapse
|
5
|
|
6
|
Generative Adversarial Network for Image Super-Resolution Combining Texture Loss. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10051729] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Objective: Super-resolution reconstruction is an increasingly important area in computer vision. To alleviate the problems that super-resolution reconstruction models based on generative adversarial networks are difficult to train and contain artifacts in reconstruction results, we propose a novel and improved algorithm. Methods: This paper presented TSRGAN (Super-Resolution Generative Adversarial Networks Combining Texture Loss) model which was also based on generative adversarial networks. We redefined the generator network and discriminator network. Firstly, on the network structure, residual dense blocks without excess batch normalization layers were used to form generator network. Visual Geometry Group (VGG)19 network was adopted as the basic framework of discriminator network. Secondly, in the loss function, the weighting of the four loss functions of texture loss, perceptual loss, adversarial loss and content loss was used as the objective function of generator. Texture loss was proposed to encourage local information matching. Perceptual loss was enhanced by employing the features before activation layer to calculate. Adversarial loss was optimized based on WGAN-GP (Wasserstein GAN with Gradient Penalty) theory. Content loss was used to ensure the accuracy of low-frequency information. During the optimization process, the target image information was reconstructed from different angles of high and low frequencies. Results: The experimental results showed that our method made the average Peak Signal to Noise Ratio of reconstructed images reach 27.99 dB and the average Structural Similarity Index reach 0.778 without losing too much speed, which was superior to other comparison algorithms in objective evaluation index. What is more, TSRGAN significantly improved subjective visual evaluations such as brightness information and texture details. We found that it could generate images with more realistic textures and more accurate brightness, which were more in line with human visual evaluation. Conclusions: Our improvements to the network structure could reduce the model’s calculation amount and stabilize the training direction. In addition, the loss function we present for generator could provide stronger supervision for restoring realistic textures and achieving brightness consistency. Experimental results prove the effectiveness and superiority of TSRGAN algorithm.
Collapse
|
7
|
Jiang J, Yu Y, Wang Z, Liu X, Ma J. Graph-Regularized Locality-Constrained Joint Dictionary and Residual Learning for Face Sketch Synthesis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:628-641. [PMID: 30235127 DOI: 10.1109/tip.2018.2870936] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Face sketch synthesis is a crucial issue in digital entertainment and law enforcement. It can bridge the considerable texture discrepancy between face photos and sketches. Most of the current face sketch synthesis approaches directly to learn the relationship between the photos and sketches, and it is very difficult for them to generate the individual specific features, which we call rare characteristics. In this paper, we propose a novel face sketch synthesis approach through residual learning. In contrast to traditional approaches, which aim to reconstruct a sketch image directly (i.e., learn the mapping relationship between the photo and sketch), we aim to predict the residual image by learning the mapping relationship between the photo and residual, i.e., the difference between the photo and sketch, given an observed photo. This technique will render optimizing the residual mapping easier than optimizing the original mapping and deriving rare characteristic information. We also introduce a joint dictionary learning algorithm by preserving the local geometry structure of a data space. Through the learned joint dictionary, we transform the face sketch synthesis from an image space to a new and compact space; the new and compact space is spanned by learned dictionary atoms, where the manifold assumption can be further guaranteed. Results show that the proposed method demonstrates an impressive performance in the face sketch synthesis task on three public face sketch datasets and various real-world photos. These results are derived by comparing the proposed method with several state-of-the-art techniques, including certain recently proposed deep learning-based approaches.
Collapse
|
8
|
Jiang J, Yu Y, Tang S, Ma J, Aizawa A, Aizawa K. Context-Patch Face Hallucination Based on Thresholding Locality-Constrained Representation and Reproducing Learning. IEEE TRANSACTIONS ON CYBERNETICS 2018; 50:324-337. [PMID: 30334810 DOI: 10.1109/tcyb.2018.2868891] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Face hallucination is a technique that reconstructs high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of the human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the context information of the image patch. In addition, when they are confronted with misalignment or the small sample size (SSS) problem, the hallucination performance is very poor. To this end, this paper incorporates the contextual information of the image patch and proposes a powerful and efficient context-patch-based face hallucination approach, namely, thresholding locality-constrained representation and reproducing learning (TLcR-RL). Under the context-patch-based framework, we advance a thresholding-based representation method to enhance the reconstruction accuracy and reduce the computational complexity. To further improve the performance of the proposed algorithm, we propose a promotion strategy called reproducing learning. By adding the estimated HR face to the training set, which can simulate the case that the HR version of the input LR face is present in the training set, it thus iteratively enhances the final hallucination result. Experiments demonstrate that the proposed TLcR-RL method achieves a substantial increase in the hallucinated results, both subjectively and objectively. In addition, the proposed framework is more robust to face misalignment and the SSS problem, and its hallucinated HR face is still very good when the LR test face is from the real world. The MATLAB source code is available at https://github.com/junjun-jiang/TLcR-RL.
Collapse
|
9
|
Pan H, Jing Z, Qiao L, Li M. Discriminative Structured Dictionary Learning on Grassmann Manifolds and Its Application on Image Restoration. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:2875-2886. [PMID: 28952956 DOI: 10.1109/tcyb.2017.2751585] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Image restoration is a difficult and challenging problem in various imaging applications. However, despite of the benefits of a single overcomplete dictionary, there are still several challenges for capturing the geometric structure of image of interest. To more accurately represent the local structures of the underlying signals, we propose a new problem formulation for sparse representation with block-orthogonal constraint. There are three contributions. First, a framework for discriminative structured dictionary learning is proposed, which leads to a smooth manifold structure and quotient search spaces. Second, an alternating minimization scheme is proposed after taking both the cost function and the constraints into account. This is achieved by iteratively alternating between updating the block structure of the dictionary defined on Grassmann manifold and sparsifying the dictionary atoms automatically. Third, Riemannian conjugate gradient is considered to track local subspaces efficiently with a convergence guarantee. Extensive experiments on various datasets demonstrate that the proposed method outperforms the state-of-the-art methods on the removal of mixed Gaussian-impulse noise.
Collapse
|
10
|
Alajarmeh A, Salam R, Abdulrahim K, Marhusin M, Zaidan A, Zaidan B. Real-time framework for image dehazing based on linear transmission and constant-time airlight estimation. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.01.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
11
|
Zeng X, Huang H, Qi C. Expanding Training Data for Facial Image Super-Resolution. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:716-729. [PMID: 28166514 DOI: 10.1109/tcyb.2017.2655027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The quality of training data is very important for learning-based facial image super-resolution (SR). The more similarity between training data and testing input is, the better SR results we can have. To generate a better training set of low/high resolution training facial images for a particular testing input, this paper is the first work that proposes expanding the training data for improving facial image SR. To this end, observing that facial images are highly structured, we propose three constraints, i.e., the local structure constraint, the correspondence constraint and the similarity constraint, to generate new training data, where local patches are expanded with different expansion parameters. The expanded training data can be used for both patch-based facial SR methods and global facial SR methods. Extensive testings on benchmark databases and real world images validate the effectiveness of training data expansion on improving the SR quality.
Collapse
|
12
|
Liu L, Yang B, Huang H. No-reference stereopair quality assessment based on singular value decomposition. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.10.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Zhang H, Yang J, Qian J, Luo W. Nonconvex relaxation based matrix regression for face recognition with structural noise and mixed noise. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.12.095] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
14
|
Jiang J, Ma J, Chen C, Jiang X, Wang Z. Noise Robust Face Image Super-Resolution Through Smooth Sparse Representation. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3991-4002. [PMID: 28113611 DOI: 10.1109/tcyb.2016.2594184] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Face image super-resolution has attracted much attention in recent years. Many algorithms have been proposed. Among them, sparse representation (SR)-based face image super-resolution approaches are able to achieve competitive performance. However, these SR-based approaches only perform well under the condition that the input is noiseless or has small noise. When the input is corrupted by large noise, the reconstruction weights (or coefficients) of the input low-resolution (LR) patches using SR-based approaches will be seriously unstable, thus leading to poor reconstruction results. To this end, in this paper, we propose a novel SR-based face image super-resolution approach that incorporates smooth priors to enforce similar training patches having similar sparse coding coefficients. Specifically, we introduce the fused least absolute shrinkage and selection operator-based smooth constraint and locality-based smooth constraint to the least squares representation-based patch representation in order to obtain stable reconstruction weights, especially when the noise level of the input LR image is high. Experiments are carried out on the benchmark FEI face database and CMU+MIT face database. Visual and quantitative comparisons show that the proposed face image super-resolution method yields superior reconstruction results when the input LR face image is contaminated by strong noise.
Collapse
|